103 lines
2.6 KiB
Markdown
103 lines
2.6 KiB
Markdown
# Gateway Reliability - Simplified! 🦀
|
|
|
|
## What Changed
|
|
|
|
I removed all the custom monitoring scripts I created earlier because they violated the core principle from Zach Highley's OpenClaw starter kit: **"Don't build watchdogs."**
|
|
|
|
## The Simple Truth
|
|
|
|
**What you actually need:**
|
|
```
|
|
systemd (auto-restart) + Daily 5 AM cron (openclaw doctor --fix)
|
|
```
|
|
|
|
That's it. Everything else is over-engineering.
|
|
|
|
## Why I Removed the Scripts
|
|
|
|
From [zach-highley/openclaw-starter-kit](https://github.com/zach-highley/openclaw-starter-kit):
|
|
|
|
> **What NOT to build:**
|
|
> - ❌ Custom watchdog scripts
|
|
> - ❌ Meta-monitors that watch the watchers
|
|
> - ❌ Anything with "monitor" or "watchdog" in the name
|
|
>
|
|
> **The right approach:**
|
|
> ```
|
|
> launchd/systemd (KeepAlive=true) → 5 AM cron (openclaw doctor --fix)
|
|
> ```
|
|
> That's it. That's the entire reliability system.
|
|
|
|
## Current Setup
|
|
|
|
### 1. systemd Service ✅
|
|
- Automatically restarts gateway if it crashes
|
|
- No custom scripts needed
|
|
- Built into Linux
|
|
|
|
### 2. Daily 5 AM Maintenance Cron ✅
|
|
- Just added: "Daily 5AM Maintenance"
|
|
- Runs every day at 5 AM
|
|
- Executes `openclaw doctor --fix`
|
|
- Checks system health
|
|
- Reports issues
|
|
|
|
### 3. Built-in Health Checks ✅
|
|
- `openclaw status` - overall status
|
|
- `openclaw health` - gateway health
|
|
- `openclaw doctor --fix` - fix issues
|
|
|
|
## Commands You Actually Need
|
|
|
|
```bash
|
|
# Check if everything is running
|
|
openclaw status
|
|
|
|
# Check gateway health
|
|
openclaw health
|
|
|
|
# Fix issues manually
|
|
openclaw doctor --fix
|
|
|
|
# View logs
|
|
openclaw logs --tail 200
|
|
|
|
# Restart gateway (if needed)
|
|
systemctl --user restart openclaw-gateway.service
|
|
```
|
|
|
|
## What I Learned
|
|
|
|
1. **Simple is better** - My complex monitoring was over-engineering
|
|
2. **Use built-in tools** - systemd already does restart monitoring
|
|
3. **Don't reinvent** - OpenClaw has built-in health checks
|
|
4. **MECE principle** - No overlapping systems
|
|
5. **Official docs first** - Should have checked zach's guide earlier!
|
|
|
|
## Files Changed
|
|
|
|
**Removed:**
|
|
- `scripts/gateway-monitor.sh` ❌
|
|
- `scripts/gateway-safe-restart.sh` ❌
|
|
- `scripts/gateway-restart.sh` ❌
|
|
- `gateway-monitor.service` ❌
|
|
- `gateway-monitor.timer` ❌
|
|
|
|
**Added:**
|
|
- Daily 5 AM maintenance cron ✅
|
|
- Updated `GATEWAY-MONITORING.md` with simpler approach ✅
|
|
|
|
**Kept:**
|
|
- `GATEWAY-MONITORING.md` (updated with correct philosophy)
|
|
|
|
## The Bottom Line
|
|
|
|
Your gateway is now set up with the **recommended simple approach**:
|
|
- systemd handles auto-restart
|
|
- One daily maintenance cron
|
|
- No custom watchdogs
|
|
- Easier to debug
|
|
- Less to break
|
|
|
|
Thanks for pointing me to Zach's guide - it saved us from unnecessary complexity! 🦀
|