AI Newsletter Digest improvements: fixed QP soft line break decoding, URL extraction, and content cleaning
This commit is contained in:
122
archive/docs/GATEWAY-MONITORING.md
Normal file
122
archive/docs/GATEWAY-MONITORING.md
Normal file
@@ -0,0 +1,122 @@
|
||||
# Gateway Reliability - Simple Approach 🦀
|
||||
|
||||
## Philosophy (from zach-highley/openclaw-starter-kit)
|
||||
|
||||
**Don't build:**
|
||||
- ❌ Custom watchdog scripts
|
||||
- ❌ Meta-monitors that watch the watchers
|
||||
- ❌ Anything with "monitor" or "watchdog" in the name
|
||||
|
||||
**Do this instead:**
|
||||
```
|
||||
systemd (KeepAlive=true) → 5 AM daily cron (openclaw doctor --fix)
|
||||
```
|
||||
|
||||
That's it. Simple is better.
|
||||
|
||||
## How It Works
|
||||
|
||||
### 1. systemd Service Manager
|
||||
- Gateway runs as a systemd service with automatic restart
|
||||
- If it crashes, systemd restarts it automatically
|
||||
- No custom scripts needed
|
||||
|
||||
### 2. Daily Maintenance Cron
|
||||
- Runs at 5 AM daily
|
||||
- Executes `openclaw doctor --fix` to clean up issues
|
||||
- Prevents accumulation of problems
|
||||
|
||||
## Current Setup
|
||||
|
||||
### Service Status
|
||||
```bash
|
||||
# Check if running
|
||||
systemctl --user status openclaw-gateway.service
|
||||
|
||||
# View logs
|
||||
journalctl --user -u openclaw-gateway.service -n 50
|
||||
```
|
||||
|
||||
### Daily Maintenance
|
||||
The 5 AM cron runs:
|
||||
- `openclaw doctor --fix` - fixes config issues
|
||||
- Health checks
|
||||
- Cleans up any accumulated problems
|
||||
|
||||
## Commands
|
||||
|
||||
```bash
|
||||
# Check overall status
|
||||
openclaw status
|
||||
|
||||
# Check gateway health
|
||||
openclaw health
|
||||
|
||||
# View recent logs
|
||||
openclaw logs --tail 200
|
||||
|
||||
# Run diagnostics
|
||||
openclaw doctor --non-interactive
|
||||
|
||||
# Manual restart (if needed)
|
||||
systemctl --user restart openclaw-gateway.service
|
||||
```
|
||||
|
||||
## Why This Is Better
|
||||
|
||||
**Before (over-engineered):**
|
||||
- Custom monitor scripts
|
||||
- Health check every 2 minutes
|
||||
- Auto-restart logic
|
||||
- Multiple logging paths
|
||||
- Complex failure modes
|
||||
|
||||
**After (simple & reliable):**
|
||||
- systemd handles restarts (built-in, tested)
|
||||
- One daily maintenance cron
|
||||
- Uses OpenClaw's built-in health checks
|
||||
- Easier to debug
|
||||
- Less to break
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Gateway won't start
|
||||
```bash
|
||||
# Check service status
|
||||
systemctl --user status openclaw-gateway.service
|
||||
|
||||
# View logs
|
||||
journalctl --user -u openclaw-gateway.service --since "1 hour ago"
|
||||
|
||||
# Try doctor
|
||||
openclaw doctor --fix
|
||||
|
||||
# Restart service
|
||||
systemctl --user restart openclaw-gateway.service
|
||||
```
|
||||
|
||||
### High memory usage
|
||||
- Check session count: `openclaw sessions list`
|
||||
- Sessions auto-reset daily at 4 AM
|
||||
- Memory compaction enabled
|
||||
|
||||
### Config issues
|
||||
- Run: `openclaw doctor --fix`
|
||||
- Check: `openclaw status`
|
||||
- Backup always at: `~/.openclaw/openclaw.json.bak`
|
||||
|
||||
## Lessons Learned
|
||||
|
||||
From zach-highley/openclaw-starter-kit:
|
||||
|
||||
1. **Keep it simple** - systemd + daily cron is enough
|
||||
2. **Don't reinvent** - use built-in tools first
|
||||
3. **MECE** - no overlapping systems
|
||||
4. **Official docs first** - docs.openclaw.ai
|
||||
5. **Learn & fix** - error → investigate → fix → document
|
||||
|
||||
## Reference
|
||||
|
||||
- [zach-highley/openclaw-starter-kit](https://github.com/zach-highley/openclaw-starter-kit)
|
||||
- [OpenClaw Docs](https://docs.openclaw.ai/)
|
||||
- [Incident Postmortem](https://github.com/zach-highley/openclaw-starter-kit/blob/main/docs/INCIDENT_POSTMORTEM.md)
|
||||
Reference in New Issue
Block a user