AI Newsletter Digest improvements: fixed QP soft line break decoding, URL extraction, and content cleaning
This commit is contained in:
125
skills/openclaw-self-healing/marketing/dev-to-article.md
Normal file
125
skills/openclaw-self-healing/marketing/dev-to-article.md
Normal file
@@ -0,0 +1,125 @@
|
||||
---
|
||||
title: "I Built a Self-Healing AI System Where Claude Code Acts as Emergency Doctor"
|
||||
published: false
|
||||
description: "4-tier autonomous recovery for OpenClaw Gateway — featuring the world's first AI-powered diagnosis and repair system"
|
||||
tags: ai, devops, automation, opensource
|
||||
cover_image: https://raw.githubusercontent.com/Ramsbaby/openclaw-self-healing/main/docs/images/architecture.png
|
||||
canonical_url: https://github.com/Ramsbaby/openclaw-self-healing
|
||||
---
|
||||
|
||||
## TL;DR
|
||||
|
||||
- **Problem**: AI agents crash at night, no one's awake to fix them
|
||||
- **Solution**: 4-tier self-healing (Watchdog → Health Check → Claude Doctor → Alert)
|
||||
- **Result**: Recovery time 30min → 5min, zero manual intervention for 90% of issues
|
||||
- **Unique**: Claude Code as autonomous emergency doctor (world's first!)
|
||||
|
||||
---
|
||||
|
||||
## The Wake-Up Call
|
||||
|
||||
*"Jarvis, why aren't you responding?"*
|
||||
|
||||
2 AM. My AI assistant was dead. Again.
|
||||
|
||||
The process was alive, but HTTP responses were timing out. Memory looked fine, but API calls were failing. Traditional process monitoring couldn't catch these "zombie" states.
|
||||
|
||||
**The irony**: An AI that runs 24/7 needs someone to watch it 24/7. But I need sleep.
|
||||
|
||||
So I built a system where **AI heals AI**.
|
||||
|
||||
---
|
||||
|
||||
## Architecture: 4 Levels of Defense
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Level 1: Watchdog (60s interval) │
|
||||
│ └─ Process dead? → Restart │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Level 2: Health Check (300s interval) │
|
||||
│ └─ HTTP 200 failing? → 3 retries → Level 3 │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Level 3: Claude Emergency Recovery (30m timeout) 🧠 │
|
||||
│ ├─ Launch Claude Code in tmux PTY │
|
||||
│ ├─ Autonomous diagnosis (logs, config, ports) │
|
||||
│ ├─ Autonomous repair (fix & restart) │
|
||||
│ └─ Generate recovery report │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
↓
|
||||
┌─────────────────────────────────────────────────────────┐
|
||||
│ Level 4: Human Alert (Discord notification) 🚨 │
|
||||
│ └─ Only when AI doctor fails │
|
||||
└─────────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## The Secret Sauce: Claude as Doctor
|
||||
|
||||
Level 3 is where the magic happens. When Levels 1-2 fail, we spawn Claude Code in a tmux PTY session with this prompt:
|
||||
|
||||
```
|
||||
You are an OpenClaw Gateway emergency doctor.
|
||||
The gateway has been unresponsive for 5+ minutes.
|
||||
|
||||
Diagnose and fix the issue:
|
||||
1. Check `openclaw status`
|
||||
2. Analyze recent logs
|
||||
3. Validate configuration
|
||||
4. Check for port conflicts
|
||||
5. Attempt repairs
|
||||
6. Verify HTTP 200 response
|
||||
|
||||
You have 30 minutes. Save humanity.
|
||||
```
|
||||
|
||||
Claude then autonomously:
|
||||
- Reads logs and identifies patterns
|
||||
- Checks configuration for errors
|
||||
- Restarts services with fixes
|
||||
- Validates the repair worked
|
||||
|
||||
**It's like having a senior DevOps engineer on call 24/7.**
|
||||
|
||||
---
|
||||
|
||||
## Real-World Results
|
||||
|
||||
| Metric | Before | After |
|
||||
|--------|--------|-------|
|
||||
| Avg recovery time | 30 min | 5 min |
|
||||
| Night incidents resolved | 0% | 90% |
|
||||
| Manual interventions/week | 5 | 0.5 |
|
||||
|
||||
The system has been running in production for 2 weeks. Level 3 (Claude Doctor) has been triggered twice and successfully resolved both issues without human intervention.
|
||||
|
||||
---
|
||||
|
||||
## Try It Yourself
|
||||
|
||||
```bash
|
||||
# One-click install
|
||||
curl -sSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-healing/main/install.sh | bash
|
||||
```
|
||||
|
||||
**GitHub**: https://github.com/Ramsbaby/openclaw-self-healing
|
||||
**ClawHub**: `clawhub install openclaw-self-healing`
|
||||
|
||||
---
|
||||
|
||||
## What's Next?
|
||||
|
||||
- [ ] Linux support (currently macOS only)
|
||||
- [ ] Multi-node healing
|
||||
- [ ] Cost optimization (Claude API isn't free!)
|
||||
|
||||
---
|
||||
|
||||
*Have you built self-healing systems for AI agents? I'd love to hear your approach in the comments!*
|
||||
|
||||
🦞 Built with love for the OpenClaw community
|
||||
33
skills/openclaw-self-healing/marketing/hn-repost.md
Normal file
33
skills/openclaw-self-healing/marketing/hn-repost.md
Normal file
@@ -0,0 +1,33 @@
|
||||
# Hacker News 재포스팅 전략
|
||||
|
||||
## 현재 상태
|
||||
- 기존 포스트: 1 point, 0 comments (묻힘)
|
||||
- URL: https://news.ycombinator.com/item?id=46913226
|
||||
|
||||
## 재포스팅 규칙
|
||||
- HN은 같은 URL 재포스팅 금지 (30일 이내)
|
||||
- 다른 URL 사용하거나, 30일 후 재시도
|
||||
|
||||
## 대안 전략
|
||||
|
||||
### Option 1: 블로그 포스트로 우회
|
||||
1. Dev.to에 아티클 게시
|
||||
2. Dev.to URL로 HN 포스팅
|
||||
3. 제목: "Show HN: Claude Code as 24/7 Emergency Doctor for AI Agents"
|
||||
|
||||
### Option 2: 30일 후 재포스팅
|
||||
- 예정일: 2026-03-08
|
||||
- 최적 시간: US 저녁 = KST 오전 9-11시 (화~목)
|
||||
- 제목 개선: "Show HN: 4-tier self-healing for AI agents – Claude diagnoses and fixes itself"
|
||||
|
||||
### 최적 포스팅 시간
|
||||
| 시간대 | KST | 이유 |
|
||||
|--------|-----|------|
|
||||
| US 아침 | 22:00-24:00 | 출근 전 HN 체크 |
|
||||
| US 점심 | 02:00-04:00 | 점심시간 브라우징 |
|
||||
| US 저녁 | 09:00-11:00 | **최고** - 퇴근 후 여유 |
|
||||
|
||||
## 권장 액션
|
||||
1. ✅ Dev.to 아티클 먼저 게시
|
||||
2. ✅ 24시간 후 Dev.to URL로 HN 포스팅 (KST 오전 10시)
|
||||
3. ✅ 동시에 Reddit r/selfhosted 포스팅
|
||||
42
skills/openclaw-self-healing/marketing/reddit-selfhosted.md
Normal file
42
skills/openclaw-self-healing/marketing/reddit-selfhosted.md
Normal file
@@ -0,0 +1,42 @@
|
||||
# r/selfhosted Post
|
||||
|
||||
**Title**: I built a 4-tier self-healing system for my self-hosted AI agent — Claude Code acts as emergency doctor
|
||||
|
||||
**Subreddit**: r/selfhosted
|
||||
|
||||
---
|
||||
|
||||
**Body**:
|
||||
|
||||
I run OpenClaw (open-source AI assistant) on my Mac Mini 24/7. The problem? It crashes at night when I'm asleep.
|
||||
|
||||
Traditional watchdogs just restart the process, but that doesn't help when:
|
||||
- Process is alive but HTTP is timing out
|
||||
- Memory looks fine but API calls fail
|
||||
- Config got corrupted somehow
|
||||
|
||||
So I built a **4-tier self-healing system**:
|
||||
|
||||
1. **Level 1 - Watchdog** (60s): Process dead? Restart.
|
||||
2. **Level 2 - Health Check** (5min): HTTP failing? Try 3x, then escalate.
|
||||
3. **Level 3 - Claude Doctor** (30min): AI diagnoses and fixes the issue autonomously
|
||||
4. **Level 4 - Discord Alert**: Only bothers me if AI can't fix it
|
||||
|
||||
The interesting part is Level 3: Claude Code runs in a tmux PTY session, reads logs, checks config, and attempts repairs. It's like having a DevOps engineer on call 24/7.
|
||||
|
||||
**Results after 2 weeks**:
|
||||
- Recovery time: 30min → 5min
|
||||
- Night incidents auto-resolved: 90%
|
||||
- Manual interventions: 5/week → 0.5/week
|
||||
|
||||
**GitHub**: https://github.com/Ramsbaby/openclaw-self-healing
|
||||
|
||||
One-click install: `curl -sSL .../install.sh | bash`
|
||||
|
||||
Currently macOS only. Linux support coming.
|
||||
|
||||
Anyone else doing self-healing for their self-hosted AI agents? Curious how others approach this.
|
||||
|
||||
---
|
||||
|
||||
**Flair**: Automation / AI
|
||||
80
skills/openclaw-self-healing/marketing/twitter-thread.md
Normal file
80
skills/openclaw-self-healing/marketing/twitter-thread.md
Normal file
@@ -0,0 +1,80 @@
|
||||
# Twitter/X Thread
|
||||
|
||||
## Tweet 1 (Main)
|
||||
🦞 I built a self-healing AI system where Claude Code acts as emergency doctor
|
||||
|
||||
When my AI agent crashes at 2AM, it now fixes itself.
|
||||
|
||||
4-tier recovery:
|
||||
⚡ Watchdog → Health Check → Claude Doctor → Alert
|
||||
|
||||
The AI literally heals itself 🧠
|
||||
|
||||
Thread 🧵👇
|
||||
|
||||
---
|
||||
|
||||
## Tweet 2
|
||||
The problem with traditional watchdogs:
|
||||
|
||||
❌ Process alive but HTTP dead
|
||||
❌ Memory fine but API timing out
|
||||
❌ Config corrupted
|
||||
|
||||
They just restart blindly.
|
||||
|
||||
My system diagnoses WHY it failed.
|
||||
|
||||
---
|
||||
|
||||
## Tweet 3
|
||||
Level 3 is the magic ✨
|
||||
|
||||
Claude Code runs in a tmux PTY session:
|
||||
• Reads logs
|
||||
• Checks config
|
||||
• Identifies root cause
|
||||
• Attempts repair
|
||||
• Validates fix
|
||||
|
||||
30 min timeout. Fully autonomous.
|
||||
|
||||
---
|
||||
|
||||
## Tweet 4
|
||||
Results after 2 weeks in production:
|
||||
|
||||
📉 Recovery time: 30min → 5min
|
||||
🌙 Night issues auto-resolved: 90%
|
||||
👋 Manual fixes: 5/week → 0.5/week
|
||||
|
||||
---
|
||||
|
||||
## Tweet 5
|
||||
Try it yourself:
|
||||
|
||||
```
|
||||
curl -sSL https://raw.githubusercontent.com/Ramsbaby/openclaw-self-healing/main/install.sh | bash
|
||||
```
|
||||
|
||||
GitHub: github.com/Ramsbaby/openclaw-self-healing
|
||||
|
||||
⭐ if you find it useful!
|
||||
|
||||
#OpenClaw #AI #DevOps #SelfHealing
|
||||
|
||||
---
|
||||
|
||||
## Single Tweet Version (280 chars)
|
||||
🦞 Built a self-healing system for AI agents
|
||||
|
||||
When my bot crashes at 2AM, Claude Code wakes up as emergency doctor:
|
||||
• Diagnoses logs
|
||||
• Fixes config
|
||||
• Restarts services
|
||||
|
||||
AI heals AI. No human needed.
|
||||
|
||||
github.com/Ramsbaby/openclaw-self-healing
|
||||
|
||||
#AI #DevOps
|
||||
Reference in New Issue
Block a user