# 2026-02-08 Daily Log ## 🚨 Critical Incident #2: Self-Healing System μ‹€νŒ¨ ### νƒ€μž„λΌμΈ - **11:29** β€” Gateway λ¬΄ν•œ μž¬μ‹œμž‘ μ‹œμž‘ (`tools.exec.allowlist` config ν‚€ 문제) - **11:29-11:40** β€” Watchdog 130β†’158회 μž¬μ‹œλ„ ν›„ Exponential Backoff λ°œλ™ - **11:40:08** β€” Backoff μ™„λ£Œ, μž¬μ‹œμž‘ μ‹œλ„ (μ—¬μ „νžˆ μ‹€νŒ¨) - **11:41-11:44** β€” μž¬μ‹œλ„ 164β†’181회, Backoff μž¬μ§„μž… - **11:44** β€” μ •μš°λ‹˜ λ§₯λ―Έλ‹ˆ 직접 μ ‘μ†ν•˜μ—¬ μˆ˜λ™ 볡ꡬ - **11:45:19** β€” Gateway 정상 볡ꡬ (PID 80021) - **12:17** β€” `openclaw doctor --fix`둜 Config μ™„μ „ 정리 ### κ·Όλ³Έ 원인 **Config에 `tools.exec.allowlist` ν‚€κ°€ λ‚¨μ•„μžˆμ—ˆμŒ** - μŠ€ν‚€λ§ˆμ— μ—†λŠ” ν‚€ β†’ Gateway μ‹œμž‘ μ‹œ `exit_1` μ—λŸ¬ - 이전 config patch μ‹œ μ‹€μˆ˜λ‘œ μΆ”κ°€λ˜μ—ˆκ±°λ‚˜ 제거 λˆ„λ½ ### μ…€ν”„ 볡ꡬ μ‹œμŠ€ν…œ μ‹€νŒ¨ 뢄석 #### Level 1 (Watchdog) β€” ⚠️ λΆ€λΆ„ μž‘λ™ **문제:** - μž¬μ‹œλ„ 횟수 6회 초과 β†’ Exponential Backoff (10λΆ„ λŒ€κΈ°) - κ·Όλ³Έ 원인(Config) ν•΄κ²° λͺ» 함 β†’ λ¬΄ν•œ μž¬μ‹œμž‘λ§Œ 반볡 - μ΅œμ’… **185회** μž¬μ‹œλ„ **κ΅ν›ˆ:** - WatchdogλŠ” "폭주 λ°©μ§€"μ§€ "치료"κ°€ μ•„λ‹˜ - Config λ¬Έμ œλŠ” μ™ΈλΆ€ κ°œμž… ν•„μš” #### Level 2 (Health Check) β€” ❌ 무λ ₯ **문제:** - Gatewayκ°€ 죽으면 HTTP 체크 의미 μ—†μŒ - μž¬μ‹œμž‘ν•΄λ„ Config 문제둜 λ‹€μ‹œ 죽음 **κ΅ν›ˆ:** - Health CheckλŠ” "μΌμ‹œμ  μž₯μ• "만 볡ꡬ κ°€λŠ₯ - "ꡬ쑰적 문제"λŠ” Level 3 ν•„μš” #### Level 3 (Emergency Recovery) β€” ❌ λ―Έμž‘λ™ **문제:** - Claude CLI λ―Έμ„€μΉ˜ - tmux μ„Έμ…˜ μ‹œμž‘λ„ λͺ» 함 **κ΅ν›ˆ:** - **μ¦‰μ‹œ 쑰치:** Claude CLI μ„€μΉ˜ ν•„μˆ˜ - Emergency RecoveryλŠ” "μ΅œν›„μ˜ μˆ˜λ‹¨" #### Level 4 (Discord Alert) β€” ❌ λ―Έμž‘λ™ **문제:** - HTTP 404 (채널 ID 문제) - μ•Œλ¦Ό μ‹€νŒ¨λ‘œ μ •μš°λ‹˜ 인지 μ§€μ—° **κ΅ν›ˆ:** - **μ¦‰μ‹œ 쑰치:** Discord 채널 ID 검증 ν•„μš” ### 볡ꡬ κ³Όμ • 1. **μ •μš°λ‹˜ μˆ˜λ™ κ°œμž…** (11:40-11:44) - λ§₯λ―Έλ‹ˆ 접속 - Config μˆ˜μ • λ˜λŠ” μˆ˜λ™ μž¬μ‹œμž‘ 2. **Gateway 정상 볡ꡬ** (11:45:19) - Watchdogκ°€ 크둠 catch-up μžλ™ μ‹€ν–‰ 3. **Config 정리** (12:17) - `openclaw doctor --fix` μ‹€ν–‰ - `tools.exec.allowlist` ν‚€ 제거 ### κ΅ν›ˆ (두 번째 λ¬΄ν•œ μž¬μ‹œμž‘ 사건) #### 1. Config Validation λΆ€μž¬ - **문제:** Invalid config keyκ°€ Gateway μ‹œμž‘μ„ λ§‰μŒ - **ν•΄κ²°:** Config λ³€κ²½ μ‹œ `openclaw doctor` ν•„μˆ˜ - **κ°œμ„ :** Config patch μ „ schema validation μΆ”κ°€ #### 2. Self-Healing Single Point of Failure - **문제:** Level 2-4 λͺ¨λ‘ Gateway 의쑴 β†’ Gateway 죽으면 μ „λ©Έ - **ν•΄κ²°:** Level 3 (Claude CLI) μ„€μΉ˜ β†’ 독립적 볡ꡬ 경둜 확보 - **κ°œμ„ :** LaunchAgent 기반 Level 0 μΆ”κ°€ (Config validation + fix) #### 3. Emergency Recovery Dependency 미검증 - **문제:** Level 3 μ˜μ‘΄μ„±(Claude CLI) λ―Έμ„€μΉ˜ - **문제:** Level 4 μ˜μ‘΄μ„±(Discord 채널) 잘λͺ»λ¨ - **ν•΄κ²°:** Dependency check 슀크립트 μž‘μ„± - **κ°œμ„ :** μ‹œμŠ€ν…œ μ‹œμž‘ μ‹œ μžλ™ 검증 #### 4. Watchdog Backoff의 ν•œκ³„ - **문제:** BackoffλŠ” "폭주 λ°©μ§€"μ§€λ§Œ "κ·Όλ³Έ 원인 ν•΄κ²°" λͺ» 함 - **ν•΄κ²°:** Backoff μ§„μž… μ‹œ Emergency Recovery μ¦‰μ‹œ 호좜 - **κ°œμ„ :** Backoff μž„κ³„κ°’ 동적 μ‘°μ • (Config 문제 감지 μ‹œ 더 빨리 μ—μŠ€μ»¬λ ˆμ΄μ…˜) ### μ¦‰μ‹œ 쑰치 사항 - [x] **Claude CLI μ„€μΉ˜ (Level 3 ν™œμ„±ν™”)** β€” 12:28 μ™„λ£Œ (v2.1.37) - [x] **Discord μ•Œλ¦Ό μˆ˜μ • (Level 4)** β€” 12:28 μ™„λ£Œ (webhook 제거, stdout μ‚¬μš©) - [x] **Config Validator (Level 0)** β€” 12:34 μ™„λ£Œ (LaunchAgent 등둝, 5λΆ„ 간격) - [x] **Watchdog v5.2 (Backoff β†’ Level 3)** β€” 12:34 μ™„λ£Œ (μž¬μ‹œμž‘λ¨) - [ ] Self-Healing dependency check μΆ”κ°€ (선택) ### 톡계 - **총 μž¬μ‹œλ„ 횟수:** 185회 - **μž₯μ•  μ‹œκ°„:** ~15λΆ„ (11:29-11:45) - **μˆ˜λ™ κ°œμž…:** μ •μš°λ‹˜ (11:40-11:44) - **볡ꡬ μ™„λ£Œ:** 11:45:19 - **Config 정리:** 12:17 --- ## πŸ” Self-Healing v2.0 κ°œμ„  κ³„νš ### λͺ©ν‘œ "μ •μš°λ‹˜μ΄ μž μžλŠ” λ™μ•ˆ 무슨 일이 생겨도 μ•„μΉ¨μ—λŠ” 정상 μž‘λ™ 쀑이어야 ν•œλ‹€" ### κ°œμ„  λ°©ν–₯ #### 1. Level 0 μΆ”κ°€: Config Guardian - **λͺ©μ :** Config 문제λ₯Ό Gateway μ‹œμž‘ 전에 감지 - **κ΅¬ν˜„:** LaunchAgent (독립 ν”„λ‘œμ„ΈμŠ€) - **체크:** - `openclaw doctor` μ‹€ν–‰ - Invalid config key 감지 β†’ μžλ™ `--fix` - Schema validation - **λΉˆλ„:** 5λΆ„λ§ˆλ‹€ + Config λ³€κ²½ 감지 μ‹œ #### 2. Level 3 κ°•ν™”: Claude CLI μ˜μ‘΄μ„± 제거 - **ν˜„μž¬:** Claude CLI ν•„μˆ˜ β†’ μ—†μœΌλ©΄ μž‘λ™ μ•ˆ 함 - **κ°œμ„ :** - Fallback #1: OpenClaw CLI둜 config diff 뢄석 - Fallback #2: λ§ˆμ§€λ§‰ 정상 config둜 rollback - Fallback #3: Claude CLI 호좜 (선택) #### 3. Watchdog v5.2: Smart Escalation - **ν˜„μž¬:** Backoff μ§„μž… β†’ 10λΆ„ λŒ€κΈ° (아무것도 μ•ˆ 함) - **κ°œμ„ :** - Backoff μ§„μž… μ‹œ μ¦‰μ‹œ Level 3 호좜 - Config 문제 νŒ¨ν„΄ 감지 β†’ Level 0 호좜 - 3회 연속 μ‹€νŒ¨ β†’ Emergency Recovery κ°•μ œ μ‹€ν–‰ #### 4. Level 4 μˆ˜μ •: Multi-channel Alert - **ν˜„μž¬:** Discord 단일 채널 (404 μ‹œ μ‹€νŒ¨) - **κ°œμ„ :** - Telegram fallback μΆ”κ°€ - SMS μ΅œν›„ μˆ˜λ‹¨ (Twilio API) - Email μ•Œλ¦Ό (SMTP) ### κ΅¬ν˜„ μš°μ„ μˆœμœ„ 1. **HIGH:** Claude CLI μ„€μΉ˜ (μ¦‰μ‹œ) 2. **HIGH:** Discord 채널 ID μˆ˜μ • (μ¦‰μ‹œ) 3. **MEDIUM:** Level 0 Config Guardian (이번 μ£Ό) 4. **MEDIUM:** Watchdog v5.2 Smart Escalation (이번 μ£Ό) 5. **LOW:** Multi-channel Alert (λ‹€μŒ μ£Ό) --- ## πŸ“Š μ‹œμŠ€ν…œ μ „λ°˜ 점검 (12:00) ### μ™„λ£Œ 사항 - **Gateway μ—…λ°μ΄νŠΈ:** 2026.2.6-3 적용 - **exec.security:** `full` β†’ `allowlist` λ³€κ²½ (22 νŒ¨ν„΄) - **크둠 점검:** 39개 ν™œμ„± 크둠 정상 μž‘λ™ - **Self-Healing:** 4-Layer λͺ¨λ‘ 볡ꡬ μ™„λ£Œ ### λ³΄μ•ˆ κ°œμ„  - allowlist νŒ¨ν„΄: bash, node, git, python3, curl, docker λ“± - `ask: on-miss`: 미등둝 λͺ…λ Ήμ–΄λŠ” 승인 ν•„μš” --- ## πŸ”„ μ»¨ν…μŠ€νŠΈ μ••μΆ• (12:35) ### μ„Έμ…˜ μƒνƒœ - μ»¨ν…μŠ€νŠΈ: 70% (141k/200k) - Compactions: 50회 - λͺ¨λΈ: claude-opus-4-5 (Thinking: high) ### μ••μΆ• μ „ 핡심 진행상황 1. βœ… Self-Healing v2.0.1 μ™„λ£Œ (GitHub + ClawHub 동기화) 2. βœ… V5.0.1 μžκΈ°ν‰κ°€ μ‹œμŠ€ν…œ μ™„λ£Œ (9.80/10) 3. βœ… Watchdog v5.1 μ—…κ·Έλ ˆμ΄λ“œ (zombie SIGKILL + cron catch-up) 4. βœ… λΈ”λ‘œκ·Έ 3λΆ€μž‘ μ΄ˆμ•ˆ μ™„λ£Œ (λ°œν–‰ λŒ€κΈ°) 5. βœ… 데일리 λ„›μ§€ 크둠 생성 (평일 06:20) 6. βœ… PitchHut 등둝 μ™„λ£Œ ### μ§„ν–‰ 쀑 - [ ] Clawdex μŠ€ν‚¬ 검증 μ •μ±… κ°•ν™” - [ ] λΈ”λ‘œκ·Έ λ°œν–‰ (`draft: false`) - [ ] n8n μ›Œν¬ν”Œλ‘œμš° μ‹œμž‘ ### λ‹€μŒ 단계 1. Discord 질문 응닡 ("진행쀑?") 2. TOOLS.md μ—…λ°μ΄νŠΈ (gog 계정 정보) 3. 데일리 λ„›μ§€ λͺ¨λ‹ˆν„°λ§ (첫 μ‹€ν–‰: 2/10 μ›”) --- ## MEMORY.md μ—…λ°μ΄νŠΈ μ™„λ£Œ - [x] 두 번째 λ¬΄ν•œ μž¬μ‹œμž‘ 사건 기둝 (Important Decisions) - [x] Self-Healing ν•œκ³„ λ¬Έμ„œν™” - [x] v2.0 κ°œμ„  κ³„νš μ°Έμ‘°