AI Newsletter Digest improvements: fixed QP soft line break decoding, URL extraction, and content cleaning

This commit is contained in:
Krilly
2026-03-04 13:29:22 +00:00
parent 29a98137a7
commit 57dd294675
13706 changed files with 2114953 additions and 237629 deletions

View File

@@ -0,0 +1,79 @@
# Coding-Agent Runbook (PTY + Background)
This orchestrator MUST delegate implementation tasks to a coding agent.
Do not hand-code feature work directly when the skill is active.
Supported agents:
- `codex`
- `claude`
- `opencode`
- `pi`
## First Rule
At skill start, ask:
1) Which coding agent should run tasks?
2) Which fallback agent should be used if primary fails?
## Launch Patterns
### Codex
```bash
codex exec --full-auto "<gate task prompt>"
```
### Claude
```bash
claude "<gate task prompt>"
```
### OpenCode
```bash
opencode run "<gate task prompt>"
```
### Pi
```bash
pi -p "<gate task prompt>"
```
OpenClaw execution recommendation:
- `pty:true` for interactive CLIs
- `background:true` for long-running work
- `workdir:<project-root>`
## Required Orchestration Loop
1. Generate gate prompt (`generate_gate_prompt.py`).
2. Execute selected coding agent with that prompt (`agent_exec.py` or equivalent).
3. Require coding agent to update docs immediately after task completion:
- docs/tasks.md
- docs/progress.md
- docs/change-log.md
- docs/traceability.md
- docs/test-results.md
- docs/agent-handoff.md
4. OpenClaw agent runs verification itself:
- CLI checks in terminal
- Browser/manual checks for web journeys
5. If validation fails:
- summarize issue clearly with command/flow + output
- re-spawn coding agent with fix prompt (same task/spec)
- require docs updates again
- re-test
6. Only then update gate status.
## Manual Review Responsibility
Even in autonomous mode, the OpenClaw agent performs manual verification itself:
- Web/UI flows: run in browser tools, test critical journeys.
- CLI flows: run required commands in terminal and inspect outputs.
If checks fail, send concrete failure details to coding agent, request fix, and retest.
## Completion Wake Pattern
For long runs, require coding agent wake messages to include task + verification handoff.
"When fully done, run:
openclaw gateway wake --text 'Done: <gate> | task: <summary> | handoff: see docs/agent-handoff.md for CLI+Browser checks' --mode now"

View File

@@ -0,0 +1,76 @@
# Gate Checklists
## G0 Intake Complete
- Planning questionnaire started
- Mission, scope, journeys captured
- project_mode and execution_mode selected
## G1 Planning Approved
- Requirements testable and clear
- **Specs created for all v1 features** (in `docs/specs/` or `docs/requirements.md`)
- **Each spec has acceptance criteria (testable, not vague)**
- Definition of Done captured
- Acceptance tests drafted
- Risks and assumptions listed
## G2 Architecture Approved
Common:
- architecture doc updated
- ADR-0001 completed with alternatives
- **ADRs reference relevant specs**
- test strategy and security baseline included
- **`docs/specs/` directory exists with feature specs**
Greenfield preconditions:
- bootstrap architecture is complete
- **At least one feature spec approved**
Brownfield preconditions:
- as-is architecture complete
- system inventory + dependency map complete
- characterization baseline exists
- migration plan + compatibility matrix complete
- **Existing behavior documented before changes specced**
## G3 Slice-1 Build Verified
- **Task references spec section** (e.g., `Spec: specs/auth.md#login`)
- first vertical slice implemented
- **Implementation matches spec acceptance criteria**
- unit tests pass for slice
- integration test for key path passes
- manual smoke path passes
- docs updated
## G4 Full Build Verified
- **All tasks in g4-task-plan.md have spec references**
- lint/type/build pass
- unit + integration suite pass
- e2e critical paths pass
- contract checks pass (if API boundaries exist)
- migration checks pass (brownfield)
- **All spec acceptance criteria verified**
## G5 Security & Quality Verified
- secret scanning baseline
- dependency vulnerability baseline
- auth/authorization checks
- input validation checks
- error handling/logging checks
- performance smoke checks
## G6 Release Candidate Verified
- release checklist complete
- rollback instructions tested/validated
- monitoring/alerts configured
- open risks acknowledged
## G7 Production/Handover Complete
- post-deploy smoke passes
- handover notes complete
- incident/runbook notes complete
- backlog of follow-ups created
## State Transitions
Allowed: PENDING -> IN_PROGRESS -> PASS/FAIL/BLOCKED
- FAIL requires evidence + remediation plan
- BLOCKED requires owner + unblock condition

View File

@@ -0,0 +1,168 @@
# Gate Prompt Templates (Codex)
Copy/adapt these templates per gate. Keep prompts explicit and evidence-oriented.
## Common Prompt Header
```text
You are implementing Gate <GATE_ID> for this project.
Constraints:
- Follow AGENTS.md workflow rules exactly.
- Update documentation after every meaningful change.
- Run required validations and report evidence.
- Do not claim completion without test outputs.
- Do not assume requirements; if unclear, stop and ask.
- If spec reference is missing for implementation work, return BLOCKED and do not code.
Output contract (mandatory):
- STATUS: DONE | BLOCKED
- TASK: <single task>
- SPEC_REF: <reference or BLOCKED reason>
- FILES_CHANGED: <list>
- VALIDATION_RUN: <commands + outcomes>
- OPENCLAW_VERIFY: <cli checks + browser checks or N/A>
- RISKS: <list or NONE>
Required docs updates:
- docs/tasks.md
- docs/progress.md
- docs/change-log.md
- docs/traceability.md
- docs/test-results.md
When fully done, run:
openclaw gateway wake --text "Done: <GATE_ID> completed with evidence" --mode now
```
## G1 Planning Prompt
```text
Objective: Complete planning artifacts.
Tasks:
1) Finalize requirements with testable acceptance criteria.
2) Capture Definition of Done.
3) List assumptions and risks.
4) If research_mode=true, produce docs/research-notes.md with options and recommendation.
Validations:
- Ensure requirements are testable and unambiguous.
- Ensure acceptance criteria map to at least one test each.
Done condition:
- docs/requirements.md complete
- docs/plan.md updated
- docs/progress.md updated for G1
```
## G2 Architecture Prompt
```text
Objective: Complete architecture baseline and ADR.
Tasks:
1) Update docs/architecture.md (components, data flow, deployment, security baseline).
2) Update docs/adr/ADR-0001-initial-architecture.md with alternatives and trade-offs.
3) For brownfield, ensure as-is architecture + migration artifacts are current.
Validations:
- Architecture supports must-have journeys.
- ADR includes at least 2 alternatives.
Done condition:
- G2 artifacts complete and cross-linked in docs/traceability.md
```
## G3 Slice-1 Prompt
```text
Objective: Deliver and verify first vertical slice.
Tasks:
1) Implement first slice for the top priority user journey.
2) Add unit and integration tests for this slice.
3) Execute manual smoke test for the slice.
Validations:
- unit tests pass
- integration tests pass
- manual smoke scenario recorded in docs/test-results.md
Done condition:
- slice-1 works end-to-end with evidence
```
## G4 Full Build Prompt
```text
Objective: Complete full build and baseline verification.
Tasks:
1) Implement remaining in-scope v1 features.
2) Run full validation suite.
3) Resolve failures or document blockers.
Validations:
- lint/type/build pass
- unit/integration/e2e pass
- contract checks pass if API boundaries exist
Done condition:
- all in-scope features implemented and verified
```
## G5 Security & Quality Prompt
```text
Objective: Execute security and quality gate.
Tasks:
1) Run dependency/secret baseline checks.
2) Verify auth/input validation/error handling.
3) Run performance smoke checks.
Validations:
- no unresolved critical/high issues
- mitigation plan logged for medium/low issues
Done condition:
- security and quality evidence logged
```
## G6 Release Candidate Prompt
```text
Objective: Prepare and verify release candidate.
Tasks:
1) Complete release checklist.
2) Validate rollback instructions.
3) Confirm monitoring/alerts baseline.
Validations:
- release-checklist complete
- rollback approach validated
- docs versioned and coherent
Done condition:
- RC ready for approval/deployment
```
## G7 Handover Prompt
```text
Objective: Complete handover and close orchestration.
Tasks:
1) Execute post-deploy smoke tests.
2) Finalize handover notes + runbook pointers.
3) Create next-iteration backlog.
Validations:
- critical journeys pass in deployed environment
- unresolved risks have owners
Done condition:
- docs/progress.md reaches 100% and project is handover-ready
```

View File

@@ -0,0 +1,72 @@
# Manual Test Templates
Use these templates for human-verifiable checks. Record all runs in `docs/test-results.md`.
## Mandatory Orchestrator Behavior
- The orchestrator itself performs manual verification after coding agent changes.
- For web/UI systems: run real browser checks.
- For CLI systems: run actual commands and inspect outputs.
- If verification fails: orchestrator re-spawns coding agent with a fix prompt, then re-tests.
## Web App Manual Tests
### WT-001: Auth Login Journey (if auth exists)
- Preconditions: test user account exists
- Steps:
1. Open login page in a real browser
2. Submit valid credentials
3. Confirm landing on authenticated area
- Expected: login succeeds, no console/server errors
### WT-002: Core CRUD Journey
- Steps:
1. Create an entity
2. View it in listing/detail
3. Edit it
4. Delete it
- Expected: data lifecycle works end-to-end
### WT-003: Failure Path
- Steps:
1. Trigger invalid input
2. Trigger API/server failure scenario
- Expected: graceful errors, no crash, clear recovery path
### WT-004: Payment Journey (if payments exist)
- Steps:
1. Execute success path
2. Execute failure/cancel path
- Expected: both handled correctly with consistent state
## CLI Manual Tests
### CT-001: Happy Path Command
- Steps: run primary command with valid inputs
- Expected: success exit code and expected output
### CT-002: Invalid Input Handling
- Steps: run command with malformed/missing args
- Expected: clear error, non-zero exit, no crash
### CT-003: Config Handling
- Steps: run with expected config + missing config
- Expected: explicit behavior and guidance
### CT-004: Output Contract
- Steps: verify stdout/stderr format against docs
- Expected: output consistent and parseable if required
## Brownfield Migration Tests
### BT-001: Legacy/Modern Parity Check
- Steps: run same scenario against old and new path
- Expected: equivalent behavior for supported scope
### BT-002: Rollback Rehearsal
- Steps: deploy migration slice then execute rollback procedure
- Expected: service restored cleanly to prior known-good state
### BT-003: Contract Compatibility
- Steps: verify consumer/provider boundary contracts
- Expected: no breaking contract changes

View File

@@ -0,0 +1,42 @@
# Modes
## 1) Project Mode
### greenfield
Use for new systems from scratch.
Expected pre-architecture outputs:
- requirements baseline
- architecture baseline
- ADR-0001
- initial CI/test strategy
### brownfield
Use for onboarding and evolving existing systems.
Expected pre-architecture outputs:
- as-is architecture
- system inventory
- dependency map
- legacy risk register
- characterization test baseline
- migration strategy with rollback points
- compatibility matrix
## 2) Execution Mode
### autonomous
- proceed automatically when gate checks pass
- auto-repair up to configured retries (default 2)
- pause only on persistent failures/blockers
### gated
- pause at every gate
- present pass/fail evidence
- require explicit user go-ahead to proceed
## Recommended Defaults
- Unknown/new domain → `gated`
- High-risk brownfield migration → `gated`
- Well-understood internal greenfield project → `autonomous`

View File

@@ -0,0 +1,65 @@
# Planning Questionnaire (Mandatory)
Ask these in order. Do not start implementation until critical answers are provided.
## 0) Coding Agent Selection (Ask First)
1. Which coding agent should run implementation tasks? (`codex` | `claude` | `opencode` | `pi`)
2. What is the fallback coding agent if the primary fails repeatedly?
## A) Outcome and Scope
3. What are we building (one-sentence mission)?
4. Who are the target users?
5. What is in scope for v1?
6. What is explicitly out of scope?
7. What is the deadline (if any)?
## B) User Journeys and Success
8. What are the top 3 user journeys?
9. What must work on day one (must-have features)?
10. What metrics define success (adoption, conversion, latency, reliability)?
11. What does “Definition of Done” mean for this project?
## C) Product and Compliance Constraints
12. Any legal/compliance constraints (privacy, data residency, PCI, HIPAA, etc.)?
13. Any accessibility level target (e.g., WCAG baseline)?
14. Any browser/device/platform constraints?
15. Any third-party integrations required?
## D) Technical Constraints
16. Preferred stack (frontend/backend/database/infra)?
17. Existing repo or greenfield?
18. Required hosting target (Cloudflare, Vercel, AWS, on-prem, etc.)?
19. Required CI/CD platform?
20. Auth requirements (roles, SSO, OAuth providers)?
21. Payments/subscriptions needed?
22. Data model complexity and expected scale?
## E) Quality and Operations
23. Required test levels (unit/integration/e2e/perf/security)?
24. Availability target/SLO?
25. Logging/monitoring/alerting requirements?
26. Rollback expectations?
27. Backup and disaster recovery expectations?
## F) Orchestration Preferences
28. Mode: `autonomous` or `gated`?
29. Should `research_mode` run during planning? (`true/false`)
30. In gated mode, who approves each gate?
31. In autonomous mode, should orchestrator auto-repair failures up to 2 retries? (`true/false`)
32. Preferred progress update frequency?
## G) Acceptance and Sign-off
33. What are the exact acceptance tests for launch?
34. What evidence is required at each gate?
35. Final approver for release?
## Minimum Inputs Required to Start Build
- Primary coding agent choice
- Mission
- Top user journeys
- v1 scope
- Hosting target
- Stack preference (or explicit “recommend one”)
- Mode (`autonomous` or `gated`)
- Definition of Done
- Acceptance tests

View File

@@ -0,0 +1,39 @@
# Research Playbook
Use during planning when `research_mode=true`.
## Goals
- Reduce architecture risk before implementation
- Provide transparent option comparison
- Tie decisions to requirements and constraints
## Research Procedure
1. Restate research questions from planning gaps.
2. Define decision criteria (cost, complexity, speed, security, scale, lock-in).
3. Generate 2-4 viable options per major decision:
- app architecture
- data layer
- deployment model
- auth model
- testing strategy
4. For each option, record:
- fit for requirements
- trade-offs
- operational burden
- risk profile
5. Recommend one option with confidence score (low/medium/high).
6. Convert recommendation into ADR draft.
## Output Template (`docs/research-notes.md`)
- Questions
- Decision Criteria
- Options Compared
- Recommendation
- Risks and Mitigations
- Follow-up Questions
## Quality Rules
- Prefer primary docs and well-established references.
- Avoid single-source decisions for critical architecture choices.
- Mark unknowns explicitly.
- Do not present uncertain conclusions as facts.

View File

@@ -0,0 +1,185 @@
# Spec-Driven Development (Non-Negotiable)
This is the governing principle of the orchestrator: **no code without a spec**.
## Core Rule
The coding agent MUST NOT write implementation code until a written, approved spec exists for what it is about to build. This prevents:
- Guessing at requirements
- Making assumptions about behavior
- Building features the user didn't ask for
- Architectural drift from undocumented decisions
## What Counts as a Spec
A spec is a written document (in `docs/` or inline in a task file) that includes:
1. **What** is being built (feature/component/fix)
2. **Why** it's needed (user story, problem statement)
3. **Acceptance criteria** (testable conditions for "done")
4. **Constraints** (tech stack, performance, security, compatibility)
5. **Out of scope** (what this does NOT do)
Minimum viable spec for a single task:
```markdown
## Task: [Name]
**Goal:** [One sentence]
**Acceptance Criteria:**
- [ ] Criterion 1
- [ ] Criterion 2
**Constraints:** [Any limits]
**Out of Scope:** [What we're not doing]
```
## Spec Lifecycle
### 1. Spec Creation (Before G2)
- Orchestrator (or user) writes the spec
- Spec is stored in `docs/specs/` or embedded in `docs/requirements.md`
- For brownfield: existing behavior must be documented first
### 2. Spec Approval (Before Implementation)
- User reviews and approves (in gated mode)
- Or orchestrator validates completeness (in autonomous mode)
- Spec is marked APPROVED in `docs/specs/` or status.json
### 3. Spec → Task Mapping (G3/G4)
- Each task in `docs/g4-task-plan.md` MUST reference a spec section
- Format: `Spec: requirements.md#feature-name` or `Spec: specs/auth.md`
- Tasks without spec references are BLOCKED
### 4. Implementation (Coding Agent)
- Agent receives: spec + task description + context
- Agent MUST NOT invent features not in spec
- Agent MUST flag spec gaps and request clarification (not guess)
### 5. Verification Against Spec
- Orchestrator checks implementation against acceptance criteria
- Deviation from spec = FAIL (not creative license)
## Enforcement Points
### Gate G1 (Planning Approved)
- `docs/requirements.md` must exist with testable requirements
- Acceptance criteria must be explicit, not vague
### Gate G2 (Architecture Approved)
- `docs/specs/` directory must exist with at least one spec file
- Or `docs/requirements.md` must have spec-level detail for v1 features
- ADR references must point to spec decisions
### Gate G3/G4 (Build)
- Each task prompt MUST include:
- Spec reference
- Acceptance criteria from spec
- Explicit boundaries
- `run_gate.py` blocks tasks without `--spec-ref` argument
### Coding Agent Prompt Template
All coding agent prompts MUST include this preamble:
```
## SPEC-DRIVEN RULES
1. You are implementing ONLY what is specified below.
2. Do NOT add features, abstractions, or "improvements" not in spec.
3. If the spec is unclear or incomplete, STOP and ask for clarification.
4. Do NOT guess at requirements. Ever.
5. Your output will be verified against the acceptance criteria below.
## SPEC
[Insert spec section here]
## ACCEPTANCE CRITERIA
[Insert criteria here]
## TASK
[Insert specific task]
```
## Red Flags (Auto-Fail)
The following trigger automatic gate failure:
- Task executed without spec reference
- Coding agent added unrequested features
- Acceptance criteria missing or vague ("should work well")
- Implementation diverged from spec without change request
- Assumptions documented as facts
## Change Requests
If requirements change mid-build:
1. Run `change_impact.py` to assess impact
2. Update spec documents
3. Re-approve affected specs
4. Update traceability matrix
5. Only then resume implementation
No "I'll just add this quickly" — all changes go through spec update.
## Spec Templates
### Feature Spec (`docs/specs/feature-name.md`)
```markdown
# Feature: [Name]
## Overview
[1-2 sentences]
## User Story
As a [user type], I want [goal] so that [benefit].
## Acceptance Criteria
- AC-1: Given [context], when [action], then [result]
- AC-2: Given [context], when [action], then [result]
## Allowed Scope Files
- src/path/to/feature/**
- tests/path/to/feature/**
## Technical Constraints
- [Stack/performance/security constraints]
## Dependencies
- [Other features, APIs, services]
## Out of Scope
- [What this feature explicitly does NOT do]
## Open Questions
- [Anything needing clarification before implementation]
```
### API Endpoint Spec
```markdown
# Endpoint: [Method] [Path]
## Purpose
[What this endpoint does]
## Request
- Method: [GET/POST/etc]
- Path: [/api/v1/resource]
- Auth: [Required/None/Scope]
- Body: [Schema or example]
## Response
- Success: [Status + schema]
- Errors: [Status codes + meanings]
## Validation Rules
- [Field validations]
## Side Effects
- [Database changes, events emitted, etc]
```
## Summary
**Spec → Approve → Implement → Verify**
No shortcuts. No guessing. No "I assumed you wanted..."
The spec is the contract. Deviate = Fail.

View File

@@ -0,0 +1,71 @@
# Testing Matrix (Gate-Based)
Apply this matrix on every project. Expand when domain-specific risks appear.
## Gate G1 (Planning)
- Validate requirements clarity
- Validate acceptance criteria are testable
- Validate risks and assumptions listed
## Gate G2 (Architecture)
- Validate architecture supports all must-have journeys
- Validate threat model baseline exists
- Validate ADR exists with alternatives and trade-offs
## Gate G3 (Slice-1 Build)
- Unit tests for first slice pass
- Integration test for key flow passes
- Manual smoke test of one critical journey passes
- Docs updated for slice
## Gate G4 (Full Build)
- Lint/type/build clean
- Unit/integration suite pass
- E2E critical paths pass
- API contract checks pass (if relevant)
- Data migration checks pass (if relevant)
## Gate G5 (Security & Quality)
- Secret scanning baseline
- Dependency vulnerability scan baseline
- AuthN/AuthZ checks
- Input validation checks
- Error handling/logging checks
- Performance smoke checks
## Gate G6 (Release Candidate)
- Release checklist complete
- Rollback steps tested or validated
- Monitoring/alerts configured
- Versioned docs complete
## Gate G7 (Production/Handover)
- Post-deploy smoke tests pass
- Incident runbook available
- Handover notes complete
- Open risks tracked with owners
## Manual Testing Requirements
For Web Projects:
- Login flow (if auth exists)
- Core create/read/update/delete journey
- Payment happy path + failure path (if payments exist)
- Error page and recovery behavior
For CLI Projects:
- Core command success path
- Invalid input handling
- Config loading behavior
- Output format consistency
## Evidence Format
For every gate, record in `docs/test-results.md`:
- test name
- command or steps
- expected result
- actual result
- pass/fail
- evidence link/snippet
- timestamp