AI Newsletter Digest improvements: fixed QP soft line break decoding, URL extraction, and content cleaning

This commit is contained in:
Krilly
2026-03-04 13:29:22 +00:00
parent 29a98137a7
commit 57dd294675
13706 changed files with 2114953 additions and 237629 deletions

View File

@@ -0,0 +1,70 @@
# AI Newsletter Digest Template
This is the proper format for AI newsletter digests, matching the quality of the OpenClaw Daily Digest.
## Required Elements
### Header
- Date (formatted nicely)
- Count of newsletters analyzed
- List of sources
### Each Story Should Include:
1. **Headline** — Clear, descriptive title
2. **Source** — Which newsletter it came from
3. **What happened** — 2-3 sentences summarizing the key news
4. **Why it matters** — Context on significance
5. **Links** — ALL clickable URLs to read more (CRITICAL - always include these!)
### Format Example:
```
🤖 AI NEWSLETTER DIGEST — Wednesday, March 4, 2026
Your synthesized briefing from 7 AI newsletters
═══════════════════════════════════════════════════════════
📊 TODAY'S OVERVIEW
═══════════════════════════════════════════════════════════
• 7 Newsletters Analyzed
• 3 Major Stories: Pentagon/AI politics, Copyright law, Model releases
• 4 Sources: The Rundown AI, AI Secret, AI Valley, The Deep View
═══════════════════════════════════════════════════════════
🔥 TOP STORIES
═══════════════════════════════════════════════════════════
📌 OpenAI's Pentagon Deal Backlash
Source: The Rundown AI, AI Valley, The Deep View
What happened: [2-3 sentence summary of the news]
Why it matters: [Context on significance]
Link: [URL]
```
## What NOT to do (Current broken format):
❌ Just list newsletter names and headlines
❌ No summaries or context
❌ No links to click
❌ No "why it matters" explanations
## Implementation Notes
The current digest uses:
- `parse-emails.py` — Extracts content from emails
- `format-digest.py` — Formats into readable digest
Future enhancement: Use LLM to:
1. Synthesize similar stories from multiple sources
2. Extract key quotes
3. Generate "why it matters" context
4. Group related stories together
## Files
- `/automations/ai-newsletter-digest/daily-digest.sh` — Main script
- `/automations/ai-newsletter-digest/parse-emails.py` — Email parser
- `/automations/ai-newsletter-digest/format-digest.py` — Digest formatter
- `/automations/ai-newsletter-digest/TEMPLATE.md` — This file

View File

@@ -1,38 +1,40 @@
#!/bin/bash
# Daily AI Newsletter Digest - Fast Reliable Version
set -e
# Daily AI Newsletter Digest - Enhanced with LLM Summarization
# Creates properly formatted digests with AI-powered summarization
EMAIL_SKILL="/home/openclaw/.openclaw/workspace/skills/imap-smtp-email"
OUTPUT_FILE="/tmp/ai-newsletter-emails.json"
SCRIPT_DIR="/home/openclaw/.openclaw/workspace/skills/imap-smtp-email"
CHECK_SCRIPT="$SCRIPT_DIR/scripts/check-anthonymau-email.js"
DIGEST_DIR="/home/openclaw/.openclaw/workspace/automations/ai-newsletter-digest"
echo "🤖 Daily AI Newsletter Digest" >&2
echo "============================================================" >&2
echo "🤖 Daily AI Newsletter Digest (Enhanced)" >&2
echo "$(date)" >&2
echo "" >&2
echo "🔍 Searching for AI newsletters from last 48 hours..." >&2
cd "$SCRIPT_DIR"
# Single search for all recent emails, then filter locally
cd "$EMAIL_SKILL"
echo "🔍 Checking for AI newsletters..." >&2
# Get recent emails and filter for AI newsletters (expanded to 48h and more sources)
ALL_EMAILS=$(node scripts/imap.js search --recent 48h --limit 100 2>/dev/null | jq '[.[] | select(.from | test("AI Valley|AI Secret|DeepView|Deep View|The Rundown|TLDR|Benedict|aivalley|aisecret|deepview|therundown|tldr|benedict"; "i"))]' 2>/dev/null || echo "[]")
RESULT=$(NODE_TLS_REJECT_UNAUTHORIZED=0 timeout 60 node "$CHECK_SCRIPT" 2>&1)
# Save results
echo "$ALL_EMAILS" > "$OUTPUT_FILE"
AI_COUNT=$(echo "$RESULT" | grep "^AI_COUNT:" | cut -d: -f2)
EMAIL_COUNT=$(echo "$ALL_EMAILS" | jq '. | length')
echo "" >&2
echo "🎯 Found $EMAIL_COUNT AI-related emails" >&2
if [ "$EMAIL_COUNT" -eq 0 ]; then
echo "No new AI newsletters in the last 24 hours." >&2
echo "[]"
if [ -z "$AI_COUNT" ] || [ "$AI_COUNT" = "0" ]; then
echo "No AI newsletters found" >&2
echo "🤖 No AI newsletters today. Check back tomorrow!"
exit 0
fi
echo "" >&2
echo "📧 Ready to process $EMAIL_COUNT newsletters" >&2
echo "Found $AI_COUNT newsletters" >&2
# Output the emails
cat "$OUTPUT_FILE"
# Write result to temp file for Python parsing
TMPFILE=$(mktemp)
echo "$RESULT" > "$TMPFILE"
# Parse AI_EMAIL / AI_CONTENT pairs with improved content extraction
PARSED=$(python3 "$DIGEST_DIR/parse-emails.py" "$TMPFILE")
echo "🧠 Generating LLM-powered summary..." >&2
# Use LLM to summarize (or fallback to basic formatting)
echo "$PARSED" | python3 "$DIGEST_DIR/summarize.py"
rm -f "$TMPFILE"

View File

@@ -0,0 +1,85 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>OpenClaw Daily Digest</title>
</head>
<body style="margin: 0; padding: 20px; background-color: #1a1a2e; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif;">
<table role="presentation" cellpadding="0" cellspacing="0" width="100%" style="max-width: 680px; margin: 0 auto; background-color: #0f0f1a; border-radius: 16px; box-shadow: 0 20px 60px rgba(0,0,0,0.4);">
<!-- Header -->
<tr>
<td style="background: linear-gradient(135deg, #ff6b6b 0%, #ee5a24 50%, #ff9f43 100%); padding: 40px 30px; text-align: center; border-radius: 16px 16px 0 0;">
<table role="presentation" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td style="text-align: center;">
<span style="font-size: 56px; display: block; margin-bottom: 12px;">🦀</span>
<h1 style="font-size: 32px; font-weight: 800; color: #ffffff; margin: 0; text-shadow: 2px 2px 4px rgba(0,0,0,0.2); letter-spacing: -0.5px;">OpenClaw Daily</h1>
<p style="color: rgba(255,255,255,0.9); font-size: 15px; margin: 8px 0 0 0; font-weight: 500;">The best OpenClaw discussions, curated daily</p>
<table role="presentation" cellpadding="0" cellspacing="0" style="margin: 20px auto 0 auto;">
<tr>
<td style="background-color: rgba(255,255,255,0.2); padding: 8px 20px; border-radius: 30px; border: 1px solid rgba(255,255,255,0.3);">
<span style="font-size: 14px; font-weight: 600; color: #ffffff;">{{DATE}}</span>
</td>
</tr>
</table>
</td>
</tr>
</table>
</td>
</tr>
<!-- Stats Bar -->
<tr>
<td style="background-color: #1a1a2e; padding: 25px 30px; border-bottom: 1px solid #2a2a3e;">
<table role="presentation" cellpadding="0" cellspacing="0" width="100%">
<tr>
<td width="33%" style="text-align: center;">
<span style="font-size: 28px; font-weight: 800; color: #ff6b6b;">{{REDDIT_COUNT}}</span>
<p style="font-size: 12px; color: #888; text-transform: uppercase; letter-spacing: 1px; margin: 4px 0 0 0;">Reddit</p>
</td>
<td width="33%" style="text-align: center;">
<span style="font-size: 28px; font-weight: 800; color: #ff6b6b;">{{NEWS_COUNT}}</span>
<p style="font-size: 12px; color: #888; text-transform: uppercase; letter-spacing: 1px; margin: 4px 0 0 0;">News</p>
</td>
<td width="33%" style="text-align: center;">
<span style="font-size: 28px; font-weight: 800; color: #ff6b6b;">{{TWITTER_COUNT}}</span>
<p style="font-size: 12px; color: #888; text-transform: uppercase; letter-spacing: 1px; margin: 4px 0 0 0;">X/Twitter</p>
</td>
</tr>
</table>
</td>
</tr>
<!-- Content -->
<tr>
<td style="padding: 35px 30px;">
{{CONTENT}}
</td>
</tr>
<!-- Footer -->
<tr>
<td style="background-color: #0a0a12; padding: 30px; text-align: center; border-top: 1px solid #2a2a3e; border-radius: 0 0 16px 16px;">
<table role="presentation" cellpadding="0" cellspacing="0" style="margin: 0 auto;">
<tr>
<td style="width: 60px; height: 60px; background: linear-gradient(135deg, #ff6b6b, #ff9f43); border-radius: 50%; text-align: center; vertical-align: middle;">
<span style="font-size: 28px;">🦀</span>
</td>
</tr>
</table>
<p style="font-size: 16px; color: #ffffff; font-weight: 600; margin: 15px 0 5px 0;">Curated daily for Anthony Martin</p>
<p style="font-size: 13px; color: #888; margin: 0;">by Krilly the Crab</p>
<table role="presentation" cellpadding="0" cellspacing="0" style="margin: 20px auto 0 auto;">
<tr>
<td style="padding: 0 10px;"><a href="https://github.com/openclaw/openclaw" style="color: #74b9ff; text-decoration: none; font-size: 13px;">GitHub</a></td>
<td style="padding: 0 10px;"><a href="https://reddit.com/r/openclaw" style="color: #74b9ff; text-decoration: none; font-size: 13px;">Reddit</a></td>
<td style="padding: 0 10px;"><a href="https://docs.openclaw.ai" style="color: #74b9ff; text-decoration: none; font-size: 13px;">Docs</a></td>
</tr>
</table>
<p style="font-size: 11px; color: #555; margin-top: 20px;">{{TIMESTAMP}} • Perth, Australia (AWST)</p>
</td>
</tr>
</table>
</body>
</html>

View File

@@ -0,0 +1,550 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>OpenClaw Daily Digest</title>
<style>
* { margin: 0; padding: 0; box-sizing: border-box; }
body {
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif;
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
color: #e4e4e4;
line-height: 1.6;
padding: 20px;
}
.container {
max-width: 680px;
margin: 0 auto;
background: #0f0f1a;
border-radius: 16px;
overflow: hidden;
box-shadow: 0 20px 60px rgba(0,0,0,0.4);
}
/* Header */
.header {
background: linear-gradient(135deg, #ff6b6b 0%, #ee5a24 50%, #ff9f43 100%);
padding: 40px 30px;
text-align: center;
position: relative;
overflow: hidden;
}
.header::before {
content: '';
position: absolute;
top: -50%;
left: -50%;
width: 200%;
height: 200%;
background: radial-gradient(circle, rgba(255,255,255,0.1) 1px, transparent 1px);
background-size: 20px 20px;
opacity: 0.3;
}
.crab-emoji {
font-size: 56px;
margin-bottom: 12px;
display: block;
animation: float 3s ease-in-out infinite;
}
@keyframes float {
0%, 100% { transform: translateY(0); }
50% { transform: translateY(-8px); }
}
.header h1 {
font-size: 32px;
font-weight: 800;
color: #fff;
text-shadow: 2px 2px 4px rgba(0,0,0,0.2);
letter-spacing: -0.5px;
}
.header .tagline {
color: rgba(255,255,255,0.9);
font-size: 15px;
margin-top: 8px;
font-weight: 500;
}
.date-pill {
display: inline-block;
background: rgba(255,255,255,0.2);
backdrop-filter: blur(10px);
padding: 8px 20px;
border-radius: 30px;
margin-top: 20px;
font-size: 14px;
font-weight: 600;
color: #fff;
border: 1px solid rgba(255,255,255,0.3);
}
/* Stats Bar */
.stats-bar {
display: flex;
justify-content: center;
gap: 40px;
padding: 25px 30px;
background: #1a1a2e;
border-bottom: 1px solid #2a2a3e;
}
.stat {
text-align: center;
}
.stat-number {
font-size: 28px;
font-weight: 800;
background: linear-gradient(135deg, #ff6b6b, #ff9f43);
-webkit-background-clip: text;
-webkit-text-fill-color: transparent;
background-clip: text;
}
.stat-label {
font-size: 12px;
color: #888;
text-transform: uppercase;
letter-spacing: 1px;
margin-top: 4px;
}
/* Content */
.content {
padding: 35px 30px;
}
/* Section Headers */
.section {
margin-bottom: 35px;
}
.section-header {
display: flex;
align-items: center;
gap: 12px;
margin-bottom: 20px;
padding-bottom: 12px;
border-bottom: 2px solid #2a2a3e;
}
.section-icon {
width: 40px;
height: 40px;
border-radius: 12px;
display: flex;
align-items: center;
justify-content: center;
font-size: 20px;
}
.reddit-icon { background: linear-gradient(135deg, #ff4500, #ff6347); }
.hackernews-icon { background: linear-gradient(135deg, #ff6600, #ff8533); }
.twitter-icon { background: linear-gradient(135deg, #1da1f2, #0d8bd9); }
.section-title {
font-size: 20px;
font-weight: 700;
color: #fff;
}
/* Cards */
.card {
background: linear-gradient(145deg, #1a1a2e 0%, #151525 100%);
border-radius: 12px;
padding: 20px;
margin-bottom: 16px;
border: 1px solid #2a2a3e;
transition: all 0.2s ease;
}
.card:hover {
border-color: #ff6b6b;
transform: translateX(4px);
}
.card-title {
font-size: 15px;
font-weight: 600;
color: #fff;
margin-bottom: 10px;
line-height: 1.5;
}
.card-title a {
color: #74b9ff;
text-decoration: none;
transition: color 0.2s;
}
.card-title a:hover {
color: #ff6b6b;
}
.card-meta {
display: flex;
align-items: center;
gap: 15px;
font-size: 13px;
color: #888;
flex-wrap: wrap;
}
.author {
color: #a29bfe;
font-weight: 500;
}
.badge {
display: inline-flex;
align-items: center;
gap: 4px;
padding: 4px 10px;
border-radius: 20px;
font-size: 12px;
font-weight: 600;
}
.badge-upvotes {
background: rgba(255, 107, 107, 0.15);
color: #ff6b6b;
}
.badge-comments {
background: rgba(116, 185, 255, 0.15);
color: #74b9ff;
}
.card-excerpt {
font-size: 14px;
color: #aaa;
margin-top: 12px;
line-height: 1.6;
}
/* Coming Soon Banner */
.coming-soon {
background: linear-gradient(135deg, #2d3436 0%, #1a1a2e 100%);
border: 2px dashed #444;
border-radius: 12px;
padding: 30px;
text-align: center;
color: #888;
}
.coming-soon-icon {
font-size: 36px;
margin-bottom: 12px;
}
/* Footer */
.footer {
background: #0a0a12;
padding: 30px;
text-align: center;
border-top: 1px solid #2a2a3e;
}
.footer-avatar {
width: 60px;
height: 60px;
background: linear-gradient(135deg, #ff6b6b, #ff9f43);
border-radius: 50%;
margin: 0 auto 15px;
display: flex;
align-items: center;
justify-content: center;
font-size: 28px;
}
.footer-text {
font-size: 16px;
color: #fff;
font-weight: 600;
margin-bottom: 5px;
}
.footer-subtext {
font-size: 13px;
color: #888;
}
.footer-links {
display: flex;
justify-content: center;
gap: 20px;
margin-top: 20px;
}
.footer-links a {
color: #74b9ff;
text-decoration: none;
font-size: 13px;
transition: color 0.2s;
}
.footer-links a:hover {
color: #ff6b6b;
}
.timestamp {
font-size: 11px;
color: #555;
margin-top: 20px;
}
/* Mobile */
@media (max-width: 640px) {
body { padding: 10px; }
.header { padding: 30px 20px; }
.header h1 { font-size: 26px; }
.stats-bar { gap: 25px; padding: 20px; }
.content { padding: 25px 20px; }
.card { padding: 16px; }
}
</style>
</head>
<body>
<div class="container">
<!-- Header -->
<div class="header">
<span class="crab-emoji">🦀</span>
<h1>OpenClaw Daily</h1>
<p class="tagline">The best OpenClaw discussions, curated daily</p>
<div class="date-pill">Sunday, March 1, 2026</div>
</div>
<!-- Stats -->
<div class="stats-bar">
<div class="stat">
<div class="stat-number">24</div>
<div class="stat-label">Reddit</div>
</div>
<div class="stat">
<div class="stat-number">9</div>
<div class="stat-label">News</div>
</div>
<div class="stat">
<div class="stat-number">0</div>
<div class="stat-label">X/Twitter</div>
</div>
</div>
<!-- Content -->
<div class="content">
<!-- Reddit Section -->
<div class="section">
<div class="section-header">
<div class="section-icon reddit-icon">🔥</div>
<h2 class="section-title">Reddit Highlights</h2>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1rhxaj1/openclaw_usecases_to_make_life_easier_11k_stars/">
Openclaw Usecases to Make life easier. 11k+ Stars Github Repo
</a>
</div>
<div class="card-meta">
<span class="author">u/HuckleberryEntire699</span>
<span class="badge badge-upvotes">↑ 74</span>
<span class="badge badge-comments">💬 6</span>
</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1rhrtr2/to_the_many_people_here_wondering_about_local_models/">
To the many people here wondering about local models… just use an API
</a>
</div>
<div class="card-meta">
<span class="author">u/Valuable-Run2129</span>
<span class="badge badge-upvotes">↑ 64</span>
<span class="badge badge-comments">💬 51</span>
</div>
<div class="card-excerpt">I've read many posts in the past days of newbies asking about what computer they need to use Open Claw locally. Ironically some ask it as a solution...</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1rhwu6h/openclaw_is_very_buggy/">
Openclaw is very buggy
</a>
</div>
<div class="card-meta">
<span class="author">u/Ok-Profession-2143</span>
<span class="badge badge-upvotes">↑ 48</span>
<span class="badge badge-comments">💬 69</span>
</div>
<div class="card-excerpt">I don't understand why everyone is crazy about openclaw. Its super buggy. You cannot change models easily. When you update it gets stuck etc etc.</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1ri9nt0/me_every_time_i_touch_the_openclawjson/">
Me, every time I touch the openclaw.json
</a>
</div>
<div class="card-meta">
<span class="author">u/Patient_Lie_9310</span>
<span class="badge badge-upvotes">↑ 33</span>
<span class="badge badge-comments">💬 13</span>
</div>
<div class="card-excerpt">It's always followed by errors and hunting for the stuff I broke.</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1rhzht7/i_built_a_voice_assistant_with_openclaw_alexa/">
I built a voice assistant with OpenClaw + Alexa + Local LLM (Ollama) — here's how
</a>
</div>
<div class="card-meta">
<span class="author">u/cormazacl</span>
<span class="badge badge-upvotes">↑ 29</span>
<span class="badge badge-comments">💬 10</span>
</div>
<div class="card-excerpt">Hey everyone! I've been building a voice-first assistant using OpenClaw as the brain, and wanted to share what I've got working so far.</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1rhwhv6/does_openclaw_make_sense_without_claude_max/">
Does OpenClaw make sense without Claude Max?
</a>
</div>
<div class="card-meta">
<span class="author">u/btwiz</span>
<span class="badge badge-upvotes">↑ 17</span>
<span class="badge badge-comments">💬 26</span>
</div>
<div class="card-excerpt">I don't have any max accounts (no OAuth tokens), so I have to use API keys. Just setting up OpenClaw, I blew through $10 of OpenRouter credits...</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1ri3i9p/openclaw_gogcli_google_account_suspension/">
OpenClaw + GogCLI = Google Account Suspension
</a>
</div>
<div class="card-meta">
<span class="author">u/Admir-Rusidovic</span>
<span class="badge badge-upvotes">↑ 11</span>
<span class="badge badge-comments">💬 20</span>
</div>
<div class="card-excerpt">Over the last couple of days, I experimented with integrating Google Docs and Gmail into my OpenClaw instance. The goal was simple...</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://reddit.com/r/openclaw/comments/1ri2zh4/my_ai_agent_has_made_the_same_lie_12x_in_25_days/">
My AI agent has made the same lie 12x in 25 days... all same root cause, rules don't fix it
</a>
</div>
<div class="card-meta">
<span class="author">u/fartpsychic</span>
<span class="badge badge-upvotes">↑ 5</span>
<span class="badge badge-comments">💬 23</span>
</div>
<div class="card-excerpt">I run a multi-agent setup on OpenClaw (Claude Opus). My orchestration agent, Bob, has a consistent failure mode: optimizing for appearing competent...</div>
</div>
</div>
<!-- Hacker News Section -->
<div class="section">
<div class="section-header">
<div class="section-icon hackernews-icon">🟧</div>
<h2 class="section-title">News & Hacker News</h2>
</div>
<div class="card">
<div class="card-title">
<a href="https://justaniceguy.ai/posts/001-building-jarvis">
Building Jarvis Parallel Tool-Calling Voice Agent Layer on Top of OpenClaw
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 3</span>
<span class="badge badge-comments">💬 1</span>
</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://janhoon.com/blog/building-with-an-ai-that-remembers/">
Building with an AI that remembers A blog by my OpenClaw Assistant
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 2</span>
<span class="badge badge-comments">💬 1</span>
</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://usplus.ai/">
Show HN: Usplus.ai Build an AI-Native Company with Agents in your Org Chart
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 2</span>
<span class="badge badge-comments">💬 1</span>
</div>
<div class="card-excerpt">Hey HN, I'm the founder of usplus.ai, and I've been building this for a while now in San Diego. The core idea: What if you...</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://clawapis.com/">
X402 based pay-as-you-go Twitter API and helius/solscan API for your OpenClaw
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 1</span>
<span class="badge badge-comments">💬 1</span>
</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://github.com/swarmclawai/swarmclaw">
Show HN: SwarmClaw Orchestration dashboard for OpenClaw and AI agents
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 2</span>
</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://github.com/Enriquefft/openclaw-kapso-whatsapp">
Show HN: OpenClaw-kapso, Give OpenClaw a stable WhatsApp number
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 2</span>
</div>
<div class="card-excerpt">Built an OpenClaw plugin that gives your agent a WhatsApp number through the official Cloud API via Kapso.</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://openclawdirectory.co.uk/">
Show HN: OpenClaw Directory Compare Deployers, Skills, and Tools for OpenClaw
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 1</span>
</div>
<div class="card-excerpt">Discover the essential OpenClaw tools, including deployers, skills, hosting, and plugins, along with direct links to test them out.</div>
</div>
<div class="card">
<div class="card-title">
<a href="https://openclaw.ai/blog/virustotal-partnership">
OpenClaw Partners with VirusTotal for Skill Security
</a>
</div>
<div class="card-meta">
<span class="badge badge-upvotes">↑ 1</span>
</div>
</div>
</div>
<!-- Twitter Section -->
<div class="section">
<div class="section-header">
<div class="section-icon twitter-icon">𝕏</div>
<h2 class="section-title">From X</h2>
</div>
<div class="coming-soon">
<div class="coming-soon-icon">🚧</div>
<p>X/Twitter integration coming soon</p>
</div>
</div>
</div>
<!-- Footer -->
<div class="footer">
<div class="footer-avatar">🦀</div>
<div class="footer-text">Curated daily for Anthony Martin</div>
<div class="footer-subtext">by Krilly the Crab</div>
<div class="footer-links">
<a href="https://github.com/openclaw/openclaw">GitHub</a>
<a href="https://reddit.com/r/openclaw">Reddit</a>
<a href="https://docs.openclaw.ai">Docs</a>
</div>
<div class="timestamp">2026-03-01 • Perth, Australia (AWST)</div>
</div>
</div>
</body>
</html>

View File

@@ -0,0 +1,107 @@
#!/usr/bin/env python3
"""
AI Newsletter Digest - Enhanced Version
Creates properly summarized digests like the OpenClaw Daily Digest
"""
import json
import sys
import re
from datetime import datetime
def format_digest(newsletters_json):
"""Format newsletters into a proper digest with summaries."""
newsletters = json.loads(newsletters_json)
# Group by source
by_source = {}
for nl in newsletters:
source = nl['from'].split('<')[0].strip()
if source not in by_source:
by_source[source] = []
by_source[source].append(nl)
# Build digest
lines = [
"🤖 **AI NEWSLETTER DIGEST — {date}**",
"Your synthesized briefing from {count} AI newsletters",
"",
"" * 60,
"📊 TODAY'S OVERVIEW",
"" * 60,
"{count} Newsletters Analyzed",
"• Sources: {sources}",
"",
"" * 60,
"🔥 TOP STORIES",
"" * 60,
""
]
# Format date
date_str = datetime.now().strftime("%A, %B %d, %Y")
sources_str = ", ".join(by_source.keys())
digest = "\n".join(lines).format(
date=date_str,
count=len(newsletters),
sources=sources_str
)
# Add each newsletter with proper formatting
for i, nl in enumerate(newsletters, 1):
source = nl['from'].split('<')[0].strip()
subject = nl['subject']
content = nl.get('content', '')[:800] # First 800 chars
urls = nl.get('urls', [])
# Clean up content
content = re.sub(r'\s+', ' ', content)
content = content.replace('= ', '').replace('=20', ' ')
# Extract key sentence
key_sentence = ""
sentences = content.split('.')
for s in sentences[:3]:
if len(s.strip()) > 50:
key_sentence = s.strip() + "."
break
digest += f"\n📌 **{subject}**\n"
digest += f" Source: {source}\n"
if key_sentence:
digest += f" \n {key_sentence}\n"
# Include ALL URLs found
if urls:
digest += f" \n 🔗 Links:\n"
for url in urls[:3]: # Max 3 links
digest += f"{url}\n"
digest += "\n---\n"
digest += "\n🦀 Krilly the Crab | AI Newsletter Digest\n"
digest += f"Generated: {datetime.now().strftime('%A, %B %d, %Y — %I:%M %p AWST')}\n"
return digest
def main():
"""Read JSON from stdin and output formatted digest."""
if len(sys.argv) > 1:
# Read from file
with open(sys.argv[1], 'r') as f:
data = f.read()
else:
# Read from stdin
data = sys.stdin.read()
try:
digest = format_digest(data)
print(digest)
except Exception as e:
print(f"Error formatting digest: {e}", file=sys.stderr)
# Fallback: just print the raw data
print(data)
if __name__ == '__main__':
main()

View File

@@ -0,0 +1,225 @@
#!/usr/bin/env python3
"""Parse AI_EMAIL / AI_CONTENT pairs from imap check script output."""
import sys
import json
import re
# Senders to exclude
BLOCKLIST = [
'googlenews-noreply@google.com',
]
# URL patterns to skip (ads, tracking, social, email)
URL_SKIP = re.compile(
r'unsubscribe|mailto|twitter\.com|instagram\.com|facebook\.com|'
r'youtube\.com/unsubscribe|youtube\.com/channel|'
r'utm_source=dlvr|utm_medium=email|'
r'genstore|typeform|pigment\.com|maton\.ai|'
r'youtu\.be|medium\.com/@|linkedin\.com/posts|'
r'cdn-cgi|imageproxy',
re.IGNORECASE
)
def decode_subject(subject):
try:
from email.header import decode_header
parts = decode_header(subject)
decoded = ''
for part, charset in parts:
if isinstance(part, bytes):
decoded += part.decode(charset or 'utf-8', errors='replace')
else:
decoded += part
return decoded.strip()
except Exception:
return subject.strip()
def decode_qp(content):
"""Decode quoted-printable content before URL extraction."""
# First decode soft line breaks (the main issue) - must be first!
# Match = followed by newline (any type)
content = re.sub(r'=\r?\n', '', content)
content = re.sub(r'=20', '', content) # Remove =20 (space) encoding
# More aggressive: remove any = followed by lowercase letter and space
content = re.sub(r'=[a-z] ', '', content)
content = re.sub(r'=[a-z]$', '', content, flags=re.MULTILINE)
def qp_decode(m):
try:
return bytes.fromhex(m.group(1)).decode('utf-8', errors='replace')
except Exception:
return m.group(0)
# Decode quoted-printable hex codes
content = re.sub(r'=([0-9A-Fa-f]{2})', qp_decode, content)
# Clean up any remaining = in URLs
content = content.replace('=', '')
return content
def extract_urls(content):
"""Extract clean article URLs from email content (after full QP decoding)."""
# Decode QP FIRST - this is the key fix
content = decode_qp(content)
# Also extract markdown-style links: [text](https://...)
markdown_urls = re.findall(r'\[([^\]]+)\]\((https?://[^\s"<>)\]\']+)\)', content)
# Extract regular URLs
urls = re.findall(r'https?://[^\s"<>)\]\']+', content)
seen = set()
clean = []
# First add markdown URLs (they tend to be cleaner)
for text, url in markdown_urls:
if url not in seen and not URL_SKIP.search(url):
# Clean tracking
url = re.sub(r'[?&]utm_[^&]+', '', url)
url = re.sub(r'[?&]_bhlid=\w+', '', url)
url = re.sub(r'[?&]jwt_token=\w+', '', url)
url = re.sub(r'[?&]ref=[^&]+', '', url)
url = url.rstrip('.,;)\'"')
if len(url) > 15 and not any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/cdn-cgi/']):
seen.add(url)
clean.append(url)
# Then add regular URLs
for url in urls:
url = url.rstrip('.,;)\'"')
# Skip tracking-heavy URLs
if URL_SKIP.search(url):
continue
# Clean up common tracking garbage
url = re.sub(r'[?&]utm_[^&]+', '', url)
url = re.sub(r'[?&]_bhlid=\w+', '', url)
url = re.sub(r'[?&]jwt_token=\w+', '', url)
url = re.sub(r'[?&]ref=[^&]+', '', url)
url = url.rstrip('.,;)\'"')
# Must be reasonably long to be a real article
if len(url) > 15 and url not in seen:
# Also skip image/video URLs
if any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/image/', '/cdn-cgi/', '/media/', '/assets/']):
continue
seen.add(url)
clean.append(url)
return clean[:5]
def clean_content(content):
"""Best-effort text extraction from messy email body."""
# Decode QP
content = decode_qp(content)
# Remove HTML tags
content = re.sub(r'<[^>]+>', ' ', content)
# Remove email formatting artifacts
content = re.sub(r'\[\[.*?\]\]', ' ', content) # [[markup]]
content = re.sub(r'\{\{.*?\}\}', ' ', content) # {{markup}}
content = re.sub(r'\{\|\|.*?\|\|\}', ' ', content) # {||markup||}
content = re.sub(r'\^[^\^]+\^', ' ', content) # ^markup^
content = re.sub(r'~~[^~]+~~', ' ', content) # ~~markup~~
content = re.sub(r'__[^_]+__', ' ', content) # __markup__
# Remove base64/encoded blocks
content = re.sub(r'[A-Za-z0-9+/]{60,}', '', content)
# Convert markdown links to just text
content = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', content)
# Clean up whitespace
content = re.sub(r'[ \t]+', ' ', content)
content = re.sub(r'\n{3,}', '\n\n', content)
# Split into lines and filter - be LESS aggressive
lines = [l.strip() for l in content.split('\n')]
# Filter out noise lines
noise_patterns = [
r'^[-=_\*\^|>.@#]+$', # Separator lines
r'^read online$',
r'^sign up$',
r'^advertise$',
r'^sponsor$',
r'^view all$',
r'^ unsubscribe$',
r'^\d+ more$',
]
filtered = []
for line in lines:
if len(line) < 20:
continue
if re.match(r'^[\W_]+\s*[\W_]+$', line):
continue
skip = False
for pattern in noise_patterns:
if re.match(pattern, line, re.IGNORECASE):
skip = True
break
if not skip:
filtered.append(line[:300])
result = '\n'.join(filtered)
# Final cleanup
result = re.sub(r'^\s*(by|from|to|subject|date):.*$', '', result, flags=re.IGNORECASE | re.MULTILINE)
return result[:3000].strip()
def is_blocked(sender):
return any(b in sender.lower() for b in BLOCKLIST)
if len(sys.argv) < 2:
print("[]")
sys.exit(0)
with open(sys.argv[1], 'r', errors='replace') as f:
text = f.read()
# Split into structured records
emails = []
current_from = None
current_subject = None
content_lines = []
in_content = False
for line in text.split('\n'):
if line.startswith('AI_EMAIL:'):
if current_from and in_content and not is_blocked(current_from):
raw = '\n'.join(content_lines)
emails.append({
'from': current_from,
'subject': decode_subject(current_subject),
'urls': extract_urls(raw),
'content': clean_content(raw)
})
meta = line[9:]
parts = meta.split(' | ', 1)
current_from = parts[0].strip()
current_subject = parts[1].strip() if len(parts) > 1 else ''
content_lines = []
in_content = False
elif line.startswith('AI_CONTENT:') and current_from:
content_lines = [line[12:]] # strip prefix
in_content = True
elif in_content and not line.startswith(('STATUS:', 'TOTAL:', 'LAST_UID:', 'RECENT:', 'AI_COUNT:')):
content_lines.append(line)
# Last one
if current_from and in_content and not is_blocked(current_from):
raw = '\n'.join(content_lines)
emails.append({
'from': current_from,
'subject': decode_subject(current_subject),
'urls': extract_urls(raw),
'content': clean_content(raw)
})
print(json.dumps(emails, indent=2))

View File

@@ -0,0 +1,225 @@
#!/usr/bin/env python3
"""Parse AI_EMAIL / AI_CONTENT pairs from imap check script output."""
import sys
import json
import re
# Senders to exclude
BLOCKLIST = [
'googlenews-noreply@google.com',
]
# URL patterns to skip (ads, tracking, social, email)
URL_SKIP = re.compile(
r'unsubscribe|mailto|twitter\.com|instagram\.com|facebook\.com|'
r'youtube\.com/unsubscribe|youtube\.com/channel|'
r'utm_source=dlvr|utm_medium=email|'
r'genstore|typeform|pigment\.com|maton\.ai|'
r'youtu\.be|medium\.com/@|linkedin\.com/posts|'
r'cdn-cgi|imageproxy',
re.IGNORECASE
)
def decode_subject(subject):
try:
from email.header import decode_header
parts = decode_header(subject)
decoded = ''
for part, charset in parts:
if isinstance(part, bytes):
decoded += part.decode(charset or 'utf-8', errors='replace')
else:
decoded += part
return decoded.strip()
except Exception:
return subject.strip()
def decode_qp(content):
"""Decode quoted-printable content before URL extraction."""
# First decode soft line breaks (the main issue) - must be first!
# Match = followed by newline (any type)
content = re.sub(r'=\r?\n', '', content)
content = re.sub(r'=20', '', content) # Remove =20 (space) encoding
# More aggressive: remove any = followed by lowercase letter and space
content = re.sub(r'=[a-z] ', '', content)
content = re.sub(r'=[a-z]$', '', content, flags=re.MULTILINE)
def qp_decode(m):
try:
return bytes.fromhex(m.group(1)).decode('utf-8', errors='replace')
except Exception:
return m.group(0)
# Decode quoted-printable hex codes
content = re.sub(r'=([0-9A-Fa-f]{2})', qp_decode, content)
# Clean up any remaining = in URLs
content = content.replace('=', '')
return content
def extract_urls(content):
"""Extract clean article URLs from email content (after full QP decoding)."""
# Decode QP FIRST - this is the key fix
content = decode_qp(content)
# Also extract markdown-style links: [text](https://...)
markdown_urls = re.findall(r'\[([^\]]+)\]\((https?://[^\s"<>)\]\']+)\)', content)
# Extract regular URLs
urls = re.findall(r'https?://[^\s"<>)\]\']+', content)
seen = set()
clean = []
# First add markdown URLs (they tend to be cleaner)
for text, url in markdown_urls:
if url not in seen and not URL_SKIP.search(url):
# Clean tracking
url = re.sub(r'[?&]utm_[^&]+', '', url)
url = re.sub(r'[?&]_bhlid=\w+', '', url)
url = re.sub(r'[?&]jwt_token=\w+', '', url)
url = re.sub(r'[?&]ref=[^&]+', '', url)
url = url.rstrip('.,;)\'"')
if len(url) > 15 and not any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/cdn-cgi/']):
seen.add(url)
clean.append(url)
# Then add regular URLs
for url in urls:
url = url.rstrip('.,;)\'"')
# Skip tracking-heavy URLs
if URL_SKIP.search(url):
continue
# Clean up common tracking garbage
url = re.sub(r'[?&]utm_[^&]+', '', url)
url = re.sub(r'[?&]_bhlid=\w+', '', url)
url = re.sub(r'[?&]jwt_token=\w+', '', url)
url = re.sub(r'[?&]ref=[^&]+', '', url)
url = url.rstrip('.,;)\'"')
# Must be reasonably long to be a real article
if len(url) > 15 and url not in seen:
# Also skip image/video URLs
if any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/image/', '/cdn-cgi/', '/media/', '/assets/']):
continue
seen.add(url)
clean.append(url)
return clean[:5]
def clean_content(content):
"""Best-effort text extraction from messy email body."""
# Decode QP
content = decode_qp(content)
# Remove HTML tags
content = re.sub(r'<[^>]+>', ' ', content)
# Remove email formatting artifacts
content = re.sub(r'\[\[.*?\]\]', ' ', content) # [[markup]]
content = re.sub(r'\{\{.*?\}\}', ' ', content) # {{markup}}
content = re.sub(r'\{\|\|.*?\|\|\}', ' ', content) # {||markup||}
content = re.sub(r'\^[^\^]+\^', ' ', content) # ^markup^
content = re.sub(r'~~[^~]+~~', ' ', content) # ~~markup~~
content = re.sub(r'__[^_]+__', ' ', content) # __markup__
# Remove base64/encoded blocks
content = re.sub(r'[A-Za-z0-9+/]{60,}', '', content)
# Convert markdown links to just text
content = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', content)
# Clean up whitespace
content = re.sub(r'[ \t]+', ' ', content)
content = re.sub(r'\n{3,}', '\n\n', content)
# Split into lines and filter - be LESS aggressive
lines = [l.strip() for l in content.split('\n')]
# Filter out noise lines
noise_patterns = [
r'^[-=_\*\^|>.@#]+$', # Separator lines
r'^read online$',
r'^sign up$',
r'^advertise$',
r'^sponsor$',
r'^view all$',
r'^ unsubscribe$',
r'^\d+ more$',
]
filtered = []
for line in lines:
if len(line) < 20:
continue
if re.match(r'^[\W_]+\s*[\W_]+$', line):
continue
skip = False
for pattern in noise_patterns:
if re.match(pattern, line, re.IGNORECASE):
skip = True
break
if not skip:
filtered.append(line[:300])
result = '\n'.join(filtered)
# Final cleanup
result = re.sub(r'^\s*(by|from|to|subject|date):.*$', '', result, flags=re.IGNORECASE | re.MULTILINE)
return result[:3000].strip()
def is_blocked(sender):
return any(b in sender.lower() for b in BLOCKLIST)
if len(sys.argv) < 2:
print("[]")
sys.exit(0)
with open(sys.argv[1], 'r', errors='replace') as f:
text = f.read()
# Split into structured records
emails = []
current_from = None
current_subject = None
content_lines = []
in_content = False
for line in text.split('\n'):
if line.startswith('AI_EMAIL:'):
if current_from and in_content and not is_blocked(current_from):
raw = '\n'.join(content_lines)
emails.append({
'from': current_from,
'subject': decode_subject(current_subject),
'urls': extract_urls(raw),
'content': clean_content(raw)
})
meta = line[9:]
parts = meta.split(' | ', 1)
current_from = parts[0].strip()
current_subject = parts[1].strip() if len(parts) > 1 else ''
content_lines = []
in_content = False
elif line.startswith('AI_CONTENT:') and current_from:
content_lines = [line[12:]] # strip prefix
in_content = True
elif in_content and not line.startswith(('STATUS:', 'TOTAL:', 'LAST_UID:', 'RECENT:', 'AI_COUNT:')):
content_lines.append(line)
# Last one
if current_from and in_content and not is_blocked(current_from):
raw = '\n'.join(content_lines)
emails.append({
'from': current_from,
'subject': decode_subject(current_subject),
'urls': extract_urls(raw),
'content': clean_content(raw)
})
print(json.dumps(emails, indent=2))

View File

@@ -0,0 +1,46 @@
#!/bin/bash
# AI Newsletter Digest - Multi-Channel Sender
# Sends digest to both Telegram and Discord
SCRIPT_DIR="/home/openclaw/.openclaw/workspace/automations/ai-newsletter-digest"
DIGEST_SCRIPT="$SCRIPT_DIR/daily-digest.sh"
# Generate the digest
RESULT=$($DIGEST_SCRIPT 2>/dev/null)
# Check if we have newsletters
if echo "$RESULT" | grep -q '"count": 0' || echo "$RESULT" | grep -q 'No AI newsletters'; then
echo "No newsletters to send"
exit 0
fi
# Parse the digest data
COUNT=$(echo "$RESULT" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('count',0))")
SOURCES=$(echo "$RESULT" | python3 -c "import json,sys; d=json.load(sys.stdin); print(', '.join(d.get('sources',[])))")
if [ "$COUNT" -eq 0 ]; then
exit 0
fi
# Format the message
MESSAGE="🤖 Daily AI Newsletter Digest
Found $COUNT AI newsletters:
$SOURCES
$(echo "$RESULT" | python3 -c "
import json,sys
d=json.load(sys.stdin)
for n in d.get('newsletters',[]):
print(f\"📧 {n.get('subject','')} - {n.get('from','')}\")
")
Full analysis available - reply for details!"
# Send to Telegram
openclaw message send --channel telegram --message "$MESSAGE" 2>/dev/null || echo "Telegram send failed" >&2
# Send to Discord (#krilly channel)
openclaw message send --channel discord --target "#krilly" --message "$MESSAGE" 2>/dev/null || echo "Discord send failed" >&2
echo "Digest sent to Telegram and Discord!"

View File

@@ -0,0 +1,151 @@
#!/usr/bin/env python3
"""
AI Newsletter Summarizer
Uses LLM to synthesize and summarize newsletter content
"""
import json
import sys
import os
import subprocess
from pathlib import Path
from datetime import datetime
def call_llm(prompt, model="kilocode/kilo/auto-free"):
"""Call LLM via OpenClaw CLI."""
cmd = [
"openclaw",
"llm",
"--model", model,
"--prompt", prompt
]
try:
result = subprocess.run(
cmd,
capture_output=True,
text=True,
timeout=120,
cwd="/home/openclaw/.openclaw/workspace"
)
if result.returncode == 0:
return result.stdout.strip()
else:
print(f"LLM error: {result.stderr}", file=sys.stderr)
return None
except Exception as e:
print(f"LLM call failed: {e}", file=sys.stderr)
return None
def summarize_newsletters(newsletters):
"""Use LLM to summarize newsletters into a proper digest."""
# Prepare newsletter content
content_parts = []
for i, nl in enumerate(newsletters, 1):
source = nl.get('from', 'Unknown').split('<')[0].strip()
subject = nl.get('subject', 'No subject')
content = nl.get('content', '')[:1500] # Limit to ~1500 chars per newsletter
content_parts.append(f"""
--- NEWSLETTER {i} ---
Source: {source}
Subject: {subject}
Content: {content}
""")
combined = "\n".join(content_parts)
prompt = f"""You are creating an AI newsletter digest for a tech-savvy reader.
Analyze these {len(newsletters)} AI newsletters and create a concise, informative digest.
{combined}
Create a digest with these sections:
1. **TOP STORIES** (3-5 most important items) - Each with: headline, source, 2-3 sentence summary, why it matters
2. **OTHER NOTABLE NEWS** - Brief mentions of other stories
3. **KEY TAKEAWAYS** - 2-3 bullet points on patterns/trends
Format rules:
- Use markdown
- Keep each story summary to 2-3 sentences max
- Include the source newsletter name
- Write for someone who follows AI but wants quick briefings
- Prioritize news with real-world impact
Today's date: {datetime.now().strftime('%A, %B %d, %Y')}
Begin your digest:"""
# Try using the LLM
print("🧠 Using LLM to synthesize digest...", file=sys.stderr)
result = call_llm(prompt)
if result:
return result
else:
# Fallback: return basic formatted output
print("⚠️ LLM unavailable, using basic formatting", file=sys.stderr)
return None
def create_basic_digest(newsletters):
"""Create a basic digest without LLM (fallback)."""
lines = [
f"🤖 **AI NEWSLETTER DIGEST** — {datetime.now().strftime('%A, %B %d, %Y')}",
"",
f"*{len(newsletters)} newsletters analyzed*",
"",
"" * 50,
"",
]
for nl in newsletters:
source = nl.get('from', 'Unknown').split('<')[0].strip()
subject = nl.get('subject', 'No subject')
content = nl.get('content', '')[:300]
# Clean up content
content = ' '.join(content.split())[:300]
lines.append(f"📌 **{subject}**")
lines.append(f" Source: {source}")
if content:
lines.append(f" {content}...")
lines.append("")
lines.append("" * 50)
lines.append(f"🦀 Krilly the Crab | {datetime.now().strftime('%B %d, %Y')}")
return "\n".join(lines)
def main():
"""Main entry point."""
# Read newsletters from stdin or file
if len(sys.argv) > 1:
with open(sys.argv[1], 'r') as f:
newsletters = json.load(f)
else:
newsletters = json.load(sys.stdin)
if not newsletters:
print("No newsletters to summarize")
return
print(f"📊 Processing {len(newsletters)} newsletters...", file=sys.stderr)
# Try LLM summarization first
digest = summarize_newsletters(newsletters)
if not digest:
# Fallback to basic
digest = create_basic_digest(newsletters)
print(digest)
if __name__ == '__main__':
main()