AI Newsletter Digest improvements: fixed QP soft line break decoding, URL extraction, and content cleaning
This commit is contained in:
70
automations/ai-newsletter-digest/TEMPLATE.md
Normal file
70
automations/ai-newsletter-digest/TEMPLATE.md
Normal file
@@ -0,0 +1,70 @@
|
||||
# AI Newsletter Digest Template
|
||||
|
||||
This is the proper format for AI newsletter digests, matching the quality of the OpenClaw Daily Digest.
|
||||
|
||||
## Required Elements
|
||||
|
||||
### Header
|
||||
- Date (formatted nicely)
|
||||
- Count of newsletters analyzed
|
||||
- List of sources
|
||||
|
||||
### Each Story Should Include:
|
||||
1. **Headline** — Clear, descriptive title
|
||||
2. **Source** — Which newsletter it came from
|
||||
3. **What happened** — 2-3 sentences summarizing the key news
|
||||
4. **Why it matters** — Context on significance
|
||||
5. **Links** — ALL clickable URLs to read more (CRITICAL - always include these!)
|
||||
|
||||
### Format Example:
|
||||
|
||||
```
|
||||
🤖 AI NEWSLETTER DIGEST — Wednesday, March 4, 2026
|
||||
Your synthesized briefing from 7 AI newsletters
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
📊 TODAY'S OVERVIEW
|
||||
═══════════════════════════════════════════════════════════
|
||||
• 7 Newsletters Analyzed
|
||||
• 3 Major Stories: Pentagon/AI politics, Copyright law, Model releases
|
||||
• 4 Sources: The Rundown AI, AI Secret, AI Valley, The Deep View
|
||||
|
||||
═══════════════════════════════════════════════════════════
|
||||
🔥 TOP STORIES
|
||||
═══════════════════════════════════════════════════════════
|
||||
|
||||
📌 OpenAI's Pentagon Deal Backlash
|
||||
Source: The Rundown AI, AI Valley, The Deep View
|
||||
|
||||
What happened: [2-3 sentence summary of the news]
|
||||
|
||||
Why it matters: [Context on significance]
|
||||
|
||||
Link: [URL]
|
||||
```
|
||||
|
||||
## What NOT to do (Current broken format):
|
||||
|
||||
❌ Just list newsletter names and headlines
|
||||
❌ No summaries or context
|
||||
❌ No links to click
|
||||
❌ No "why it matters" explanations
|
||||
|
||||
## Implementation Notes
|
||||
|
||||
The current digest uses:
|
||||
- `parse-emails.py` — Extracts content from emails
|
||||
- `format-digest.py` — Formats into readable digest
|
||||
|
||||
Future enhancement: Use LLM to:
|
||||
1. Synthesize similar stories from multiple sources
|
||||
2. Extract key quotes
|
||||
3. Generate "why it matters" context
|
||||
4. Group related stories together
|
||||
|
||||
## Files
|
||||
|
||||
- `/automations/ai-newsletter-digest/daily-digest.sh` — Main script
|
||||
- `/automations/ai-newsletter-digest/parse-emails.py` — Email parser
|
||||
- `/automations/ai-newsletter-digest/format-digest.py` — Digest formatter
|
||||
- `/automations/ai-newsletter-digest/TEMPLATE.md` — This file
|
||||
@@ -1,38 +1,40 @@
|
||||
#!/bin/bash
|
||||
# Daily AI Newsletter Digest - Fast Reliable Version
|
||||
set -e
|
||||
# Daily AI Newsletter Digest - Enhanced with LLM Summarization
|
||||
# Creates properly formatted digests with AI-powered summarization
|
||||
|
||||
EMAIL_SKILL="/home/openclaw/.openclaw/workspace/skills/imap-smtp-email"
|
||||
OUTPUT_FILE="/tmp/ai-newsletter-emails.json"
|
||||
SCRIPT_DIR="/home/openclaw/.openclaw/workspace/skills/imap-smtp-email"
|
||||
CHECK_SCRIPT="$SCRIPT_DIR/scripts/check-anthonymau-email.js"
|
||||
DIGEST_DIR="/home/openclaw/.openclaw/workspace/automations/ai-newsletter-digest"
|
||||
|
||||
echo "🤖 Daily AI Newsletter Digest" >&2
|
||||
echo "============================================================" >&2
|
||||
echo "🤖 Daily AI Newsletter Digest (Enhanced)" >&2
|
||||
echo "$(date)" >&2
|
||||
echo "" >&2
|
||||
|
||||
echo "🔍 Searching for AI newsletters from last 48 hours..." >&2
|
||||
cd "$SCRIPT_DIR"
|
||||
|
||||
# Single search for all recent emails, then filter locally
|
||||
cd "$EMAIL_SKILL"
|
||||
echo "🔍 Checking for AI newsletters..." >&2
|
||||
|
||||
# Get recent emails and filter for AI newsletters (expanded to 48h and more sources)
|
||||
ALL_EMAILS=$(node scripts/imap.js search --recent 48h --limit 100 2>/dev/null | jq '[.[] | select(.from | test("AI Valley|AI Secret|DeepView|Deep View|The Rundown|TLDR|Benedict|aivalley|aisecret|deepview|therundown|tldr|benedict"; "i"))]' 2>/dev/null || echo "[]")
|
||||
RESULT=$(NODE_TLS_REJECT_UNAUTHORIZED=0 timeout 60 node "$CHECK_SCRIPT" 2>&1)
|
||||
|
||||
# Save results
|
||||
echo "$ALL_EMAILS" > "$OUTPUT_FILE"
|
||||
AI_COUNT=$(echo "$RESULT" | grep "^AI_COUNT:" | cut -d: -f2)
|
||||
|
||||
EMAIL_COUNT=$(echo "$ALL_EMAILS" | jq '. | length')
|
||||
echo "" >&2
|
||||
echo "🎯 Found $EMAIL_COUNT AI-related emails" >&2
|
||||
|
||||
if [ "$EMAIL_COUNT" -eq 0 ]; then
|
||||
echo "No new AI newsletters in the last 24 hours." >&2
|
||||
echo "[]"
|
||||
if [ -z "$AI_COUNT" ] || [ "$AI_COUNT" = "0" ]; then
|
||||
echo "No AI newsletters found" >&2
|
||||
echo "🤖 No AI newsletters today. Check back tomorrow!"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "" >&2
|
||||
echo "📧 Ready to process $EMAIL_COUNT newsletters" >&2
|
||||
echo "Found $AI_COUNT newsletters" >&2
|
||||
|
||||
# Output the emails
|
||||
cat "$OUTPUT_FILE"
|
||||
# Write result to temp file for Python parsing
|
||||
TMPFILE=$(mktemp)
|
||||
echo "$RESULT" > "$TMPFILE"
|
||||
|
||||
# Parse AI_EMAIL / AI_CONTENT pairs with improved content extraction
|
||||
PARSED=$(python3 "$DIGEST_DIR/parse-emails.py" "$TMPFILE")
|
||||
|
||||
echo "🧠 Generating LLM-powered summary..." >&2
|
||||
|
||||
# Use LLM to summarize (or fallback to basic formatting)
|
||||
echo "$PARSED" | python3 "$DIGEST_DIR/summarize.py"
|
||||
|
||||
rm -f "$TMPFILE"
|
||||
@@ -0,0 +1,85 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>OpenClaw Daily Digest</title>
|
||||
</head>
|
||||
<body style="margin: 0; padding: 20px; background-color: #1a1a2e; font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif;">
|
||||
<table role="presentation" cellpadding="0" cellspacing="0" width="100%" style="max-width: 680px; margin: 0 auto; background-color: #0f0f1a; border-radius: 16px; box-shadow: 0 20px 60px rgba(0,0,0,0.4);">
|
||||
<!-- Header -->
|
||||
<tr>
|
||||
<td style="background: linear-gradient(135deg, #ff6b6b 0%, #ee5a24 50%, #ff9f43 100%); padding: 40px 30px; text-align: center; border-radius: 16px 16px 0 0;">
|
||||
<table role="presentation" cellpadding="0" cellspacing="0" width="100%">
|
||||
<tr>
|
||||
<td style="text-align: center;">
|
||||
<span style="font-size: 56px; display: block; margin-bottom: 12px;">🦀</span>
|
||||
<h1 style="font-size: 32px; font-weight: 800; color: #ffffff; margin: 0; text-shadow: 2px 2px 4px rgba(0,0,0,0.2); letter-spacing: -0.5px;">OpenClaw Daily</h1>
|
||||
<p style="color: rgba(255,255,255,0.9); font-size: 15px; margin: 8px 0 0 0; font-weight: 500;">The best OpenClaw discussions, curated daily</p>
|
||||
<table role="presentation" cellpadding="0" cellspacing="0" style="margin: 20px auto 0 auto;">
|
||||
<tr>
|
||||
<td style="background-color: rgba(255,255,255,0.2); padding: 8px 20px; border-radius: 30px; border: 1px solid rgba(255,255,255,0.3);">
|
||||
<span style="font-size: 14px; font-weight: 600; color: #ffffff;">{{DATE}}</span>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<!-- Stats Bar -->
|
||||
<tr>
|
||||
<td style="background-color: #1a1a2e; padding: 25px 30px; border-bottom: 1px solid #2a2a3e;">
|
||||
<table role="presentation" cellpadding="0" cellspacing="0" width="100%">
|
||||
<tr>
|
||||
<td width="33%" style="text-align: center;">
|
||||
<span style="font-size: 28px; font-weight: 800; color: #ff6b6b;">{{REDDIT_COUNT}}</span>
|
||||
<p style="font-size: 12px; color: #888; text-transform: uppercase; letter-spacing: 1px; margin: 4px 0 0 0;">Reddit</p>
|
||||
</td>
|
||||
<td width="33%" style="text-align: center;">
|
||||
<span style="font-size: 28px; font-weight: 800; color: #ff6b6b;">{{NEWS_COUNT}}</span>
|
||||
<p style="font-size: 12px; color: #888; text-transform: uppercase; letter-spacing: 1px; margin: 4px 0 0 0;">News</p>
|
||||
</td>
|
||||
<td width="33%" style="text-align: center;">
|
||||
<span style="font-size: 28px; font-weight: 800; color: #ff6b6b;">{{TWITTER_COUNT}}</span>
|
||||
<p style="font-size: 12px; color: #888; text-transform: uppercase; letter-spacing: 1px; margin: 4px 0 0 0;">X/Twitter</p>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<!-- Content -->
|
||||
<tr>
|
||||
<td style="padding: 35px 30px;">
|
||||
{{CONTENT}}
|
||||
</td>
|
||||
</tr>
|
||||
|
||||
<!-- Footer -->
|
||||
<tr>
|
||||
<td style="background-color: #0a0a12; padding: 30px; text-align: center; border-top: 1px solid #2a2a3e; border-radius: 0 0 16px 16px;">
|
||||
<table role="presentation" cellpadding="0" cellspacing="0" style="margin: 0 auto;">
|
||||
<tr>
|
||||
<td style="width: 60px; height: 60px; background: linear-gradient(135deg, #ff6b6b, #ff9f43); border-radius: 50%; text-align: center; vertical-align: middle;">
|
||||
<span style="font-size: 28px;">🦀</span>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
<p style="font-size: 16px; color: #ffffff; font-weight: 600; margin: 15px 0 5px 0;">Curated daily for Anthony Martin</p>
|
||||
<p style="font-size: 13px; color: #888; margin: 0;">by Krilly the Crab</p>
|
||||
<table role="presentation" cellpadding="0" cellspacing="0" style="margin: 20px auto 0 auto;">
|
||||
<tr>
|
||||
<td style="padding: 0 10px;"><a href="https://github.com/openclaw/openclaw" style="color: #74b9ff; text-decoration: none; font-size: 13px;">GitHub</a></td>
|
||||
<td style="padding: 0 10px;"><a href="https://reddit.com/r/openclaw" style="color: #74b9ff; text-decoration: none; font-size: 13px;">Reddit</a></td>
|
||||
<td style="padding: 0 10px;"><a href="https://docs.openclaw.ai" style="color: #74b9ff; text-decoration: none; font-size: 13px;">Docs</a></td>
|
||||
</tr>
|
||||
</table>
|
||||
<p style="font-size: 11px; color: #555; margin-top: 20px;">{{TIMESTAMP}} • Perth, Australia (AWST)</p>
|
||||
</td>
|
||||
</tr>
|
||||
</table>
|
||||
</body>
|
||||
</html>
|
||||
550
automations/ai-newsletter-digest/email-template.html
Normal file
550
automations/ai-newsletter-digest/email-template.html
Normal file
@@ -0,0 +1,550 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>OpenClaw Daily Digest</title>
|
||||
<style>
|
||||
* { margin: 0; padding: 0; box-sizing: border-box; }
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Helvetica, Arial, sans-serif;
|
||||
background: linear-gradient(135deg, #1a1a2e 0%, #16213e 100%);
|
||||
color: #e4e4e4;
|
||||
line-height: 1.6;
|
||||
padding: 20px;
|
||||
}
|
||||
.container {
|
||||
max-width: 680px;
|
||||
margin: 0 auto;
|
||||
background: #0f0f1a;
|
||||
border-radius: 16px;
|
||||
overflow: hidden;
|
||||
box-shadow: 0 20px 60px rgba(0,0,0,0.4);
|
||||
}
|
||||
/* Header */
|
||||
.header {
|
||||
background: linear-gradient(135deg, #ff6b6b 0%, #ee5a24 50%, #ff9f43 100%);
|
||||
padding: 40px 30px;
|
||||
text-align: center;
|
||||
position: relative;
|
||||
overflow: hidden;
|
||||
}
|
||||
.header::before {
|
||||
content: '';
|
||||
position: absolute;
|
||||
top: -50%;
|
||||
left: -50%;
|
||||
width: 200%;
|
||||
height: 200%;
|
||||
background: radial-gradient(circle, rgba(255,255,255,0.1) 1px, transparent 1px);
|
||||
background-size: 20px 20px;
|
||||
opacity: 0.3;
|
||||
}
|
||||
.crab-emoji {
|
||||
font-size: 56px;
|
||||
margin-bottom: 12px;
|
||||
display: block;
|
||||
animation: float 3s ease-in-out infinite;
|
||||
}
|
||||
@keyframes float {
|
||||
0%, 100% { transform: translateY(0); }
|
||||
50% { transform: translateY(-8px); }
|
||||
}
|
||||
.header h1 {
|
||||
font-size: 32px;
|
||||
font-weight: 800;
|
||||
color: #fff;
|
||||
text-shadow: 2px 2px 4px rgba(0,0,0,0.2);
|
||||
letter-spacing: -0.5px;
|
||||
}
|
||||
.header .tagline {
|
||||
color: rgba(255,255,255,0.9);
|
||||
font-size: 15px;
|
||||
margin-top: 8px;
|
||||
font-weight: 500;
|
||||
}
|
||||
.date-pill {
|
||||
display: inline-block;
|
||||
background: rgba(255,255,255,0.2);
|
||||
backdrop-filter: blur(10px);
|
||||
padding: 8px 20px;
|
||||
border-radius: 30px;
|
||||
margin-top: 20px;
|
||||
font-size: 14px;
|
||||
font-weight: 600;
|
||||
color: #fff;
|
||||
border: 1px solid rgba(255,255,255,0.3);
|
||||
}
|
||||
/* Stats Bar */
|
||||
.stats-bar {
|
||||
display: flex;
|
||||
justify-content: center;
|
||||
gap: 40px;
|
||||
padding: 25px 30px;
|
||||
background: #1a1a2e;
|
||||
border-bottom: 1px solid #2a2a3e;
|
||||
}
|
||||
.stat {
|
||||
text-align: center;
|
||||
}
|
||||
.stat-number {
|
||||
font-size: 28px;
|
||||
font-weight: 800;
|
||||
background: linear-gradient(135deg, #ff6b6b, #ff9f43);
|
||||
-webkit-background-clip: text;
|
||||
-webkit-text-fill-color: transparent;
|
||||
background-clip: text;
|
||||
}
|
||||
.stat-label {
|
||||
font-size: 12px;
|
||||
color: #888;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: 1px;
|
||||
margin-top: 4px;
|
||||
}
|
||||
/* Content */
|
||||
.content {
|
||||
padding: 35px 30px;
|
||||
}
|
||||
/* Section Headers */
|
||||
.section {
|
||||
margin-bottom: 35px;
|
||||
}
|
||||
.section-header {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 12px;
|
||||
margin-bottom: 20px;
|
||||
padding-bottom: 12px;
|
||||
border-bottom: 2px solid #2a2a3e;
|
||||
}
|
||||
.section-icon {
|
||||
width: 40px;
|
||||
height: 40px;
|
||||
border-radius: 12px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
font-size: 20px;
|
||||
}
|
||||
.reddit-icon { background: linear-gradient(135deg, #ff4500, #ff6347); }
|
||||
.hackernews-icon { background: linear-gradient(135deg, #ff6600, #ff8533); }
|
||||
.twitter-icon { background: linear-gradient(135deg, #1da1f2, #0d8bd9); }
|
||||
.section-title {
|
||||
font-size: 20px;
|
||||
font-weight: 700;
|
||||
color: #fff;
|
||||
}
|
||||
/* Cards */
|
||||
.card {
|
||||
background: linear-gradient(145deg, #1a1a2e 0%, #151525 100%);
|
||||
border-radius: 12px;
|
||||
padding: 20px;
|
||||
margin-bottom: 16px;
|
||||
border: 1px solid #2a2a3e;
|
||||
transition: all 0.2s ease;
|
||||
}
|
||||
.card:hover {
|
||||
border-color: #ff6b6b;
|
||||
transform: translateX(4px);
|
||||
}
|
||||
.card-title {
|
||||
font-size: 15px;
|
||||
font-weight: 600;
|
||||
color: #fff;
|
||||
margin-bottom: 10px;
|
||||
line-height: 1.5;
|
||||
}
|
||||
.card-title a {
|
||||
color: #74b9ff;
|
||||
text-decoration: none;
|
||||
transition: color 0.2s;
|
||||
}
|
||||
.card-title a:hover {
|
||||
color: #ff6b6b;
|
||||
}
|
||||
.card-meta {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 15px;
|
||||
font-size: 13px;
|
||||
color: #888;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
.author {
|
||||
color: #a29bfe;
|
||||
font-weight: 500;
|
||||
}
|
||||
.badge {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 4px;
|
||||
padding: 4px 10px;
|
||||
border-radius: 20px;
|
||||
font-size: 12px;
|
||||
font-weight: 600;
|
||||
}
|
||||
.badge-upvotes {
|
||||
background: rgba(255, 107, 107, 0.15);
|
||||
color: #ff6b6b;
|
||||
}
|
||||
.badge-comments {
|
||||
background: rgba(116, 185, 255, 0.15);
|
||||
color: #74b9ff;
|
||||
}
|
||||
.card-excerpt {
|
||||
font-size: 14px;
|
||||
color: #aaa;
|
||||
margin-top: 12px;
|
||||
line-height: 1.6;
|
||||
}
|
||||
/* Coming Soon Banner */
|
||||
.coming-soon {
|
||||
background: linear-gradient(135deg, #2d3436 0%, #1a1a2e 100%);
|
||||
border: 2px dashed #444;
|
||||
border-radius: 12px;
|
||||
padding: 30px;
|
||||
text-align: center;
|
||||
color: #888;
|
||||
}
|
||||
.coming-soon-icon {
|
||||
font-size: 36px;
|
||||
margin-bottom: 12px;
|
||||
}
|
||||
/* Footer */
|
||||
.footer {
|
||||
background: #0a0a12;
|
||||
padding: 30px;
|
||||
text-align: center;
|
||||
border-top: 1px solid #2a2a3e;
|
||||
}
|
||||
.footer-avatar {
|
||||
width: 60px;
|
||||
height: 60px;
|
||||
background: linear-gradient(135deg, #ff6b6b, #ff9f43);
|
||||
border-radius: 50%;
|
||||
margin: 0 auto 15px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
font-size: 28px;
|
||||
}
|
||||
.footer-text {
|
||||
font-size: 16px;
|
||||
color: #fff;
|
||||
font-weight: 600;
|
||||
margin-bottom: 5px;
|
||||
}
|
||||
.footer-subtext {
|
||||
font-size: 13px;
|
||||
color: #888;
|
||||
}
|
||||
.footer-links {
|
||||
display: flex;
|
||||
justify-content: center;
|
||||
gap: 20px;
|
||||
margin-top: 20px;
|
||||
}
|
||||
.footer-links a {
|
||||
color: #74b9ff;
|
||||
text-decoration: none;
|
||||
font-size: 13px;
|
||||
transition: color 0.2s;
|
||||
}
|
||||
.footer-links a:hover {
|
||||
color: #ff6b6b;
|
||||
}
|
||||
.timestamp {
|
||||
font-size: 11px;
|
||||
color: #555;
|
||||
margin-top: 20px;
|
||||
}
|
||||
/* Mobile */
|
||||
@media (max-width: 640px) {
|
||||
body { padding: 10px; }
|
||||
.header { padding: 30px 20px; }
|
||||
.header h1 { font-size: 26px; }
|
||||
.stats-bar { gap: 25px; padding: 20px; }
|
||||
.content { padding: 25px 20px; }
|
||||
.card { padding: 16px; }
|
||||
}
|
||||
</style>
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<!-- Header -->
|
||||
<div class="header">
|
||||
<span class="crab-emoji">🦀</span>
|
||||
<h1>OpenClaw Daily</h1>
|
||||
<p class="tagline">The best OpenClaw discussions, curated daily</p>
|
||||
<div class="date-pill">Sunday, March 1, 2026</div>
|
||||
</div>
|
||||
|
||||
<!-- Stats -->
|
||||
<div class="stats-bar">
|
||||
<div class="stat">
|
||||
<div class="stat-number">24</div>
|
||||
<div class="stat-label">Reddit</div>
|
||||
</div>
|
||||
<div class="stat">
|
||||
<div class="stat-number">9</div>
|
||||
<div class="stat-label">News</div>
|
||||
</div>
|
||||
<div class="stat">
|
||||
<div class="stat-number">0</div>
|
||||
<div class="stat-label">X/Twitter</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Content -->
|
||||
<div class="content">
|
||||
<!-- Reddit Section -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-icon reddit-icon">🔥</div>
|
||||
<h2 class="section-title">Reddit Highlights</h2>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1rhxaj1/openclaw_usecases_to_make_life_easier_11k_stars/">
|
||||
Openclaw Usecases to Make life easier. 11k+ Stars Github Repo
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/HuckleberryEntire699</span>
|
||||
<span class="badge badge-upvotes">↑ 74</span>
|
||||
<span class="badge badge-comments">💬 6</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1rhrtr2/to_the_many_people_here_wondering_about_local_models/">
|
||||
To the many people here wondering about local models… just use an API
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/Valuable-Run2129</span>
|
||||
<span class="badge badge-upvotes">↑ 64</span>
|
||||
<span class="badge badge-comments">💬 51</span>
|
||||
</div>
|
||||
<div class="card-excerpt">I've read many posts in the past days of newbies asking about what computer they need to use Open Claw locally. Ironically some ask it as a solution...</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1rhwu6h/openclaw_is_very_buggy/">
|
||||
Openclaw is very buggy
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/Ok-Profession-2143</span>
|
||||
<span class="badge badge-upvotes">↑ 48</span>
|
||||
<span class="badge badge-comments">💬 69</span>
|
||||
</div>
|
||||
<div class="card-excerpt">I don't understand why everyone is crazy about openclaw. Its super buggy. You cannot change models easily. When you update it gets stuck etc etc.</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1ri9nt0/me_every_time_i_touch_the_openclawjson/">
|
||||
Me, every time I touch the openclaw.json
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/Patient_Lie_9310</span>
|
||||
<span class="badge badge-upvotes">↑ 33</span>
|
||||
<span class="badge badge-comments">💬 13</span>
|
||||
</div>
|
||||
<div class="card-excerpt">It's always followed by errors and hunting for the stuff I broke.</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1rhzht7/i_built_a_voice_assistant_with_openclaw_alexa/">
|
||||
I built a voice assistant with OpenClaw + Alexa + Local LLM (Ollama) — here's how
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/cormazacl</span>
|
||||
<span class="badge badge-upvotes">↑ 29</span>
|
||||
<span class="badge badge-comments">💬 10</span>
|
||||
</div>
|
||||
<div class="card-excerpt">Hey everyone! I've been building a voice-first assistant using OpenClaw as the brain, and wanted to share what I've got working so far.</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1rhwhv6/does_openclaw_make_sense_without_claude_max/">
|
||||
Does OpenClaw make sense without Claude Max?
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/btwiz</span>
|
||||
<span class="badge badge-upvotes">↑ 17</span>
|
||||
<span class="badge badge-comments">💬 26</span>
|
||||
</div>
|
||||
<div class="card-excerpt">I don't have any max accounts (no OAuth tokens), so I have to use API keys. Just setting up OpenClaw, I blew through $10 of OpenRouter credits...</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1ri3i9p/openclaw_gogcli_google_account_suspension/">
|
||||
OpenClaw + GogCLI = Google Account Suspension
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/Admir-Rusidovic</span>
|
||||
<span class="badge badge-upvotes">↑ 11</span>
|
||||
<span class="badge badge-comments">💬 20</span>
|
||||
</div>
|
||||
<div class="card-excerpt">Over the last couple of days, I experimented with integrating Google Docs and Gmail into my OpenClaw instance. The goal was simple...</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://reddit.com/r/openclaw/comments/1ri2zh4/my_ai_agent_has_made_the_same_lie_12x_in_25_days/">
|
||||
My AI agent has made the same lie 12x in 25 days... all same root cause, rules don't fix it
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="author">u/fartpsychic</span>
|
||||
<span class="badge badge-upvotes">↑ 5</span>
|
||||
<span class="badge badge-comments">💬 23</span>
|
||||
</div>
|
||||
<div class="card-excerpt">I run a multi-agent setup on OpenClaw (Claude Opus). My orchestration agent, Bob, has a consistent failure mode: optimizing for appearing competent...</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Hacker News Section -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-icon hackernews-icon">🟧</div>
|
||||
<h2 class="section-title">News & Hacker News</h2>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://justaniceguy.ai/posts/001-building-jarvis">
|
||||
Building Jarvis – Parallel Tool-Calling Voice Agent Layer on Top of OpenClaw
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 3</span>
|
||||
<span class="badge badge-comments">💬 1</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://janhoon.com/blog/building-with-an-ai-that-remembers/">
|
||||
Building with an AI that remembers – A blog by my OpenClaw Assistant
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 2</span>
|
||||
<span class="badge badge-comments">💬 1</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://usplus.ai/">
|
||||
Show HN: Usplus.ai – Build an AI-Native Company with Agents in your Org Chart
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 2</span>
|
||||
<span class="badge badge-comments">💬 1</span>
|
||||
</div>
|
||||
<div class="card-excerpt">Hey HN, I'm the founder of usplus.ai, and I've been building this for a while now in San Diego. The core idea: What if you...</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://clawapis.com/">
|
||||
X402 based pay-as-you-go Twitter API and helius/solscan API for your OpenClaw
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 1</span>
|
||||
<span class="badge badge-comments">💬 1</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://github.com/swarmclawai/swarmclaw">
|
||||
Show HN: SwarmClaw – Orchestration dashboard for OpenClaw and AI agents
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 2</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://github.com/Enriquefft/openclaw-kapso-whatsapp">
|
||||
Show HN: OpenClaw-kapso, Give OpenClaw a stable WhatsApp number
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 2</span>
|
||||
</div>
|
||||
<div class="card-excerpt">Built an OpenClaw plugin that gives your agent a WhatsApp number through the official Cloud API via Kapso.</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://openclawdirectory.co.uk/">
|
||||
Show HN: OpenClaw Directory – Compare Deployers, Skills, and Tools for OpenClaw
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 1</span>
|
||||
</div>
|
||||
<div class="card-excerpt">Discover the essential OpenClaw tools, including deployers, skills, hosting, and plugins, along with direct links to test them out.</div>
|
||||
</div>
|
||||
|
||||
<div class="card">
|
||||
<div class="card-title">
|
||||
<a href="https://openclaw.ai/blog/virustotal-partnership">
|
||||
OpenClaw Partners with VirusTotal for Skill Security
|
||||
</a>
|
||||
</div>
|
||||
<div class="card-meta">
|
||||
<span class="badge badge-upvotes">↑ 1</span>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Twitter Section -->
|
||||
<div class="section">
|
||||
<div class="section-header">
|
||||
<div class="section-icon twitter-icon">𝕏</div>
|
||||
<h2 class="section-title">From X</h2>
|
||||
</div>
|
||||
<div class="coming-soon">
|
||||
<div class="coming-soon-icon">🚧</div>
|
||||
<p>X/Twitter integration coming soon</p>
|
||||
</div>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<!-- Footer -->
|
||||
<div class="footer">
|
||||
<div class="footer-avatar">🦀</div>
|
||||
<div class="footer-text">Curated daily for Anthony Martin</div>
|
||||
<div class="footer-subtext">by Krilly the Crab</div>
|
||||
<div class="footer-links">
|
||||
<a href="https://github.com/openclaw/openclaw">GitHub</a>
|
||||
<a href="https://reddit.com/r/openclaw">Reddit</a>
|
||||
<a href="https://docs.openclaw.ai">Docs</a>
|
||||
</div>
|
||||
<div class="timestamp">2026-03-01 • Perth, Australia (AWST)</div>
|
||||
</div>
|
||||
</div>
|
||||
</body>
|
||||
</html>
|
||||
107
automations/ai-newsletter-digest/format-digest.py
Normal file
107
automations/ai-newsletter-digest/format-digest.py
Normal file
@@ -0,0 +1,107 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AI Newsletter Digest - Enhanced Version
|
||||
Creates properly summarized digests like the OpenClaw Daily Digest
|
||||
"""
|
||||
import json
|
||||
import sys
|
||||
import re
|
||||
from datetime import datetime
|
||||
|
||||
def format_digest(newsletters_json):
|
||||
"""Format newsletters into a proper digest with summaries."""
|
||||
|
||||
newsletters = json.loads(newsletters_json)
|
||||
|
||||
# Group by source
|
||||
by_source = {}
|
||||
for nl in newsletters:
|
||||
source = nl['from'].split('<')[0].strip()
|
||||
if source not in by_source:
|
||||
by_source[source] = []
|
||||
by_source[source].append(nl)
|
||||
|
||||
# Build digest
|
||||
lines = [
|
||||
"🤖 **AI NEWSLETTER DIGEST — {date}**",
|
||||
"Your synthesized briefing from {count} AI newsletters",
|
||||
"",
|
||||
"═" * 60,
|
||||
"📊 TODAY'S OVERVIEW",
|
||||
"═" * 60,
|
||||
"• {count} Newsletters Analyzed",
|
||||
"• Sources: {sources}",
|
||||
"",
|
||||
"═" * 60,
|
||||
"🔥 TOP STORIES",
|
||||
"═" * 60,
|
||||
""
|
||||
]
|
||||
|
||||
# Format date
|
||||
date_str = datetime.now().strftime("%A, %B %d, %Y")
|
||||
sources_str = ", ".join(by_source.keys())
|
||||
|
||||
digest = "\n".join(lines).format(
|
||||
date=date_str,
|
||||
count=len(newsletters),
|
||||
sources=sources_str
|
||||
)
|
||||
|
||||
# Add each newsletter with proper formatting
|
||||
for i, nl in enumerate(newsletters, 1):
|
||||
source = nl['from'].split('<')[0].strip()
|
||||
subject = nl['subject']
|
||||
content = nl.get('content', '')[:800] # First 800 chars
|
||||
urls = nl.get('urls', [])
|
||||
|
||||
# Clean up content
|
||||
content = re.sub(r'\s+', ' ', content)
|
||||
content = content.replace('= ', '').replace('=20', ' ')
|
||||
|
||||
# Extract key sentence
|
||||
key_sentence = ""
|
||||
sentences = content.split('.')
|
||||
for s in sentences[:3]:
|
||||
if len(s.strip()) > 50:
|
||||
key_sentence = s.strip() + "."
|
||||
break
|
||||
|
||||
digest += f"\n📌 **{subject}**\n"
|
||||
digest += f" Source: {source}\n"
|
||||
if key_sentence:
|
||||
digest += f" \n {key_sentence}\n"
|
||||
|
||||
# Include ALL URLs found
|
||||
if urls:
|
||||
digest += f" \n 🔗 Links:\n"
|
||||
for url in urls[:3]: # Max 3 links
|
||||
digest += f" • {url}\n"
|
||||
|
||||
digest += "\n---\n"
|
||||
|
||||
digest += "\n🦀 Krilly the Crab | AI Newsletter Digest\n"
|
||||
digest += f"Generated: {datetime.now().strftime('%A, %B %d, %Y — %I:%M %p AWST')}\n"
|
||||
|
||||
return digest
|
||||
|
||||
def main():
|
||||
"""Read JSON from stdin and output formatted digest."""
|
||||
if len(sys.argv) > 1:
|
||||
# Read from file
|
||||
with open(sys.argv[1], 'r') as f:
|
||||
data = f.read()
|
||||
else:
|
||||
# Read from stdin
|
||||
data = sys.stdin.read()
|
||||
|
||||
try:
|
||||
digest = format_digest(data)
|
||||
print(digest)
|
||||
except Exception as e:
|
||||
print(f"Error formatting digest: {e}", file=sys.stderr)
|
||||
# Fallback: just print the raw data
|
||||
print(data)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
225
automations/ai-newsletter-digest/parse-emails.py
Normal file
225
automations/ai-newsletter-digest/parse-emails.py
Normal file
@@ -0,0 +1,225 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Parse AI_EMAIL / AI_CONTENT pairs from imap check script output."""
|
||||
import sys
|
||||
import json
|
||||
import re
|
||||
|
||||
# Senders to exclude
|
||||
BLOCKLIST = [
|
||||
'googlenews-noreply@google.com',
|
||||
]
|
||||
|
||||
# URL patterns to skip (ads, tracking, social, email)
|
||||
URL_SKIP = re.compile(
|
||||
r'unsubscribe|mailto|twitter\.com|instagram\.com|facebook\.com|'
|
||||
r'youtube\.com/unsubscribe|youtube\.com/channel|'
|
||||
r'utm_source=dlvr|utm_medium=email|'
|
||||
r'genstore|typeform|pigment\.com|maton\.ai|'
|
||||
r'youtu\.be|medium\.com/@|linkedin\.com/posts|'
|
||||
r'cdn-cgi|imageproxy',
|
||||
re.IGNORECASE
|
||||
)
|
||||
|
||||
def decode_subject(subject):
|
||||
try:
|
||||
from email.header import decode_header
|
||||
parts = decode_header(subject)
|
||||
decoded = ''
|
||||
for part, charset in parts:
|
||||
if isinstance(part, bytes):
|
||||
decoded += part.decode(charset or 'utf-8', errors='replace')
|
||||
else:
|
||||
decoded += part
|
||||
return decoded.strip()
|
||||
except Exception:
|
||||
return subject.strip()
|
||||
|
||||
def decode_qp(content):
|
||||
"""Decode quoted-printable content before URL extraction."""
|
||||
# First decode soft line breaks (the main issue) - must be first!
|
||||
# Match = followed by newline (any type)
|
||||
content = re.sub(r'=\r?\n', '', content)
|
||||
content = re.sub(r'=20', '', content) # Remove =20 (space) encoding
|
||||
|
||||
# More aggressive: remove any = followed by lowercase letter and space
|
||||
content = re.sub(r'=[a-z] ', '', content)
|
||||
content = re.sub(r'=[a-z]$', '', content, flags=re.MULTILINE)
|
||||
|
||||
def qp_decode(m):
|
||||
try:
|
||||
return bytes.fromhex(m.group(1)).decode('utf-8', errors='replace')
|
||||
except Exception:
|
||||
return m.group(0)
|
||||
|
||||
# Decode quoted-printable hex codes
|
||||
content = re.sub(r'=([0-9A-Fa-f]{2})', qp_decode, content)
|
||||
|
||||
# Clean up any remaining = in URLs
|
||||
content = content.replace('=', '')
|
||||
|
||||
return content
|
||||
|
||||
def extract_urls(content):
|
||||
"""Extract clean article URLs from email content (after full QP decoding)."""
|
||||
# Decode QP FIRST - this is the key fix
|
||||
content = decode_qp(content)
|
||||
|
||||
# Also extract markdown-style links: [text](https://...)
|
||||
markdown_urls = re.findall(r'\[([^\]]+)\]\((https?://[^\s"<>)\]\']+)\)', content)
|
||||
|
||||
# Extract regular URLs
|
||||
urls = re.findall(r'https?://[^\s"<>)\]\']+', content)
|
||||
|
||||
seen = set()
|
||||
clean = []
|
||||
|
||||
# First add markdown URLs (they tend to be cleaner)
|
||||
for text, url in markdown_urls:
|
||||
if url not in seen and not URL_SKIP.search(url):
|
||||
# Clean tracking
|
||||
url = re.sub(r'[?&]utm_[^&]+', '', url)
|
||||
url = re.sub(r'[?&]_bhlid=\w+', '', url)
|
||||
url = re.sub(r'[?&]jwt_token=\w+', '', url)
|
||||
url = re.sub(r'[?&]ref=[^&]+', '', url)
|
||||
url = url.rstrip('.,;)\'"')
|
||||
if len(url) > 15 and not any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/cdn-cgi/']):
|
||||
seen.add(url)
|
||||
clean.append(url)
|
||||
|
||||
# Then add regular URLs
|
||||
for url in urls:
|
||||
url = url.rstrip('.,;)\'"')
|
||||
|
||||
# Skip tracking-heavy URLs
|
||||
if URL_SKIP.search(url):
|
||||
continue
|
||||
|
||||
# Clean up common tracking garbage
|
||||
url = re.sub(r'[?&]utm_[^&]+', '', url)
|
||||
url = re.sub(r'[?&]_bhlid=\w+', '', url)
|
||||
url = re.sub(r'[?&]jwt_token=\w+', '', url)
|
||||
url = re.sub(r'[?&]ref=[^&]+', '', url)
|
||||
url = url.rstrip('.,;)\'"')
|
||||
|
||||
# Must be reasonably long to be a real article
|
||||
if len(url) > 15 and url not in seen:
|
||||
# Also skip image/video URLs
|
||||
if any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/image/', '/cdn-cgi/', '/media/', '/assets/']):
|
||||
continue
|
||||
seen.add(url)
|
||||
clean.append(url)
|
||||
|
||||
return clean[:5]
|
||||
|
||||
def clean_content(content):
|
||||
"""Best-effort text extraction from messy email body."""
|
||||
# Decode QP
|
||||
content = decode_qp(content)
|
||||
|
||||
# Remove HTML tags
|
||||
content = re.sub(r'<[^>]+>', ' ', content)
|
||||
|
||||
# Remove email formatting artifacts
|
||||
content = re.sub(r'\[\[.*?\]\]', ' ', content) # [[markup]]
|
||||
content = re.sub(r'\{\{.*?\}\}', ' ', content) # {{markup}}
|
||||
content = re.sub(r'\{\|\|.*?\|\|\}', ' ', content) # {||markup||}
|
||||
content = re.sub(r'\^[^\^]+\^', ' ', content) # ^markup^
|
||||
content = re.sub(r'~~[^~]+~~', ' ', content) # ~~markup~~
|
||||
content = re.sub(r'__[^_]+__', ' ', content) # __markup__
|
||||
|
||||
# Remove base64/encoded blocks
|
||||
content = re.sub(r'[A-Za-z0-9+/]{60,}', '', content)
|
||||
|
||||
# Convert markdown links to just text
|
||||
content = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', content)
|
||||
|
||||
# Clean up whitespace
|
||||
content = re.sub(r'[ \t]+', ' ', content)
|
||||
content = re.sub(r'\n{3,}', '\n\n', content)
|
||||
|
||||
# Split into lines and filter - be LESS aggressive
|
||||
lines = [l.strip() for l in content.split('\n')]
|
||||
|
||||
# Filter out noise lines
|
||||
noise_patterns = [
|
||||
r'^[-=_\*\^|>.@#]+$', # Separator lines
|
||||
r'^read online$',
|
||||
r'^sign up$',
|
||||
r'^advertise$',
|
||||
r'^sponsor$',
|
||||
r'^view all$',
|
||||
r'^ unsubscribe$',
|
||||
r'^\d+ more$',
|
||||
]
|
||||
|
||||
filtered = []
|
||||
for line in lines:
|
||||
if len(line) < 20:
|
||||
continue
|
||||
if re.match(r'^[\W_]+\s*[\W_]+$', line):
|
||||
continue
|
||||
skip = False
|
||||
for pattern in noise_patterns:
|
||||
if re.match(pattern, line, re.IGNORECASE):
|
||||
skip = True
|
||||
break
|
||||
if not skip:
|
||||
filtered.append(line[:300])
|
||||
|
||||
result = '\n'.join(filtered)
|
||||
|
||||
# Final cleanup
|
||||
result = re.sub(r'^\s*(by|from|to|subject|date):.*$', '', result, flags=re.IGNORECASE | re.MULTILINE)
|
||||
|
||||
return result[:3000].strip()
|
||||
|
||||
def is_blocked(sender):
|
||||
return any(b in sender.lower() for b in BLOCKLIST)
|
||||
|
||||
if len(sys.argv) < 2:
|
||||
print("[]")
|
||||
sys.exit(0)
|
||||
|
||||
with open(sys.argv[1], 'r', errors='replace') as f:
|
||||
text = f.read()
|
||||
|
||||
# Split into structured records
|
||||
emails = []
|
||||
current_from = None
|
||||
current_subject = None
|
||||
content_lines = []
|
||||
in_content = False
|
||||
|
||||
for line in text.split('\n'):
|
||||
if line.startswith('AI_EMAIL:'):
|
||||
if current_from and in_content and not is_blocked(current_from):
|
||||
raw = '\n'.join(content_lines)
|
||||
emails.append({
|
||||
'from': current_from,
|
||||
'subject': decode_subject(current_subject),
|
||||
'urls': extract_urls(raw),
|
||||
'content': clean_content(raw)
|
||||
})
|
||||
meta = line[9:]
|
||||
parts = meta.split(' | ', 1)
|
||||
current_from = parts[0].strip()
|
||||
current_subject = parts[1].strip() if len(parts) > 1 else ''
|
||||
content_lines = []
|
||||
in_content = False
|
||||
elif line.startswith('AI_CONTENT:') and current_from:
|
||||
content_lines = [line[12:]] # strip prefix
|
||||
in_content = True
|
||||
elif in_content and not line.startswith(('STATUS:', 'TOTAL:', 'LAST_UID:', 'RECENT:', 'AI_COUNT:')):
|
||||
content_lines.append(line)
|
||||
|
||||
# Last one
|
||||
if current_from and in_content and not is_blocked(current_from):
|
||||
raw = '\n'.join(content_lines)
|
||||
emails.append({
|
||||
'from': current_from,
|
||||
'subject': decode_subject(current_subject),
|
||||
'urls': extract_urls(raw),
|
||||
'content': clean_content(raw)
|
||||
})
|
||||
|
||||
print(json.dumps(emails, indent=2))
|
||||
225
automations/ai-newsletter-digest/parse-emails.py.bak
Normal file
225
automations/ai-newsletter-digest/parse-emails.py.bak
Normal file
@@ -0,0 +1,225 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Parse AI_EMAIL / AI_CONTENT pairs from imap check script output."""
|
||||
import sys
|
||||
import json
|
||||
import re
|
||||
|
||||
# Senders to exclude
|
||||
BLOCKLIST = [
|
||||
'googlenews-noreply@google.com',
|
||||
]
|
||||
|
||||
# URL patterns to skip (ads, tracking, social, email)
|
||||
URL_SKIP = re.compile(
|
||||
r'unsubscribe|mailto|twitter\.com|instagram\.com|facebook\.com|'
|
||||
r'youtube\.com/unsubscribe|youtube\.com/channel|'
|
||||
r'utm_source=dlvr|utm_medium=email|'
|
||||
r'genstore|typeform|pigment\.com|maton\.ai|'
|
||||
r'youtu\.be|medium\.com/@|linkedin\.com/posts|'
|
||||
r'cdn-cgi|imageproxy',
|
||||
re.IGNORECASE
|
||||
)
|
||||
|
||||
def decode_subject(subject):
|
||||
try:
|
||||
from email.header import decode_header
|
||||
parts = decode_header(subject)
|
||||
decoded = ''
|
||||
for part, charset in parts:
|
||||
if isinstance(part, bytes):
|
||||
decoded += part.decode(charset or 'utf-8', errors='replace')
|
||||
else:
|
||||
decoded += part
|
||||
return decoded.strip()
|
||||
except Exception:
|
||||
return subject.strip()
|
||||
|
||||
def decode_qp(content):
|
||||
"""Decode quoted-printable content before URL extraction."""
|
||||
# First decode soft line breaks (the main issue) - must be first!
|
||||
# Match = followed by newline (any type)
|
||||
content = re.sub(r'=\r?\n', '', content)
|
||||
content = re.sub(r'=20', '', content) # Remove =20 (space) encoding
|
||||
|
||||
# More aggressive: remove any = followed by lowercase letter and space
|
||||
content = re.sub(r'=[a-z] ', '', content)
|
||||
content = re.sub(r'=[a-z]$', '', content, flags=re.MULTILINE)
|
||||
|
||||
def qp_decode(m):
|
||||
try:
|
||||
return bytes.fromhex(m.group(1)).decode('utf-8', errors='replace')
|
||||
except Exception:
|
||||
return m.group(0)
|
||||
|
||||
# Decode quoted-printable hex codes
|
||||
content = re.sub(r'=([0-9A-Fa-f]{2})', qp_decode, content)
|
||||
|
||||
# Clean up any remaining = in URLs
|
||||
content = content.replace('=', '')
|
||||
|
||||
return content
|
||||
|
||||
def extract_urls(content):
|
||||
"""Extract clean article URLs from email content (after full QP decoding)."""
|
||||
# Decode QP FIRST - this is the key fix
|
||||
content = decode_qp(content)
|
||||
|
||||
# Also extract markdown-style links: [text](https://...)
|
||||
markdown_urls = re.findall(r'\[([^\]]+)\]\((https?://[^\s"<>)\]\']+)\)', content)
|
||||
|
||||
# Extract regular URLs
|
||||
urls = re.findall(r'https?://[^\s"<>)\]\']+', content)
|
||||
|
||||
seen = set()
|
||||
clean = []
|
||||
|
||||
# First add markdown URLs (they tend to be cleaner)
|
||||
for text, url in markdown_urls:
|
||||
if url not in seen and not URL_SKIP.search(url):
|
||||
# Clean tracking
|
||||
url = re.sub(r'[?&]utm_[^&]+', '', url)
|
||||
url = re.sub(r'[?&]_bhlid=\w+', '', url)
|
||||
url = re.sub(r'[?&]jwt_token=\w+', '', url)
|
||||
url = re.sub(r'[?&]ref=[^&]+', '', url)
|
||||
url = url.rstrip('.,;)\'"')
|
||||
if len(url) > 15 and not any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/cdn-cgi/']):
|
||||
seen.add(url)
|
||||
clean.append(url)
|
||||
|
||||
# Then add regular URLs
|
||||
for url in urls:
|
||||
url = url.rstrip('.,;)\'"')
|
||||
|
||||
# Skip tracking-heavy URLs
|
||||
if URL_SKIP.search(url):
|
||||
continue
|
||||
|
||||
# Clean up common tracking garbage
|
||||
url = re.sub(r'[?&]utm_[^&]+', '', url)
|
||||
url = re.sub(r'[?&]_bhlid=\w+', '', url)
|
||||
url = re.sub(r'[?&]jwt_token=\w+', '', url)
|
||||
url = re.sub(r'[?&]ref=[^&]+', '', url)
|
||||
url = url.rstrip('.,;)\'"')
|
||||
|
||||
# Must be reasonably long to be a real article
|
||||
if len(url) > 15 and url not in seen:
|
||||
# Also skip image/video URLs
|
||||
if any(x in url.lower() for x in ['.jpg', '.png', '.gif', '.jpeg', '.svg', '/image/', '/cdn-cgi/', '/media/', '/assets/']):
|
||||
continue
|
||||
seen.add(url)
|
||||
clean.append(url)
|
||||
|
||||
return clean[:5]
|
||||
|
||||
def clean_content(content):
|
||||
"""Best-effort text extraction from messy email body."""
|
||||
# Decode QP
|
||||
content = decode_qp(content)
|
||||
|
||||
# Remove HTML tags
|
||||
content = re.sub(r'<[^>]+>', ' ', content)
|
||||
|
||||
# Remove email formatting artifacts
|
||||
content = re.sub(r'\[\[.*?\]\]', ' ', content) # [[markup]]
|
||||
content = re.sub(r'\{\{.*?\}\}', ' ', content) # {{markup}}
|
||||
content = re.sub(r'\{\|\|.*?\|\|\}', ' ', content) # {||markup||}
|
||||
content = re.sub(r'\^[^\^]+\^', ' ', content) # ^markup^
|
||||
content = re.sub(r'~~[^~]+~~', ' ', content) # ~~markup~~
|
||||
content = re.sub(r'__[^_]+__', ' ', content) # __markup__
|
||||
|
||||
# Remove base64/encoded blocks
|
||||
content = re.sub(r'[A-Za-z0-9+/]{60,}', '', content)
|
||||
|
||||
# Convert markdown links to just text
|
||||
content = re.sub(r'\[([^\]]+)\]\([^\)]+\)', r'\1', content)
|
||||
|
||||
# Clean up whitespace
|
||||
content = re.sub(r'[ \t]+', ' ', content)
|
||||
content = re.sub(r'\n{3,}', '\n\n', content)
|
||||
|
||||
# Split into lines and filter - be LESS aggressive
|
||||
lines = [l.strip() for l in content.split('\n')]
|
||||
|
||||
# Filter out noise lines
|
||||
noise_patterns = [
|
||||
r'^[-=_\*\^|>.@#]+$', # Separator lines
|
||||
r'^read online$',
|
||||
r'^sign up$',
|
||||
r'^advertise$',
|
||||
r'^sponsor$',
|
||||
r'^view all$',
|
||||
r'^ unsubscribe$',
|
||||
r'^\d+ more$',
|
||||
]
|
||||
|
||||
filtered = []
|
||||
for line in lines:
|
||||
if len(line) < 20:
|
||||
continue
|
||||
if re.match(r'^[\W_]+\s*[\W_]+$', line):
|
||||
continue
|
||||
skip = False
|
||||
for pattern in noise_patterns:
|
||||
if re.match(pattern, line, re.IGNORECASE):
|
||||
skip = True
|
||||
break
|
||||
if not skip:
|
||||
filtered.append(line[:300])
|
||||
|
||||
result = '\n'.join(filtered)
|
||||
|
||||
# Final cleanup
|
||||
result = re.sub(r'^\s*(by|from|to|subject|date):.*$', '', result, flags=re.IGNORECASE | re.MULTILINE)
|
||||
|
||||
return result[:3000].strip()
|
||||
|
||||
def is_blocked(sender):
|
||||
return any(b in sender.lower() for b in BLOCKLIST)
|
||||
|
||||
if len(sys.argv) < 2:
|
||||
print("[]")
|
||||
sys.exit(0)
|
||||
|
||||
with open(sys.argv[1], 'r', errors='replace') as f:
|
||||
text = f.read()
|
||||
|
||||
# Split into structured records
|
||||
emails = []
|
||||
current_from = None
|
||||
current_subject = None
|
||||
content_lines = []
|
||||
in_content = False
|
||||
|
||||
for line in text.split('\n'):
|
||||
if line.startswith('AI_EMAIL:'):
|
||||
if current_from and in_content and not is_blocked(current_from):
|
||||
raw = '\n'.join(content_lines)
|
||||
emails.append({
|
||||
'from': current_from,
|
||||
'subject': decode_subject(current_subject),
|
||||
'urls': extract_urls(raw),
|
||||
'content': clean_content(raw)
|
||||
})
|
||||
meta = line[9:]
|
||||
parts = meta.split(' | ', 1)
|
||||
current_from = parts[0].strip()
|
||||
current_subject = parts[1].strip() if len(parts) > 1 else ''
|
||||
content_lines = []
|
||||
in_content = False
|
||||
elif line.startswith('AI_CONTENT:') and current_from:
|
||||
content_lines = [line[12:]] # strip prefix
|
||||
in_content = True
|
||||
elif in_content and not line.startswith(('STATUS:', 'TOTAL:', 'LAST_UID:', 'RECENT:', 'AI_COUNT:')):
|
||||
content_lines.append(line)
|
||||
|
||||
# Last one
|
||||
if current_from and in_content and not is_blocked(current_from):
|
||||
raw = '\n'.join(content_lines)
|
||||
emails.append({
|
||||
'from': current_from,
|
||||
'subject': decode_subject(current_subject),
|
||||
'urls': extract_urls(raw),
|
||||
'content': clean_content(raw)
|
||||
})
|
||||
|
||||
print(json.dumps(emails, indent=2))
|
||||
46
automations/ai-newsletter-digest/send-multi-channel.sh
Executable file
46
automations/ai-newsletter-digest/send-multi-channel.sh
Executable file
@@ -0,0 +1,46 @@
|
||||
#!/bin/bash
|
||||
# AI Newsletter Digest - Multi-Channel Sender
|
||||
# Sends digest to both Telegram and Discord
|
||||
|
||||
SCRIPT_DIR="/home/openclaw/.openclaw/workspace/automations/ai-newsletter-digest"
|
||||
DIGEST_SCRIPT="$SCRIPT_DIR/daily-digest.sh"
|
||||
|
||||
# Generate the digest
|
||||
RESULT=$($DIGEST_SCRIPT 2>/dev/null)
|
||||
|
||||
# Check if we have newsletters
|
||||
if echo "$RESULT" | grep -q '"count": 0' || echo "$RESULT" | grep -q 'No AI newsletters'; then
|
||||
echo "No newsletters to send"
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Parse the digest data
|
||||
COUNT=$(echo "$RESULT" | python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('count',0))")
|
||||
SOURCES=$(echo "$RESULT" | python3 -c "import json,sys; d=json.load(sys.stdin); print(', '.join(d.get('sources',[])))")
|
||||
|
||||
if [ "$COUNT" -eq 0 ]; then
|
||||
exit 0
|
||||
fi
|
||||
|
||||
# Format the message
|
||||
MESSAGE="🤖 Daily AI Newsletter Digest
|
||||
|
||||
Found $COUNT AI newsletters:
|
||||
• $SOURCES
|
||||
|
||||
$(echo "$RESULT" | python3 -c "
|
||||
import json,sys
|
||||
d=json.load(sys.stdin)
|
||||
for n in d.get('newsletters',[]):
|
||||
print(f\"📧 {n.get('subject','')} - {n.get('from','')}\")
|
||||
")
|
||||
|
||||
Full analysis available - reply for details!"
|
||||
|
||||
# Send to Telegram
|
||||
openclaw message send --channel telegram --message "$MESSAGE" 2>/dev/null || echo "Telegram send failed" >&2
|
||||
|
||||
# Send to Discord (#krilly channel)
|
||||
openclaw message send --channel discord --target "#krilly" --message "$MESSAGE" 2>/dev/null || echo "Discord send failed" >&2
|
||||
|
||||
echo "Digest sent to Telegram and Discord!"
|
||||
151
automations/ai-newsletter-digest/summarize.py
Normal file
151
automations/ai-newsletter-digest/summarize.py
Normal file
@@ -0,0 +1,151 @@
|
||||
#!/usr/bin/env python3
|
||||
"""
|
||||
AI Newsletter Summarizer
|
||||
Uses LLM to synthesize and summarize newsletter content
|
||||
"""
|
||||
import json
|
||||
import sys
|
||||
import os
|
||||
import subprocess
|
||||
from pathlib import Path
|
||||
from datetime import datetime
|
||||
|
||||
def call_llm(prompt, model="kilocode/kilo/auto-free"):
|
||||
"""Call LLM via OpenClaw CLI."""
|
||||
|
||||
cmd = [
|
||||
"openclaw",
|
||||
"llm",
|
||||
"--model", model,
|
||||
"--prompt", prompt
|
||||
]
|
||||
|
||||
try:
|
||||
result = subprocess.run(
|
||||
cmd,
|
||||
capture_output=True,
|
||||
text=True,
|
||||
timeout=120,
|
||||
cwd="/home/openclaw/.openclaw/workspace"
|
||||
)
|
||||
|
||||
if result.returncode == 0:
|
||||
return result.stdout.strip()
|
||||
else:
|
||||
print(f"LLM error: {result.stderr}", file=sys.stderr)
|
||||
return None
|
||||
except Exception as e:
|
||||
print(f"LLM call failed: {e}", file=sys.stderr)
|
||||
return None
|
||||
|
||||
def summarize_newsletters(newsletters):
|
||||
"""Use LLM to summarize newsletters into a proper digest."""
|
||||
|
||||
# Prepare newsletter content
|
||||
content_parts = []
|
||||
for i, nl in enumerate(newsletters, 1):
|
||||
source = nl.get('from', 'Unknown').split('<')[0].strip()
|
||||
subject = nl.get('subject', 'No subject')
|
||||
content = nl.get('content', '')[:1500] # Limit to ~1500 chars per newsletter
|
||||
|
||||
content_parts.append(f"""
|
||||
--- NEWSLETTER {i} ---
|
||||
Source: {source}
|
||||
Subject: {subject}
|
||||
Content: {content}
|
||||
""")
|
||||
|
||||
combined = "\n".join(content_parts)
|
||||
|
||||
prompt = f"""You are creating an AI newsletter digest for a tech-savvy reader.
|
||||
|
||||
Analyze these {len(newsletters)} AI newsletters and create a concise, informative digest.
|
||||
|
||||
{combined}
|
||||
|
||||
Create a digest with these sections:
|
||||
1. **TOP STORIES** (3-5 most important items) - Each with: headline, source, 2-3 sentence summary, why it matters
|
||||
2. **OTHER NOTABLE NEWS** - Brief mentions of other stories
|
||||
3. **KEY TAKEAWAYS** - 2-3 bullet points on patterns/trends
|
||||
|
||||
Format rules:
|
||||
- Use markdown
|
||||
- Keep each story summary to 2-3 sentences max
|
||||
- Include the source newsletter name
|
||||
- Write for someone who follows AI but wants quick briefings
|
||||
- Prioritize news with real-world impact
|
||||
|
||||
Today's date: {datetime.now().strftime('%A, %B %d, %Y')}
|
||||
|
||||
Begin your digest:"""
|
||||
|
||||
# Try using the LLM
|
||||
print("🧠 Using LLM to synthesize digest...", file=sys.stderr)
|
||||
|
||||
result = call_llm(prompt)
|
||||
|
||||
if result:
|
||||
return result
|
||||
else:
|
||||
# Fallback: return basic formatted output
|
||||
print("⚠️ LLM unavailable, using basic formatting", file=sys.stderr)
|
||||
return None
|
||||
|
||||
def create_basic_digest(newsletters):
|
||||
"""Create a basic digest without LLM (fallback)."""
|
||||
|
||||
lines = [
|
||||
f"🤖 **AI NEWSLETTER DIGEST** — {datetime.now().strftime('%A, %B %d, %Y')}",
|
||||
"",
|
||||
f"*{len(newsletters)} newsletters analyzed*",
|
||||
"",
|
||||
"═" * 50,
|
||||
"",
|
||||
]
|
||||
|
||||
for nl in newsletters:
|
||||
source = nl.get('from', 'Unknown').split('<')[0].strip()
|
||||
subject = nl.get('subject', 'No subject')
|
||||
content = nl.get('content', '')[:300]
|
||||
|
||||
# Clean up content
|
||||
content = ' '.join(content.split())[:300]
|
||||
|
||||
lines.append(f"📌 **{subject}**")
|
||||
lines.append(f" Source: {source}")
|
||||
if content:
|
||||
lines.append(f" {content}...")
|
||||
lines.append("")
|
||||
|
||||
lines.append("═" * 50)
|
||||
lines.append(f"🦀 Krilly the Crab | {datetime.now().strftime('%B %d, %Y')}")
|
||||
|
||||
return "\n".join(lines)
|
||||
|
||||
def main():
|
||||
"""Main entry point."""
|
||||
|
||||
# Read newsletters from stdin or file
|
||||
if len(sys.argv) > 1:
|
||||
with open(sys.argv[1], 'r') as f:
|
||||
newsletters = json.load(f)
|
||||
else:
|
||||
newsletters = json.load(sys.stdin)
|
||||
|
||||
if not newsletters:
|
||||
print("No newsletters to summarize")
|
||||
return
|
||||
|
||||
print(f"📊 Processing {len(newsletters)} newsletters...", file=sys.stderr)
|
||||
|
||||
# Try LLM summarization first
|
||||
digest = summarize_newsletters(newsletters)
|
||||
|
||||
if not digest:
|
||||
# Fallback to basic
|
||||
digest = create_basic_digest(newsletters)
|
||||
|
||||
print(digest)
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
Reference in New Issue
Block a user