This folder provides tools to download and process PostgreSQL mailing list threads.
- Fetch thread HTML from
postgresql.org - Convert HTML to Markdown (uses
html2textif available) - Download attachments (.patch, .txt, .no-cfbot files)
- Organize content by thread-id and date
- Cursor Agent integration for automated blog generation
# Using full URL (recommended)
python3 tools/fetch_data.py --thread-id "https://www.postgresql.org/message-id/flat/CACJufx..."
# Or just the thread ID
python3 tools/fetch_data.py --thread-id "CACJufxGn+bMNPyrMTe0-W4fLmkFVXSr..."This will:
- Download the thread HTML
- Convert to Markdown
- Download all attachments (.patch, .txt, .no-cfbot)
- Save everything to
data/threads/<date>/<thread-id>/
⚡ Quick Method:
-
First time setup: Copy the template
cp QUICK_PROMPT.template QUICK_PROMPT.txt
Note:
QUICK_PROMPT.txtis gitignored for your personal use -
Open
QUICK_PROMPT.txtin the project root -
Replace
PASTE_YOUR_THREAD_ID_HEREwith your thread ID/URL (in 2 places) -
Copy the entire prompt and paste it into Cursor Agent
📚 Detailed Method:
See BLOG_GENERATION_PROMPT.md for:
- Multiple prompt templates (basic, advanced, minimal)
- Customization options
- Batch processing instructions
- Example usage and tips
The agent will:
- ✅ Fetch thread data automatically
- ✅ Analyze content and patches
- ✅ Compare patch versions using diff (if applicable)
- ✅ Generate technical blogs as a PostgreSQL expert
- ✅ Create TWO versions: English and Chinese (中文)
- ✅ Save to appropriate directories (auto-determined):
- English:
src/en/{year}/{week}/{filename}.md - Chinese:
src/cn/{year}/{week}/{filename}.md
- English:
- ✅ Update
src/SUMMARY.mdwith both language versions in their respective sections
Or just tell Cursor Agent:
"Generate a blog from this PostgreSQL thread: [paste thread ID]"
After running fetch_data.py, you'll get:
data/threads/
└── 2026-01-18/
└── CACJufxGn_bMNPyr.../
├── thread.html # Original HTML
├── thread.md # Converted Markdown
├── metadata.txt # Thread metadata
├── attachments.txt # List of attachments
└── attachments/ # Downloaded attachments
├── v1-patch.patch
├── v2-patch.patch
└── ...
Optional (recommended):
html2text: For better HTML to Markdown conversionpip install html2text
for id in "thread1" "thread2" "thread3"; do
python3 tools/fetch_data.py --thread-id "$id"
donepython3 tools/fetch_data.py --input "path/to/thread.html"python3 tools/fetch_data.py --thread-id <id> --output-dir "my-threads"- Download threads using
fetch_data.py - Use prompt templates in
QUICK_PROMPT.txtorBLOG_GENERATION_PROMPT.md - Let Cursor Agent generate high-quality technical blogs
- Review and publish to your weekly digest