Multi-Agent Document Generation Pipeline - Progress Tracker

Last Updated: 2026-01-16 (Unified fetch command documentation) Project: ghosted Kanban Project ID: b666852b-0ef9-4ee0-8d91-a7f341697897 GitHub Repo: celloopa/ghosted

Overview

Building a multi-agent pipeline that automates job application document generation:

[Job Posting] → [Parser] → [Resume Gen] → [Cover Letter Gen] → [Reviewer] → [Tracker]

When a user drops a job posting into local/postings/, agents will:

Parse the posting and extract structured data
Generate a tailored resume using Typst
Generate a tailored cover letter using Typst
Have a "hiring manager" agent review and score the documents
On approval, add to the ghosted tracker with document references

Task Progress

Phase 1: Foundation

Status	Task	ID	Notes
`[x]`	Design multi-agent pipeline architecture	`72b33fb7-8924-4b3f-8c38-11001f93436d`	✅ Created `internal/agent/config.go`, `pipeline.go`, and runtime config
`[x]`	Create agent prompt templates	`88ab2620-aeae-4842-a45c-e6453f3ee1b0`	Prompts for each agent role

Phase 2: Core Agents

Status	Task	ID	Notes
`[x]`	Implement Posting Parser Agent	`bcaee5cf-e414-4d4f-b3c4-61a6da2ed451`	✅ Created `parser.go` with tests
`[x]`	Implement Resume Generator Agent	`c4f2f83d-2f81-49d3-95d5-4af1dcc9f98c`	✅ `resume_generator.go` + 14 tests
`[x]`	Implement Cover Letter Generator Agent	`e28db1da-93ee-418f-a0a7-a88605f3c398`	✅ `coverletter_generator.go` + 13 tests
`[x]`	Implement Hiring Manager Review Agent	`05b75d49-34ad-47a8-b019-532d73b8469d`	✅ `reviewer.go` + 17 tests

Phase 3: Integration

Status	Task	ID	Notes
`[x]`	Implement Tracker Integration	`1dd2fb3f-6555-4baf-b7bb-d77056c1968d`	✅ `tracker.go` + 23 tests
`[x]`	Add `ghosted apply` CLI command	`b9615c5d-fd52-418c-89da-4bec8c724f83`	✅ Implemented with --dry-run, --auto-approve
`[ ]`	Add watch mode for automatic processing	`2cdc1317-ddc0-408b-ab3d-b6fb92e2887b`	Nice-to-have: monitor folder

Bug Fixes & Improvements

Status	Task	Issue	Notes
`[x]`	Add Microsoft Careers site fetcher	#18	✅ `extractMicrosoft()` parses NEXT_DATA JSON. 7 tests added.

Phase 4: Agent Automation & Training Data (After Phase 3)

These tasks depend on Phase 3 completion. Priority order within phase:

Status	Priority	Task	ID	Notes
`[x]`	P0	Add `ghosted cv fetch <website>` command	`715e1484`	✅ Fetch CV from remote URL
`[ ]`	P0	Add --non-interactive flag to ghosted apply	`bdfff0fc`	Blocker for AI agent usage
`[ ]`	P1	Persist intermediate outputs (parsed.json, review.json)	`225c8b1b`	Debugging + training data
`[ ]`	P2	Add session logging (optional, off by default)	`ae201fb8`	Privacy-first, user opt-in only
`[ ]`	P3	Add ghosted export-training command	`e1cdba88`	Export training pairs
`[ ]`	P4	Research and test 8B parameter models	`94f2594a`	Local model evaluation

GitHub Issues

All remaining tasks are tracked as GitHub issues. Each issue includes full implementation details and testing requirements.

Issue	Title	Phase
#18	Add Microsoft Careers site fetcher	Bug Fix (In Progress)
#1	Implement Resume Generator Agent	Core
#2	Implement Cover Letter Generator Agent	Core
#3	Implement Hiring Manager Review Agent	Core
#4	Implement Tracker Integration	Integration
#5	Add `ghosted apply` CLI command	Integration
#6	Add watch mode for automatic processing	Nice-to-have
#7	Create agent prompt templates	Foundation

Workflow

Pick an issue from the list above
Create a feature branch: git checkout -b feature/issue-N-short-description
Implement the feature with tests
Ensure all tests pass: go test ./...
Submit a pull request referencing the issue
Update task status in kanban and PROGRESS.md

File Structure (Planned)

ghosted/
├── main.go                          # Add 'apply' and 'watch' commands
├── internal/
│   ├── agent/
│   │   ├── pipeline.go              # ✅ Orchestrator - runs all agents in sequence
│   │   ├── config.go                # ✅ Agent configuration loader
│   │   ├── parser.go                # ✅ Posting Parser Agent
│   │   ├── parser_test.go           # ✅ Parser agent tests
│   │   ├── resume_generator.go      # ✅ Resume Generator Agent
│   │   ├── resume_generator_test.go # ✅ 14 tests
│   │   ├── coverletter_generator.go # ✅ Cover Letter Generator Agent
│   │   ├── coverletter_generator_test.go # ✅ 13 tests
│   │   ├── reviewer.go              # ✅ Hiring Manager Review Agent
│   │   ├── reviewer_test.go         # ✅ 17 tests
│   │   ├── tracker.go               # ✅ Tracker Integration
│   │   ├── tracker_test.go          # ✅ 23 tests
│   │   └── prompts/                 # ✅ Agent prompt templates
│   │       ├── parser.md            # ✅ Parser agent prompt
│   │       ├── resume.md            # ✅ Resume generator prompt
│   │       ├── cover.md             # ✅ Cover letter generator prompt
│   │       ├── reviewer.md          # ✅ Reviewer agent prompt
│   │       └── tracker.md           # ✅ Tracker integration prompt
│   └── ...
├── local/
│   ├── postings/                    # Drop job postings here
│   │   ├── processed/               # Successfully processed postings
│   │   └── needs-review/            # Rejected - needs manual review
│   ├── resumes/                     # Generated resume PDFs
│   ├── cover-letters/               # Generated cover letter PDFs
│   └── document-generation/
│       ├── .agent/
│       │   └── config.json          # Runtime configuration (references local files)
│       ├── cv.json                  # Source CV data
│       ├── resume.typ               # Base resume template
│       └── coverletter.typ          # Base cover letter template
└── PROGRESS.md                      # This file

Data Schemas

Parsed Job Posting (Parser → Generators)

{
  "company": "Twitch",
  "position": "Software Engineer",
  "team": "BITS Team",
  "location": "Seattle, WA",
  "remote": false,
  "salary_min": null,
  "salary_max": null,
  "job_url": null,
  "requirements": ["1+ years experience", "..."],
  "bonus_skills": ["Golang", "AWS", "..."],
  "keywords": ["live streaming", "interactive", "..."],
  "tech_stack": ["Go", "TypeScript", "React"],
  "company_values": ["community", "collaboration"]
}

Review Result (Reviewer → Tracker)

{
  "approved": true,
  "overall_score": 85,
  "resume_review": {
    "score": 88,
    "strengths": ["..."],
    "weaknesses": ["..."],
    "suggestions": ["..."]
  },
  "cover_letter_review": {
    "score": 82,
    "strengths": ["..."],
    "weaknesses": ["..."],
    "suggestions": ["..."]
  },
  "recommendation": "Approve with minor edits"
}

CLI Commands (Planned)

# Run full pipeline on a posting
ghosted apply local/postings/twitch-swe.md

# Options
ghosted apply --auto-approve posting.md    # Skip review, add directly
ghosted apply --dry-run posting.md         # Generate docs, don't add to tracker
ghosted apply --reviewer-only resume.pdf cover.pdf posting.md

# Watch mode (Phase 3)
ghosted watch                              # Monitor postings folder
ghosted watch --auto-approve               # Auto-approve all

Current Focus

Completed:

ghosted fetch <url|domain> command - unified fetch for job postings AND CVs (auto-detected)
ghosted apply <posting> command - runs full pipeline with --dry-run support
ghosted context command - outputs context for AI agents (postings, CV, applications)
ghosted compile <id|dir> command - compiles Typst to PDF and updates tracker
Documentation updated to reflect unified fetch (legacy ghosted cv fetch still works)

Next task:

Add --non-interactive flag to ghosted apply (bdfff0fc) - for AI agent usage

Blockers: None - Phase 3 complete!

Questions/Decisions:

Should agents be Claude Code subagents or Go code with LLM API calls? → Claude Code subagents
Approval threshold: 70/100 score? → Yes, implemented in reviewer.go
Keep rejected postings or delete them? → Save feedback to {company}-feedback.json, keep posting for revision
URL support for apply? → Fetch-first workflow: ghosted fetch then ghosted apply
Small model fine-tuning strategy? → Collect training data first, then test 8B models per step

Agent Interaction Logging (Optional, Privacy-First)

Design Principles

OFF by default - no logging unless user explicitly enables
No annoying prompts - user must seek out the option
Local-first - sharing requires deliberate action
Documented in help - not in dialogs or banners

Three Modes

Mode	Command	Behavior
Off (default)	n/a	No logging whatsoever
Local	`ghosted config set logging local`	Saves to `local/reports/`, never leaves machine
Share	`ghosted config set logging share`	Opt-in feedback to developer (requires explicit acknowledgment)

Report Location (when local logging enabled)

local/reports/
├── YYYY-MM-DD-{company}-{role}-session.json   # Machine-readable
└── YYYY-MM-DD-{company}-{role}-session.md     # Human-readable

Use Cases

Off: Default for all users, complete privacy
Local: Power users iterating on prompts, self-hosted fine-tuning
Share: Users who want to help improve ghosted (must actively enable)

Manual Session Log (Developer Reference)

2026-01-16: Microsoft UX Engineer II
- Score: 78/100 (Approved)
- Automation rate: 43% (3/7 steps)
- Key gap: ghosted apply requires TTY
- Reports: local/reports/2026-01-16-microsoft-ux_engineer-session.*
- Note: This was logged manually for framework development

Training Data Pipeline (Planned)

Phase 1: Collection

Collect high-quality input/output pairs from manual orchestrator runs:

Agent	Input	Output	Format
Parser	posting.md	parsed.json	JSON extraction
Resume	parsed.json + cv.json	resume.typ	Typst generation
Cover	parsed.json + cv.json + resume.typ	cover-letter.typ	Typst generation
Reviewer	all docs	review.json	Scoring + reasoning

Phase 2: Export

# Planned command
ghosted export-training local/applications/ux-design/microsoft-ux_engineer/

Output format:

{
  "step": "parser",
  "input": "...",
  "output": "...",
  "metadata": { "company", "position", "review_score" }
}

Phase 3: Fine-tuning

Target models (8B parameter class):

Qwen2.5-7B-Instruct - Strong at structured output
Llama-3.1-8B-Instruct - Good general reasoning
Mistral-7B-Instruct-v0.3 - Fast inference

Training strategy:

Start with Parser (most constrained task)
Add Resume/Cover (template-guided generation)
Reviewer last (requires most reasoning)

Phase 4: Integration

Replace Claude API calls with local model inference:

Use Ollama or llama.cpp for serving
Orchestrator (Claude) only handles error recovery and complex decisions
Target: 80%+ automation with local models

Session Notes (for context reset)

What's Done

All 4 core agents implemented with 80+ tests:
- resume_generator.go - CV loading, Typst output, skill matching
- coverletter_generator.go - Experience extraction, consistency with resume
- reviewer.go - Scoring (70+ approval), detailed feedback
- tracker.go - Store integration, file organization
ghosted fetch command implemented:
- internal/fetch/fetcher.go - URL fetching, HTML parsing, markdown conversion
- Supports: Lever, Greenhouse, Workday, LinkedIn, Ashby, generic
- Saves to local/postings/ with metadata header
- Usage: ghosted fetch https://jobs.lever.co/company/123

What's Next

Implement ghosted apply <file> - This will:
- Accept a posting file from local/postings/
- Run the full pipeline (Parser → Resume → Cover → Reviewer → Tracker)
- Invoke Claude Code for AI-powered document generation
- Wire up all the agents we built

Test the full workflow:

ghosted fetch https://jobs.lever.co/some-company/123
ghosted apply local/postings/some-company-swe-posting.md

Completed Work Log

2026-01-16: Unified Fetch Command Documentation

Updated documentation to reflect the unified ghosted fetch command that auto-detects job postings vs CVs.

Files modified:

README.md - Removed separate "CV Fetch Command" section, updated CLI examples
CHANGELOG.md - Updated CV Fetch entry to reference unified fetch command

Changes:

ghosted fetch cello.design now documented as the primary way to fetch CVs
Legacy ghosted cv fetch <website> still works for backwards compatibility
Single "Fetch Command" section in README covers both use cases

2026-01-16: TUI Clipboard Copy for Claude Prompt

Added ability to copy a complete Claude prompt to clipboard after fetching a job posting in the TUI.

Problem solved: After fetching a posting in the TUI, users had to manually copy content and construct a prompt for Claude. This was tedious.

Files modified:

internal/tui/fetch.go - Added PostingContent field, Result() getter, 'c' key handler, help text
internal/tui/keys.go - Added CopyContext key binding
internal/tui/app.go - Added handleCopyContext() and buildApplyPrompt() functions

Features:

Press 'c' after successful job posting fetch to copy prompt
Runs ghosted context to include full context (CV, applications, postings)
Includes the fetched job posting content
Includes instructions for generating tailored resume and cover letter
Shows success message "Claude prompt copied to clipboard"

2026-01-16: Filename Validation for Fetched Postings

Improved filename generation for fetched job postings to reject IDs and provide better error messages.

Problem solved: Some job board URLs (especially LinkedIn) generate filenames that are just numeric IDs, which aren't useful for organizing postings.

Files modified:

internal/fetch/fetcher.go - Added validation functions and improved hostname extraction
internal/fetch/fetcher_test.go - 8 new tests for filename validation
main.go - Better error messages with example usage

Functions added:

isHexString() - Checks if string is hexadecimal
looksLikeID() - Detects numeric, UUID, or hex hash patterns
ValidateFilename() - Validates generated filenames
extractCleanHostname() - Extracts company name from hostname

Test coverage:

TestLooksLikeID - ID pattern detection
TestValidateFilename - Filename validation
TestExtractCleanHostname - Hostname parsing
TestGenerateFilename_Validation - Integration tests

2026-01-16: Microsoft Careers Site Fetcher (#18)

Added specialized extractor for careers.microsoft.com and apply.careers.microsoft.com URLs.

Problem solved: Microsoft Careers uses React/Next.js with job data embedded in <script id="__NEXT_DATA__"> JSON. The generic fetcher was returning truncated content.

Files modified:

internal/fetch/fetcher.go - Added extractMicrosoft() and helper functions
internal/fetch/fetcher_test.go - Added 7 new tests

Implementation:

extractMicrosoft() - Main extractor, parses NEXT_DATA JSON
parseMicrosoftNextData() - Navigates JSON structure (job, jobDetail, data)
extractMicrosoftJobData() - Extracts title, description, qualifications, responsibilities
validateMicrosoftExtraction() - Rejects numeric company names, empty/numeric positions
isNumeric() - Helper for validation

Test coverage:

TestFetcher_ExtractMicrosoft_NextData - Full NEXT_DATA parsing
TestFetcher_ExtractMicrosoft_FallbackToMeta - Meta tag fallback
TestFetcher_ExtractMicrosoft_JobDetail - Alternate JSON structure
TestFetcher_ExtractMicrosoft_QualificationsArray - Array handling
TestIsNumeric - Numeric validation
TestValidateMicrosoftExtraction - Data validation
TestFetcher_ExtractMicrosoft_ApplySubdomain - Subdomain support

2026-01-16: Unified Fetch Command + TUI Fetch View

Merged job posting fetch and CV fetch into a single ghosted fetch command with auto-detection. Also added a TUI fetch view accessible via f key.

Files created/modified:

internal/fetch/fetcher.go - Added FetchType enum and DetectFetchType() function
internal/fetch/cv.go - New CV fetcher with JSON Resume support
internal/fetch/cv_test.go - 8 tests for CV fetching and detection
main.go - Unified cmdFetch() to handle both types
internal/tui/fetch.go - New FetchView with URL input and async fetch
internal/tui/keys.go - Updated key bindings
internal/tui/app.go - Added ViewFetch state
internal/tui/list.go - Added fetch action

Detection rules:

Bare domain (cello.design) → CV fetch to local/cv.json
Explicit /cv.json path → CV fetch
Any URL with path → Job posting fetch to local/postings/

Usage (CLI):

ghosted fetch https://jobs.lever.co/company/123  # Job posting
ghosted fetch cello.design                       # CV from domain/cv.json
ghosted fetch https://example.com/cv.json        # CV from explicit URL

Usage (TUI): Press f from the list view to open the fetch dialog.

Key binding changes:

s = filter by status (was f)
f = fetch URL (new)

2026-01-16: Context and Compile Commands

ghosted context command:

Outputs all context needed for AI-assisted workflows
Shows pending postings in local/postings/
Displays CV data from local/cv.json
Lists current applications organized by status
Shows available agent prompt templates
Usage: ghosted context

ghosted compile <id|dir> command:

Compiles .typ files (resume.typ, cover-letter.typ) to PDF
Accepts application ID or directory path
Automatically updates tracker with resume_version and cover_letter
Opens output folder after compilation (macOS: open, Linux: xdg-open)
Usage: ghosted compile abc123 or ghosted compile local/applications/swe/acme/

TUI Improvements:

Help view is now a centered overlay dialog
Expanded help with CLI commands and tips

2026-01-16: All Core Agents Implemented (Tasks 1-4)

Files created:

internal/agent/resume_generator.go - Resume Generator Agent
internal/agent/resume_generator_test.go - 14 tests
internal/agent/coverletter_generator.go - Cover Letter Generator Agent
internal/agent/coverletter_generator_test.go - 13 tests
internal/agent/reviewer.go - Hiring Manager Review Agent
internal/agent/reviewer_test.go - 17 tests
internal/agent/tracker.go - Tracker Integration Agent
internal/agent/tracker_test.go - 23 tests

Total: 67 new tests (80+ total in agent package)

Resume Generator Agent features:

CV loading from JSON Resume format
Typst template loading
Skill matching between CV and job requirements
Output path generation with job-type folders
PDF compilation via Typst CLI
Prompt generation for AI-based tailoring

Cover Letter Generator Agent features:

CV and resume loading for consistency
Relevant experience extraction (scored by job match)
Three-section structure (Hook, Bridge, Value)
Typst output with modern-cv template

Hiring Manager Review Agent features:

Document loading and validation
Weighted scoring (60% resume, 40% cover letter)
Requirement match analysis (met/missing/bonus)
Approval threshold: 70/100
Recommendation categories: Approve, Approve with edits, Revise, Not recommended
Detailed feedback with strengths, weaknesses, suggestions

Tracker Integration Agent features:

Application creation in ghosted store
Notes generation from tech stack, score, requirements
Job type auto-detection (fe-dev, ux-design, product-design, swe)
Application folder path generation
Posting archival to processed folder
Rejection feedback file creation

Run tests:

go test -v ./internal/agent/...

2026-01-16: Agent Prompt Templates (Task 7)

Files created:

internal/agent/prompts/parser.md - Job posting parser prompt
internal/agent/prompts/resume.md - Resume generator prompt with Typst format
internal/agent/prompts/cover.md - Cover letter generator prompt with Typst format
internal/agent/prompts/reviewer.md - Hiring manager review prompt with scoring criteria
internal/agent/prompts/tracker.md - Tracker integration prompt for ghosted CLI

Key features:

Parser: Structured JSON output with requirements, tech_stack, keywords, company_values
Resume: Typst template with tailoring principles, keyword optimization, skills highlighting
Cover Letter: Three-section structure (Hook, Bridge, Value) with tone guidelines
Reviewer: Scoring criteria (40% requirements, 30% experience, 20% communication, 10% culture)
Tracker: CLI command generation with file organization and validation

2026-01-16: Posting Parser Agent (Task 2)

Files created/modified:

internal/agent/parser.go - ParserAgent implementation with:
- File type detection (.md, .txt, .png, .jpg, .jpeg)
- Content reading for text and image markers
- System/user prompt generation for AI parsing
- JSON parsing and validation
- Output schema definition
internal/agent/parser_test.go - Test coverage for all parser methods
internal/agent/config.go - Updated ParsedPosting struct with new fields:
- Team, BonusSkills, Keywords, CompanyValues
internal/agent/pipeline.go - Updated runParserStep() to use ParserAgent
- Added extractBasicInfo() fallback for basic text extraction

Test coverage:

TestParserAgent_SupportedExtensions - File type detection
TestParserAgent_IsImageFile - Image identification
TestParserAgent_ReadPosting - File reading
TestParserAgent_ParseJSON - JSON parsing with validation
TestParserAgent_GetSystemPrompt - Prompt content verification
TestExtractBasicInfo - Basic info extraction fallback

2026-01-16: Pipeline Architecture (Task 1)

Files created:

internal/agent/config.go - Agent types, config structs, data models
internal/agent/pipeline.go - Pipeline orchestrator with state management
local/document-generation/.agent/config.json - Runtime configuration

Key decisions implemented:

Sequential pipeline: Parser → Resume → Cover → Reviewer → Tracker
JSON intermediate format between agents
State persistence for resume/retry capability
Typst for PDF generation

Data types defined:

ParsedPosting - Structured job posting data
GeneratedDocuments - Paths to generated .typ and .pdf files
ReviewResult - Approval status, score, and feedback

Status Legend

[ ] - Not started
[~] - In progress
[x] - Completed
[!] - Blocked

How to Update This File

When working on a task:

Change status from [ ] to [~]
Add notes about approach/blockers
Update "Current Focus" section

When completing a task:

Change status from [~] to [x]
Update "Current Focus" to next task
Add completion notes if relevant

Use vibe kanban MCP to sync:

mcp__vibe_kanban__update_task(task_id, status="inprogress"|"done")

FilesExpand file tree

PROGRESS.md

Latest commit

History

PROGRESS.md

File metadata and controls

Multi-Agent Document Generation Pipeline - Progress Tracker

Overview

Task Progress

Phase 1: Foundation

Phase 2: Core Agents

Phase 3: Integration

Bug Fixes & Improvements

Phase 4: Agent Automation & Training Data (After Phase 3)

GitHub Issues

Workflow

File Structure (Planned)

Data Schemas

Parsed Job Posting (Parser → Generators)

Review Result (Reviewer → Tracker)

CLI Commands (Planned)

Current Focus

Agent Interaction Logging (Optional, Privacy-First)

Design Principles

Three Modes

Report Location (when local logging enabled)

Use Cases

Manual Session Log (Developer Reference)

Training Data Pipeline (Planned)

Phase 1: Collection

Phase 2: Export

Phase 3: Fine-tuning

Phase 4: Integration

Session Notes (for context reset)

What's Done

What's Next

Completed Work Log

2026-01-16: Unified Fetch Command Documentation

2026-01-16: TUI Clipboard Copy for Claude Prompt

2026-01-16: Filename Validation for Fetched Postings

2026-01-16: Microsoft Careers Site Fetcher (#18)

2026-01-16: Unified Fetch Command + TUI Fetch View

2026-01-16: Context and Compile Commands

2026-01-16: All Core Agents Implemented (Tasks 1-4)

2026-01-16: Agent Prompt Templates (Task 7)

2026-01-16: Posting Parser Agent (Task 2)

2026-01-16: Pipeline Architecture (Task 1)

Status Legend

How to Update This File