Data enrichment pipeline for Federal Register executive orders. Uses OpenAI to generate summaries, themes, impact analysis, and potential concerns.
Live site: What Got Signed? (or run locally with the included Express server)
- Fetch: Download executive orders from the Federal Register API
- Enrich: Use OpenAI to analyze each order with a two-pass approach using a static taxonomy:
- Pass 1 (gpt-4.1-mini): Plain-language summary and thematic categorization from static taxonomy
- Pass 2 (gpt-4.1-mini): Population impact analysis and potential concerns, using themes from Pass 1
- LLM selects from a comprehensive static taxonomy (280 themes, 158 populations)
- Suggestions for new taxonomy items are logged for human review
- Aggregate: Generate term summaries and timeline data (fast, no API calls)
- Generate Narratives: LLM-generated summaries and impact analysis per presidential term, quarter, and theme (uses OpenAI API)
- Weekly Digest: LLM-generated weekly narrative of EOs signed that week — appears on the homepage, regenerated automatically each week with staleness detection
- Automated Daily Sync: GitHub Actions cron runs every day at 6am ET — fetches new EOs, enriches, aggregates, updates narratives, and commits data back to the repo
- Web Frontend: Express server with a clean UI to browse executive orders by term, quarter, or theme
npm install
npm run buildCopy the example environment file and add your OpenAI API key:
cp .env.example .envThen edit .env with your API key from https://platform.openai.com/api-keys
flowchart LR
A[Fetch executive orders from Federal Register API] --> B[Enrich data using static taxonomy]
B --> C[Aggregate data for presidential term and quarterly timelines]
C --> D[Generate narratives for detailed EO reviews]
B --- E[Pass 1 - gpt 4.1 mini: summaries + themes]
E --- F[Pass 2 - gpt 4.1 mini: populations + concerns]
Download executive orders for a specific year:
npm run fetch -- --year 2025Fetch a range of years:
npm run fetch -- --from 2020 --to 2025Or fetch a single executive order by number:
npm run fetch -- --eo 14350Raw data is saved to data/raw/executive-orders.json.
Process orders through OpenAI for enrichment. The enrichment uses a two-pass approach with a static taxonomy:
- Pass 1 (gpt-4.1-mini): Generates the summary and identifies themes from the taxonomy
- Pass 2 (gpt-4.1-mini): Identifies impacted populations and potential concerns, using the themes from Pass 1 for context
The LLM selects tags from a comprehensive static taxonomy (what-got-signed/data/taxonomy.json). If the LLM suggests a new tag, it's logged to metadata-config/executive_order_taxonomy_guide.md for human review rather than being automatically added.
# Enrich all orders from a specific year
npm run enrich -- --year 2025
# Limit number of orders to process
npm run enrich -- --year 2025 --limit 5
# Re-process already enriched orders (includes new ones too)
npm run enrich -- --force
# Re-enrich only already-enriched orders (no new ones)
npm run enrich -- --existing-only
# Re-enrich a specific executive order
npm run enrich -- --eo 14350Enriched data is saved to data/enriched/.
Generate term summaries and timeline data (no API calls required):
# Aggregate all data
npm run aggregate
# Aggregate for a specific president
npm run aggregate -- --president trumpAggregated data is saved to data/aggregated/:
term-summaries.json- Summary data per presidential term with top themestimeline.json- Quarterly timeline data with theme summaries
Generate LLM-powered narrative summaries for presidential terms, quarterly periods, and themes. This step uses the OpenAI API and may incur costs:
# Generate all narratives (term + quarterly + theme) -- will run for new narratives needed as well as updating stale ones
npm run generate-narratives
# Generate term narratives only
npm run generate-narratives -- --type term
# Generate quarterly narratives only
npm run generate-narratives -- --type quarterly
# Generate theme narratives only
npm run generate-narratives -- --type theme
# Generate quarterly narratives for a specific year
npm run generate-narratives -- --type quarterly --year 2025
# Generate narrative for a specific quarter
npm run generate-narratives -- --type quarterly --year 2025 --quarter 1
# Filter by president (term narratives only)
npm run generate-narratives -- --president trump
# Filter by theme (theme narratives only)
npm run generate-narratives -- --type theme --theme immigration
# Force regeneration (skip staleness checks)
npm run generate-narratives -- --force
# Check what needs updating (no regeneration, no API calls)
npm run generate-narratives -- --checkOutputs:
data/aggregated/narratives.json- Term narratives with summary and potential impact paragraphsdata/aggregated/quarterly-narratives.json- Quarterly narratives with summary and potential impact paragraphsdata/aggregated/theme-narratives.json- Theme narratives with summary and potential impact paragraphsdata/aggregated/weekly-narrative.json- Current week's digest narrative with linked EO list
Smart incremental generation: Narratives automatically detect when they're stale based on the enriched_at timestamps of underlying orders. If you enrich new orders in Q3 2025 with 3 themes, running generate-narratives will automatically regenerate only:
- The Q3 2025 quarterly narrative
- The 3 affected theme narratives
- The relevant term narrative(s)
Use --check to preview what would be regenerated without making API calls.
Run the entire pipeline (fetch, enrich, aggregate, generate narratives) for a given year:
# Run full pipeline for 2025
npm run pipeline -- --year 2025
# Skip fetching (use existing raw data)
npm run pipeline -- --year 2025 --skip-fetch
# Skip narrative generation
npm run pipeline -- --year 2025 --skip-narratives
# Force re-enrichment and narrative regeneration
npm run pipeline -- --year 2025 --forceCheck for new executive orders and automatically process them:
# Check current year for new EOs and process them
npm run update
# Check a specific year
npm run update -- --year 2025
# Check only (don't process, just show what's new)
npm run update -- --check
# Skip narrative generation
npm run update -- --skip-narrativesThis command:
- Queries the Federal Register API for the current year
- Compares against existing enriched files
- If new EOs are found: fetches, enriches, aggregates, and updates narratives
- Uses smart staleness detection to only regenerate affected narratives
- Generates (or skips if current) the weekly digest narrative
Start the Express server to browse executive orders:
cd what-got-signed
npm install
node server.jsThen open http://localhost:3000 in your browser.
- This Week section on the homepage — weekly LLM digest of recent EOs with links to detail pages
- Quarterly timeline with horizontal scroll, filterable by year via multi-select dropdown
- Detail pages for presidential terms, quarters, and themes with LLM-generated narratives
- Definitions page with sticky category headers for browsing themes and populations
- Back-to-top button appears after scrolling past the viewport height
- Friendly messaging for themes with limited data (callout shown instead of empty sections)
President names are automatically styled with avatars throughout the site. The system derives avatar filenames from full names by lowercasing, removing periods, and replacing spaces with hyphens.
Current avatars (place in what-got-signed/public/avatars/):
barack-obama.jpgdonald-trump.jpggeorge-w-bush.jpgjoseph-r-biden-jr.jpg
Adding a new president: When a new president takes office, simply run the data pipeline to fetch and enrich their executive orders. The president's name comes from the Federal Register API, and the avatar ID is derived automatically. Just add the avatar image with the correct filename and everything else works automatically.
To find the correct filename for a new president, convert their full name: lowercase, remove periods, replace spaces with hyphens. For example, "Jane A. Smith Jr." becomes jane-a-smith-jr.jpg.
The API automatically enforces same-origin requests only. This means:
- Your frontend at
whatgotsigned.comcan callwhatgotsigned.com/api/*✓ - Your local dev at
localhost:3000can calllocalhost:3000/api/*✓ - Another site at
evil-site.comtrying to call your API → blocked ✗
No configuration needed - it works automatically in both development and production.
Adding extra allowed origins (optional): If you need to allow additional domains (e.g., an admin dashboard on a different subdomain), set the ALLOWED_ORIGINS environment variable:
ALLOWED_ORIGINS=https://admin.whatgotsigned.com node server.jsThe main taxonomy defines all available themes and populations. The LLM selects from this taxonomy during enrichment. The frontend generates flat registries from this hierarchical structure on-the-fly:
{
"themes": {
"national_security_defense": ["Military Readiness / Force Structure", "..."],
"immigration": ["Border Enforcement", "Visa Policy", "..."],
"economy_trade": ["Trade Policy / Agreements", "..."]
},
"impacted_populations": {
"demographic_groups": {
"racial_ethnic": ["African Americans / Black Americans", "..."]
},
"employment_sectors": {
"government": ["Federal Employees / Civil Servants", "..."]
}
}
}Each enriched order includes:
{
"document_number": "2025-12345",
"executive_order_number": 14350,
"title": "Executive Order Title",
"signing_date": "2025-01-20",
"president": {
"name": "Donald Trump",
"identifier": "donald-trump"
},
"html_url": "https://www.federalregister.gov/...",
"raw_text_url": "https://www.federalregister.gov/.../raw_text",
"enrichment": {
"summary": "Plain-language summary of the order...",
"theme_ids": ["immigration-enforcement", "national-security"],
"impacted_populations": {
"positive_ids": ["border-patrol-agents"],
"negative_ids": ["undocumented-immigrants"]
},
"potential_concerns": [
"Implementation may strain agency resources.",
"Could face legal challenges on constitutional grounds."
],
"enriched_at": "2025-01-15T10:30:00.000Z",
"model_used": "gpt-4.1-mini"
}
}federal-register-analytics/
├── src/
│ ├── types.ts # TypeScript type definitions
│ ├── config.ts # Configuration constants (retry settings, paths, models)
│ ├── utils.ts # Utility functions (withRetry, requireJson, taxonomy loaders)
│ ├── fetch.ts # Federal Register API fetching (with retry on 429)
│ ├── enrich.ts # OpenAI enrichment logic (two-pass with static taxonomy, typed retries)
│ ├── taxonomy.ts # Taxonomy loader and formatter
│ ├── aggregate.ts # Data aggregation (term summaries, timeline)
│ ├── narratives.ts # LLM-generated narratives (term, quarterly, theme, weekly)
│ ├── sync.ts # Incremental sync logic (new EOs only)
│ ├── index.ts # Main exports
│ └── cli/ # CLI entry points
│ ├── fetch.ts
│ ├── enrich.ts
│ ├── aggregate.ts
│ ├── narratives.ts
│ ├── pipeline.ts
│ └── sync.ts # Entrypoint for npm run update (guards OPENAI_API_KEY)
├── src/__tests__/
│ └── utils.test.ts # Vitest tests for withRetry and requireJson
├── .github/
│ └── workflows/
│ └── daily-sync.yml # GitHub Actions cron (daily at 6am ET)
├── metadata-config/ # Taxonomy documentation
│ └── executive_order_taxonomy_guide.md # Usage guide + LLM suggestions
├── what-got-signed/ # Web frontend (deployable standalone)
│ ├── server.js # Express server (generates registries from taxonomy)
│ ├── views/ # EJS templates
│ │ ├── partials/ # Reusable components (header, footer, back-to-top)
│ │ ├── index.ejs # Homepage with timeline
│ │ ├── detail.ejs # Detail pages (term, quarter, theme)
│ │ └── definitions.ejs # Theme and population definitions
│ ├── public/ # Static CSS, JS, images
│ └── data/ # All data files
│ ├── taxonomy.json # Master taxonomy (themes + populations)
│ ├── enriched/ # Enriched data (committed)
│ ├── aggregated/ # Aggregated data (committed)
│ └── raw/ # Raw API data (gitignored)
└── dist/ # Compiled JavaScript (gitignored)
All data files are stored in what-got-signed/data/ for deployment simplicity:
what-got-signed/data/taxonomy.json- Master taxonomy (committed)what-got-signed/data/enriched/- Enriched executive order data (committed)what-got-signed/data/aggregated/- Aggregated data (committed)what-got-signed/data/raw/- Raw API data (gitignored)
To regenerate data locally:
npm run fetch -- --from 2017 --to 2025(fetches raw data from Federal Register API)npm run enrich -- --force(re-enrich with OpenAI)npm run aggregate(regenerate timeline and term summaries)npm run generate-narratives -- --force(regenerate LLM narratives)
A GitHub Actions workflow (.github/workflows/daily-sync.yml) runs npm run update every day at 10:00 UTC (6am ET). It:
- Checks for new EOs published that day
- Enriches, aggregates, and updates narratives if any are found
- Updates the weekly digest narrative (skips if already current for this week)
- Commits changed data files back to
mainand triggers a Vercel redeploy
Setup required: Add your OpenAI API key as a repository secret named OPENAI_API_KEY in GitHub → Settings → Secrets and variables → Actions.
The workflow is a no-op when there are no new EOs (no empty commit is created).
The what-got-signed/ folder is self-contained and can be deployed standalone:
cd what-got-signed
npm install
node server.jsOn platforms like Railway or Render, set the root directory to what-got-signed/.
MIT