Skip to content

Add end-to-end test for calibration database build pipeline#556

Merged
MaxGhenis merged 1 commit intomainfrom
add-database-build-test
Feb 26, 2026
Merged

Add end-to-end test for calibration database build pipeline#556
MaxGhenis merged 1 commit intomainfrom
add-database-build-test

Conversation

@MaxGhenis
Copy link
Contributor

Summary

  • Adds test_database_build.py that runs every ETL script in sequence (matching make database order) against a HuggingFace-hosted stratified CPS dataset
  • Validates the resulting SQLite database has correct tables, national targets, state income tax coverage (42+ states, CA > $100B), 435+ congressional district strata, valid policyengine-us variables, and 1000+ total targets
  • Catches API mismatches, broken imports, and data-loading errors before they reach production — the type of issues that caused Add state income tax revenue as calibration target #492 / Fix etl_state_income_tax.py API mismatches #500

Test plan

  • CI passes (the test itself takes ~2 minutes to run all 9 ETL scripts)
  • Verify test catches intentional breakage (e.g. rename a function in db_metadata.py)

🤖 Generated with Claude Code

@MaxGhenis MaxGhenis force-pushed the add-database-build-test branch from c33c39f to 94bf74f Compare February 26, 2026 00:26
Runs all ETL scripts (create_database_tables, create_initial_strata,
etl_national_targets, etl_age, etl_medicaid, etl_snap,
etl_state_income_tax, etl_irs_soi, validate_database) in sequence
and validates the resulting SQLite database for:
- Expected tables (strata, stratum_constraints, targets)
- National targets include key variables (snap, social_security, ssi)
- State income tax targets cover 42+ states with CA > $100B
- Congressional district strata for 435+ districts
- All target variables exist in policyengine-us
- Total target count > 1000

This prevents API mismatches and import errors from going undetected
when ETL scripts are modified.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MaxGhenis MaxGhenis force-pushed the add-database-build-test branch from 94bf74f to be5b15f Compare February 26, 2026 00:29
@MaxGhenis MaxGhenis merged commit b8e4e40 into main Feb 26, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant