Context
Marina Gindelsky at BEA uses PolicyEngine to calculate taxes and transfers on CPS ASEC microdata that she's scaled to NIPA totals. Currently she goes through policyengine-taxsim's TAXSIM format, but this is limiting — TAXSIM's input format doesn't cover transfer variables (SNAP, Medicaid, etc.) and requires lossy translation from CPS's native structure.
More broadly, many researchers and agencies have their own microdata (CPS, ACS, SCF, SIPP, custom surveys) and want to run PolicyEngine calculations on it. The current options are:
- policyengine-taxsim — limited to TAXSIM's input/output format
- policyengine-us Simulation API — powerful but requires manually mapping every variable to PE's entity structure
- policyengine-us-data — builds our canonical dataset, not designed for user-supplied data
Proposal: policyengine-survey-calculator (or similar name)
A new repo/package that:
-
Accepts common survey microdata formats with prebuilt connectors:
- CPS ASEC (Marina's immediate need)
- ACS
- SCF
- SIPP
- Flat CSV with a documented schema
-
Maps survey variables to PolicyEngine's input variables — each connector knows how to translate (e.g., CPS PEMLR -> PE employment_income, CPS household structure -> PE tax units/SPM units)
-
Runs PolicyEngine calculations and returns results merged back onto the original microdata
-
Exports results in the original survey format or flat files
Separation of concerns
| Repo |
Responsibility |
| policyengine-us |
Tax/benefit rules only |
| policyengine-us-data |
Building the best unified microdata file (Enhanced CPS) |
| policyengine-taxsim |
TAXSIM format emulation specifically |
| New repo |
"Bring your own survey data, get PE calculations back" |
The CPS connector code could be shared with policyengine-us-data (e.g., as a dependency or shared utility), since both need to parse CPS ASEC structure into PE entities.
Open questions
- Repo name:
policyengine-survey-calculator? policyengine-surveys? policyengine-microdata?
- Should policyengine-taxsim eventually become a thin wrapper around this + a TAXSIM format adapter?
- How much connector logic can be shared with policyengine-us-data's CPS processing?
- Should this support UK surveys too (FRS, SPI) or stay US-focused initially?
Immediate motivation
Marina at BEA needs transfer variable calculations (SNAP, Medicaid, SSI, etc.) on her CPS-based microdata. The TAXSIM format can't express these inputs/outputs. A CPS connector that accepts her data directly would solve this cleanly.
Context
Marina Gindelsky at BEA uses PolicyEngine to calculate taxes and transfers on CPS ASEC microdata that she's scaled to NIPA totals. Currently she goes through policyengine-taxsim's TAXSIM format, but this is limiting — TAXSIM's input format doesn't cover transfer variables (SNAP, Medicaid, etc.) and requires lossy translation from CPS's native structure.
More broadly, many researchers and agencies have their own microdata (CPS, ACS, SCF, SIPP, custom surveys) and want to run PolicyEngine calculations on it. The current options are:
Proposal:
policyengine-survey-calculator(or similar name)A new repo/package that:
Accepts common survey microdata formats with prebuilt connectors:
Maps survey variables to PolicyEngine's input variables — each connector knows how to translate (e.g., CPS
PEMLR-> PEemployment_income, CPS household structure -> PE tax units/SPM units)Runs PolicyEngine calculations and returns results merged back onto the original microdata
Exports results in the original survey format or flat files
Separation of concerns
The CPS connector code could be shared with policyengine-us-data (e.g., as a dependency or shared utility), since both need to parse CPS ASEC structure into PE entities.
Open questions
policyengine-survey-calculator?policyengine-surveys?policyengine-microdata?Immediate motivation
Marina at BEA needs transfer variable calculations (SNAP, Medicaid, SSI, etc.) on her CPS-based microdata. The TAXSIM format can't express these inputs/outputs. A CPS connector that accepts her data directly would solve this cleanly.