Summary
Create variables for standard income concepts used by major inequality databases (WID, LIS, OECD), starting with WID's pre-tax national income. Also add standardized summary statistic computation (Gini, percentile shares, etc.) that matches the methodology of these external sources.
Motivation
PolicyEngine computes household_net_income and household_market_income, but these don't map directly to the income concepts used by the World Inequality Database (WID), OECD, or LIS. This makes it hard to:
- Validate against external benchmarks — e.g., OWID reports US Gini of 0.587 (2023) using WID pre-tax national income, but PE's net income Gini is 0.565 and market income Gini is 0.643. Neither matches.
- Compare cross-country — WID's methodology is consistent across countries, so implementing it in PE would enable apples-to-apples comparisons.
- Forecast for prediction markets — There are active prediction markets on OWID Gini values that PE microsim could inform, but only if we compute the same income concept.
WID pre-tax national income definition
From WID methodology:
- Pre-tax national income = factor income (labor + capital) + pensions + unemployment insurance, before taxes and transfers
- Unit: Equal-split among adults (20+) in the same household — i.e., total household pre-tax national income divided equally among adult members
- Population: Adults aged 20+
Key differences from PE's current variables:
- Includes pension/UI income (unlike pure market income)
- Excludes means-tested transfers (unlike net income)
- Equal-split among adults only (not per-capita, not equivalized)
Proposed implementation
New variables in policyengine-us
# Person-level
wid_pretax_national_income_person # = (labor + capital + pensions + UI) / n_adults_in_household
Components:
employment_income + self_employment_income (labor)
capital_gains + dividend_income + interest_income + rental_income (capital)
social_security + pension_income (pensions)
unemployment_compensation (UI)
- Divided by number of adults (20+) in the tax unit/household
Consider for policyengine-core or policyengine.py
If the income concept definition is common across countries (which it is for WID), the variable template or summary stat functions could live in a shared package:
wid_pretax_national_income as a cross-country variable pattern
- Standardized
gini(variable, population_filter, weighting) that matches WID's methodology
- Percentile share computation matching WID conventions
Acceptance criteria
References
Summary
Create variables for standard income concepts used by major inequality databases (WID, LIS, OECD), starting with WID's pre-tax national income. Also add standardized summary statistic computation (Gini, percentile shares, etc.) that matches the methodology of these external sources.
Motivation
PolicyEngine computes
household_net_incomeandhousehold_market_income, but these don't map directly to the income concepts used by the World Inequality Database (WID), OECD, or LIS. This makes it hard to:WID pre-tax national income definition
From WID methodology:
Key differences from PE's current variables:
Proposed implementation
New variables in policyengine-us
Components:
employment_income+self_employment_income(labor)capital_gains+dividend_income+interest_income+rental_income(capital)social_security+pension_income(pensions)unemployment_compensation(UI)Consider for policyengine-core or policyengine.py
If the income concept definition is common across countries (which it is for WID), the variable template or summary stat functions could live in a shared package:
wid_pretax_national_incomeas a cross-country variable patterngini(variable, population_filter, weighting)that matches WID's methodologyAcceptance criteria
wid_pretax_national_income_personvariable computes correctlyReferences