Skip to content

Populate CPS inputs for SPM childcare formula#705

Merged
MaxGhenis merged 4 commits intomainfrom
codex/childcare-spm-cap-formula
May 7, 2026
Merged

Populate CPS inputs for SPM childcare formula#705
MaxGhenis merged 4 commits intomainfrom
codex/childcare-spm-cap-formula

Conversation

@MaxGhenis
Copy link
Copy Markdown
Contributor

@MaxGhenis MaxGhenis commented Apr 9, 2026

Summary

  • carry CPS WKSWORK through the microdata build as weeks_worked
  • carry CPS PERRP through as is_unmarried_partner_of_household_head
  • preserve weeks_worked in Extended CPS even after the paired model PR gives it a future-year formula
  • impute clone-half weeks_worked with the other CPS-only labor inputs instead of donor-copying it
  • stop rebuilding spm_unit_capped_work_childcare_expenses inside policyengine-us-data
  • leave capped childcare to the model formula in policyengine-us and add regression coverage for the new input plumbing

Validation

  • uv run pytest -q tests/unit/test_extended_cps.py tests/unit/test_weeks_worked.py tests/unit/test_reference_partner.py tests/unit/datasets/test_cps_income_variables.py tests/unit/datasets/test_cps_helpers.py::test_validate_raw_cps_schema_accepts_constructed_tax_unit_id_column tests/unit/datasets/test_cps_helpers.py::test_validate_raw_cps_schema_requires_reference_partner_column tests/unit/test_employer_sponsored_insurance_premiums.py::test_raw_cps_schema_requires_esi_source_columns
  • uv run ruff check policyengine_us_data/datasets/cps/census_cps.py policyengine_us_data/datasets/cps/cps.py policyengine_us_data/datasets/cps/extended_cps.py tests/unit/test_extended_cps.py tests/unit/test_weeks_worked.py tests/unit/test_reference_partner.py tests/unit/datasets/test_cps_income_variables.py tests/unit/datasets/test_cps_helpers.py tests/unit/test_employer_sponsored_insurance_premiums.py
  • uv run ruff format --check policyengine_us_data/datasets/cps/census_cps.py policyengine_us_data/datasets/cps/cps.py policyengine_us_data/datasets/cps/extended_cps.py tests/unit/test_extended_cps.py tests/unit/test_weeks_worked.py tests/unit/test_reference_partner.py tests/unit/datasets/test_cps_income_variables.py tests/unit/datasets/test_cps_helpers.py tests/unit/test_employer_sponsored_insurance_premiums.py
  • git diff --check

Notes

  • On raw CPS 2024, SPM_WKXPNS is highly reproducible from other CPS inputs: MAE is about $12, 97.9% of units are within $1, and 99.1% are within $5.
  • SPM_CAPWKCCXPNS is not reproducible nearly as cleanly from current public CPS inputs, so this PR intentionally stops short of reconstructing the capped value in us-data.
  • Paired model PR: Add Census SPM work expense formulas policyengine-us#8247.

@MaxGhenis MaxGhenis changed the title Use Census childcare capping formula Populate CPS inputs for SPM childcare formula Apr 9, 2026
Copy link
Copy Markdown
Collaborator

@baogorek baogorek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize this is an early review and we're waiting on policyengine-us#7960, but I wanted to get some thoughts in here. Interesting that the tests are failing on state-level calibration of aca. (I need to add that to my scorecard.)

Comment thread tests/unit/test_reference_partner.py Outdated
Comment thread tests/unit/test_reference_partner.py Outdated
Comment thread policyengine_us_data/datasets/cps/cps.py Outdated
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 10, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
pipeline-diagrams Error Error Apr 11, 2026 0:57am

Request Review

@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Addressed the review comments in 27169f8:

  • Replaced the brittle source-text tests with behavioral tests that call the CPS extraction functions.
  • Added a named PERRP unmarried-partner code mapping with Census CPS ASEC 2024 technical documentation cited in code.
  • Kept the WKSWORK test focused on generated weeks_worked output and clipping behavior.

Local checks:

  • uv run pytest -q tests/unit/test_reference_partner.py tests/unit/test_weeks_worked.py
  • uv run pytest -q tests/unit/test_extended_cps.py
  • ruff check tests/unit/test_reference_partner.py tests/unit/test_weeks_worked.py

Note: ruff check policyengine_us_data/datasets/cps/cps.py still reports pre-existing lint debt in that module (star import/unused locals), not introduced by this patch.

@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Follow-up for the unit-test failure on the new run: 89b58ac extracts derive_weeks_worked and tests that helper directly, while production still assigns cps["weeks_worked"] = derive_weeks_worked(person.WKSWORK).

The CI failure came from the test calling the full add_personal_income_variables path after package-spec state had been changed elsewhere in the full unit suite. Local full unit verification now passes with CI-style env:

  • HUGGING_FACE_TOKEN=dummy uv run pytest -q tests/unit -> 589 passed, 9 skipped

@MaxGhenis
Copy link
Copy Markdown
Contributor Author

Integration follow-up: the Modal run failed in tests/integration/test_cps.py::test_add_personal_variables_maps_current_health_coverage_flags because that synthetic fixture does not include PERRP.

I pushed a narrow fix in this commit:

  • default missing PERRP to all-false when building is_unmarried_partner_of_household_head
  • added a unit test covering the missing-column fallback

Local verification:

  • uv run pytest -q tests/unit/test_reference_partner.py tests/integration/test_cps.py -k 'missing_perrp_defaults_to_false or add_personal_variables_maps_current_health_coverage_flags'

@MaxGhenis MaxGhenis force-pushed the codex/childcare-spm-cap-formula branch from 7f64138 to fa23ce3 Compare May 7, 2026 20:34
@MaxGhenis MaxGhenis marked this pull request as ready for review May 7, 2026 21:29
@MaxGhenis MaxGhenis merged commit 7cb8045 into main May 7, 2026
12 checks passed
@MaxGhenis MaxGhenis deleted the codex/childcare-spm-cap-formula branch May 7, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants