Skip to content

CI checkpoint cache never invalidates on code changes #583

@baogorek

Description

@baogorek

Bug

The Modal checkpoint system introduced in 9428a6df (Feb 4, 2026) caches built H5 files on a persistent volume and restores them on subsequent runs if they exist and are non-empty. There is no invalidation when the code that produces those files changes.

This means any code change on main that affects H5 output (datasets, calibration, imputation) is silently masked — CI skips the rebuild and runs tests against stale data.

Impact

Since Feb 4, 27 commits on main touched H5-producing code, including:

None of these regenerated data in CI. The current CI failures on main (employment_income = 0) are caused by stale enhanced_cps_2024.h5 from the checkpoint volume.

Suggested fix

Invalidate checkpoints when the producing script (or its dependencies) changes, e.g. by hashing the script content and comparing against a stored hash. As an immediate workaround, trigger a run with --clear-checkpoints.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions