fix: route TripleDifference power to panel DGP when n_periods > 2 by igerber · Pull Request #544 · igerber/diff-diff

igerber · 2026-06-23T13:39:12Z

Summary

simulate_power / simulate_mde / simulate_sample_size now route TripleDifference (DDD) power to the panel DGP generate_ddd_panel_data when n_periods > 2 (previously the _EstimatorProfile hard-coded the cross-sectional 2×2×2 generate_ddd_data, silently ignoring n_periods). The panel path honors n_periods/treatment_period, sizes the panel by n_units directly, and switches the sample-size search from the multiple-of-8 grid to a continuous step-1 search.
Emit a UserWarning when the estimator lacks cluster="unit" on the panel path (within-unit serial correlation makes unclustered SEs anti-conservative → overstated power); reject the cross-sectional-only n_per_cell key with a clear ValueError. treatment_fraction stays inert (balanced 2×2×2); group_frac/partition_frac are overridable via data_generator_kwargs.
Add a split-aware viable-N floor (_ddd_panel_viable_min_n, mirroring the DGP's rounded stratified allocation) so simulate_sample_size never probes an infeasible (empty-cell) n under skewed splits; raise a clear error when an n_range upper bound is below that floor.
Since simulate_power defaults to n_periods=4, the default DDD power call now uses the panel DGP (intentional — removes the prior "n_periods ignored" wart).

Resolves the deferred TODO.md row (Methodology/Correctness, "TripleDifference power auto-routing").

Methodology references (required if estimator / math changes)

Method name(s): PowerAnalysis (simulation path) — TripleDifference DGP routing
Paper / source link(s): Bloom (1995); Burlig, Preonas & Woerman (2020). DGP per generate_ddd_panel_data (DDD-CPT triple-interaction identification). No estimator math changed — this is DGP selection/wiring for the simulation power harness.
Any intentional deviations from the source (and why): None new. Documented in docs/methodology/REGISTRY.md §PowerAnalysis (**Note:** labels) — cross-sectional (n_periods ≤ 2) vs panel (> 2) routing + the cluster="unit" inference caveat.

Validation

Tests added/updated (tests/test_power.py): re-targeted two stale-warning tests + the n_per_cell collision test to the cross-sectional path; added panel-routing, missing-cluster warning, no-warning-with-cluster, n_per_cell-reject, and slow simulate_mde/simulate_sample_size panel tests (incl. unbalanced-split viable-floor + low-n_range guard). Full tests/test_power.py green (214 passed, incl. slow).
Backtest / simulation / notebook evidence: docs/tutorials/06_power_analysis.ipynb DDD section updated to teach the routing + showcase the panel path with cluster="unit"; executes end-to-end (all 32 code cells).

Security / privacy

Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

simulate_power/simulate_mde/simulate_sample_size silently ignored n_periods for DDD because the _EstimatorProfile hard-coded the cross-sectional generate_ddd_data. Route to generate_ddd_panel_data when n_periods > 2, honoring n_periods/ treatment_period and sizing the panel by n_units directly. simulate_sample_size switches from the multiple-of-8 grid to a continuous step-1 search on the panel path. Emit a UserWarning when the estimator lacks cluster="unit" (panel SEs are anti-conservative and overstate power); reject the cross-sectional-only n_per_cell key with a clear ValueError. treatment_fraction stays inert (balanced 2x2x2). Docs: REGISTRY.md PowerAnalysis notes + tutorial 06 + CHANGELOG [Unreleased]; removes the resolved TODO row. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Addresses local codex P1/P2: simulate_sample_size advertised group_frac/ partition_frac overrides for the panel DDD path but hard-coded the search floor at 16, which is infeasible for skewed splits (generate_ddd_panel_data requires every (group,partition) cell non-empty — e.g. 0.1/0.1 needs n>=55). Add _ddd_panel_viable_min_n() (mirrors the DGP's rounded stratified allocation) and use it for min_n/abs_min on the panel path; raise a clear ValueError when an n_range upper bound is below that floor. Regression tests for the unbalanced split + low n_range guard. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-23T13:43:41Z

Overall Assessment
✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

Affected method: PowerAnalysis simulation path for TripleDifference in simulate_power, simulate_mde, and simulate_sample_size.
The panel routing for n_periods > 2 is documented in REGISTRY.md with **Note:** labels, so it is not a methodology defect.
The panel DGP/estimator contract is coherent: generate_ddd_panel_data emits post and unit, and the adapter fits TripleDifference(..., time="post") with a warning unless cluster="unit" is set.
No new inline inference or partial-NaN inference anti-patterns found.
Only minor P2/P3 cleanup items below.

Methodology

Severity: P3 informational
Impact: The PR intentionally changes DDD power routing: n_periods > 2 now uses generate_ddd_panel_data, while n_periods <= 2 keeps the cross-sectional generate_ddd_data path. This is explicitly documented in docs/methodology/REGISTRY.md:L3374-L3375; the panel DGP docstring also documents the within-unit clustering caveat at diff_diff/prep_dgp.py:L1181-L1192. The estimator variance path remains the documented IF/CR1 path at diff_diff/triple_diff.py:L1348-L1364.
Concrete fix: None required.

Code Quality

Severity: P2
Impact: _ddd_panel_viable_min_n() mirrors the DGP allocation but does not validate group_frac / partition_frac before the scan. Invalid values such as group_frac=0.0 can be converted into a misleading “n_range upper bound is below the minimum panel-DDD sample size” error before the DGP’s clearer validation would fire. See diff_diff/power.py:L689-L702 and diff_diff/power.py:L3094-L3104. This does not produce wrong estimates, but it weakens edge-case diagnostics.
Concrete fix: Validate group_frac and partition_frac are in (0, 1) before _ddd_panel_viable_min_n(), matching generate_ddd_panel_data at diff_diff/prep_dgp.py:L1284-L1287; have the helper raise clearly if no feasible n is found within search_max.

Performance

No findings.

Maintainability

No blocking findings. The allocation mirror in _ddd_panel_cells_populated() is acceptable for this PR, but if generate_ddd_panel_data allocation changes again, consider extracting the shared allocation/count helper to avoid drift.

Tech Debt

No findings. The removed TODO.md row corresponds to the implemented DDD power auto-routing work; no new untracked correctness debt found.

Security

No findings. The diff does not introduce secrets, network calls, file-system side effects, or unsafe deserialization paths.

Documentation/Tests

Severity: P3
Impact: The tutorial footnote says cross-sectional DDD simulate_sample_size() “uses a search floor of 64” at docs/tutorials/06_power_analysis.ipynb DDD support table prose. The default registry start is 64, but explicit n_range can still evaluate down to the 16-unit absolute floor in diff_diff/power.py:L3171-L3194.
Concrete fix: Rephrase to “default search starts at registry min_n=64; explicit n_range can go as low as the 16-unit realizability floor.”

I did not run the test suite in this read-only review environment.

…oor helper Addresses CI codex P2: _ddd_panel_viable_min_n() now validates group_frac/ partition_frac in (0, 1) up front (matching generate_ddd_panel_data) and raises clearly if no feasible n is found within search_max, so an out-of-range split surfaces the DGP's clear message instead of a misleading "n_range below the minimum" bracketing error. Adds a unit test for the validation. Also clarifies the tutorial-06 DDD support-table footnote (P3): simulate_sample_size starts at the registry floor of 64, but an explicit n_range can reach the 16-unit floor. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-23T13:53:48Z

🔁 AI review rerun (requested by @igerber)

Head SHA: e059454157b27d0abe4e86e2a4142ac1e1ae8086

Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

Affected method: PowerAnalysis simulation routing for TripleDifference in simulate_power, simulate_mde, and simulate_sample_size.
The DDD n_periods > 2 panel routing is documented in docs/methodology/REGISTRY.md:L3374-L3375, so it is not an undocumented methodology deviation.
The panel DGP/estimator contract is consistent: generate_ddd_panel_data emits post and unit, and the power adapter fits TripleDifference(..., time="post") while warning unless cluster="unit" is set.
Prior P2 diagnostic issue is addressed: _ddd_panel_viable_min_n() now validates group_frac / partition_frac before sample-size bracketing.
No new inline inference or partial NaN guard anti-patterns found.
I could not run the targeted pytest checks because pytest is not installed in this environment.

Methodology

Severity: P3 informational
Impact: TripleDifference power now routes n_periods > 2 to generate_ddd_panel_data; this is explicitly documented with **Note:** labels in docs/methodology/REGISTRY.md:L3374-L3375. The DGP docstring documents the panel serial-correlation caveat at diff_diff/prep_dgp.py:L1151-L1192, and the estimator’s clustered IF/CR1 variance contract matches docs/methodology/REGISTRY.md:L2123-L2151 and diff_diff/triple_diff.py:L1348-L1364.
Concrete fix: None required.

Code Quality

No findings. The prior diagnostic concern is resolved by validation at diff_diff/power.py:L699-L705, with regression coverage at tests/test_power.py:L1396-L1402.

Performance

No findings.

Maintainability

No blocking findings. The DDD panel allocation mirror at diff_diff/power.py:L675-L708 matches generate_ddd_panel_data at diff_diff/prep_dgp.py:L1294-L1315; keep these in sync if the DGP allocation changes later.

Tech Debt

No findings. The prior TODO row for TripleDifference power auto-routing has been removed from TODO.md, consistent with this PR implementing it.

Security

No findings. The diff does not introduce secrets, network calls, unsafe deserialization, or new filesystem side effects.

Documentation/Tests

Severity: P3 informational
Impact: The tutorial support-table footnote now correctly distinguishes the cross-sectional DDD registry floor from the explicit n_range realizability floor at docs/tutorials/06_power_analysis.ipynb:L442. New tests cover panel routing, cluster warning behavior, n_per_cell rejection, panel MDE/sample-size routing, unbalanced split floors, and invalid split validation at tests/test_power.py:L1257-L1402.
Concrete fix: None required.

igerber and others added 2 commits June 23, 2026 09:37

igerber added the ready-for-ci Triggers CI test workflows label Jun 23, 2026

igerber merged commit 2293ba1 into main Jun 23, 2026
33 of 34 checks passed

igerber deleted the fix/ddd-power-panel-routing branch June 23, 2026 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: route TripleDifference power to panel DGP when n_periods > 2#544

fix: route TripleDifference power to panel DGP when n_periods > 2#544
igerber merged 3 commits into
mainfrom
fix/ddd-power-panel-routing

igerber commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

igerber commented Jun 23, 2026

Summary

Methodology references (required if estimator / math changes)

Validation

Security / privacy

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

github-actions Bot commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant