Skip to content

fix: route TripleDifference power to panel DGP when n_periods > 2#544

Merged
igerber merged 3 commits into
mainfrom
fix/ddd-power-panel-routing
Jun 23, 2026
Merged

fix: route TripleDifference power to panel DGP when n_periods > 2#544
igerber merged 3 commits into
mainfrom
fix/ddd-power-panel-routing

Conversation

@igerber

@igerber igerber commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Summary

  • simulate_power / simulate_mde / simulate_sample_size now route TripleDifference (DDD) power to the panel DGP generate_ddd_panel_data when n_periods > 2 (previously the _EstimatorProfile hard-coded the cross-sectional 2×2×2 generate_ddd_data, silently ignoring n_periods). The panel path honors n_periods/treatment_period, sizes the panel by n_units directly, and switches the sample-size search from the multiple-of-8 grid to a continuous step-1 search.
  • Emit a UserWarning when the estimator lacks cluster="unit" on the panel path (within-unit serial correlation makes unclustered SEs anti-conservative → overstated power); reject the cross-sectional-only n_per_cell key with a clear ValueError. treatment_fraction stays inert (balanced 2×2×2); group_frac/partition_frac are overridable via data_generator_kwargs.
  • Add a split-aware viable-N floor (_ddd_panel_viable_min_n, mirroring the DGP's rounded stratified allocation) so simulate_sample_size never probes an infeasible (empty-cell) n under skewed splits; raise a clear error when an n_range upper bound is below that floor.
  • Since simulate_power defaults to n_periods=4, the default DDD power call now uses the panel DGP (intentional — removes the prior "n_periods ignored" wart).

Resolves the deferred TODO.md row (Methodology/Correctness, "TripleDifference power auto-routing").

Methodology references (required if estimator / math changes)

  • Method name(s): PowerAnalysis (simulation path) — TripleDifference DGP routing
  • Paper / source link(s): Bloom (1995); Burlig, Preonas & Woerman (2020). DGP per generate_ddd_panel_data (DDD-CPT triple-interaction identification). No estimator math changed — this is DGP selection/wiring for the simulation power harness.
  • Any intentional deviations from the source (and why): None new. Documented in docs/methodology/REGISTRY.md §PowerAnalysis (**Note:** labels) — cross-sectional (n_periods ≤ 2) vs panel (> 2) routing + the cluster="unit" inference caveat.

Validation

  • Tests added/updated (tests/test_power.py): re-targeted two stale-warning tests + the n_per_cell collision test to the cross-sectional path; added panel-routing, missing-cluster warning, no-warning-with-cluster, n_per_cell-reject, and slow simulate_mde/simulate_sample_size panel tests (incl. unbalanced-split viable-floor + low-n_range guard). Full tests/test_power.py green (214 passed, incl. slow).
  • Backtest / simulation / notebook evidence: docs/tutorials/06_power_analysis.ipynb DDD section updated to teach the routing + showcase the panel path with cluster="unit"; executes end-to-end (all 32 code cells).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

igerber and others added 2 commits June 23, 2026 09:37
simulate_power/simulate_mde/simulate_sample_size silently ignored n_periods for
DDD because the _EstimatorProfile hard-coded the cross-sectional generate_ddd_data.
Route to generate_ddd_panel_data when n_periods > 2, honoring n_periods/
treatment_period and sizing the panel by n_units directly. simulate_sample_size
switches from the multiple-of-8 grid to a continuous step-1 search on the panel
path. Emit a UserWarning when the estimator lacks cluster="unit" (panel SEs are
anti-conservative and overstate power); reject the cross-sectional-only n_per_cell
key with a clear ValueError. treatment_fraction stays inert (balanced 2x2x2).

Docs: REGISTRY.md PowerAnalysis notes + tutorial 06 + CHANGELOG [Unreleased];
removes the resolved TODO row.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses local codex P1/P2: simulate_sample_size advertised group_frac/
partition_frac overrides for the panel DDD path but hard-coded the search floor
at 16, which is infeasible for skewed splits (generate_ddd_panel_data requires
every (group,partition) cell non-empty — e.g. 0.1/0.1 needs n>=55). Add
_ddd_panel_viable_min_n() (mirrors the DGP's rounded stratified allocation) and
use it for min_n/abs_min on the panel path; raise a clear ValueError when an
n_range upper bound is below that floor. Regression tests for the unbalanced
split + low n_range guard.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

Overall Assessment
✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • Affected method: PowerAnalysis simulation path for TripleDifference in simulate_power, simulate_mde, and simulate_sample_size.
  • The panel routing for n_periods > 2 is documented in REGISTRY.md with **Note:** labels, so it is not a methodology defect.
  • The panel DGP/estimator contract is coherent: generate_ddd_panel_data emits post and unit, and the adapter fits TripleDifference(..., time="post") with a warning unless cluster="unit" is set.
  • No new inline inference or partial-NaN inference anti-patterns found.
  • Only minor P2/P3 cleanup items below.

Methodology

  • Severity: P3 informational
    Impact: The PR intentionally changes DDD power routing: n_periods > 2 now uses generate_ddd_panel_data, while n_periods <= 2 keeps the cross-sectional generate_ddd_data path. This is explicitly documented in docs/methodology/REGISTRY.md:L3374-L3375; the panel DGP docstring also documents the within-unit clustering caveat at diff_diff/prep_dgp.py:L1181-L1192. The estimator variance path remains the documented IF/CR1 path at diff_diff/triple_diff.py:L1348-L1364.
    Concrete fix: None required.

Code Quality

  • Severity: P2
    Impact: _ddd_panel_viable_min_n() mirrors the DGP allocation but does not validate group_frac / partition_frac before the scan. Invalid values such as group_frac=0.0 can be converted into a misleading “n_range upper bound is below the minimum panel-DDD sample size” error before the DGP’s clearer validation would fire. See diff_diff/power.py:L689-L702 and diff_diff/power.py:L3094-L3104. This does not produce wrong estimates, but it weakens edge-case diagnostics.
    Concrete fix: Validate group_frac and partition_frac are in (0, 1) before _ddd_panel_viable_min_n(), matching generate_ddd_panel_data at diff_diff/prep_dgp.py:L1284-L1287; have the helper raise clearly if no feasible n is found within search_max.

Performance

  • No findings.

Maintainability

  • No blocking findings. The allocation mirror in _ddd_panel_cells_populated() is acceptable for this PR, but if generate_ddd_panel_data allocation changes again, consider extracting the shared allocation/count helper to avoid drift.

Tech Debt

  • No findings. The removed TODO.md row corresponds to the implemented DDD power auto-routing work; no new untracked correctness debt found.

Security

  • No findings. The diff does not introduce secrets, network calls, file-system side effects, or unsafe deserialization paths.

Documentation/Tests

  • Severity: P3
    Impact: The tutorial footnote says cross-sectional DDD simulate_sample_size() “uses a search floor of 64” at docs/tutorials/06_power_analysis.ipynb DDD support table prose. The default registry start is 64, but explicit n_range can still evaluate down to the 16-unit absolute floor in diff_diff/power.py:L3171-L3194.
    Concrete fix: Rephrase to “default search starts at registry min_n=64; explicit n_range can go as low as the 16-unit realizability floor.”

I did not run the test suite in this read-only review environment.

…oor helper

Addresses CI codex P2: _ddd_panel_viable_min_n() now validates group_frac/
partition_frac in (0, 1) up front (matching generate_ddd_panel_data) and raises
clearly if no feasible n is found within search_max, so an out-of-range split
surfaces the DGP's clear message instead of a misleading "n_range below the
minimum" bracketing error. Adds a unit test for the validation. Also clarifies
the tutorial-06 DDD support-table footnote (P3): simulate_sample_size starts at
the registry floor of 64, but an explicit n_range can reach the 16-unit floor.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: e059454157b27d0abe4e86e2a4142ac1e1ae8086


Overall Assessment

✅ Looks good. No unmitigated P0/P1 findings.

Executive Summary

  • Affected method: PowerAnalysis simulation routing for TripleDifference in simulate_power, simulate_mde, and simulate_sample_size.
  • The DDD n_periods > 2 panel routing is documented in docs/methodology/REGISTRY.md:L3374-L3375, so it is not an undocumented methodology deviation.
  • The panel DGP/estimator contract is consistent: generate_ddd_panel_data emits post and unit, and the power adapter fits TripleDifference(..., time="post") while warning unless cluster="unit" is set.
  • Prior P2 diagnostic issue is addressed: _ddd_panel_viable_min_n() now validates group_frac / partition_frac before sample-size bracketing.
  • No new inline inference or partial NaN guard anti-patterns found.
  • I could not run the targeted pytest checks because pytest is not installed in this environment.

Methodology

  • Severity: P3 informational
    Impact: TripleDifference power now routes n_periods > 2 to generate_ddd_panel_data; this is explicitly documented with **Note:** labels in docs/methodology/REGISTRY.md:L3374-L3375. The DGP docstring documents the panel serial-correlation caveat at diff_diff/prep_dgp.py:L1151-L1192, and the estimator’s clustered IF/CR1 variance contract matches docs/methodology/REGISTRY.md:L2123-L2151 and diff_diff/triple_diff.py:L1348-L1364.
    Concrete fix: None required.

Code Quality

  • No findings. The prior diagnostic concern is resolved by validation at diff_diff/power.py:L699-L705, with regression coverage at tests/test_power.py:L1396-L1402.

Performance

  • No findings.

Maintainability

  • No blocking findings. The DDD panel allocation mirror at diff_diff/power.py:L675-L708 matches generate_ddd_panel_data at diff_diff/prep_dgp.py:L1294-L1315; keep these in sync if the DGP allocation changes later.

Tech Debt

  • No findings. The prior TODO row for TripleDifference power auto-routing has been removed from TODO.md, consistent with this PR implementing it.

Security

  • No findings. The diff does not introduce secrets, network calls, unsafe deserialization, or new filesystem side effects.

Documentation/Tests

  • Severity: P3 informational
    Impact: The tutorial support-table footnote now correctly distinguishes the cross-sectional DDD registry floor from the explicit n_range realizability floor at docs/tutorials/06_power_analysis.ipynb:L442. New tests cover panel routing, cluster warning behavior, n_per_cell rejection, panel MDE/sample-size routing, unbalanced split floors, and invalid split validation at tests/test_power.py:L1257-L1402.
    Concrete fix: None required.

@igerber igerber added the ready-for-ci Triggers CI test workflows label Jun 23, 2026
@igerber igerber merged commit 2293ba1 into main Jun 23, 2026
33 of 34 checks passed
@igerber igerber deleted the fix/ddd-power-panel-routing branch June 23, 2026 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant