Skip to content

Conversation

@igerber
Copy link
Owner

@igerber igerber commented Jan 24, 2026

Summary

  • Add the reference period (e=-1-anticipation) to Callaway-Sant'Anna event study output when using base_period="universal"
  • This matches R did package behavior and provides visual clarity in event study plots
  • Reference period has effect=0, se=0 (by construction as normalization constraint)
  • t_stat and p_value are NaN (undefined for normalized reference)

Methodology references (required if estimator / math changes)

  • Method name(s): Callaway-Sant'Anna event study aggregation
  • Paper / source link(s): Callaway & Sant'Anna (2021), R did package
  • Any intentional deviations from the source (and why): None - this aligns with R did package behavior

Validation

  • Tests added/updated:
    • test_event_study_universal_includes_reference_period - verifies e=-1 included with effect=0
    • test_event_study_varying_excludes_reference_period - verifies varying mode unaffected
    • test_event_study_universal_with_anticipation - verifies e=-2 reference with anticipation=1
  • Full test suite: 921 passed

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

Generated with Claude Code

…period

Add the reference period to Callaway-Sant'Anna event study output when using
base_period="universal". This matches R `did` package behavior and provides
visual clarity in event study plots by including the normalization point.

Changes:
- Add base_period type hint to CallawaySantAnnaAggregationMixin
- Add reference period entry (effect=0, se=0, t_stat=NaN) in _aggregate_event_study
- Add 3 tests for reference period behavior (universal, varying, anticipation)
- Update REGISTRY.md documentation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@github-actions
Copy link

Overall assessment: ⛔ Blocker

Executive summary:

  • P0: Reference-period CI is set to (0.0, 0.0) despite se=0, which conflicts with the registry’s NaN-propagation guidance and can mislead inference.
  • P1: The reference period is injected even when no event-study effects exist, so downstream pretrend diagnostics treat a synthetic zero-variance point as real data.
  • P2: Tests encode the (0.0, 0.0) CI and don’t cover pretrend/empty-data interactions for the new reference period behavior.

Methodology

Affected method(s): Callaway–Sant’Anna event study aggregation (base_period="universal").

  • P0 Reference-period inference is inconsistent with the registry’s NaN propagation: se=0 but conf_int=(0.0, 0.0) implies infinite precision. Impact: violates stated inference policy and can bias downstream diagnostics/plots by treating the normalization period as estimated. Fix: keep effect=0/se=0 if desired, but set conf_int to (np.nan, np.nan) (or route through compute_confidence_interval with se=np.nan) and keep t_stat/p_value as NaN. diff_diff/staggered_aggregation.py:L420-L432 tests/test_staggered.py:L2848-L2853 docs/methodology/REGISTRY.md:L211-L226
  • P1 Reference period is added even when event_study_effects is empty (e.g., balance_e filters out all groups), so pretrend extraction no longer surfaces the “no pre-treatment periods” condition and instead uses a synthetic zero-variance point. Impact: hides empty-data errors and can distort pretrend tests (singular vcov / misleading power). Fix: only add the reference period when at least one real effect exists, and/or exclude n_groups==0 / rel_period==(-1-anticipation) in pretrend extraction. diff_diff/staggered_aggregation.py:L420-L432 diff_diff/pretrends.py:L658-L672 diff_diff/honest_did.py:L1263-L1268

Code Quality

  • No issues found.

Performance

  • No issues found.

Maintainability

  • No issues found.

Tech Debt

  • No issues found.

Security

  • No issues found.

Documentation/Tests

  • P2 Tests codify the (0.0, 0.0) CI expectation and there is no coverage for the new reference period’s interaction with pretrend/empty-data behavior. Impact: regressions in diagnostics or NaN handling won’t be caught. Fix: update the reference-period CI assertion to expect NaN and add tests ensuring pretrends/honest_did skip n_groups==0/reference-period entries and still raise on empty pre-periods. tests/test_staggered.py:L2840-L2901 diff_diff/pretrends.py:L658-L672 diff_diff/honest_did.py:L1263-L1268

…fields

- Change conf_int from (0.0, 0.0) to (np.nan, np.nan) for consistency with
  NaN-propagation policy (P0 fix)
- Change se from 0.0 to np.nan since reference period is a normalization
  constraint, not an estimated effect
- Add guard to only inject reference period when real effects exist (P1 fix)
- Update tests to expect NaN for se and conf_int
- Update REGISTRY.md to document NaN inference for reference period
- Add launch/ and launch-video/ to .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Jan 24, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: 3b0666d7a94b2525d906e4da9419ad8298d78d6a


Overall Assessment: ⚠️ Needs changes

Executive Summary

  • Adds a universal-base reference period to CS event study output and documents it in the methodology registry/tests.
  • The normalization-only reference period (SE NaN) is now treated as an estimated pre-period by pre-trends and HonestDiD, leading to invalid vcov/tests for base_period="universal".
  • Add filtering for non-estimated periods and regression tests for downstream inference paths to avoid NaN contamination.

Methodology

  • Severity: P1. Impact: The reference period is defined as a normalization constraint in the registry (docs/methodology/REGISTRY.md:L227-L234) and is injected into event-study output (diff_diff/staggered_aggregation.py:L420-L434), but downstream inference consumes all negative/relative periods as estimated (diff_diff/pretrends.py:L648-L673, diff_diff/honest_did.py:L575-L602), contaminating vcov/tests with NaNs or invalid inputs. Concrete fix: filter out entries with n_groups==0 or non-finite se (or add an explicit is_reference/reference_period flag) before building pre/post arrays in these consumers.

Code Quality

  • No issues found.

Performance

  • No issues found.

Maintainability

  • No issues found.

Tech Debt

  • No issues found.

Security

  • No issues found.

Documentation/Tests

  • Severity: P2. Impact: New tests cover event-study output but not the downstream consumers that now receive a reference-period entry; regressions in pre-trends/HonestDiD remain unguarded. Concrete fix: add tests exercising PreTrendsTest/HonestDiD with base_period="universal" (and anticipation>0) to assert the reference period is ignored (tests/test_staggered.py:L2825-L2895).

Address code review feedback: the reference period (e=-1) with NaN SE could
contaminate vcov matrices in pretrends and HonestDiD consumers.

Changes:
- pretrends.py: Filter out entries with n_groups=0 or non-finite SE when
  extracting pre-period effects for CallawaySantAnna and SunAbraham results
- honest_did.py: Filter out entries with n_groups=0 or non-finite SE when
  extracting event study effects for CallawaySantAnna results
- Add tests verifying base_period="universal" works correctly with both
  PreTrendsPower and HonestDiD (reference period is properly excluded)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Jan 24, 2026

Addressed Code Review Feedback

Commit 1: 3b0666d - Reference period inference fields

  • Changed conf_int from (0.0, 0.0) to (np.nan, np.nan) (P0 fix)
  • Changed se from 0.0 to np.nan (consistency with undefined inference)
  • Added guard to only inject reference period when real effects exist (P1 fix)
  • Updated tests to expect NaN for se and conf_int
  • Updated REGISTRY.md to document NaN inference for reference period

Commit 2: a34445b - Downstream consumer protection

Addressed the methodology issue where downstream consumers (pretrends, HonestDiD) could have their vcov matrices contaminated by NaN values from the reference period.

Changes:

  • pretrends.py: Filter out entries with n_groups=0 or non-finite SE when extracting pre-period effects
  • honest_did.py: Filter out entries with n_groups=0 or non-finite SE when extracting event study effects
  • Added tests for both PreTrendsPower and HonestDiD with base_period="universal" to verify reference period is correctly excluded

All 206 related tests pass (93 staggered + 113 pretrends/honest_did).

@igerber
Copy link
Owner Author

igerber commented Jan 24, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: a34445b1a122d1c97ef7f408223d0153bf82c748


Overall Assessment

  • ⚠️ Needs changes

Executive Summary

  • P1: HonestDiD’s max pre‑violation now ingests the synthetic reference period and can return zero when no real pre‑period estimates exist, tightening RM bounds incorrectly.
  • P2: The injected reference period has se=NaN, but plotting skips NaN‑SE entries, so the reference period won’t appear in event‑study plots by default.
  • Methodology registry alignment for the CS event‑study change looks consistent; tests cover universal/varying/anticipation plus filtering in HonestDiD/PreTrends.

Methodology

  • Severity: P1 | Impact: HonestDiD._estimate_max_pre_violation now includes the normalization-only reference period (n_groups=0, se=NaN) and can return 0 when no real pre‑period effects exist, collapsing RM bounds to the point estimate and understating uncertainty. | Concrete fix: filter event_study_effects the same way as _extract_event_study_params (drop n_groups==0 or non‑finite se) or compute max only from pre_periods. | Location: diff_diff/honest_did.py:L1264-L1276

Code Quality

  • Severity: P2 | Impact: Reference period entries have se=NaN, and plot_event_study drops NaN‑SE points, so the new reference period never appears in plots, undermining the visual cue for universal base period. | Concrete fix: allow plotting reference points with NaN SE (marker only, no CI) or set a plotting‑only se=0 for n_groups==0 while keeping inference fields NaN. | Location: diff_diff/staggered_aggregation.py:L420-L434, diff_diff/visualization.py:L195-L201

Performance

  • None.

Maintainability

  • None.

Tech Debt

  • None.

Security

  • None.

Documentation/Tests

  • Severity: P3 | Impact: No regression test covers the max‑pre‑violation path when the only negative period is the normalization reference, so this bug could reappear. | Concrete fix: add a test in tests/test_honest_did.py that constructs CS results with no real pre‑period effects and asserts the fallback uses overall_se (not 0). | Location: tests/test_honest_did.py

…otting

- Filter n_groups=0 entries in _estimate_max_pre_violation to prevent
  reference period (effect=0) from collapsing RM bounds
- Allow NaN SE entries in event study plots (show point without error bars)
- Add regression test for max_pre_violation exclusion behavior
- Add visualization tests for reference period with NaN SE

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Jan 24, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: d3da585b1dca093cd43dc3f2377a420d4751270a


Overall Assessment

  • ⚠️ Needs changes

Executive Summary

  • P1: With the new injected reference period, plot_event_study still defaults the reference marker/line to -1, so base_period="universal" + anticipation>0 plots can mislabel the normalization period.
  • Event-study output now adds the normalization period with NaN inference fields and n_groups=0, which aligns with the Methodology Registry and avoids misleading stats.
  • HonestDiD and PreTrendsPower filter out normalization constraints and non-finite SEs, protecting downstream inference.
  • Tests cover universal/varying modes and plotting with NaN SEs for the reference period.

Methodology

  • Severity: P1; Impact: The estimator now injects the normalization period at e=-1-anticipation (diff_diff/staggered_aggregation.py:423), but plotting still defaults reference_period=-1 (diff_diff/visualization.py:365), so for anticipation>0 the true normalization period is rendered as a regular estimate and the reference line/marker is wrong, conflicting with the registry’s definition (docs/methodology/REGISTRY.md:232); Concrete fix: when reference_period is None, infer it from results.event_study_effects (e.g., n_groups==0 or se NaN with effect 0), or store anticipation on CallawaySantAnnaResults and set reference_period = -1 - anticipation for universal base periods; add a plot test for anticipation>0.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • No findings.

…cipation

Fix _extract_plot_data() to detect reference period from n_groups=0 marker
instead of hardcoding -1. This correctly handles anticipation > 0 where the
reference period is at e = -1 - anticipation (e.g., e=-2 when anticipation=1).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@igerber
Copy link
Owner Author

igerber commented Jan 24, 2026

/ai-review

@github-actions
Copy link

🔁 AI review rerun (requested by @igerber)

Head SHA: 98ba6db004b98811bddef989ca43e8940286ebbc


Overall assessment: ✅ Looks good

Executive summary:

  • Callaway–Sant’Anna event-study output now includes the universal-base reference period and matches the Methodology Registry (effect=0, SE/CI NaN).
  • HonestDiD/PreTrends filtering excludes normalization constraints and non-finite SEs, avoiding NaN contamination in downstream inference.
  • Visualization changes correctly render NaN-SE reference points without error bars; tests cover universal/varying and anticipation.
  • Minor test gap: bootstrap interaction with the injected reference period isn’t exercised.

Methodology

  • No findings.

Code Quality

  • No findings.

Performance

  • No findings.

Maintainability

  • No findings.

Tech Debt

  • No findings.

Security

  • No findings.

Documentation/Tests

  • P3 — Impact: The base_period="universal" reference-period injection isn’t covered under bootstrap inference, so regressions in the update path could slip through. Concrete fix: add a test running CallawaySantAnna(..., base_period="universal", n_bootstrap>0, aggregate="event_study") and assert the reference period keeps NaN SE/t_stat/p_value while bootstrap updates succeed (diff_diff/staggered_aggregation.py:L420-L434).

@igerber igerber merged commit 622be06 into main Jan 24, 2026
4 checks passed
@igerber igerber deleted the feature/cs-event-study-reference-period branch January 24, 2026 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants