Skip to content

docs: Gardner (2022) paper review — TwoStageDiD (PR-A)#539

Merged
igerber merged 1 commit into
mainfrom
docs/gardner-2022-review
Jun 23, 2026
Merged

docs: Gardner (2022) paper review — TwoStageDiD (PR-A)#539
igerber merged 1 commit into
mainfrom
docs/gardner-2022-review

Conversation

@igerber

@igerber igerber commented Jun 22, 2026

Copy link
Copy Markdown
Owner

Summary

  • New docs/methodology/papers/gardner-2022-review.md — eq./section-numbered scholarly review of Gardner (2022), Two-stage differences in differences (arXiv:2207.05943), the primary source for TwoStageDiD (R did2s). Covers the two-stage procedure (§3), estimands eqs. (4)/(5), event-study eq. (6) + Step 2′, the Newey-McFadden (1994) Thm 6.1 GMM variance (§3.3), and the Appendix B reference Stata GMM.
  • Corrected variance misattributions in REGISTRY.md ## TwoStageDiD and the METHODOLOGY_REVIEW.md tracker that the source read surfaced:
    • (i) The "Equation 6 per-cluster inverse (D_c'D_c)^{-1} deviation" was fabricated — eq. (6) is the event-study regression spec; the variance (§3.3, unnumbered) uses the global Jacobian inverse (Newey-McFadden Thm 6.1), clustered at the unit (App. B vce(cluster id)). Our code already matches this and did2s — never a deviation.
    • (ii) "(Gardner 2022, Theorem 1)" → the paper has no numbered theorems; corrected to "§3.3; Newey-McFadden 1994, Thm 6.1".
    • Relabeled the cluster-summed meat (was mislabeled "Bread"). The accurate "no finite-sample multiplier" Deviation from R note is unchanged.
  • Corrected the did2s bootstrap-default claim (3 places: paper review, REGISTRY, and the one-line two_stage_results.py docstring): did2s defaults to analytical corrected clustered SEs (bootstrap = FALSE); block bootstrap is optional, not the default (verified against the did2s source).
  • doc-deps.yaml: mapped the review under two_stage.py. TODO.md: tracked the PR-B deliverables.

This is PR-A of the 2-PR TwoStageDiD methodology validation (the imputation-pair twin of the just-completed ImputationDiD). PR-B adds tests/test_methodology_two_stage.py + a did2s R-parity fixture, then flips the tracker row to Complete.

Methodology references (required if estimator / math changes)

  • Method name(s): TwoStageDiD (Gardner 2022 / R did2s)
  • Paper / source link(s): Gardner, J. (2022), Two-stage differences in differences, arXiv:2207.05943 — https://arxiv.org/abs/2207.05943
  • Any intentional deviations from the source: None — this PR removes a fabricated deviation. The implementation is faithful to the paper's global Newey-McFadden GMM sandwich. The only source change is a one-line docstring correction in two_stage_results.py (the did2s bootstrap-default claim); no logic changes.

Validation

  • Tests added/updated: None (no logic change). tests/test_doc_deps_integrity.py green (149 passed); scholarly references (references.rst) + catalog surfaces verified via /docs-check; two_stage_results imports clean.
  • Backtest / simulation / notebook evidence: N/A.
  • AI review (local + CI agentic codex / gpt-5.5): clean after addressing a bootstrap-attribution finding (Gardner uses analytical GMM SEs; did2s defaults to bootstrap = FALSE), a tracker-consistency finding (the METHODOLOGY_REVIEW.md Eq. 6 deviation claim), and a PDF page/table metadata fix (34 pages, tables 1–4).

Security / privacy

  • Confirm no secrets/PII in this PR: Yes

🤖 Generated with Claude Code

@github-actions

Copy link
Copy Markdown

Overall Assessment

⚠️ Needs changes — one unmitigated P1 methodology documentation mismatch remains.

Executive Summary

  • The REGISTRY.md correction is methodologically right: Gardner Eq. (6) is the event-study regression, and §3.3 uses the global Newey-McFadden GMM sandwich. (arxiv.org)
  • P1: METHODOLOGY_REVIEW.md still repeats the old false “paper Eq. 6 deviation” claim.
  • P2: the new paper review says R did2s block bootstrap is the default, but current did2s defaults bootstrap = FALSE. (raw.githubusercontent.com)
  • The deferred PR-B test/R-parity work is properly tracked in TODO.md, so it is not a blocker.
  • I could not run tests locally because pytest is not installed in this environment.

Methodology

Finding 1 — P1

Location: METHODOLOGY_REVIEW.md:L592-L597

Impact: The PR fixes the fabricated Eq. (6) variance deviation in docs/methodology/REGISTRY.md:L1398 and the new review, but the methodology tracker still says the implementation matches did2s “not paper Eq. 6” and lists “Newey-McFadden Theorem 6.1 sandwich vs paper’s Eq. 6” as an outstanding documented deviation. That directly contradicts the corrected source review and leaves a canonical tracker with a known-false methodology statement.

Concrete fix: Update the TwoStageDiD tracker entry to say Gardner Eq. (6) is the event-study spec, §3.3 uses the global Jacobian/GMM sandwich, and the implementation matches the paper and did2s. Add docs/methodology/papers/gardner-2022-review.md under “Documentation in place”; keep status In Progress if PR-B tests/R parity are still deferred.

Finding 2 — P2

Location: docs/methodology/papers/gardner-2022-review.md:L135

Impact: The new review states “Block bootstrap is the R did2s package default.” Current did2s documents bootstrap as optional with default FALSE, and the function signature is bootstrap = FALSE; block bootstrap is only used when explicitly enabled. This can mislead the planned R-parity fixture about which inference path is the default. (raw.githubusercontent.com)

Concrete fix: Change the sentence to: R did2s defaults to analytical corrected clustered SEs and supports optional block bootstrap via bootstrap = TRUE. For consistency, also correct the duplicated pre-existing claims at docs/methodology/REGISTRY.md:L1404 and diff_diff/two_stage_results.py:L27-L29.

Code Quality

Severity: N/A
Impact: No source code changes in this PR.
Concrete fix: None.

Performance

Severity: N/A
Impact: Documentation-only PR; no runtime path changed.
Concrete fix: None.

Maintainability

Severity: P2
Impact: The remaining contradiction between REGISTRY.md, the new paper review, and METHODOLOGY_REVIEW.md creates avoidable future review drift.
Concrete fix: Same as Methodology Finding 1.

Tech Debt

Severity: P3
Impact: PR-B methodology tests and did2s R-parity fixture are deferred, but the PR adds a TODO.md tracking row at TODO.md:L176, which is acceptable for test/documentation debt.
Concrete fix: No action required for this PR beyond keeping the tracker accurate.

Security

Severity: N/A
Impact: No secrets, credentials, or executable workflow changes observed in the diff.
Concrete fix: None.

Documentation/Tests

Severity: P3
Impact: No tests were added, but this is tracked in TODO.md:L176. I attempted tests/test_doc_deps_integrity.py; it could not run because /usr/bin/python has no pytest module.
Concrete fix: None required for the deferred test gap; CI should still run the doc-deps integrity suite.

Path To Approval

  1. Update METHODOLOGY_REVIEW.md:L592-L597 to remove the false Eq. (6) deviation and record the new Gardner paper review while leaving the PR-B tests/R fixture as the remaining promotion work.

@igerber igerber force-pushed the docs/gardner-2022-review branch from 4f14300 to 460e0f1 Compare June 22, 2026 15:39
@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 460e0f104479d33b7fde233e74f6497affa18101


Overall Assessment

⚠️ Needs changes — one unmitigated P1 methodology documentation mismatch remains.

Executive Summary

  • The prior P1 in METHODOLOGY_REVIEW.md:L590-L598 is resolved.
  • The prior did2s bootstrap-default issue is resolved in docs/methodology/REGISTRY.md:L1404, diff_diff/two_stage_results.py:L27-L31, and docs/methodology/papers/gardner-2022-review.md:L135; current did2s has bootstrap = FALSE as the default. (raw.githubusercontent.com)
  • [Newly identified] docs/methodology/REGISTRY.md:L1432 still contains the stale “matches R did2s, not paper Eq. 6” claim.
  • The deferred PR-B methodology tests and did2s fixture are tracked in TODO.md:L176, so they are not blockers.
  • I could not run pytest because /usr/bin/python has no pytest module; docs/doc-deps.yaml parses successfully.

Methodology

Finding 1 — P1 [Newly identified]

Location: docs/methodology/REGISTRY.md:L1431-L1433

Impact: The Verified Components checklist still says Global (D'D)^{-1} in variance (matches R did2s, not paper Eq. 6). That contradicts the corrected registry note at docs/methodology/REGISTRY.md:L1398 and the new paper review. Gardner Eq. (6) is the event-study regression specification, while §3.3 frames inference as a joint GMM/generated-regressand variance problem and Appendix B implements GMM with clustered SEs. (arxiv.org)

Concrete fix: Replace the checklist item with something like: Global (D'D)^{-1} in variance (faithful to Gardner §3.3 / Newey-McFadden GMM sandwich and matches R did2s). Then rerun a grep for not paper Eq. 6, Equation 6 discrepancy, and per-cluster inverse across the TwoStageDiD docs.

Code Quality

Severity: N/A
Impact: No estimator code path changed; the edited source file only corrects a docstring. Existing TwoStageDiD inference paths still use safe_inference.
Concrete fix: None.

Performance

Severity: N/A
Impact: Documentation-only methodology changes; no runtime behavior changed.
Concrete fix: None.

Maintainability

Severity: P2
Impact: METHODOLOGY_REVIEW.md:L1447-L1450 still says TwoStageDiD needs a Gardner paper review, even though this PR adds it. This is stale tracker state, not a methodology blocker.
Concrete fix: Update that line to say the Gardner review is now present and only tests/test_methodology_two_stage.py plus the did2s R fixture remain.

Tech Debt

Severity: P3
Impact: PR-B test and R-parity work is explicitly tracked in TODO.md:L176.
Concrete fix: No action required for this PR.

Security

Severity: N/A
Impact: No secrets or executable workflow changes observed in the changed files.
Concrete fix: None.

Documentation/Tests

Severity: P2
Impact: Same stale tracker issue as Maintainability: METHODOLOGY_REVIEW.md:L1447-L1450 should no longer list the Gardner paper review as missing.
Concrete fix: Update the promotion-priority text to reflect that only the methodology test file and did2s fixture remain.

Path To Approval

  1. Update docs/methodology/REGISTRY.md:L1432 to remove the false “not paper Eq. 6” wording and state that the global inverse is faithful to Gardner §3.3 / Newey-McFadden and matches did2s.

@igerber igerber force-pushed the docs/gardner-2022-review branch from 460e0f1 to 21ecef9 Compare June 22, 2026 15:58
@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 21ecef9d6f40b2a752663b71b3aa8a5ca98535d0


Overall Assessment

⚠️ Needs changes — one newly identified P1 methodology-documentation mismatch remains.

Executive Summary

  • The prior P1 in docs/methodology/REGISTRY.md:L1432 is resolved; the checklist now says the global inverse is faithful to Gardner §3.3 / Newey-McFadden and matches did2s.
  • The prior stale tracker issue is resolved in METHODOLOGY_REVIEW.md:L1450.
  • The did2s bootstrap-default correction is consistent with the R source: bootstrap = FALSE is the default. (raw.githubusercontent.com)
  • [Newly identified] The new Gardner review overstates implemented scope by marking Gardner Eq. (5) -average and the full-sample first-stage variant as checked paper→library components, but TwoStageDiD exposes untreated-only Stage 1 and no static estimand.
  • TODO.md:L176 tracks PR-B tests/R fixture work, so that deferred validation is not a blocker.
  • docs/doc-deps.yaml parses successfully; I did not run pytest in the read-only review sandbox.

Methodology

Finding 1 — P1 [Newly identified]

Location: docs/methodology/papers/gardner-2022-review.md:L99-L103, docs/methodology/papers/gardner-2022-review.md:L124-L125

Impact: The new review says its checklist is “paper → library” and marks Eq. (5) -average “via duration-restricted Stage-2 sample” plus the “first-stage flexibility/full-sample variant” as checked. Gardner does describe both as valid modifications: Eq. (5) is recovered by restricting the second-stage sample to untreated observations and treated observations with durations no greater than , and footnote 8 says full-sample first-stage variants can be more efficient. (ar5iv.labs.arxiv.org) But the Python implementation still documents and executes Stage 1 on untreated observations only (diff_diff/two_stage.py:L1219-L1223, diff_diff/two_stage.py:L1638-L1641, diff_diff/two_stage.py:L2255-L2281), and the public parameters/get_params have no /duration-restricted static estimand or first-stage-sample selector (diff_diff/two_stage.py:L1320-L1345, diff_diff/two_stage.py:L3311-L3324). This makes the methodology review overclaim verified implementation support.

Concrete fix: Reword the checklist/tuning table so only implemented components are checked: overall ATT Eq. (4), event-study Eq. (6), untreated-only Stage 1, and current variance behavior. Mark Eq. (5) and full-sample first-stage variants as “paper-permitted but not exposed by TwoStageDiD” or add explicit TODO tracking if planned. If the intent is to support them now, add the public parameters, get_params() propagation, stage-2 filtering/first-stage branching, bootstrap/inference handling, and tests.

Code Quality

Severity: N/A
Impact: No runtime estimator logic changed; the only Python change is the TwoStageBootstrapResults docstring correction at diff_diff/two_stage_results.py:L27-L31.
Concrete fix: None.

Performance

Severity: N/A
Impact: Documentation/docstring-only changes; no runtime path changed.
Concrete fix: None.

Maintainability

Severity: P3
Impact: The priority-list heading at METHODOLOGY_REVIEW.md:L1447 still says the section is missing “a methodology test file / R parity and a paper review,” while the TwoStageDiD row now correctly says the Gardner review landed. The row itself is clear, so this is minor wording drift.
Concrete fix: Optionally change the heading to “missing one or more methodology test files, R parity fixtures, or paper reviews.”

Tech Debt

Severity: P3
Impact: PR-B methodology tests and did2s R parity are explicitly tracked in TODO.md:L176.
Concrete fix: No action required for this PR.

Security

Severity: N/A
Impact: No secrets, workflow changes, or executable security-sensitive changes observed.
Concrete fix: None.

Documentation/Tests

Severity: P1
Impact: Same as Methodology Finding 1: the new documentation overstates implemented estimator scope.
Concrete fix: Correct gardner-2022-review.md:L99-L103 and L124-L125 to distinguish implemented features from paper-permitted but unimplemented variants.

Path to Approval

  1. Update docs/methodology/papers/gardner-2022-review.md:L101 so Eq. (5) is not marked as implemented unless a public /duration-restricted static estimand is actually added.
  2. Update docs/methodology/papers/gardner-2022-review.md:L103 and L125 so the full-sample first-stage variant is described as paper-permitted but not exposed by Python TwoStageDiD, or implement and test a first-stage-sample selector end to end.

@igerber igerber force-pushed the docs/gardner-2022-review branch from 21ecef9 to 6f66aba Compare June 22, 2026 16:09
@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 6f66aba9c44ec51e11d8f2656d42b534587ad9a1


Overall Assessment

Looks good — no unmitigated P0/P1 findings. The prior P1 overclaim about Eq. (5) / full-sample first-stage support is resolved.

Executive Summary

  • Prior P1 resolved: docs/methodology/papers/gardner-2022-review.md:L99-L104 now explicitly marks the estimand and full-sample first-stage variant as paper-permitted but not exposed by TwoStageDiD.
  • The REGISTRY variance correction is source-consistent: Gardner Eq. (6) is the event-study regression, while §3.3 gives the joint-GMM/Newey-McFadden sandwich; Appendix B uses onestep, winitial(identity), and vce(cluster id). (arxiv.org)
  • The did2s bootstrap-default correction is source-consistent: R did2s() has bootstrap = FALSE by default and uses the analytical corrected clustered-SE branch unless bootstrap is requested. (raw.githubusercontent.com)
  • One P2 documentation wording issue remains in the new paper review: it says Newey-McFadden is the only theorem invoked anywhere, but Gardner also cites de Chaisemartin and D’Haultfœuille Theorem 1 for the TWFE-weight representation.
  • TODO.md:L176 tracks the PR-B methodology test / R parity fixture work, so those gaps are not blockers.

Methodology

Finding 1 — P2

Location: docs/methodology/papers/gardner-2022-review.md:L10

Impact: The review correctly says Gardner has no numbered theorem of his own, and correctly replaces the nonexistent “Gardner 2022, Theorem 1” variance attribution with Gardner §3.3 / Newey-McFadden Theorem 6.1. But the sentence overstates this as “the only ‘Theorem’ invoked anywhere is Newey & McFadden,” while the paper also invokes de Chaisemartin and D’Haultfœuille Theorem 1 for the TWFE-weight representation. This does not change the TwoStageDiD implementation contract, but it makes the source-review prose inaccurate. (arxiv.org)

Concrete fix: Reword to: “The paper contains no Gardner-numbered theorems or propositions; the variance relies on Newey-McFadden (1994), Theorem 6.1. Section 2.2 separately cites de Chaisemartin and D’Haultfœuille (2020), Theorem 1, for the TWFE-weight representation.”

Code Quality

Severity: N/A
Impact: No runtime estimator logic changed; the only Python change is the TwoStageBootstrapResults docstring correction at diff_diff/two_stage_results.py:L27-L31.
Concrete fix: None.

Performance

Severity: N/A
Impact: Documentation/docstring-only changes; no runtime path changed.
Concrete fix: None.

Maintainability

Severity: N/A
Impact: The previous tracker wording drift is resolved at METHODOLOGY_REVIEW.md:L1447-L1450.
Concrete fix: None.

Tech Debt

Severity: P3
Impact: PR-B validation work is explicitly tracked in TODO.md:L176, including tests/test_methodology_two_stage.py and a did2s R parity fixture.
Concrete fix: No action required for this PR.

Security

Severity: N/A
Impact: No secrets, workflow changes, executable code paths, or security-sensitive changes observed.
Concrete fix: None.

Documentation/Tests

Severity: P2
Impact: Same as Methodology Finding 1: one source-review sentence overstates theorem references. Validation gaps are tracked in TODO.md:L176.
Concrete fix: Apply the wording correction above. I verified docs/doc-deps.yaml parses and git diff --check is clean; I did not run pytest in the read-only review sandbox.

@igerber igerber force-pushed the docs/gardner-2022-review branch from 6f66aba to aa29cec Compare June 22, 2026 16:14
@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: aa29cec8e53d476955717b68b274c52ed7086056


Overall Assessment

Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Prior P2 wording issue is resolved: docs/methodology/papers/gardner-2022-review.md:L10 now correctly says there is no Gardner-numbered theorem, while separately acknowledging de Chaisemartin & D’Haultfœuille Theorem 1.
  • The REGISTRY correction is source-consistent: Gardner Eq. (6) is the event-study regression, while §3.3 frames inference as joint GMM and Appendix B uses vce(cluster id) GMM syntax. (arxiv.org)
  • The did2s bootstrap correction is source-consistent: R did2s() declares bootstrap = FALSE by default and enters the analytical corrected-clustered-SE branch when bootstrap is false. (raw.githubusercontent.com)
  • PR-B methodology tests and R parity are explicitly tracked in TODO.md:L176, so the remaining validation work is not a blocker under the deferred-work rules.
  • No runtime estimator logic changed; the only Python change is a docstring correction in diff_diff/two_stage_results.py:L27-L31.

Methodology

Severity: N/A
Impact: No methodology defect found. docs/methodology/REGISTRY.md:L1390-L1404 now accurately labels the cluster outer-product as meat, removes the fabricated Eq. 6 variance discrepancy, and describes the global GMM bread. This aligns with Gardner’s two-stage setup and inference discussion: first stage on untreated observations, second stage on adjusted outcomes, Eq. (6) as event-study specification, §3.3 joint GMM inference, and Appendix B clustered GMM syntax. (arxiv.org)
Concrete fix: None.

Code Quality

Severity: N/A
Impact: No runtime code path changed. The docstring correction in diff_diff/two_stage_results.py:L27-L31 matches the R source default.
Concrete fix: None.

Performance

Severity: N/A
Impact: Documentation/docstring-only changes; no performance-sensitive path changed.
Concrete fix: None.

Maintainability

Severity: N/A
Impact: Tracker wording is now consistent across METHODOLOGY_REVIEW.md:L589-L598, METHODOLOGY_REVIEW.md:L1447-L1450, and docs/methodology/REGISTRY.md:L1427-L1436.
Concrete fix: None.

Tech Debt

Severity: P3 informational
Impact: Remaining TwoStageDiD methodology validation work is tracked in TODO.md:L176, including tests/test_methodology_two_stage.py and a did2s R parity fixture.
Concrete fix: No action required for this PR.

Security

Severity: N/A
Impact: No secrets, workflow changes, executable paths, or security-sensitive changes observed.
Concrete fix: None.

Documentation/Tests

Severity: P3 informational
Impact: docs/doc-deps.yaml:L249-L256 now maps the new Gardner review under two_stage.py, and the prior documentation wording issue is resolved. I parsed docs/doc-deps.yaml successfully and checked for stale “block bootstrap by default” / fabricated Eq. 6 wording in the affected areas. I did not run pytest in this read-only review sandbox.
Concrete fix: None for PR-A; PR-B test/R parity work is tracked in TODO.md:L176.

Methodology-review PR-A for TwoStageDiD (Gardner 2022, arXiv:2207.05943;
R `did2s`), the imputation-pair twin of the just-completed ImputationDiD.

- New `docs/methodology/papers/gardner-2022-review.md`: eq./section-numbered
  scholarly review of the primary source.
- REGISTRY `## TwoStageDiD` + `METHODOLOGY_REVIEW.md` tracker: corrected the
  variance misattributions the source read surfaced — (i) the "Equation 6
  per-cluster inverse (D_c'D_c)^{-1} deviation" was fabricated (eq. 6 is the
  event-study spec; the variance is the unnumbered GLOBAL Newey-McFadden
  Thm 6.1 Jacobian-inverse sandwich, which the code already matches — not a
  deviation); (ii) "(Gardner 2022, Theorem 1)" (the paper has no numbered
  theorems); relabeled the cluster-summed meat (was "Bread").
- Corrected the `did2s` bootstrap-default claim in 3 places (paper review,
  REGISTRY, two_stage_results.py docstring): did2s defaults to analytical
  corrected clustered SEs (`bootstrap = FALSE`); block bootstrap is optional,
  not the default (verified vs the did2s source).
- doc-deps.yaml: mapped the review under two_stage.py. TODO.md: tracked the
  PR-B deliverables (tests + did2s parity -> tracker flip).

doc-deps integrity green; references + catalog verified. Local + CI agentic
AI review clean after addressing the bootstrap-attribution and tracker-
consistency findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@igerber igerber force-pushed the docs/gardner-2022-review branch from aa29cec to 1f3b629 Compare June 22, 2026 22:03
@github-actions

Copy link
Copy Markdown

🔁 AI review rerun (requested by @igerber)

Head SHA: 1f3b6297c8a74b28dc7ceae4e23bdc05fb1c18bf


Overall Assessment

Looks good — no unmitigated P0/P1 findings.

Executive Summary

  • Re-review result is unchanged: the prior wording/methodology attribution issue is resolved.
  • Affected method: TwoStageDiD; this PR changes documentation and one results docstring only, with no estimator logic, weighting, variance, SE, or default-behavior changes.
  • The REGISTRY correction is source-consistent: Gardner Eq. (6) is the event-study regression, while §3.3 gives joint-GMM inference with a global Jacobian inverse; Appendix B uses GMM with clustered SEs. (arxiv.org)
  • The did2s bootstrap correction is accurate: bootstrap = FALSE is the documented default, and bootstrap SEs are optional. (kylebutts.github.io)
  • Remaining TwoStageDiD methodology tests/R parity are tracked in TODO.md:L176, so they are P3 informational under the deferred-work policy.

Methodology

Severity: N/A
Impact: No methodology defect found. docs/methodology/REGISTRY.md:L1393-L1411 now correctly labels the clustered score outer-product as meat, describes the global GMM bread/Jacobian inverse, removes the fabricated “paper Eq. 6 per-cluster inverse” deviation, and corrects the did2s bootstrap-default claim. This matches Gardner §3.2/§3.3 and Appendix B.
Concrete fix: None.

Severity: P3 informational
Impact: Dedicated TwoStageDiD methodology validation and did2s R parity remain outstanding, but the PR explicitly tracks them in TODO.md:L176 and keeps the tracker status as In Progress at METHODOLOGY_REVIEW.md:L586-L598.
Concrete fix: No action required for this PR; complete the tracked PR-B items before promotion to Complete.

Code Quality

Severity: N/A
Impact: The only Python change is a docstring correction in diff_diff/two_stage_results.py:L27-L31. The surrounding GMM variance docstring in diff_diff/two_stage.py:L2842-L2857 already states the global inverse/no finite-sample-adjustment contract, so no partial source-code correction is left behind.
Concrete fix: None.

Performance

Severity: N/A
Impact: Documentation/docstring-only changes; no runtime or performance-sensitive path changed.
Concrete fix: None.

Maintainability

Severity: N/A
Impact: The methodology tracker, REGISTRY, paper review, and doc dependency mapping are consistent: see METHODOLOGY_REVIEW.md:L589-L598, docs/methodology/REGISTRY.md:L1405-L1411, docs/methodology/papers/gardner-2022-review.md:L149-L155, and docs/doc-deps.yaml:L249-L256.
Concrete fix: None.

Tech Debt

Severity: P3 informational
Impact: PR-B follow-up work is properly tracked in TODO.md:L176, including tests/test_methodology_two_stage.py, the did2s golden fixture, and the tracker promotion step.
Concrete fix: No action required in this PR.

Security

Severity: N/A
Impact: No executable workflow, dependency, credential, or security-sensitive code changes observed in the diff.
Concrete fix: None.

Documentation/Tests

Severity: P3 informational
Impact: No tests were added, but this is acceptable for the documented PR-A scope because no estimator logic changed and the missing methodology/R-parity coverage is tracked. I checked docs/doc-deps.yaml parses, git diff --check is clean, and grepped affected files for stale “block bootstrap by default” / fabricated Eq. 6 wording. I did not run pytest in this read-only review sandbox.
Concrete fix: None for this PR.

@igerber igerber merged commit 6502f19 into main Jun 23, 2026
42 of 43 checks passed
@igerber igerber deleted the docs/gardner-2022-review branch June 23, 2026 10:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready-for-ci Triggers CI test workflows

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant