fix(review): stop same-head AI re-review churn and add a maintainer-gated freeze#3461
Merged
Conversation
…ated freeze Removes baseSha from the AI review cache fingerprint -- it is the live tip of the base branch, which advances on every unrelated merge regardless of whether this PR's own diff changed, so on an active repo it defeated the cache on nearly every scheduled re-gate sweep. Adds a durable "published" marker to ai_review_cache so a review that has actually reached the PR is reused indefinitely instead of only for a bounded cooldown, and stops the reviewing placeholder from posting when nothing is about to change. Also adds a maintainer-gated freeze: once a PR carries the manual-review label, new pushes reuse the last published review instead of spending a fresh AI call, until a maintainer/collaborator explicitly retriggers it. Validated with npm run test:coverage (9078 passed), npm run test:workers, db:migrations:check, db:schema-drift:check, cf-typegen:check, ui:openapi:check, ui:openapi:settings-parity, actionlint, and npm audit --audit-level=moderate.
|
Superagent didn't find any vulnerabilities or security issues in this PR. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3461 +/- ##
=======================================
Coverage 92.99% 92.99%
=======================================
Files 296 296
Lines 30972 30995 +23
Branches 11294 11303 +9
=======================================
+ Hits 28803 28825 +22
- Misses 1514 1515 +1
Partials 655 655
🚀 New features to boost your workflow:
|
17 tasks
JSONbored
added a commit
that referenced
this pull request
Jul 5, 2026
… path (#3477) * test(review): cover the markAiReviewPublished write-failure fail-open path #3461 added markAiReviewPublished alongside the existing markPullRequestSurfacePublished stamp but only tested the latter's failure path, leaving the new call's own catch/console.error branch without a regression test (codecov/patch flagged it after merge). Mirrors the existing "over-publish dedup" test for the sibling stamp. * test(review): assert the exact stamp identity in the fail-open regression Addresses the gate review's nit on #3477: the markAiReviewPublished spy assertion only checked that it was called, not with which repo/PR/head, so a wrong-identity call would have passed silently.
18 tasks
JSONbored
added a commit
that referenced
this pull request
Jul 5, 2026
…view freeze (#3490) The maintainer-gated freeze (#3461) reused the last published AI review for any PR carrying the manual-review label, with no author exemption. Confirmed live on PR #3476: pushing genuine follow-up commits to the owner's own held PR kept replaying the ORIGINAL, now-stale AI verdict instead of evaluating the new commits, because github_app.ai_review_frozen_reuse fired on every push. The anti-gaming concern the freeze exists for is specific to a contributor iterating pushes against the bot; it never applies to the repo owner, an ADMIN_GITHUB_LOGINS fleet operator, or a protected automation bot, matching the exemption this codebase already grants those authors everywhere else (auto-close, review-nag, contributor caps). Root-caused via the live audit_events trail on the self-host VPS, which showed ai_review_frozen_reuse firing on every one of #3476's own follow-up commits despite the PR being owner-authored.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes the wasteful/incorrect same-head PR re-review churn observed in production (PR #3379's repeated AI calls and comment edits, PR #3383's held-verdict flip). Three compounding bugs, all confirmed against live audit-event evidence before fixing:
baseShain the AI review cache fingerprint (src/review/ai-review-cache-input.ts). It hashed the PR's livebase.shaon top of the actual diff content.base.shais the live tip of the target branch — it advances on every unrelated merge, independent of whether this PR's own files changed. On an active repo that means the fingerprint differs on nearly every scheduled regate-sweep tick, defeating both the durable cache and the existing bounded non-cacheable-cooldown reuse (added for a prior incident, perf(review): cache the AI review by (repo, pr, head SHA) to skip redundant LLM calls #1462) — since that lookup is keyed on an exact fingerprint match. The diff content (reviewFiles' patches) already captures "did the reviewed content change"; hashingbaseShaon top was redundant when unchanged and actively harmful when the repo is busy.src/db/repositories.ts). A genuinely non-cacheable (dynamic-context: grounding/RAG/reputation) review was only reusable for a bounded 30-minute cooldown — past that, every sweep tick paid for a fresh LLM call forever. Since LLM output isn't perfectly deterministic, two calls against identical input could disagree, flipping the published verdict.src/queue/processors.ts), before checking whether anything had changed — guaranteeing two real comment edits per sweep tick (placeholder-in, final-out) forever, defeating the existing byte-identical no-op guard at the comment layer.What changed
baseShafromAiReviewCacheInput/aiReviewCacheInputFingerprint.ai_review_cache.published_atcolumn (migrations/0112_ai_review_cache_published.sql) +markAiReviewPublished()/getLatestPublishedAiReview(). Once published, a review is reused indefinitely for its exact head+fingerprint regardless of the cooldown; a fresh write resetspublished_attoNULL.manual-reviewlabel, further contributor pushes reuse the last published review instead of spending a fresh AI call — closing a gaming surface where a held PR could be pushed to repeatedly at real LLM cost while waiting for a maintainer to actually look. Only an explicit maintainer/collaborator retrigger (the existing PR-panel checkbox, unchanged authorization) unfreezes it. CI/mergeable-status rendering and label/assignee reconciliation are unaffected — both still run every pass.Two pre-existing tests had their assertions deliberately updated (not regressed) because they encoded the now-superseded policy: one asserted a bounded-but-still-unbounded-over-time retry cadence for non-cacheable reviews, the other asserted
baseShaalone should miss the cache.No linked issue — this is a direct fix for production incident evidence gathered from
audit_eventson the live deployment, not a pre-filed feature request.Scope
type(scope): short summaryConventional Commit format, for examplefix(api): restore profile access checks.CONTRIBUTING.mdand does not reintroduce GitHub Pages, VitePress,site/, orCNAME.Validation
git diff --checknpm run actionlintnpm run typechecknpm run test:coverage— full unsharded run, 9078 passed, 0 failednpm run test:workersnpm run build:mcp(not run — no MCP package changes)npm run test:mcp-pack(not run — no MCP package changes)npm run ui:openapi:checknpm run ui:lint/ui:typecheck/ui:build(not run — no UI files touched; confirmed viaui:openapi:settings-paritythatRepositorySettingsis unaffected)npm audit --audit-level=moderate— 0 vulnerabilitiesai-review-cache.test.ts/ai-review-cache-input.test.ts. Targeted branch coverage on every changed line range is 100% (one defensive?? nullfallback markedv8 ignore, consistent with 6 identical pre-existing instances in the same file).If any required check was skipped, explain why:
build:mcp/test:mcp-pack: no changes under the MCP package.ui:lint/ui:typecheck/ui:build: no changes underapps/gittensory-ui/**; also separately rannpm run cf-typegen:check(clean, no wrangler binding changes) even though it's not in the template's list.Safety
ui:openapi:check+ui:openapi:settings-parity.)UI Evidencesection below with screenshots. (N/A — backend-only change, no visible UI.)Notes
0112_ai_review_cache_published.sql(contiguous, verified viadb:migrations:check);src/db/schema.tsupdated to match (verified viadb:schema-drift:check).