test: delete-this-pr-never#2
Open
bborbe wants to merge 8 commits into
Open
Conversation
bborbe
added a commit
that referenced
this pull request
Apr 27, 2026
bborbe
added a commit
that referenced
this pull request
Apr 27, 2026
Caught during local smoke test against PR #2: Claude prefixed the ai_review verdict JSON with explanatory prose despite the prompt asking for raw JSON only. json.Unmarshal failed on "All three checks pass:..." and the step incorrectly routed to human_review. New extractVerdict walks the LLM response in three stages: 1. Direct unmarshal of the trimmed response 2. Strip ```json / ``` fences then unmarshal 3. Walk back from the last '}' to find the balanced opening '{', extract that substring, unmarshal Tested via 11 table-driven cases in pkg/steps/review_test.go covering raw JSON, whitespace, fenced code blocks, prose-before, prose-around, multiple JSON fragments (last wins), nested objects, and the failure modes (empty, prose-only, unbalanced braces). Verified end-to-end on PR #2: ai_review now correctly emits NextPhase=done with verdict=pass on the same prose+JSON output that previously escalated to human_review.
bborbe
added a commit
that referenced
this pull request
Apr 30, 2026
generate-dummy-task now emits the new agent contract frontmatter (clone_url, ref, base_ref, task_identifier) at phase=in_progress so the new checkoutExecutionStep can pick it up. Drops stale -allowed-tools flag (allowed-tools are now hardcoded per phase in factory.go). Auto-fetches PR head SHA via gh; defaults target bborbe/code-reviewer PR #2.
bborbe
added a commit
that referenced
this pull request
May 1, 2026
5/6 DoD criteria verified via local e2e: - 001: GitHub HTTPS clone+review (PR #2) — pass - 002: GitHub SCP/SSH clone+review (same PR) — pass - 003: Arbitrary ref review (master vs master~1) — pass Bitbucket scenario 004 deferred: needs accessible target. Adds gitignore for run-task scratch task files (test*.md).
bborbe
added a commit
that referenced
this pull request
May 15, 2026
…Rung-2 smoke Each bug was caught by running the local smoke test against PR #2 with the real bot PAT and stepping through the failure modes one by one. - pkg/factory/runner.go: inject real ReviewVerifier (was nil) so cmd/run-task exercises the ai_review post-verification step locally. Required by spec AC line 211 (local smoke test). Falls into the same nil-disables-critical- path pattern as the prior poster fix. - pkg/githubposter/poster.go (listBotReviews): filter out state=COMMENTED reviews from the dismissal list. GitHub's API rejects dismissal of comment-state reviews with HTTP 422 "Can not dismiss a commented pull request review". Comment reviews don't block merges anyway, so leaving them stacked is harmless; only APPROVED / CHANGES_REQUESTED need dismissal. - pkg/verdict.go (parseJSONVerdict): replace per-line regex requiring both "verdict" AND "reason" tokens on the same line with a brace-balanced JSON block extractor + json.Unmarshal. The pre-spec-025 schema had "reason"; the new schema has "summary" instead, so Claude's pretty-printed multi- line JSON output never matched the old per-line check. Result was every successful review falling to the heuristic fail-closed branch (treating the agent's own approve verdict as request-changes). New parser handles legacy single-line + new multi-line fenced format; falls back to heuristic on malformed JSON; respects the last-50-lines window to avoid matching JSON examples quoted earlier in the review text. Smoke test against #2 now produces: - in_progress: posts COMMENT review under pr-review-of-ben with autoApprove- disabled body prefix; prior CHANGES_REQUESTED reviews on same SHA are dismissed; verify-after-POST confirms persistence (diagnostic: success review_id 4300948227) - ai_review: 3 quality checks pass, post-verification finds the bot review, verdict: pass, NextPhase: done
bborbe
added a commit
that referenced
this pull request
May 15, 2026
The watcher writes task body as: # PR Review: <title> <PR URL> ## Plan agentlib's Markdown splits sections at BOTH "# " and "## ". So md.Preamble is empty (H1 starts immediately) and the URL sits inside the H1 section's Body — not in the preamble. The previous extractor scanned only md.Preamble and always failed with "no GitHub PR URL found in task preamble", routing every prod task to human_review. Caught by spec-027 Rung-2 smoke (commit 7585e3f to #2): diagnostic showed failure_step=pr_url_extraction, attempt=0, before any HTTP call could fire. Fix scans md.Preamble first, then every section before the first H2, which is the watcher-authored part. Stops at the first "## " so URLs Claude writes inside ## Review are never matched.
bborbe
added a commit
that referenced
this pull request
May 15, 2026
Spec 029 set both dev.env and prod.env to the wildcard `github.com/bborbe/*`. That makes dev and prod watch the same set, so every PR in a bborbe/* repo spawns TWO tasks (one per cluster) with identical task_identifier (UUID5 dedup works) but different `stage:` frontmatter. obsidian-git commits both to vault and produces an unresolved merge conflict in the task file every time — the controller can't parse the YAML and the task gets stuck. Caught during spec-027 Rung-2 verification on PR #2: task spawned, both clusters wrote, conflict committed, controller couldn't pick up the task until a human resolved the markers. Reverting dev to go-skeleton (its original isolation target) so dev and prod no longer overlap. Prod keeps the wildcard intentionally — operator's choice; reviewed and accepted.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Test PR for pr-reviewer agent. Don't close. Is for testing PR Review :)