Skip to content

test: delete-this-pr-never#2

Open
bborbe wants to merge 8 commits into
masterfrom
delete-this-pr-never
Open

test: delete-this-pr-never#2
bborbe wants to merge 8 commits into
masterfrom
delete-this-pr-never

Conversation

@bborbe
Copy link
Copy Markdown
Owner

@bborbe bborbe commented Apr 23, 2026

Test PR for pr-reviewer agent. Don't close. Is for testing PR Review :)

bborbe added a commit that referenced this pull request Apr 27, 2026
Caught during local smoke test against PR #2: Claude prefixed the
ai_review verdict JSON with explanatory prose despite the prompt
asking for raw JSON only. json.Unmarshal failed on "All three checks
pass:..." and the step incorrectly routed to human_review.

New extractVerdict walks the LLM response in three stages:
1. Direct unmarshal of the trimmed response
2. Strip ```json / ``` fences then unmarshal
3. Walk back from the last '}' to find the balanced opening '{',
   extract that substring, unmarshal

Tested via 11 table-driven cases in pkg/steps/review_test.go covering
raw JSON, whitespace, fenced code blocks, prose-before, prose-around,
multiple JSON fragments (last wins), nested objects, and the failure
modes (empty, prose-only, unbalanced braces).

Verified end-to-end on PR #2: ai_review now correctly emits
NextPhase=done with verdict=pass on the same prose+JSON output that
previously escalated to human_review.
bborbe added a commit that referenced this pull request Apr 30, 2026
generate-dummy-task now emits the new agent contract frontmatter
(clone_url, ref, base_ref, task_identifier) at phase=in_progress so
the new checkoutExecutionStep can pick it up. Drops stale
-allowed-tools flag (allowed-tools are now hardcoded per phase in
factory.go). Auto-fetches PR head SHA via gh; defaults target
bborbe/code-reviewer PR #2.
bborbe added a commit that referenced this pull request May 1, 2026
5/6 DoD criteria verified via local e2e:
- 001: GitHub HTTPS clone+review (PR #2) — pass
- 002: GitHub SCP/SSH clone+review (same PR) — pass
- 003: Arbitrary ref review (master vs master~1) — pass

Bitbucket scenario 004 deferred: needs accessible target.

Adds gitignore for run-task scratch task files (test*.md).
bborbe

This comment was marked as duplicate.

bborbe

This comment was marked as duplicate.

bborbe added a commit that referenced this pull request May 15, 2026
…Rung-2 smoke

Each bug was caught by running the local smoke test against PR #2 with the
real bot PAT and stepping through the failure modes one by one.

- pkg/factory/runner.go: inject real ReviewVerifier (was nil) so cmd/run-task
  exercises the ai_review post-verification step locally. Required by spec
  AC line 211 (local smoke test). Falls into the same nil-disables-critical-
  path pattern as the prior poster fix.

- pkg/githubposter/poster.go (listBotReviews): filter out state=COMMENTED
  reviews from the dismissal list. GitHub's API rejects dismissal of
  comment-state reviews with HTTP 422 "Can not dismiss a commented pull
  request review". Comment reviews don't block merges anyway, so leaving
  them stacked is harmless; only APPROVED / CHANGES_REQUESTED need dismissal.

- pkg/verdict.go (parseJSONVerdict): replace per-line regex requiring both
  "verdict" AND "reason" tokens on the same line with a brace-balanced JSON
  block extractor + json.Unmarshal. The pre-spec-025 schema had "reason";
  the new schema has "summary" instead, so Claude's pretty-printed multi-
  line JSON output never matched the old per-line check. Result was every
  successful review falling to the heuristic fail-closed branch (treating
  the agent's own approve verdict as request-changes). New parser handles
  legacy single-line + new multi-line fenced format; falls back to heuristic
  on malformed JSON; respects the last-50-lines window to avoid matching
  JSON examples quoted earlier in the review text.

Smoke test against #2 now produces:
- in_progress: posts COMMENT review under pr-review-of-ben with autoApprove-
  disabled body prefix; prior CHANGES_REQUESTED reviews on same SHA are
  dismissed; verify-after-POST confirms persistence (diagnostic: success
  review_id 4300948227)
- ai_review: 3 quality checks pass, post-verification finds the bot review,
  verdict: pass, NextPhase: done
bborbe added a commit that referenced this pull request May 15, 2026
The watcher writes task body as:
  # PR Review: <title>
  <PR URL>
  ## Plan

agentlib's Markdown splits sections at BOTH "# " and "## ". So md.Preamble
is empty (H1 starts immediately) and the URL sits inside the H1 section's
Body — not in the preamble. The previous extractor scanned only md.Preamble
and always failed with "no GitHub PR URL found in task preamble", routing
every prod task to human_review.

Caught by spec-027 Rung-2 smoke (commit 7585e3f to #2):
diagnostic showed failure_step=pr_url_extraction, attempt=0, before any
HTTP call could fire.

Fix scans md.Preamble first, then every section before the first H2,
which is the watcher-authored part. Stops at the first "## " so URLs
Claude writes inside ## Review are never matched.
bborbe added a commit that referenced this pull request May 15, 2026
Spec 029 set both dev.env and prod.env to the wildcard `github.com/bborbe/*`.
That makes dev and prod watch the same set, so every PR in a bborbe/* repo
spawns TWO tasks (one per cluster) with identical task_identifier (UUID5
dedup works) but different `stage:` frontmatter. obsidian-git commits both
to vault and produces an unresolved merge conflict in the task file every
time — the controller can't parse the YAML and the task gets stuck.

Caught during spec-027 Rung-2 verification on PR #2: task spawned, both
clusters wrote, conflict committed, controller couldn't pick up the task
until a human resolved the markers.

Reverting dev to go-skeleton (its original isolation target) so dev and
prod no longer overlap. Prod keeps the wildcard intentionally — operator's
choice; reviewed and accepted.
Copy link
Copy Markdown

@ben-s-pull-request-reviewer ben-s-pull-request-reviewer Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

smoke test from Ben's Pull Request Reviewer App — verifying REST /reviews visibility post-migration from pr-review-of-ben PAT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant