refactor(review): add per-agent reviewers and orchestrator (2/3)#1106
refactor(review): add per-agent reviewers and orchestrator (2/3)#1106peyton-alt wants to merge 4 commits intofeat/entire-review-v2-a1from
Conversation
There was a problem hiding this comment.
Pull request overview
This PR (A2/3) implements the concrete building blocks for the upcoming entire review internal refactor: per-agent AgentReviewer adapters, a shared ReviewerTemplate, a single-agent Run orchestrator with sink fan-out, and shared prompt/scope composition utilities (still not wired into the user-facing command until PR-A3).
Changes:
- Added
ReviewerTemplatescaffolding inreview/typesand implemented per-agent reviewers forclaude-code,codex(with chrome output filtering), andgemini-cli, each with contract-style tests. - Introduced review run consumption primitives:
Sink,RunSummary/AgentRun,DumpSink, and the single-agentRun()orchestrator with tests. - Added shared prompt composition (
ComposeReviewPrompt) and git scope detection/statistics (detectScopeBaseRef,ComputeScopeStats) with thorough test coverage.
Reviewed changes
Copilot reviewed 26 out of 26 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| cmd/entire/cli/review/types/template.go | Adds shared ReviewerTemplate implementation for spawning/parsing agent processes. |
| cmd/entire/cli/review/types/template_test.go | Unit tests for event forwarding and wait behavior of ReviewerTemplate. |
| cmd/entire/cli/review/types/sink.go | Defines Sink, RunSummary, AgentRun, and AgentStatus. |
| cmd/entire/cli/review/types/sink_test.go | Tests AgentStatus.String and sink interface compilation/zero-value behavior. |
| cmd/entire/cli/review/run.go | Implements single-agent orchestrator Run() plus status classification. |
| cmd/entire/cli/review/run_test.go | Tests orchestrator success/cancel/failure/start-error paths and sink fan-out. |
| cmd/entire/cli/review/dump.go | Adds DumpSink post-run renderer reading AgentRun.Buffer. |
| cmd/entire/cli/review/dump_test.go | Tests dump output for success/failure/cancel/mixed/empty summaries. |
| cmd/entire/cli/review/prompt.go | Adds shared prompt composition including optional scope clause. |
| cmd/entire/cli/review/prompt_test.go | Tests prompt section joining and whitespace handling. |
| cmd/entire/cli/review/scope.go | Implements base-ref detection + scope stats via go-git + git CLI helpers. |
| cmd/entire/cli/review/scope_test.go | Integration-heavy tests for scope detection and counts in temp repos. |
| cmd/entire/cli/review/env.go | Adds AppendReviewEnv helper to set ENTIRE_REVIEW_* env vars on spawned agents. |
| cmd/entire/cli/agent/claudecode/reviewer.go | Claude per-agent reviewer (argv prompt) using shared template/prompt/env. |
| cmd/entire/cli/agent/claudecode/reviewer_test.go | Contract tests for Claude reviewer argv/env parsing and scanner error handling. |
| cmd/entire/cli/agent/claudecode/testdata/canned_session.txt | Golden fixture for Claude parser tests. |
| cmd/entire/cli/agent/codex/reviewer.go | Codex per-agent reviewer (stdin prompt) with filtered stdout parsing. |
| cmd/entire/cli/agent/codex/reviewer_test.go | Contract tests for Codex reviewer argv/env and filtered narrative output. |
| cmd/entire/cli/agent/codex/output_filter.go | Codex chrome stripping (banners/hooks/exec/CSI/timestamp logs). |
| cmd/entire/cli/agent/codex/output_filter_test.go | Unit tests for codex output filtering rules and regression cases. |
| cmd/entire/cli/agent/codex/testdata/canned_exec.txt | Golden fixture exercising codex chrome filtering + narrative retention. |
| cmd/entire/cli/agent/geminicli/reviewer.go | Gemini per-agent reviewer (stdin prompt) using shared template/prompt/env. |
| cmd/entire/cli/agent/geminicli/reviewer_test.go | Contract tests for Gemini reviewer argv/env parsing and scanner error handling. |
| cmd/entire/cli/agent/geminicli/testdata/canned_session.txt | Golden fixture for Gemini parser tests. |
| cmd/entire/cli/agent/architecture_test.go | Updates forbidden-import allowlist to permit importing the new review package. |
| .golangci.yaml | Allows returning review/types.Process for ireturn linter configuration. |
013d5f4 to
c6c5aef
Compare
c4b7bce to
37f5f0f
Compare
Each agent implements types.AgentReviewer in its own package, owning
its quirks: argv/stdin shape, env propagation, output adapter.
- claude-code: argv `claude -p <prompt>`. Plain-text assistant
output emitted as AssistantText events.
- codex: argv `codex exec --skip-git-repo-check -`, prompt via
stdin. output_filter.go strips banners, exec-block tool calls,
hook-firing notices, CSI sequences, and timestamped error logs
before AssistantText emission. Anchored regexes (^exec$ /
^exec <cmd> in /<path>) avoid eating narrative containing 'exec'.
- gemini-cli: argv `gemini -p ' '`, prompt via stdin. Clean stdout,
no chrome filtering.
All three set ENTIRE_REVIEW_{SESSION,AGENT,SKILLS,PROMPT,STARTING_SHA}
on the spawned process so the lifecycle hook (CU1) adopts the session
as a review.
Per-agent contract tests cover argv shape, stdin wiring, env vars set,
deferred binary lookup (TestNoBinaryRequiredAtConstruction), and a
golden-file event-stream test feeding canned stdout fixtures through
the parser.
Codex's chrome filter is exercised by output_filter_test.go separately
from reviewer_test.go: filter logic stays inside agent/codex/, never
in shared code.
Compile-time interface check (var _ AgentReviewer = ...) ensures all
three implementations satisfy the interface defined in CU2.
Also: add review/* to architecture_test.go allowedPrefixes (per-agent
reviewer.go files import review env consts and review/types), and add
review/types.Process to ireturn allow list in .golangci.yaml.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The orchestration layer consuming AgentReviewer from CU3.
- types/sink.go: Sink interface (AgentEvent during run, RunFinished
once at end), RunSummary, AgentRun (per-agent post-run data),
AgentStatus enum. Sinks MUST NOT block.
- review/dump.go: DumpSink — post-run terminal narrative renderer.
AgentEvent is a no-op; rendering reads from AgentRun.Buffer in
RunFinished. Handles succeeded/failed/cancelled per-agent
distinctly; closes with an N-agents-done counts line.
- review/run.go: Run() — single-agent orchestrator. Forwards events
to all sinks as they arrive, classifies terminal status, calls
RunFinished on each sink with the populated summary.
classifyStatus checks ctx.Err() before waitErr so signal-killed
processes (ExitError exit code -1) are correctly classified as
Cancelled rather than Failed (regression class from PR #1018
commit 8e82e40).
AgentRun.Buffer holds the full event log per agent for post-run
rendering. A TODO(memory) flags this for revisiting if very long
reviews put pressure on memory.
CU8 generalizes Run to RunMulti for N-agent fan-out, re-using the
same Sink and RunSummary types.
c6c5aef to
ce17205
Compare
|
@BugBot review |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Comment @cursor review or bugbot run to trigger another review on this PR
Reviewed by Cursor Bugbot for commit ce17205. Configure here.
review/prompt.go's ComposeReviewPrompt assembles the agent prompt from Skills + AlwaysPrompt + PerRunPrompt + scope clause. Empty sections are skipped (no triple-newline gaps — regression class from PR #1018 commit 13415f5). The scope clause pins agents to 'commits unique to this branch vs <closest-ancestor>' to prevent the divergent-default problem where codex used origin/main...HEAD and claude used working-tree-only on the same invocation (fixed in #1018 commit b9ed9c0, structurally enforced here). review/scope.go's detectScopeBaseRef finds the closest non-self ancestor branch by tip timestamp, with fallback chain origin/HEAD → origin/main → origin/master → main → master. ComputeScopeStats and formatScopeBanner produce the 'Reviewing feat/X vs main: 3 commits, 7 files changed, 2 uncommitted' output that CU6 will print before agent launch. Consolidates CU3's placeholder composePrompt and duplicated appendReviewEnv into shared helpers in review/. Each agent reviewer (claudecode, codex, geminicli) now calls review.ComposeReviewPrompt and review.AppendReviewEnv directly. Per-agent local copies removed. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
CU3's three reviewers (claudecode, codex, geminicli) shared ~50 lines
of identical Start/Process scaffolding. Only buildCmd (argv/env) and
the stdout parser genuinely differ per agent.
types.ReviewerTemplate now owns the shared lifecycle (spawn → pipe
stdout → run parser → forward events → close on exit). Each agent's
reviewer.go is reduced to:
- New() returning a *ReviewerTemplate populated with the agent's
name, BuildCmd, and Parser.
- buildReviewCmd: argv + ENTIRE_REVIEW_* env propagation.
- parseXxxOutput: agent-specific stdout-to-Event stream (with the
scanner.Err handling from the CU3 fix loop preserved unchanged).
Per-agent file: ~140 → ~80 lines. Total: ~150 LoC saved.
Behavior identical — all CU3 tests pass without modification.
Adding a 4th launchable agent now requires only its buildCmd and
parser; the lifecycle scaffolding is no longer per-agent code.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ce17205 to
aa8be02
Compare

https://entire.io/gh/entireio/cli/trails/287
Summary
Second of three stacked PRs rebuilding `entire review`'s internals. Stacked on PR-A1 (#1105). Implements the abstractions PR-A1 declared: per-agent `AgentReviewer` for the three launchable agents, the `Sink`/`Run` orchestrator that consumes them, and the prompt + scope composition layer. Still no user-visible behaviour change — `runReview` is rewired in PR-A3.
Stack:
What's in this PR
CU3 — Per-agent reviewers (`74133788e`)
CU4 — Sink + DumpSink + single-agent orchestrator (`a7b916143`)
CU5 — Prompt composition + scope detection (`2bc20b166`)
CU5b — Extract ReviewerTemplate refactor (`013d5f4a2`)
Test plan
Smoke test (package-level contract)
The per-agent reviewer tests exercise each reviewer end-to-end with golden-file fixtures (real argv construction, real env, real parser pipeline). Plus the orchestrator tests drive `Run()` against stub reviewers under success/cancel/error paths.
```bash
git checkout feat/entire-review-v2-a2
go build -o /tmp/entire-smoke-a2 ./cmd/entire # confirms compilation
go test ./cmd/entire/cli/agent/{claudecode,codex,geminicli}/ \
./cmd/entire/cli/review/ \
./cmd/entire/cli/review/types/ -count=1
```
Expected: all 5 packages green.
Actual output (verified locally on `feat/entire-review-v2-a2` @ 013d5f4):
```
ok cmd/entire/cli/agent/claudecode 3.999s
ok cmd/entire/cli/agent/codex 0.543s
ok cmd/entire/cli/agent/geminicli 0.833s
ok cmd/entire/cli/review 1.202s
ok cmd/entire/cli/review/types 1.101s
```
✓ All package-level smoke green.
A real-agent end-to-end smoke (orchestrator + DumpSink against live claude-code) is more meaningful at PR-A3 where `entire review` actually invokes this layer; deferred to that PR's smoke procedure.
Anti-features explicitly NOT recreated
Note
Medium Risk
Introduces new review execution plumbing (process spawning, event parsing, git scope detection, and status classification), which could affect correctness of review runs and cancellation/error reporting despite being well-tested.
Overview
Adds the next layer of the
entire reviewrewrite: per-agentAgentReviewerimplementations forclaude-code,codex, andgemini-cli, including codex stdout chrome stripping and consistent parser error reporting asRunError+Finished{Success:false}.Introduces a single-agent
Run()orchestrator withSink/RunSummary/AgentRun/AgentStatusand aDumpSinkrenderer, plus sharedComposeReviewPromptand git-based scope detection (detectScopeBaseRef/ComputeScopeStats) to standardize what commits the agents should review.Also tightens the env-var handshake with
AppendReviewEnv(deduping staleENTIRE_REVIEW_*entries), updates agent import-architecture allowlists, and expands unit/golden tests across agents, orchestrator, prompt, env, and scope.Reviewed by Cursor Bugbot for commit ce17205. Configure here.