feat(review): custom review profiles (backend) by backnotprop · Pull Request #913 · backnotprop/plannotator

backnotprop · 2026-06-13T21:25:11Z

What

Named, reusable review profiles — "Security", "Performance", "API Contract" — that run through the existing Claude/Codex background review pipeline. A developer drops a JSON file, picks it at launch, and findings come back as the same external diff annotations as today.

{ "instructions": "Focus only on security-impacting issues." }

That one-line file is a complete, valid profile.

This PR is backend only — no UI yet. It's fully drivable over HTTP, and the existing "Run Review" path is unchanged: an absent reviewProfileId resolves to builtin:default and produces a byte-identical prompt. Nothing regresses while this lands.

Authoritative spec: docs/custom-reviews.md. Current-pipeline reference map it builds on: docs/agent-jobs-code-review-handoff.md.

How it works

pick profile → resolve (builtin/user/repo) → sectioned prompt → Claude/Codex
  → parse → severity-normalize → in-diff filter → external annotations

packages/shared/review-profiles.ts (new, runtime-agnostic) — types, shape validation, filename → id inference, label inference, trivial name-clash resolution, and the sectioned prompt composer.
packages/server/review-profile-loader.ts (new) — discovery across built-in + user dir (${PLANNOTATOR_DATA_DIR}/reviews/) + repo dir (.plannotator/reviews/), skip-and-log on malformed files.
GET /api/agents/review-profiles — discovery endpoint (Bun + Pi).
reviewProfileId on POST /api/agents/jobs — parsed and forwarded; rejects unknown fields; no inline prompt text accepted. reviewProfileId/reviewProfileLabel added to AgentJobInfo.
Launch-time resolution at the existing launch-state snapshot boundary; repo profiles scoped to one unambiguous local repo (absent in remote-only PR / ambiguous workspace mode).
Severity normalization — Codex priority 0-3 → the shared important | nit | pre_existing scale, in one place, so badges read the same regardless of engine.
In-diff filter + completion semantics — drop findings not in the reviewed patch; fail the job if all findings are out-of-diff or output is missing/unparseable (the Code Tour precedent). Calm one-liner: 8 findings · 2 skipped (not in the reviewed diff).
Profile label as annotation metadata — author stays the engine name (Claude Code / Codex); the profile rides alongside.
Bun/Pi parity maintained via vendor.sh.

Scope discipline

The spec has an explicit "Deliberately not building" section, honored here: no precedence engine, no content hashing, no diagnostics channel, no ReviewEngineAdapter interface, no anchoring module, no inline reviewPrompt, no temp-profile machinery. Two new modules, the rest small edits.

Deferred (not in this PR)

Frontend: the engine/review picker in AgentsTab.tsx, sending reviewProfileId from useAgentJobs.ts, the repo-source tag, and the profile tag on annotations.

Verification

Typecheck: clean (packages/shared + packages/server tsconfigs)
New suites: review-profiles, loader, launch, prompt composition, ingestion — all green
Regression: full server + AI suites pass, 0 fail
Vendor: vendor.sh idempotent (no Pi drift on re-run)

Built via a phased pipeline with per-phase perspective-diverse adversarial review (correctness / spec-fidelity / Bun-Pi parity) and a final whole-feature audit. Verification re-run independently after handoff.

Review notes

The two-engine prompt/severity seams (claude-review.ts, codex-review.ts, and onJobComplete in review.ts) and the Bun↔Pi mirror in serverReview.ts / apps/pi-extension/server/agent-jobs.ts are the highest-value places to scrutinize. The byte-identical-default-prompt invariant has a dedicated test but is worth a human eye.

🤖 Implemented with Claude Code via a multi-agent workflow; verified before PR.

Named, reusable review profiles ("Security", "Performance", "API Contract") that run through the existing Claude/Codex background review pipeline. Drop a JSON file in ~/.plannotator/reviews/ or .plannotator/reviews/, pick it at launch, and findings come back as the same external annotations as today. Backend only — no UI yet. The existing "Run Review" path is untouched: an absent profile resolves to builtin:default and produces a byte-identical prompt. - packages/shared/review-profiles.ts: types, shape validation, filename->id inference, label inference, trivial name-clash resolution (builtin reserved, user beats repo), sectioned prompt composer - packages/server/review-profile-loader.ts: builtin + user-dir + repo-dir discovery (repo scoped to one unambiguous local repo) - GET /api/agents/review-profiles discovery endpoint (Bun + Pi) - reviewProfileId on POST /api/agents/jobs (rejects unknown fields; no inline prompts); reviewProfileId/Label added to AgentJobInfo - launch-time profile resolution at the existing launch-state snapshot boundary - Codex priority -> shared important|nit|pre_existing severity scale - in-diff finding filter + fail-on-all-out-of-diff completion semantics - profile label carried as annotation metadata (author stays the engine name) - Bun/Pi parity via vendor.sh Spec: docs/custom-reviews.md (with a "Deliberately not building" boundary — no precedence engine, hashing, diagnostics channel, adapter interface, or anchoring module). Tests: review-profiles, loader, launch, prompt composition, ingestion. Full server + ai suites green; typecheck clean; vendor idempotent.

Cuts agreed during review — the feature comes out smaller and sharper (net -392 lines). A profile is now just a name + instructions in your global folder, run with any engine/model, and every finding it produces appears in the panel. Cuts: - Remove the in-diff finding filter and all reviewChangedFiles plumbing (Bun + Pi). Findings flow straight to addAnnotations like main — nothing is filtered, dropped, or hidden. - Remove per-repo profiles. Global ~/.plannotator/reviews/ only; drops the repo source, repoCwd gating, and the PR/fork trust surface. - Remove the engines restriction. A profile is id/label/instructions; engine and model stay independent runtime choices. Also drops the dead reviewProfile params from the Claude/Codex command builders. Fixes & cleanups from review: - Fail Claude jobs on empty/unparseable output (was silently "done"; pre-existing gap, also helps the default review). Codex already did this. - void the dangling debugLog call in the Codex transform. - Remove the unreachable builtin:default reserved-id guard (namespacing already reserves it). - Move markJobReviewFailed + REVIEW_OUTPUT_FAILED into packages/shared so Bun and Pi stop duplicating them. - Document GET /api/agents/review-profiles in the endpoint table. - Update docs/custom-reviews.md to match what ships (no per-repo, no engines, no finding-dropping). Verified: typecheck clean; 765 server+ai+shared tests pass; vendor idempotent; whole-repo remnant sweep clean.

- Guard the completion-semantics fix with tests: empty/whitespace/garbage Claude output -> parseClaudeStreamOutput null, and markJobReviewFailed flips the job to failed with a leak-free reason. (The fix had no test after the filter tests were removed.) - Make Codex completion symmetric with Claude: handle the codex job unconditionally and treat a missing output path as REVIEW_OUTPUT_FAILED instead of silently leaving the job "done". - Stop stamping reviewProfileLabel on default reviews — only custom profiles get a tag, so the future UI won't paint "Default" on every ordinary finding. - Sort review-profile filenames so a bare-name clash resolves deterministically (first-seen wins) rather than by filesystem order. - Extract the shared annotation map+ingest block in onJobComplete (was duplicated across the codex/claude branches) into one local helper. Bun + Pi mirrored. Typecheck clean; 768 server+ai+shared tests pass; vendor idempotent.

backnotprop added 3 commits June 13, 2026 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(review): custom review profiles (backend)#913

feat(review): custom review profiles (backend)#913
backnotprop wants to merge 3 commits into
mainfrom
feat/custom-reviews

backnotprop commented Jun 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

backnotprop commented Jun 13, 2026

What

How it works

Scope discipline

Deferred (not in this PR)

Verification

Review notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant