stainless-code · SutuSebastian · Jun 11, 2026 · Jun 11, 2026 · Jun 11, 2026 · Jun 11, 2026
diff --git a/.agents/skills/harden-pr/LEDGER.md b/.agents/skills/harden-pr/LEDGER.md
@@ -0,0 +1,23 @@
+# Harden-pr ledger
+
+Single durable backlog for [`harden-pr`](./SKILL.md). Parent reads **§ Rejections** at vet step; **§ Deferred** on cap and on `/harden-pr reconcile`.
+
+## Rejections
+
+By-design or false-positive findings — do not re-raise.
+
+```markdown
+- **[category]** `file:line` — label: reason
+```
+
+<!-- Example:
+- **[security]** `src/cli/proxy.ts:42` — https_proxy env: by-design — standard CLI proxy convention.
+-->
+
+## Deferred
+
+Capped or out-of-scope-for-now — reconcile re-vets; remove lines when fixed.
+
+```markdown
+- **[severity]** `file:line` — finding (deferred: out of scope | cap | blocked)
+```
diff --git a/.agents/skills/harden-pr/SKILL.md b/.agents/skills/harden-pr/SKILL.md
@@ -4,8 +4,8 @@ description: >-
   Bring a branch to pristine, maximum production readiness without changing PR intent —
   spawn parallel Task subagents (never inline review), fix in-bounds findings, loop autonomously until
   clean or pass cap, then report once. Use after a tracer-bullet commit (lite), before PR
-  is done (full), or on "harden", "harden-pr", "pristine", "review until clean",
-  "production-ready pass". Invoking this skill authorizes one harden commit at cycle end.
+  is done (full), on "harden", "harden-pr", "pristine", "review until clean",
+  "production-ready pass", or "harden-pr reconcile". Invoking this skill authorizes one harden commit at cycle end.
   NEVER stop mid-loop to ask about commits, babysit, or the next pass. NEVER redesign the
   feature or change observable runtime behavior.
 ---
@@ -16,9 +16,9 @@ description: >-
 
 Local loop: parallel reviewer subagents → merge findings → fix in-bounds → re-verify → repeat until clean or cap → **one final report**.
 
-**Invoking this skill (`/harden-pr`, `harden-pr lite`, `harden-pr full`) is a run-to-completion command.** The agent executes the full loop before ending the turn.
+**Invoking this skill (`/harden-pr`, `harden-pr lite`, `harden-pr full`, `harden-pr quick`, `harden-pr reconcile`) is a run-to-completion command.** The agent executes the full loop before ending the turn.
 
-Sister skills: [`audit-pr-architecture`](../audit-pr-architecture/SKILL.md) (extended structural reviewer). Mention **`babysit`** only in the final report (full mode) — never mid-loop.
+Sister skills: [`audit-pr-architecture`](../audit-pr-architecture/SKILL.md) (extended structural reviewer). **Ledger:** [LEDGER.md](./LEDGER.md) (rejections + deferred — one file). Mention **`babysit`** only in the final report (full mode) — never mid-loop.
 
 ## Run-to-completion (read first)
 
@@ -42,12 +42,14 @@ Otherwise: resolve anchor → run all passes → fix → verify → next pass
 
 ## Modes
 
-| Mode     | When                                                                                                                                         | Scope                   | Max passes |
-| -------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------- | ---------- |
-| **Lite** | After each tracer-bullet slice commit ([`tracer-bullets`](../../rules/tracer-bullets.md) cadence)                                            | Files in the slice diff | 2          |
-| **Full** | User intent ("full harden", "PR done", "production-ready pass") **or** offer when an in-flight `docs/plans/<topic>.md` checklist is complete | `origin/main...HEAD`    | 3          |
+| Mode          | When                                                                                                                                         | Scope                     | Max passes |
+| ------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------- | ---------- |
+| **Lite**      | After each tracer-bullet slice commit ([`tracer-bullets`](../../rules/tracer-bullets.md) cadence)                                            | Files in the slice diff   | 2          |
+| **Quick**     | Cheap uncertainty pass ("quick harden")                                                                                                      | Last commit or slice diff | 1          |
+| **Full**      | User intent ("full harden", "PR done", "production-ready pass") **or** offer when an in-flight `docs/plans/<topic>.md` checklist is complete | `origin/main...HEAD`      | 3          |
+| **Reconcile** | `/harden-pr reconcile` — process [LEDGER.md § Deferred](./LEDGER.md#deferred), then run **full** if branch still open                        | `origin/main...HEAD`      | 3          |
 
-Default to **lite** when invoked immediately after a slice commit. Default to **full** when the user signals branch completion.
+Default to **lite** when invoked immediately after a slice commit. Default to **full** when the user signals branch completion. **Quick** = core 3 reviewers only (no extended roster).
 
 ## Production bar (what "pristine" means)
 
@@ -76,6 +78,27 @@ Resolve in order; stop at the first hit:
 
 Reviewers treat the anchor as contract. Findings that would violate it → **report, do not apply**.
 
+Record `HEAD` at loop start (`git rev-parse HEAD`) in the final report. If `HEAD` changes mid-loop from unrelated work, re-resolve the anchor before the next pass.
+
+## Vet step (parent, after merge — before fix)
+
+Subagents over-report. After merge + dedupe:
+
+1. Read [LEDGER.md § Rejections](./LEDGER.md#rejections) — drop findings matching a rejection entry.
+2. For each remaining finding: **re-read** `file` at `line` (or the cited region). Drop if the claim is false or by-design.
+3. New by-design drops → append one bullet to **§ Rejections** in [LEDGER.md](./LEDGER.md).
+4. Sort survivors by leverage: `severity` first, then `confidence` desc, then `effort` asc (`S` before `L`).
+
+**Anti-pattern:** applying a fix without re-reading the cited location.
+
+## Reconcile mode
+
+Run-to-completion like other modes:
+
+1. Read [LEDGER.md § Deferred](./LEDGER.md#deferred). Re-vet each row (same vet step). Fix in-bounds items; remove fixed lines.
+2. Run **full** harden on `origin/main...HEAD` (same loop as full mode).
+3. On cap: append still-deferred items to **§ Deferred** in [LEDGER.md](./LEDGER.md). Report what was reconciled vs still open.
+
 ## In-bounds vs out-of-bounds
 
 **Fix:** bugs, missing tests, docs/changeset drift, lint/type/format, error-handling gaps, edge cases, **behavior-preserving refactors in touched files**, in-scope nits (naming, comment hygiene, cheap lint fixes).
@@ -106,10 +129,16 @@ Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses ar
   "finding": "One-sentence claim about a gap vs production bar",
   "severity": "blocker | major | minor | nit | info",
   "file": "repo-relative/path or \"multiple\"",
-  "fixable_in_bounds": true
+  "line": 42,
+  "confidence": "high | medium | low",
+  "effort": "S | M | L",
+  "fixable_in_bounds": true,
+  "production_bar": "Tests | Docs | Structure | …"
 }
 ```
 
+Use `line: null` when the gap is file-level (e.g. missing test file).
+
 **Severity → action**
 
 | Severity                   | Parent action                                                       |
@@ -124,8 +153,9 @@ Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses ar
 1. Concatenate all reviewer arrays.
 2. Drop `info` unless it blocks ship shape.
 3. Dedupe: same `file` + same root cause → keep highest severity, merge `finding` text.
-4. Sort actionable: `blocker` → `major` → `minor` → `nit`.
-5. If merged list is empty → pass succeeds; skip fix phase.
+4. Sort actionable: `blocker` → `major` → `minor` → `nit`; within tier → `confidence` desc → `effort` asc.
+5. **Vet** (§ Vet step).
+6. If vetted list is empty → pass succeeds; skip fix phase.
 
 **Example merged queue (pass 1)**
 
@@ -135,19 +165,31 @@ Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses ar
     "finding": "CLI --help documents summary counts but not per-row attribution on --base JSON rows.",
     "severity": "major",
     "file": "src/cli/cmd-audit.ts",
-    "fixable_in_bounds": true
+    "line": 120,
+    "confidence": "high",
+    "effort": "S",
+    "fixable_in_bounds": true,
+    "production_bar": "Docs"
   },
   {
     "finding": "Skill shard leaks requiredColumns when describing attribution.",
     "severity": "major",
     "file": "templates/agent-content/skill/10-recipes-context.md",
-    "fixable_in_bounds": true
+    "line": null,
+    "confidence": "high",
+    "effort": "M",
+    "fixable_in_bounds": true,
+    "production_bar": "Surfaces"
   },
   {
     "finding": "No e2e test for attribution: inherited on deprecated delta.",
     "severity": "nit",
     "file": "src/application/audit-worktree.test.ts",
-    "fixable_in_bounds": true
+    "line": null,
+    "confidence": "medium",
+    "effort": "S",
+    "fixable_in_bounds": true,
+    "production_bar": "Tests"
   }
 ]
 ```
@@ -170,7 +212,7 @@ You are the **{ROLE}** reviewer for `/harden-pr` on `{REPO}`.
 **Task:** {EXTRA}
 
 **Return ONLY** a JSON array of findings:
-[{ "finding": "...", "severity": "blocker|major|minor|nit|info", "file": "...", "fixable_in_bounds": true|false }]
+[{ "finding": "...", "severity": "blocker|major|minor|nit|info", "file": "...", "line": N|null, "confidence": "high|medium|low", "effort": "S|M|L", "fixable_in_bounds": true|false, "production_bar": "..." }]
 If clean: []
 
 Readonly — do not edit files.
@@ -216,23 +258,25 @@ Re-derive layer globs from `docs/architecture.md` § Layering — don't hardcode
 Execute **without pausing for user input** until exit condition:
 
 ```
-resolve intent anchor
+resolve intent anchor; stamp HEAD
 pass = 1
 loop:
   Task-batch all applicable reviewers (parallel, readonly)
   parent: merge + dedupe JSON findings (§ Finding schema)
+  parent: vet findings (§ Vet step)
   if none actionable → goto done
   fix in-bounds (pass 1: all; passes 2+: blockers first, then in-scope nits)
-  run project checks on touched files
+  per fix: run verification gate from verify-after-each-step on touched files
   if clean and no new findings → goto done
   if pass >= max_passes → goto capped
   pass += 1
   goto loop
 capped:
+  append deferred rows to LEDGER.md § Deferred
   emit deferred-nits list (each nit must cite plan Out of scope or cross-PR blocker — not "optional")
 done:
   if uncommitted fixes → git commit -m "harden: …"
-  emit final report (include babysit one-liner if full mode)
+  emit final report (include babysit one-liner if full mode; include anchor HEAD stamp)
 ```
 
 **Pass cap behavior:** after cap, stop auto-fixing; list deferred nits. Do not block the next tracer slice.
@@ -243,9 +287,11 @@ Skill invocation **is** the commit authorization. After the loop: if fixes exist
 
 ## Quick invoke
 
-| Intent      | Say                                                    |
-| ----------- | ------------------------------------------------------ |
-| Post-slice  | `/harden-pr lite` or `/harden-pr` after a slice commit |
-| Branch done | `/harden-pr full` or "production-ready pass"           |
+| Intent           | Say                                                    |
+| ---------------- | ------------------------------------------------------ |
+| Post-slice       | `/harden-pr lite` or `/harden-pr` after a slice commit |
+| Cheap pass       | `/harden-pr quick`                                     |
+| Branch done      | `/harden-pr full` or "production-ready pass"           |
+| Deferred backlog | `/harden-pr reconcile`                                 |
 
 Replaces the old copy-paste: _"spawn subagents → fix → loop until clean"_ — this skill **is** that loop.
diff --git a/.changeset/read-surface-hardening.md b/.changeset/read-surface-hardening.md
@@ -0,0 +1,5 @@
+---
+"@stainless-code/codemap": patch
+---
+
+Harden read surfaces: `codemap query --format …` blocks index mutations via the same read-only guard as `--json`; `codemap serve` requires `--token` when `--host` is not loopback (any `127.0.0.0/8` address counts as loopback, so `--token` stays optional on `127.0.0.2` and similar); `codemap validate` (and MCP/HTTP `validate`) can return `rejected` rows with optional `reason` (`path escapes project root` | `path escapes via symlink` | `path resolves outside project root`) — output `path` keys are always project-relative POSIX paths.
diff --git a/README.md b/README.md
@@ -50,7 +50,7 @@ codemap dead-code --json                                     # outcome alias →
 codemap query --json --recipe fan-out                        # recipe SQL by id (alias: -r)
 codemap query --json "SELECT name, file_path FROM symbols WHERE name = 'foo'"  # ad-hoc SQL
 codemap --files src/a.ts src/b.tsx                           # targeted re-index after edits
-codemap validate --json                                      # detect stale / missing / unindexed files
+codemap validate --json                                      # detect stale / missing / unindexed / rejected files
 codemap context --compact --for "refactor auth"              # JSON envelope + intent-matched recipes
 codemap ingest-coverage coverage/coverage-final.json --json  # Istanbul / LCOV (auto-detected) → coverage table; joins with symbols
 NODE_V8_COVERAGE=.cov bun test && codemap ingest-coverage .cov --runtime --json  # V8 protocol (per-process dumps); local-only
@@ -162,9 +162,9 @@ codemap query --format diff-json 'SELECT "README.md" AS file_path, 1 AS line_sta
 codemap --with-fts --full
 codemap query --recipe text-in-deprecated-functions    # demonstrates FTS5 ⨯ symbols ⨯ coverage JOIN
 # HTTP API — same tool taxonomy as `codemap mcp`, exposed over POST /tool/{name} for
-# non-MCP consumers (CI scripts, curl, IDE plugins). Loopback default; optional --token.
+# non-MCP consumers (CI scripts, curl, IDE plugins). Loopback default; --token required on non-loopback.
 TOKEN=$(openssl rand -hex 32)
-codemap serve --port 7878 --token "$TOKEN" &
+codemap serve --port 7878 --token "$TOKEN" &                 # --token required when --host is not loopback
 curl -s -X POST http://127.0.0.1:7878/tool/query \
   -H 'Content-Type: application/json' \
   -H "Authorization: Bearer $TOKEN" \