Skip to content
23 changes: 23 additions & 0 deletions .agents/skills/harden-pr/LEDGER.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Harden-pr ledger

Single durable backlog for [`harden-pr`](./SKILL.md). Parent reads **§ Rejections** at vet step; **§ Deferred** on cap and on `/harden-pr reconcile`.

## Rejections

By-design or false-positive findings — do not re-raise.

```markdown
- **[category]** `file:line` — label: reason
```

<!-- Example:
- **[security]** `src/cli/proxy.ts:42` — https_proxy env: by-design — standard CLI proxy convention.
-->

## Deferred

Capped or out-of-scope-for-now — reconcile re-vets; remove lines when fixed.

```markdown
- **[severity]** `file:line` — finding (deferred: out of scope | cap | blocked)
```
92 changes: 69 additions & 23 deletions .agents/skills/harden-pr/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ description: >-
Bring a branch to pristine, maximum production readiness without changing PR intent —
spawn parallel Task subagents (never inline review), fix in-bounds findings, loop autonomously until
clean or pass cap, then report once. Use after a tracer-bullet commit (lite), before PR
is done (full), or on "harden", "harden-pr", "pristine", "review until clean",
"production-ready pass". Invoking this skill authorizes one harden commit at cycle end.
is done (full), on "harden", "harden-pr", "pristine", "review until clean",
"production-ready pass", or "harden-pr reconcile". Invoking this skill authorizes one harden commit at cycle end.
NEVER stop mid-loop to ask about commits, babysit, or the next pass. NEVER redesign the
feature or change observable runtime behavior.
---
Expand All @@ -16,9 +16,9 @@ description: >-

Local loop: parallel reviewer subagents → merge findings → fix in-bounds → re-verify → repeat until clean or cap → **one final report**.

**Invoking this skill (`/harden-pr`, `harden-pr lite`, `harden-pr full`) is a run-to-completion command.** The agent executes the full loop before ending the turn.
**Invoking this skill (`/harden-pr`, `harden-pr lite`, `harden-pr full`, `harden-pr quick`, `harden-pr reconcile`) is a run-to-completion command.** The agent executes the full loop before ending the turn.

Sister skills: [`audit-pr-architecture`](../audit-pr-architecture/SKILL.md) (extended structural reviewer). Mention **`babysit`** only in the final report (full mode) — never mid-loop.
Sister skills: [`audit-pr-architecture`](../audit-pr-architecture/SKILL.md) (extended structural reviewer). **Ledger:** [LEDGER.md](./LEDGER.md) (rejections + deferred — one file). Mention **`babysit`** only in the final report (full mode) — never mid-loop.

## Run-to-completion (read first)

Expand All @@ -42,12 +42,14 @@ Otherwise: resolve anchor → run all passes → fix → verify → next pass

## Modes

| Mode | When | Scope | Max passes |
| -------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ----------------------- | ---------- |
| **Lite** | After each tracer-bullet slice commit ([`tracer-bullets`](../../rules/tracer-bullets.md) cadence) | Files in the slice diff | 2 |
| **Full** | User intent ("full harden", "PR done", "production-ready pass") **or** offer when an in-flight `docs/plans/<topic>.md` checklist is complete | `origin/main...HEAD` | 3 |
| Mode | When | Scope | Max passes |
| ------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------- | ---------- |
| **Lite** | After each tracer-bullet slice commit ([`tracer-bullets`](../../rules/tracer-bullets.md) cadence) | Files in the slice diff | 2 |
| **Quick** | Cheap uncertainty pass ("quick harden") | Last commit or slice diff | 1 |
| **Full** | User intent ("full harden", "PR done", "production-ready pass") **or** offer when an in-flight `docs/plans/<topic>.md` checklist is complete | `origin/main...HEAD` | 3 |
| **Reconcile** | `/harden-pr reconcile` — process [LEDGER.md § Deferred](./LEDGER.md#deferred), then run **full** if branch still open | `origin/main...HEAD` | 3 |

Default to **lite** when invoked immediately after a slice commit. Default to **full** when the user signals branch completion.
Default to **lite** when invoked immediately after a slice commit. Default to **full** when the user signals branch completion. **Quick** = core 3 reviewers only (no extended roster).

## Production bar (what "pristine" means)

Expand Down Expand Up @@ -76,6 +78,27 @@ Resolve in order; stop at the first hit:

Reviewers treat the anchor as contract. Findings that would violate it → **report, do not apply**.

Record `HEAD` at loop start (`git rev-parse HEAD`) in the final report. If `HEAD` changes mid-loop from unrelated work, re-resolve the anchor before the next pass.

## Vet step (parent, after merge — before fix)

Subagents over-report. After merge + dedupe:

1. Read [LEDGER.md § Rejections](./LEDGER.md#rejections) — drop findings matching a rejection entry.
2. For each remaining finding: **re-read** `file` at `line` (or the cited region). Drop if the claim is false or by-design.
3. New by-design drops → append one bullet to **§ Rejections** in [LEDGER.md](./LEDGER.md).
4. Sort survivors by leverage: `severity` first, then `confidence` desc, then `effort` asc (`S` before `L`).

**Anti-pattern:** applying a fix without re-reading the cited location.

## Reconcile mode

Run-to-completion like other modes:

1. Read [LEDGER.md § Deferred](./LEDGER.md#deferred). Re-vet each row (same vet step). Fix in-bounds items; remove fixed lines.
2. Run **full** harden on `origin/main...HEAD` (same loop as full mode).
3. On cap: append still-deferred items to **§ Deferred** in [LEDGER.md](./LEDGER.md). Report what was reconciled vs still open.

## In-bounds vs out-of-bounds

**Fix:** bugs, missing tests, docs/changeset drift, lint/type/format, error-handling gaps, edge cases, **behavior-preserving refactors in touched files**, in-scope nits (naming, comment hygiene, cheap lint fixes).
Expand Down Expand Up @@ -106,10 +129,16 @@ Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses ar
"finding": "One-sentence claim about a gap vs production bar",
"severity": "blocker | major | minor | nit | info",
"file": "repo-relative/path or \"multiple\"",
"fixable_in_bounds": true
"line": 42,
"confidence": "high | medium | low",
"effort": "S | M | L",
"fixable_in_bounds": true,
"production_bar": "Tests | Docs | Structure | …"
}
```

Use `line: null` when the gap is file-level (e.g. missing test file).

**Severity → action**

| Severity | Parent action |
Expand All @@ -124,8 +153,9 @@ Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses ar
1. Concatenate all reviewer arrays.
2. Drop `info` unless it blocks ship shape.
3. Dedupe: same `file` + same root cause → keep highest severity, merge `finding` text.
4. Sort actionable: `blocker` → `major` → `minor` → `nit`.
5. If merged list is empty → pass succeeds; skip fix phase.
4. Sort actionable: `blocker` → `major` → `minor` → `nit`; within tier → `confidence` desc → `effort` asc.
5. **Vet** (§ Vet step).
6. If vetted list is empty → pass succeeds; skip fix phase.

**Example merged queue (pass 1)**

Expand All @@ -135,19 +165,31 @@ Each reviewer returns **only** a JSON array (no prose wrapper). Parent parses ar
"finding": "CLI --help documents summary counts but not per-row attribution on --base JSON rows.",
"severity": "major",
"file": "src/cli/cmd-audit.ts",
"fixable_in_bounds": true
"line": 120,
"confidence": "high",
"effort": "S",
"fixable_in_bounds": true,
"production_bar": "Docs"
},
{
"finding": "Skill shard leaks requiredColumns when describing attribution.",
"severity": "major",
"file": "templates/agent-content/skill/10-recipes-context.md",
"fixable_in_bounds": true
"line": null,
"confidence": "high",
"effort": "M",
"fixable_in_bounds": true,
"production_bar": "Surfaces"
},
{
"finding": "No e2e test for attribution: inherited on deprecated delta.",
"severity": "nit",
"file": "src/application/audit-worktree.test.ts",
"fixable_in_bounds": true
"line": null,
"confidence": "medium",
"effort": "S",
"fixable_in_bounds": true,
"production_bar": "Tests"
}
]
```
Expand All @@ -170,7 +212,7 @@ You are the **{ROLE}** reviewer for `/harden-pr` on `{REPO}`.
**Task:** {EXTRA}

**Return ONLY** a JSON array of findings:
[{ "finding": "...", "severity": "blocker|major|minor|nit|info", "file": "...", "fixable_in_bounds": true|false }]
[{ "finding": "...", "severity": "blocker|major|minor|nit|info", "file": "...", "line": N|null, "confidence": "high|medium|low", "effort": "S|M|L", "fixable_in_bounds": true|false, "production_bar": "..." }]
If clean: []

Readonly — do not edit files.
Expand Down Expand Up @@ -216,23 +258,25 @@ Re-derive layer globs from `docs/architecture.md` § Layering — don't hardcode
Execute **without pausing for user input** until exit condition:

```
resolve intent anchor
resolve intent anchor; stamp HEAD
pass = 1
loop:
Task-batch all applicable reviewers (parallel, readonly)
parent: merge + dedupe JSON findings (§ Finding schema)
parent: vet findings (§ Vet step)
if none actionable → goto done
fix in-bounds (pass 1: all; passes 2+: blockers first, then in-scope nits)
run project checks on touched files
per fix: run verification gate from verify-after-each-step on touched files
if clean and no new findings → goto done
if pass >= max_passes → goto capped
pass += 1
goto loop
capped:
append deferred rows to LEDGER.md § Deferred
emit deferred-nits list (each nit must cite plan Out of scope or cross-PR blocker — not "optional")
done:
if uncommitted fixes → git commit -m "harden: …"
emit final report (include babysit one-liner if full mode)
emit final report (include babysit one-liner if full mode; include anchor HEAD stamp)
```

**Pass cap behavior:** after cap, stop auto-fixing; list deferred nits. Do not block the next tracer slice.
Expand All @@ -243,9 +287,11 @@ Skill invocation **is** the commit authorization. After the loop: if fixes exist

## Quick invoke

| Intent | Say |
| ----------- | ------------------------------------------------------ |
| Post-slice | `/harden-pr lite` or `/harden-pr` after a slice commit |
| Branch done | `/harden-pr full` or "production-ready pass" |
| Intent | Say |
| ---------------- | ------------------------------------------------------ |
| Post-slice | `/harden-pr lite` or `/harden-pr` after a slice commit |
| Cheap pass | `/harden-pr quick` |
| Branch done | `/harden-pr full` or "production-ready pass" |
| Deferred backlog | `/harden-pr reconcile` |

Replaces the old copy-paste: _"spawn subagents → fix → loop until clean"_ — this skill **is** that loop.
5 changes: 5 additions & 0 deletions .changeset/read-surface-hardening.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
"@stainless-code/codemap": patch
---

Harden read surfaces: `codemap query --format …` blocks index mutations via the same read-only guard as `--json`; `codemap serve` requires `--token` when `--host` is not loopback (any `127.0.0.0/8` address counts as loopback, so `--token` stays optional on `127.0.0.2` and similar); `codemap validate` (and MCP/HTTP `validate`) can return `rejected` rows with optional `reason` (`path escapes project root` | `path escapes via symlink` | `path resolves outside project root`) — output `path` keys are always project-relative POSIX paths.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ codemap dead-code --json # outcome alias →
codemap query --json --recipe fan-out # recipe SQL by id (alias: -r)
codemap query --json "SELECT name, file_path FROM symbols WHERE name = 'foo'" # ad-hoc SQL
codemap --files src/a.ts src/b.tsx # targeted re-index after edits
codemap validate --json # detect stale / missing / unindexed files
codemap validate --json # detect stale / missing / unindexed / rejected files
codemap context --compact --for "refactor auth" # JSON envelope + intent-matched recipes
codemap ingest-coverage coverage/coverage-final.json --json # Istanbul / LCOV (auto-detected) → coverage table; joins with symbols
NODE_V8_COVERAGE=.cov bun test && codemap ingest-coverage .cov --runtime --json # V8 protocol (per-process dumps); local-only
Expand Down Expand Up @@ -162,9 +162,9 @@ codemap query --format diff-json 'SELECT "README.md" AS file_path, 1 AS line_sta
codemap --with-fts --full
codemap query --recipe text-in-deprecated-functions # demonstrates FTS5 ⨯ symbols ⨯ coverage JOIN
# HTTP API — same tool taxonomy as `codemap mcp`, exposed over POST /tool/{name} for
# non-MCP consumers (CI scripts, curl, IDE plugins). Loopback default; optional --token.
# non-MCP consumers (CI scripts, curl, IDE plugins). Loopback default; --token required on non-loopback.
TOKEN=$(openssl rand -hex 32)
codemap serve --port 7878 --token "$TOKEN" &
codemap serve --port 7878 --token "$TOKEN" & # --token required when --host is not loopback
curl -s -X POST http://127.0.0.1:7878/tool/query \
-H 'Content-Type: application/json' \
-H "Authorization: Bearer $TOKEN" \
Expand Down
Loading
Loading