diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index 563dfcd8..a1b7be99 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -12,7 +12,7 @@
{
"name": "compound-engineering",
"description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 29 specialized agents, 22 commands, and 19 skills.",
- "version": "2.33.0",
+ "version": "2.34.0",
"author": {
"name": "Kieran Klaassen",
"url": "https://github.com/kieranklaassen",
diff --git a/plugins/compound-engineering/.claude-plugin/plugin.json b/plugins/compound-engineering/.claude-plugin/plugin.json
index a74039ac..9b35c5a7 100644
--- a/plugins/compound-engineering/.claude-plugin/plugin.json
+++ b/plugins/compound-engineering/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
{
"name": "compound-engineering",
- "version": "2.33.0",
+ "version": "2.34.0",
"description": "AI-powered development tools. 29 agents, 22 commands, 19 skills, 1 MCP server for code review, research, design, and workflow automation.",
"author": {
"name": "Kieran Klaassen",
diff --git a/plugins/compound-engineering/CHANGELOG.md b/plugins/compound-engineering/CHANGELOG.md
index b80621c6..bce7f8d7 100644
--- a/plugins/compound-engineering/CHANGELOG.md
+++ b/plugins/compound-engineering/CHANGELOG.md
@@ -5,6 +5,59 @@ All notable changes to the compound-engineering plugin will be documented in thi
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
+## [2.34.0] - 2026-02-13
+
+### Changed
+
+- **`/deepen-plan` command** — Complete rewrite with **context-managed map-reduce** architecture to prevent context overflow. Validated across multiple real-world runs.
+
+ **Architecture:**
+ - Sub-agents write full analysis JSON to `.deepen/` on disk, return only a ~200-char completion signal to parent
+ - Parent context stays under ~13k tokens regardless of agent count (vs unbounded in v1)
+ - 10-phase pipeline: Analyze → Discover → Research (batched) → Validate → Judge (parallel per-section + merge) → Enhance → Quality Review → Preservation Check → Present
+
+ **Context Overflow Prevention (crash-tested):**
+ - **Batched agent launches** — Max 4 Task() agents pending simultaneously. Prevents session crash from simultaneous returns (anthropics/claude-code#11280, #8136)
+ - **200-char return cap** — Hard limit on agent return messages. All analysis lives in JSON files on disk
+ - **Task() failure recovery** — Retry once on silent infrastructure errors (`[Tool result missing due to internal error]`)
+
+ **Version Grounding:**
+ - Plan-analyzer reads lockfile → package.json → plan text (priority order) to resolve actual framework versions
+ - Prevents downstream agents from researching wrong library versions (e.g., MUI 5 when project uses MUI 7)
+ - `version_mismatches` field flags discrepancies between plan text and actual dependencies
+
+ **Per-Section Judge Parallelization:**
+ - Replaces single monolithic judge with parallel per-section judges + merge judge
+ - Section judges run in parallel (batched max 4), each deduplicates and ranks within its section
+ - Merge judge resolves cross-section conflicts, identifies cross-section convergence
+ - Reduced judge time from ~21 min to ~8-10 min in testing
+
+ **Two-Part Output Structure:**
+ - **Decision Record** (reviewer-facing): Enhancement summary, agent consensus, research insights, strong signal markers, fast follow items, verification checklist
+ - **Implementation Spec** (developer-facing): Clean, linear implementation guidance with ready-to-copy code blocks — no `// ENHANCED:` annotations or `(Rec #X)` references
+
+ **Quality Review (CoVe Pattern):**
+ - Post-enhancement agent checks for self-contradictions, PR scope assessment, defensive stacking, code completeness (undefined references), integration test gap detection, deferred items needing bridge mitigations
+ - Runs in isolated context — does not inherit enhancer's perspective
+
+ **Enhancer Improvements:**
+ - **Resolve conditionals** — Reads codebase to determine which implementation path applies, eliminates "if X use A, if Y use B" forks
+ - **Version verification** — Checks `frameworks_with_versions` before suggesting APIs (prevents ES2023+ suggestions for ES2022 targets)
+ - **Accessibility verification** — Ensures `prefers-reduced-motion` fallbacks don't leave permanent visual artifacts
+ - **Convergence signals** — `[Strong Signal — N agents]` markers when 3+ agents independently flag same concern
+ - **`fast_follow` classification** — Fourth action bucket for items with real UX impact but out of PR scope (must be ticketed before merge)
+
+ **Other Improvements:**
+ - `truncated_count` required field — Agents report omitted recommendations beyond 8-cap; judge weights convergence accordingly
+ - `learnings-researcher` integration — Single dedicated agent replaces N per-file learning agents
+ - Pipeline checkpoint logging to `.deepen/PIPELINE_LOG.md` for diagnostics
+ - Cross-platform safe: project-relative `.deepen/`, Node.js validation (no Python3 dependency)
+ - Architectural Decision Challenge phase with `project-architecture-challenger` agent
+ - `agent-native-architecture-reviewer` with dedicated skill routing
+ - PROJECT ARCHITECTURE CONTEXT block for all review/research agents
+
+---
+
## [2.33.0] - 2026-02-12
### Added
diff --git a/plugins/compound-engineering/commands/deepen-plan.md b/plugins/compound-engineering/commands/deepen-plan.md
index a7054764..c8d3a6ed 100644
--- a/plugins/compound-engineering/commands/deepen-plan.md
+++ b/plugins/compound-engineering/commands/deepen-plan.md
@@ -4,543 +4,1188 @@ description: Enhance a plan with parallel research agents for each section to ad
argument-hint: "[path to plan file]"
---
-# Deepen Plan - Power Enhancement Mode
+# Deepen Plan (v3 — Context-Managed Map-Reduce)
+
+**Note: The current year is 2026.** Use this when searching for recent documentation and best practices.
+
+Take an existing plan (from `/workflows:plan`) and enhance each section with parallel research, skill application, and review agents — using file-based synthesis to prevent context overflow while maximizing depth.
## Introduction
-**Note: The current year is 2026.** Use this when searching for recent documentation and best practices.
+Senior Technical Research Lead with expertise in architecture, best practices, and production-ready implementation patterns
-This command takes an existing plan (from `/workflows:plan`) and enhances each section with parallel research agents. Each major element gets its own dedicated research sub-agent to find:
-- Best practices and industry patterns
-- Performance optimizations
-- UI/UX improvements (if applicable)
-- Quality enhancements and edge cases
-- Real-world implementation examples
+## Architecture: Phased File-Based Map-Reduce
-The result is a deeply grounded, production-ready plan with concrete implementation details.
+1. **Analyze Phase** (sequential) — Parse plan into structured manifest. **Grounds versions in lockfile/package.json**, not plan text.
+2. **Discover Phase** (parent) — Find available skills, learnings, agents using Glob/Read. Match against manifest.
+3. **Research Phase** (batched parallel) — Matched agents write structured recommendations to `.deepen/`, return only a completion signal. Agents report `truncated_count` when capped.
+4. **Validate** — Verify all expected agent files exist, conform to schema (including required `truncated_count`), flag zero-tool-use hallucination risk.
+5. **Judge Phase** (parallel per-section + data prep + merge) — Per-section judges run in parallel (batched, max 4). Data prep agent (haiku) compiles all results into a single `MERGE_INPUT.json`. Merge judge reads one file and focuses on cross-section conflict/convergence reasoning.
+6. **Judge Validation** — Verify judge output references real manifest sections.
+7. **Enhance Phase** — Synthesis agent reads consolidated recommendations + original plan, writes enhanced version. **Verifies APIs exist in resolved versions before suggesting code.** Classifies items as implement/verify/fast_follow/defer. Two-part output: Decision Record + Implementation Spec.
+8. **Quality Review** — CoVe-pattern agent checks enhanced plan for self-contradictions, PR scope, defensive stacking, deferred items needing bridge mitigations.
+9. **Preservation Check** — Single-pass verification that enhanced plan contains every original section.
+10. **Present** — Parent reads enhancement summary + quality review and presents next steps.
+
+Parent context stays under ~15k tokens of agent output regardless of agent count.
+
+## Task() Failure Recovery
+
+
+If any Task() call returns an error, empty result, or `[Tool result missing due to internal error]`, retry ONCE with identical parameters before failing the phase. This is a known Claude Code infrastructure issue — the subprocess can silently fail due to timeout, OOM, or connection drop. The retry almost always succeeds. Log the failure and retry in the pipeline log.
+
## Plan File
#$ARGUMENTS
**If the plan path above is empty:**
-1. Check for recent plans: `ls -la docs/plans/`
-2. Ask the user: "Which plan would you like to deepen? Please provide the path (e.g., `docs/plans/2026-01-15-feat-my-feature-plan.md`)."
+1. Check for recent plans: `ls -la plans/`
+2. Ask the user: "Which plan would you like to deepen? Please provide the path."
Do not proceed until you have a valid plan file path.
-## Main Tasks
+## Checkpoint Logging
-### 1. Parse and Analyze Plan Structure
+
+After EVERY phase, write a checkpoint to `.deepen/PIPELINE_LOG.md`. This is diagnostic — report these results back.
-
-First, read and parse the plan to identify each major section that can be enhanced with research.
-
-
-**Read the plan file and extract:**
-- [ ] Overview/Problem Statement
-- [ ] Proposed Solution sections
-- [ ] Technical Approach/Architecture
-- [ ] Implementation phases/steps
-- [ ] Code examples and file references
-- [ ] Acceptance criteria
-- [ ] Any UI/UX components mentioned
-- [ ] Technologies/frameworks mentioned (Rails, React, Python, TypeScript, etc.)
-- [ ] Domain areas (data models, APIs, UI, security, performance, etc.)
-
-**Create a section manifest:**
+Format each checkpoint as:
```
-Section 1: [Title] - [Brief description of what to research]
-Section 2: [Title] - [Brief description of what to research]
-...
+## Phase N: [Name] — [PASS/FAIL/PARTIAL]
+- Started: [timestamp from date command]
+- Completed: [timestamp]
+- Notes: [what happened, any issues]
+- Files created: [list]
```
+
-### 2. Discover and Apply Available Skills
+## Main Tasks
-
-Dynamically discover all available skills and match them to plan sections. Don't assume what skills exist - discover them at runtime.
-
+### 1. Prepare the Scratchpad Directory
-**Step 1: Discover ALL available skills from ALL sources**
+
+Use a project-relative path, NOT /tmp/. The /tmp/ path causes two problems:
+1. Claude Code's Read tool and MCP filesystem tools cannot access /tmp/ (outside allowed directories)
+2. On Windows, /tmp/ resolves to different locations depending on the subprocess
+
```bash
-# 1. Project-local skills (highest priority - project-specific)
-ls .claude/skills/
-
-# 2. User's global skills (~/.claude/)
-ls ~/.claude/skills/
+DEEPEN_DIR=".deepen"
+rm -rf "$DEEPEN_DIR"
+mkdir -p "$DEEPEN_DIR"
+grep -qxF '.deepen/' .gitignore 2>/dev/null || echo '.deepen/' >> .gitignore
+
+cp "$DEEPEN_DIR/original_plan.md"
+
+# Initialize pipeline log
+echo "# Deepen Plan Pipeline Log" > "$DEEPEN_DIR/PIPELINE_LOG.md"
+echo "" >> "$DEEPEN_DIR/PIPELINE_LOG.md"
+echo "## Phase 0: Setup — PASS" >> "$DEEPEN_DIR/PIPELINE_LOG.md"
+echo "- Started: $(date -u +%H:%M:%S)" >> "$DEEPEN_DIR/PIPELINE_LOG.md"
+echo "- Plan copied to .deepen/original_plan.md" >> "$DEEPEN_DIR/PIPELINE_LOG.md"
+echo "" >> "$DEEPEN_DIR/PIPELINE_LOG.md"
+```
-# 3. compound-engineering plugin skills
-ls ~/.claude/plugins/cache/*/compound-engineering/*/skills/
+### 2. Analyze Plan Structure (Phase 1 — Sequential)
-# 4. ALL other installed plugins - check every plugin for skills
-find ~/.claude/plugins/cache -type d -name "skills" 2>/dev/null
+
+Run this BEFORE discovering or launching any agents. This produces the structured manifest that drives intelligent agent selection.
+
-# 5. Also check installed_plugins.json for all plugin locations
-cat ~/.claude/plugins/installed_plugins.json
+```
+Task plan-analyzer("
+You are a Plan Structure Analyzer. Parse a development plan into a structured manifest.
+
+## Instructions:
+1. Read .deepen/original_plan.md
+
+2. **GROUND versions in actual dependency files — do NOT trust plan text for versions.**
+ Resolve framework/library versions using this priority order (highest trust first):
+ a. **Lockfile** (exact resolved versions): Glob for package-lock.json, yarn.lock, pnpm-lock.yaml, Gemfile.lock, poetry.lock. Read the relevant entries.
+ b. **Dependency file** (semver ranges): Read package.json, Gemfile, pyproject.toml, etc. Extract version ranges.
+ c. **Plan text** (lowest trust): Only use versions stated in the plan if no dependency file exists. Mark as unverified.
+
+ For each technology, record:
+ - The resolved version from the lockfile/dependency file
+ - Whether the plan text stated a different version (version mismatch)
+ - The source: \"lockfile\", \"dependency_file\", or \"plan_text_unverified\"
+
+3. Write your analysis to .deepen/PLAN_MANIFEST.json using this EXACT schema:
+
+{
+ \"plan_title\": \"\",
+ \"plan_path\": \"\",
+ \"technologies\": [\"Rails\", \"React\", \"TypeScript\", ...],
+ \"domains\": [\"authentication\", \"caching\", \"API design\", ...],
+ \"sections\": [
+ {
+ \"id\": 1,
+ \"title\": \"\",
+ \"summary\": \"<1-2 sentences>\",
+ \"technologies\": [\"subset\"],
+ \"domains\": [\"subset\"],
+ \"has_code_examples\": true|false,
+ \"has_ui_components\": true|false,
+ \"has_data_models\": true|false,
+ \"has_api_design\": true|false,
+ \"has_security_concerns\": true|false,
+ \"has_performance_concerns\": true|false,
+ \"has_testing_strategy\": true|false,
+ \"has_deployment_concerns\": true|false,
+ \"enhancement_opportunities\": \"\"
+ }
+ ],
+ \"frameworks_with_versions\": {
+ \"React\": {\"version\": \"19.1.0\", \"source\": \"lockfile\"},
+ \"MUI\": {\"version\": \"7.3.7\", \"source\": \"lockfile\"}
+ },
+ \"version_mismatches\": [
+ {
+ \"technology\": \"MUI\",
+ \"plan_stated\": \"5\",
+ \"actual_resolved\": \"7.3.7\",
+ \"source\": \"lockfile\",
+ \"impact\": \"All MUI API recommendations must target v7, not v5\"
+ }
+ ],
+ \"overall_risk_areas\": [\"\"],
+ \"acceptance_criteria_count\": ,
+ \"implementation_phases_count\":
+}
+
+4. Also write a human-readable summary to .deepen/PLAN_MANIFEST.md (max 300 words). If version mismatches were found, list them prominently at the top.
+
+5. Return to parent: 'Plan analysis complete. sections identified across technologies. [X version mismatches found.] Written to .deepen/PLAN_MANIFEST.json'
+")
```
-**Important:** Check EVERY source. Don't assume compound-engineering is the only plugin. Use skills from ANY installed plugin that's relevant.
-
-**Step 2: For each discovered skill, read its SKILL.md to understand what it does**
+Wait for completion. Then log checkpoint:
```bash
-# For each skill directory found, read its documentation
-cat [skill-path]/SKILL.md
+echo "## Phase 1: Plan Analysis — $([ -f .deepen/PLAN_MANIFEST.json ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "- Files: $(ls .deepen/PLAN_MANIFEST.* 2>/dev/null)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
```
-**Step 3: Match skills to plan content**
+### 3. Discover Available Skills, Learnings, and Agents (Phase 2)
-For each skill discovered:
-- Read its SKILL.md description
-- Check if any plan sections match the skill's domain
-- If there's a match, spawn a sub-agent to apply that skill's knowledge
+
+This step runs in the PARENT context. Discovery only — read directory listings and frontmatter, NOT full file contents. Keep lightweight.
+
-**Step 4: Spawn a sub-agent for EVERY matched skill**
+#### Step 3a: Discover Skills
-**CRITICAL: For EACH skill that matches, spawn a separate sub-agent and instruct it to USE that skill.**
+Use Claude Code's native tools:
-For each matched skill:
```
-Task general-purpose: "You have the [skill-name] skill available at [skill-path].
+Glob: .claude/skills/*/SKILL.md
+Glob: ~/.claude/skills/*/SKILL.md
+Glob: ~/.claude/plugins/cache/**/skills/*/SKILL.md
+```
-YOUR JOB: Use this skill on the plan.
+For each discovered SKILL.md, Read first 10 lines only (frontmatter/description).
-1. Read the skill: cat [skill-path]/SKILL.md
-2. Follow the skill's instructions exactly
-3. Apply the skill to this content:
+#### Step 3b: Discover Learnings
-[relevant plan section or full plan]
+**Preferred method:** If the compound-engineering plugin's `learnings-researcher` agent is available (check `~/.claude/plugins/cache/**/agents/research/learnings-researcher.md`), use it as a single dedicated agent in Step 4 instead of spawning per-file learning agents. It searches `docs/solutions/` by frontmatter metadata — one specialized agent replaces N generic ones with better quality.
-4. Return the skill's full output
+**Fallback (no learnings-researcher available):**
-The skill tells you what to do - follow it. Execute the skill completely."
+```
+Glob: docs/solutions/**/*.md
```
-**Spawn ALL skill sub-agents in PARALLEL:**
-- 1 sub-agent per matched skill
-- Each sub-agent reads and uses its assigned skill
-- All run simultaneously
-- 10, 20, 30 skill sub-agents is fine
+For each found file, Read first 15 lines (frontmatter only).
-**Each sub-agent:**
-1. Reads its skill's SKILL.md
-2. Follows the skill's workflow/instructions
-3. Applies the skill to the plan
-4. Returns whatever the skill produces (code, recommendations, patterns, reviews, etc.)
+#### Step 3c: Discover Review/Research Agents
-**Example spawns:**
```
-Task general-purpose: "Use the dhh-rails-style skill at ~/.claude/plugins/.../dhh-rails-style. Read SKILL.md and apply it to: [Rails sections of plan]"
+Glob: .claude/agents/*.md
+Glob: ~/.claude/agents/*.md
+Glob: ~/.claude/plugins/cache/**/agents/**/*.md
+```
+
+For compound-engineering plugin agents:
+- USE: `agents/review/*`, `agents/research/*`, `agents/design/*`, `agents/docs/*`
+- SKIP: `agents/workflow/*` (workflow orchestrators, not reviewers)
+
+#### Step 3d: Match Against Manifest
+
+Read `.deepen/PLAN_MANIFEST.json` and match discovered resources:
+
+**Skills** — Match if skill's domain overlaps with any plan technology or domain:
+- Rails plans -> `dhh-rails-style`
+- Ruby gem plans -> `andrew-kane-gem-writer`
+- Frontend/UI plans -> `frontend-design`
+- AI/agent plans -> `agent-native-architecture`
+- LLM integration plans -> `dspy-ruby`
+- Documentation plans -> `every-style-editor`, `compound-docs`
+- Skill creation plans -> `create-agent-skills`
+
+**Important:** Skills may have `references/` subdirectories. Instruct skill agents to also check `references/`, `assets/`, `templates/` directories within the skill path.
+
+**Special routing — `agent-native-architecture` skill:** This skill is interactive with a routing table. Do NOT use the generic skill template. Use the dedicated template in Step 4.
+
+**Learnings** — Match if tags, category, or module overlaps with plan technologies/domains.
+
+**Agents** — Two tiers:
+
+**Always run (cross-cutting):**
+- Security agents (security-sentinel)
+- Architecture agents (architecture-strategist)
+- Performance agents (performance-oracle)
+- Project Architecture Challenger (see Step 4)
+
+**Manifest-matched (run if domain overlap):**
+- Framework-specific reviewers (dhh-rails-reviewer, kieran-rails-reviewer, kieran-typescript-reviewer, kieran-python-reviewer)
+- Domain-specific agents (data-integrity-guardian, deployment-verification-agent)
+- Frontend agents (julik-frontend-races-reviewer, design agents)
+- Code quality agents (code-simplicity-reviewer, pattern-recognition-specialist)
+- Agent-native reviewer (for plans involving agent/tool features)
+
+#### Handling Sparse Discovery
-Task general-purpose: "Use the frontend-design skill at ~/.claude/plugins/.../frontend-design. Read SKILL.md and apply it to: [UI sections of plan]"
+If few/no matched skills/learnings found, acknowledge: "Limited institutional knowledge available. Enhancement based primarily on framework documentation and cross-cutting analysis."
-Task general-purpose: "Use the agent-native-architecture skill at ~/.claude/plugins/.../agent-native-architecture. Read SKILL.md and apply it to: [agent/tool sections of plan]"
+Write matched resources list to `.deepen/MATCHED_RESOURCES.md`.
-Task general-purpose: "Use the security-patterns skill at ~/.claude/skills/security-patterns. Read SKILL.md and apply it to: [full plan]"
+Log checkpoint:
+
+```bash
+echo "## Phase 2: Discovery — PASS" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "- Skills found: $(grep -c 'skill' .deepen/MATCHED_RESOURCES.md 2>/dev/null || echo 0)" >> .deepen/PIPELINE_LOG.md
+echo "- Learnings found: $(grep -c 'learning' .deepen/MATCHED_RESOURCES.md 2>/dev/null || echo 0)" >> .deepen/PIPELINE_LOG.md
+echo "- Agents found: $(grep -c 'agent' .deepen/MATCHED_RESOURCES.md 2>/dev/null || echo 0)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
```
-**No limit on skill sub-agents. Spawn one for every skill that could possibly be relevant.**
+### 4. Launch Research Agents (Phase 3 — Batched Parallel)
+
+
+KNOWN ISSUE: When 10+ Task() agents return simultaneously, they can dump ~100-200K tokens into the parent context at once. Claude Code's compaction triggers too late (~98-99% usage) and the session locks up (anthropics/claude-code#11280, #8136).
+
+MITIGATION: Launch agents in BATCHES of 3-4. Wait for each batch to complete before launching the next. This caps simultaneous returns and gives compaction room to fire between batches.
-### 3. Discover and Apply Learnings/Solutions
+Batch order:
+- **Batch 1:** Always-run cross-cutting agents (security-sentinel, architecture-strategist, performance-oracle, project-architecture-challenger)
+- **Batch 2:** Manifest-matched review agents (framework reviewers, domain agents, code quality)
+- **Batch 3:** Skill agents + learnings-researcher
+- **Batch 4:** Docs-researchers (one per technology)
-
-Check for documented learnings from /workflows:compound. These are solved problems stored as markdown files. Spawn a sub-agent for each learning to check if it's relevant.
-
+Wait for each batch to fully complete before starting the next. Between batches, log a checkpoint.
+
-**LEARNINGS LOCATION - Check these exact folders:**
+
+EVERY agent prompt MUST include these output constraints. This prevents context overflow.
+
+Append this SHARED_CONTEXT + OUTPUT_RULES block to EVERY agent spawn prompt:
```
-docs/solutions/ <-- PRIMARY: Project-level learnings (created by /workflows:compound)
-├── performance-issues/
-│ └── *.md
-├── debugging-patterns/
-│ └── *.md
-├── configuration-fixes/
-│ └── *.md
-├── integration-issues/
-│ └── *.md
-├── deployment-issues/
-│ └── *.md
-└── [other-categories]/
- └── *.md
+## SHARED CONTEXT
+Read .deepen/PLAN_MANIFEST.md first for plan overview, technologies, and risk areas.
+Read .deepen/original_plan.md for the full plan content.
+
+## OUTPUT RULES (MANDATORY — VIOLATION CAUSES SESSION CRASH)
+1. Write your FULL analysis as JSON to .deepen/{your_agent_name}.json
+2. Use this EXACT schema:
+ {
+ "agent_type": "skill|learning|research|review",
+ "agent_name": "",
+ "source_type": "skill|documented-learning|official-docs|community-web",
+ "summary": "<500 chars max>",
+ "tools_used": ["read_file:path", "web_search:query", ...],
+ "recommendations": [
+ {
+ "section_id": ,
+ "type": "best-practice|edge-case|anti-pattern|performance|security|code-example|architecture|ux|testing",
+ "title": "<100 chars>",
+ "recommendation": "<500 chars>",
+ "code_example": "",
+ "references": [""],
+ "priority": "high|medium|low",
+ "confidence": 0.0-1.0
+ }
+ ],
+ "truncated_count": 0
+ }
+3. Max 8 recommendations per agent. Prioritize by impact.
+4. Only include recommendations with confidence >= 0.6.
+5. Every recommendation MUST reference a NUMERIC section_id from the plan manifest (e.g., 1, 2, 3 — NOT "Phase-1" or "Phase-1-Types-Store"). String section IDs will be silently dropped by section judges.
+6. Code examples are ENCOURAGED.
+7. tools_used is MANDATORY. If empty, set confidence to 0.5 max.
+8. **truncated_count is REQUIRED (default 0).** If you had more recommendations beyond the 8 cap, set this to the number you omitted. Example: you found 12 relevant issues but only wrote the top 8 → truncated_count: 4. The judge uses this to weight convergence signals.
+8. **CRITICAL — YOUR RETURN MESSAGE TO PARENT MUST BE UNDER 200 CHARACTERS.**
+ Return ONLY this exact format:
+ "Done. recs for sections in .deepen/{agent_name}.json"
+ Do NOT return recommendations, analysis, code, or explanations to the parent.
+ Do NOT summarize your findings in the return message.
+ ALL analysis goes in the JSON file. The return message is just a completion signal.
+ If you return more than 200 characters, you risk crashing the parent session.
```
+
-**Step 1: Find ALL learning markdown files**
+#### Batch Execution
-Run these commands to get every learning file:
+
+DO NOT launch all agents at once. Follow this batch sequence:
-```bash
-# PRIMARY LOCATION - Project learnings
-find docs/solutions -name "*.md" -type f 2>/dev/null
+**BATCH 1 — Cross-cutting (always-run):** Launch these 3-4 agents in parallel. Wait for ALL to complete.
+- security-sentinel
+- architecture-strategist
+- performance-oracle
+- project-architecture-challenger
+
+Log: `echo "## Phase 3a: Batch 1 (cross-cutting) — PASS" >> .deepen/PIPELINE_LOG.md`
+
+**BATCH 2 — Manifest-matched reviewers:** Launch matched review agents in parallel (max 4). Wait for ALL to complete.
+- Framework reviewers, domain agents, code quality agents, agent-native reviewer
+
+Log: `echo "## Phase 3b: Batch 2 (reviewers) — PASS" >> .deepen/PIPELINE_LOG.md`
+
+**BATCH 3 — Skills + Learnings:** Launch matched skill agents + learnings-researcher in parallel (max 4). Wait for ALL to complete.
+
+Log: `echo "## Phase 3c: Batch 3 (skills+learnings) — PASS" >> .deepen/PIPELINE_LOG.md`
+
+**BATCH 4 — Docs researchers:** Launch per-technology docs researchers in parallel (max 4). Wait for ALL to complete.
+
+Log: `echo "## Phase 3d: Batch 4 (docs) — PASS" >> .deepen/PIPELINE_LOG.md`
+
+If a batch has more than 4 agents, split it into sub-batches of 4. Never have more than 4 Task() calls pending simultaneously.
+
+
+#### Agent Templates
-# If docs/solutions doesn't exist, check alternate locations:
-find .claude/docs -name "*.md" -type f 2>/dev/null
-find ~/.claude/docs -name "*.md" -type f 2>/dev/null
+**For each matched SKILL:**
+```
+Task skill-agent("
+You have the [skill-name] skill at [skill-path].
+1. Read the skill: Read [skill-path]/SKILL.md
+2. Check for additional resources:
+ - Glob [skill-path]/references/*.md
+ - Glob [skill-path]/assets/*
+ - Glob [skill-path]/templates/*
+3. Read the plan context from .deepen/
+4. Apply the skill's expertise to the plan
+5. Write recommendations following the OUTPUT RULES
+" + SHARED_CONTEXT + OUTPUT_RULES)
```
-**Step 2: Read frontmatter of each learning to filter**
+**For each matched LEARNING:**
+```
+Task learning-agent("
+Read this learning file completely: [path]
+This documents a previously solved problem. Check if it applies to the plan.
+If relevant: write specific recommendations.
+If not relevant: write empty recommendations array with summary 'Not applicable: [reason]'
+" + SHARED_CONTEXT + OUTPUT_RULES)
+```
-Each learning file has YAML frontmatter with metadata. Read the first ~20 lines of each file to get:
+**For each matched REVIEW/RESEARCH AGENT:**
+```
+Task [agent-name]("
+Review this plan using your expertise. Focus on your domain.
-```yaml
----
-title: "N+1 Query Fix for Briefs"
-category: performance-issues
-tags: [activerecord, n-plus-one, includes, eager-loading]
-module: Briefs
-symptom: "Slow page load, multiple queries in logs"
-root_cause: "Missing includes on association"
----
+## PROJECT ARCHITECTURE CONTEXT
+Read the project's CLAUDE.md for project-specific architectural principles. Evaluate the plan against THESE principles.
+" + SHARED_CONTEXT + OUTPUT_RULES)
```
-**For each .md file, quickly scan its frontmatter:**
+**For each technology in the manifest, spawn a docs-researcher:**
+```
+Task docs-researcher-[technology]("
+Research current (2025-2026) best practices for [technology] [version if available].
+
+## Documentation Research Steps:
+1. Query Context7 MCP for official framework documentation:
+ - First: mcp__plugin_compound-engineering_context7__resolve-library-id for '[technology]'
+ - Then: mcp__plugin_compound-engineering_context7__query-docs with the resolved ID
+2. Web search for recent (2025-2026) articles, migration guides, changelog notes
+3. Search for version-specific changes if manifest includes a version
+4. Find concrete code patterns and configuration recommendations
+
+Budget: 3-5 searches per technology.
+" + SHARED_CONTEXT + OUTPUT_RULES)
+```
-```bash
-# Read first 20 lines of each learning (frontmatter + summary)
-head -20 docs/solutions/**/*.md
+**SPECIAL: `agent-native-architecture` skill (if matched):**
+```
+Task agent-native-architecture-reviewer("
+You are an Agent-Native Architecture Reviewer.
+
+## Instructions:
+1. Read [skill-path]/SKILL.md — focus on , ,
+2. Read these reference files:
+ - [skill-path]/references/from-primitives-to-domain-tools.md
+ - [skill-path]/references/mcp-tool-design.md
+ - [skill-path]/references/refactoring-to-prompt-native.md
+3. Read project's CLAUDE.md
+4. Read .deepen/ plan context
+
+## Apply these checks:
+- Does a new tool duplicate existing capability?
+- Does the tool encode business logic that should live in the agent prompt?
+- Are there two ways to accomplish the same outcome?
+- Is logic in the right layer?
+- Do hardcoded values belong in skills?
+- Are features truly needed now, or YAGNI?
+- Does the Architecture Review Checklist pass?
+
+Use type 'architecture' or 'anti-pattern' for findings.
+" + SHARED_CONTEXT + OUTPUT_RULES)
```
-**Step 3: Filter - only spawn sub-agents for LIKELY relevant learnings**
+**ALWAYS RUN: Project Architecture Challenger**
+```
+Task project-architecture-challenger("
+You are a Project Architecture Challenger. Your job is to CHALLENGE the plan's decisions against the project's own architectural principles.
+
+## Instructions:
+1. Read project's CLAUDE.md — extract architectural principles, patterns, conventions
+2. Read .deepen/original_plan.md
+3. Read .deepen/PLAN_MANIFEST.md
+
+## For each major decision, ask:
+- **Redundancy**: Does this duplicate something existing?
+- **Layer placement**: Is business logic in the right place?
+- **YAGNI enforcement**: Does the plan acknowledge YAGNI but build it anyway?
+- **Hardcoded vs emergent**: Are values hardcoded that could be discovered?
+- **Convention drift**: Does any decision contradict CLAUDE.md?
+- **Complexity budget**: Does each feature earn its complexity?
+
+High confidence (0.8+) when CLAUDE.md explicitly contradicts the plan.
+Medium confidence (0.6-0.7) for judgment calls.
+" + SHARED_CONTEXT + OUTPUT_RULES)
+```
-Compare each learning's frontmatter against the plan:
-- `tags:` - Do any tags match technologies/patterns in the plan?
-- `category:` - Is this category relevant? (e.g., skip deployment-issues if plan is UI-only)
-- `module:` - Does the plan touch this module?
-- `symptom:` / `root_cause:` - Could this problem occur with the plan?
+After ALL batches complete, log the overall checkpoint:
-**SKIP learnings that are clearly not applicable:**
-- Plan is frontend-only → skip `database-migrations/` learnings
-- Plan is Python → skip `rails-specific/` learnings
-- Plan has no auth → skip `authentication-issues/` learnings
+```bash
+AGENT_COUNT=$(ls .deepen/*.json 2>/dev/null | grep -v PLAN_MANIFEST | wc -l)
+echo "## Phase 3: Research Agents (All Batches) — $([ $AGENT_COUNT -gt 0 ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "- Agent JSON files written: $AGENT_COUNT" >> .deepen/PIPELINE_LOG.md
+echo "- Files: $(ls .deepen/*.json 2>/dev/null | grep -v PLAN_MANIFEST)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
+```
-**SPAWN sub-agents for learnings that MIGHT apply:**
-- Any tag overlap with plan technologies
-- Same category as plan domain
-- Similar patterns or concerns
+
+Late agent completion notifications are expected and harmless. Because agents are batched, late notifications should be rare — but if you receive one after moving to Step 5+, ignore it. The agent's JSON file is already on disk.
+
-**Step 4: Spawn sub-agents for filtered learnings**
+### 5. Verify and Validate Agent Outputs (Phase 4)
-For each learning that passes the filter:
+#### Step 5a: Verify Expected Files Exist
+```bash
+EXPECTED_AGENTS=""
+
+MISSING=""
+for agent in $EXPECTED_AGENTS; do
+ if ! ls .deepen/${agent}*.json 1>/dev/null 2>&1; then
+ MISSING="$MISSING $agent"
+ echo "MISSING: $agent"
+ fi
+done
+
+if [ -n "$MISSING" ]; then
+ echo "WARNING: Missing agent files:$MISSING"
+fi
```
-Task general-purpose: "
-LEARNING FILE: [full path to .md file]
-1. Read this learning file completely
-2. This learning documents a previously solved problem
+Re-launch missing agents before proceeding.
-Check if this learning applies to this plan:
+#### Step 5b: Validate JSON Schema and Flag Hallucination Risk
----
-[full plan content]
----
+
+Use Node.js for validation — Python3 may not be installed on Windows.
+
-If relevant:
-- Explain specifically how it applies
-- Quote the key insight or solution
-- Suggest where/how to incorporate it
-
-If NOT relevant after deeper analysis:
-- Say 'Not applicable: [reason]'
+```bash
+node -e "
+const fs = require('fs');
+const path = require('path');
+const files = fs.readdirSync('.deepen').filter(f => f.endsWith('.json') && f !== 'PLAN_MANIFEST.json');
+let valid = 0, invalid = 0, noTools = 0, totalTruncated = 0;
+for (const file of files) {
+ const fp = path.join('.deepen', file);
+ try {
+ const data = JSON.parse(fs.readFileSync(fp, 'utf8'));
+ if (Array.isArray(data.recommendations) === false) throw new Error('recommendations not an array');
+ if (data.recommendations.length > 8) throw new Error('too many recommendations: ' + data.recommendations.length);
+ if (typeof data.truncated_count !== 'number') throw new Error('missing required field: truncated_count');
+ for (let i = 0; i < data.recommendations.length; i++) {
+ const rec = data.recommendations[i];
+ if (rec.section_id == null) throw new Error('rec ' + i + ': missing section_id');
+ if (typeof rec.section_id !== 'number') {
+ console.log('WARNING: ' + file + ' rec ' + i + ': section_id is ' + JSON.stringify(rec.section_id) + ' (string) — must be numeric. Section judges may drop this rec.');
+ }
+ if (rec.type == null || rec.type === '') throw new Error('rec ' + i + ': missing type');
+ if (rec.recommendation == null || rec.recommendation === '') throw new Error('rec ' + i + ': missing recommendation');
+ }
+ const tools = data.tools_used || [];
+ const truncNote = data.truncated_count > 0 ? ' (truncated ' + data.truncated_count + ')' : '';
+ if (tools.length === 0) {
+ console.log('WARNING NO TOOLS: ' + file + truncNote);
+ noTools++;
+ } else {
+ console.log('VALID: ' + file + ' - ' + data.recommendations.length + ' recs, ' + tools.length + ' tools' + truncNote);
+ }
+ totalTruncated += data.truncated_count;
+ valid++;
+ } catch (e) {
+ console.log('INVALID: ' + file + ' -- ' + e.message + ' -- removing');
+ fs.unlinkSync(fp);
+ invalid++;
+ }
+}
+console.log('Summary: ' + valid + ' valid, ' + invalid + ' invalid, ' + noTools + ' no-tools-used, ' + totalTruncated + ' total truncated recs');
"
```
-**Example filtering:**
+Log checkpoint:
+
+```bash
+echo "## Phase 4: Validation — PASS" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
```
-# Found 15 learning files, plan is about "Rails API caching"
-# SPAWN (likely relevant):
-docs/solutions/performance-issues/n-plus-one-queries.md # tags: [activerecord] ✓
-docs/solutions/performance-issues/redis-cache-stampede.md # tags: [caching, redis] ✓
-docs/solutions/configuration-fixes/redis-connection-pool.md # tags: [redis] ✓
+### 6. Judge Phase — Per-Section Parallel Judging + Merge (Phase 5)
-# SKIP (clearly not applicable):
-docs/solutions/deployment-issues/heroku-memory-quota.md # not about caching
-docs/solutions/frontend-issues/stimulus-race-condition.md # plan is API, not frontend
-docs/solutions/authentication-issues/jwt-expiry.md # plan has no auth
-```
+
+Do NOT read individual agent JSON files into parent context. Launch PARALLEL per-section JUDGE agents that each read them in their own context windows.
+
+The judge phase has two steps:
+1. **Section Judges** (parallel, batched) — One judge per manifest section. Each deduplicates, ranks, and assigns convergence signals for its section only.
+2. **Merge Judge** (sequential) — Reads all section judgments, resolves cross-section conflicts, identifies cross-section convergence, produces final consolidated output.
-**Spawn sub-agents in PARALLEL for all filtered learnings.**
+This replaces the single monolithic judge, cutting judge time from ~21 min to ~8-10 min.
+
-**These learnings are institutional knowledge - applying them prevents repeating past mistakes.**
+#### Step 6a: Read section count and plan batching
-### 4. Launch Per-Section Research Agents
+Read `.deepen/PLAN_MANIFEST.json` to get the section count. Calculate how many judge batches are needed (max 4 per batch).
-
-For each major section in the plan, spawn dedicated sub-agents to research improvements. Use the Explore agent type for open-ended research.
-
+#### Step 6b: Launch Per-Section Judges (batched)
-**For each identified section, launch parallel research:**
+For each section in the manifest, launch a section judge. Batch in groups of max 4, wait for each batch to complete.
```
-Task Explore: "Research best practices, patterns, and real-world examples for: [section topic].
-Find:
-- Industry standards and conventions
-- Performance considerations
-- Common pitfalls and how to avoid them
-- Documentation and tutorials
-Return concrete, actionable recommendations."
+Task judge-section-N("
+You are a Section Judge for section N: '[section_title]'. Consolidate recommendations targeting THIS section only.
+
+## Instructions:
+1. Read .deepen/PLAN_MANIFEST.json for section N's structure
+2. Read ALL JSON files in .deepen/*.json (skip PLAN_MANIFEST.json, skip JUDGED_*.json)
+3. Collect ONLY recommendations where section_id == N
+
+4. EVIDENCE CHECK: If tools_used is empty AND source_type is NOT 'skill', downweight confidence by 0.2.
+
+5. Within this section's recommendations:
+ a. DEDUPLICATE: Remove semantically similar recs (keep higher-confidence)
+ b. RESOLVE CONFLICTS: Prefer higher attribution priority source
+ c. RANK by: source_type priority FIRST, then priority, then confidence
+ d. SELECT top 8 maximum
+
+**Source Attribution Priority (highest to lowest):**
+- skill — Institutional knowledge
+- documented-learning — Previously solved problems
+- official-docs — Framework documentation
+- community-web — Blog posts, tutorials
+
+6. Preserve code_example fields
+
+7. Assign impact level:
+ - must_change — Plan has gap causing failures if not addressed
+ - should_change — Significant improvement
+ - consider — Valuable enhancement worth evaluating
+ - informational — Context or reference
+
+8. CONVERGENCE SIGNAL: If 3+ agents independently flagged the same concern, mark with convergence_count. TRUNCATION-AWARE: If an agent has truncated_count > 0, it may have had additional matching recommendations. If 2 agents converge AND both were truncated, treat as 3-agent strength.
+
+9. DEFENSIVE STACKING CHECK: If multiple recommendations add validation for the same data at different layers, flag as a cross-cutting concern.
+
+10. Write to .deepen/JUDGED_SECTION_N.json:
+
+{
+ \"section_id\": N,
+ \"section_title\": \"\",
+ \"raw_count\": ,
+ \"duplicates_removed\": ,
+ \"conflicts_resolved\": ,
+ \"recommendations\": [
+ {
+ \"id\": 1,
+ \"type\": \"best-practice|...\",
+ \"impact\": \"must_change|should_change|consider|informational\",
+ \"title\": \"<100 chars>\",
+ \"recommendation\": \"<500 chars>\",
+ \"code_example\": \"\",
+ \"references\": [\"...\"],
+ \"priority\": \"high|medium|low\",
+ \"confidence\": 0.0-1.0,
+ \"source_agents\": [\"agent1\", \"agent2\"],
+ \"convergence_count\":
+ }
+ ],
+ \"section_concerns\": [\"\"]
+}
+
+11. Return to parent: 'Section N judged. raw -> after dedup. Written to .deepen/JUDGED_SECTION_N.json'
+")
```
-**Also use Context7 MCP for framework documentation:**
+Log checkpoint per batch:
+```bash
+echo "## Phase 5a: Section Judges Batch [B] — PASS" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
+```
+
+#### Step 6c: Data Prep Agent (mechanical — model: haiku)
+
+
+The merge judge previously failed due to OOM/timeout when reading 20+ files AND doing cross-section reasoning in one context. Split into two agents: a cheap data prep agent handles all I/O, then the merge judge focuses entirely on reasoning from a single pre-compiled input file.
+
-For any technologies/frameworks mentioned in the plan, query Context7:
```
-mcp__plugin_compound-engineering_context7__resolve-library-id: Find library ID for [framework]
-mcp__plugin_compound-engineering_context7__query-docs: Query documentation for specific patterns
+Task judge-data-prep("
+You are a Data Preparation Agent. Your job is purely mechanical — extract and compile data from multiple files into a single structured input for the merge judge. No judgment, no synthesis.
+
+## Instructions:
+1. Read .deepen/PLAN_MANIFEST.json — extract plan_title, section count
+2. Read ALL .deepen/JUDGED_SECTION_*.json files — extract each section's full recommendations array, raw_count, duplicates_removed, conflicts_resolved, section_concerns
+3. Read ALL agent JSON files in .deepen/*.json (skip PLAN_MANIFEST.json, JUDGED_*.json) — extract ONLY agent_name and summary fields (ignore recommendations — those are already in section judges)
+
+4. Write to .deepen/MERGE_INPUT.json:
+
+{
+ \"plan_title\": \"\",
+ \"section_count\": ,
+ \"sections\": [
+ {
+ \"section_id\": ,
+ \"section_title\": \"\",
+ \"raw_count\": ,
+ \"duplicates_removed\": ,
+ \"conflicts_resolved\": ,
+ \"section_concerns\": [\"\"],
+ \"recommendations\": []
+ }
+ ],
+ \"agent_summaries\": [
+ {\"agent\": \"\", \"summary\": \"<500 chars>\"}
+ ],
+ \"totals\": {
+ \"total_raw\": ,
+ \"total_duplicates_removed\": ,
+ \"total_conflicts_resolved\":
+ }
+}
+
+5. Return to parent: 'Data prep complete. sections, agent summaries compiled to .deepen/MERGE_INPUT.json'
+", model: haiku)
```
-**Use WebSearch for current best practices:**
+#### Step 6d: Merge Judge (reasoning — reads one file)
-Search for recent (2024-2026) articles, blog posts, and documentation on topics in the plan.
+After data prep completes, the merge judge reads a single pre-compiled input and focuses entirely on cross-section analysis.
-### 5. Discover and Run ALL Review Agents
-
-
-Dynamically discover every available agent and run them ALL against the plan. Don't filter, don't skip, don't assume relevance. 40+ parallel agents is fine. Use everything available.
-
+```
+Task judge-merge("
+You are the Merge Judge. Your job is cross-section reasoning — conflict detection, convergence analysis, and final consolidation. All data has been pre-compiled for you in one file.
+
+## Instructions:
+1. Read .deepen/MERGE_INPUT.json — this contains ALL section judgments and agent summaries in one file. Do NOT read individual agent or section judge files.
+
+## Cross-Section Analysis (your unique job):
+2. CROSS-SECTION CONFLICTS: Check if any recommendation in Section A contradicts one in Section C (e.g., same file referenced with conflicting guidance on where logic should live). Flag conflicts with both section IDs and a resolution recommendation.
+
+3. CROSS-SECTION CONVERGENCE: Check if different sections independently recommend the same pattern (e.g., Section 1 recommends typed filterContext AND Section 3 recommends deriving from typed context). This strengthens both signals — note the cross-section reinforcement.
+
+4. RENUMBER recommendation IDs sequentially across all sections (1, 2, 3... not per-section).
+
+5. Write to .deepen/JUDGED_RECOMMENDATIONS.json:
+
+{
+ \"plan_title\": \"\",
+ \"total_raw_recommendations\": ,
+ \"duplicates_removed\": ,
+ \"conflicts_resolved\": ,
+ \"low_evidence_downweighted\": ,
+ \"sections\": [
+
+ ],
+ \"cross_cutting_concerns\": [
+ {
+ \"title\": \"\",
+ \"description\": \"\",
+ \"affected_sections\": [1, 3, 5]
+ }
+ ],
+ \"agent_summaries\":
+}
+
+6. Return to parent: 'Merge complete. total recs across sections. cross-section concerns. Written to .deepen/JUDGED_RECOMMENDATIONS.json'
+")
+```
-**Step 1: Discover ALL available agents from ALL sources**
+#### Step 6e: Validate Judge Output
```bash
-# 1. Project-local agents (highest priority - project-specific)
-find .claude/agents -name "*.md" 2>/dev/null
+node -e "
+const fs = require('fs');
+try {
+ const judged = JSON.parse(fs.readFileSync('.deepen/JUDGED_RECOMMENDATIONS.json', 'utf8'));
+ const manifest = JSON.parse(fs.readFileSync('.deepen/PLAN_MANIFEST.json', 'utf8'));
+ const manifestIds = new Set(manifest.sections.map(s => s.id));
+
+ if (Array.isArray(judged.sections) === false) throw new Error('sections not array');
+ if (judged.sections.length === 0) throw new Error('sections empty');
+
+ let totalRecs = 0;
+ for (const section of judged.sections) {
+ if (manifestIds.has(section.section_id) === false) {
+ console.log('WARNING: Section ID ' + section.section_id + ' not in manifest');
+ }
+ totalRecs += section.recommendations.length;
+ }
+ console.log('JUDGE VALID: ' + judged.sections.length + ' sections, ' + totalRecs + ' recommendations');
+} catch (e) {
+ console.log('JUDGE INVALID: ' + e.message);
+}
+"
+```
-# 2. User's global agents (~/.claude/)
-find ~/.claude/agents -name "*.md" 2>/dev/null
+Log checkpoint:
-# 3. compound-engineering plugin agents (all subdirectories)
-find ~/.claude/plugins/cache/*/compound-engineering/*/agents -name "*.md" 2>/dev/null
+```bash
+echo "## Phase 5: Judge (all sections + merge) — $([ -f .deepen/JUDGED_RECOMMENDATIONS.json ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
+```
-# 4. ALL other installed plugins - check every plugin for agents
-find ~/.claude/plugins/cache -path "*/agents/*.md" 2>/dev/null
+### 7. Enhance the Plan (Phase 6 — Synthesis)
-# 5. Check installed_plugins.json to find all plugin locations
-cat ~/.claude/plugins/installed_plugins.json
+
+Do NOT read judged recommendations into parent context. Launch a SYNTHESIS agent.
+
-# 6. For local plugins (isLocal: true), check their source directories
-# Parse installed_plugins.json and find local plugin paths
+```
+Task plan-enhancer("
+You are a Plan Enhancement Writer. Merge research recommendations into the original plan.
+
+## Instructions:
+1. Read .deepen/original_plan.md — source plan
+2. Read .deepen/JUDGED_RECOMMENDATIONS.json — consolidated findings
+3. Read .deepen/PLAN_MANIFEST.json — section structure
+
+## Enhancement Rules:
+
+### Output Structure — Two Audiences, Two Sections
+
+The enhanced plan MUST have two clearly separated parts:
+
+**PART 1: Decision Record** (top of file)
+This section is for reviewers and future-you. It explains WHAT changed from the original plan and WHY. It contains:
+- Enhancement Summary (counts, agents, dates)
+- Pre-Implementation Verification checklist
+- Key Improvements with agent consensus signals and [Strong Signal] markers
+- Research Insights (consolidated from all sections — NOT interleaved in the spec)
+- New Considerations Discovered
+- Fast Follow items
+- Cross-Cutting Concerns
+- Deferred items
+
+**PART 2: Implementation Spec** (rest of file)
+This section is for the developer implementing the plan. It is a clean, linear 'do this, then this, then this' document. It contains:
+- The original plan structure with enhancements merged seamlessly
+- Clean code blocks ready to copy — NO `// ENHANCED: ` annotations, NO `(Rec #X, Y agents)` references
+- No Research Insights blocks interrupting the flow
+- Clear marking of code snippets: add `` before code blocks that are final, add `` before code blocks that are pseudocode or depend on project-specific details
+
+Separate the two parts with:
+```
+---
+# Implementation Spec
+---
```
-**Important:** Check EVERY source. Include agents from:
-- Project `.claude/agents/`
-- User's `~/.claude/agents/`
-- compound-engineering plugin (but SKIP workflow/ agents - only use review/, research/, design/, docs/)
-- ALL other installed plugins (agent-sdk-dev, frontend-design, etc.)
-- Any local plugins
+### Preservation
-**For compound-engineering plugin specifically:**
-- USE: `agents/review/*` (all reviewers)
-- USE: `agents/research/*` (all researchers)
-- USE: `agents/design/*` (design agents)
-- USE: `agents/docs/*` (documentation agents)
-- SKIP: `agents/workflow/*` (these are workflow orchestrators, not reviewers)
+**All sections:** Preserve original section structure, ordering, and acceptance criteria.
-**Step 2: For each discovered agent, read its description**
+**Prose sections:** Preserve original text exactly. If a recommendation changes the guidance, rewrite the prose to incorporate the improvement naturally — do NOT append a separate 'Research Insights' block. The developer should read one coherent document, not an original + annotations.
-Read the first few lines of each agent file to understand what it reviews/analyzes.
+**Code blocks:** When must_change or should_change recommendations modify a code block, produce the FINAL corrected version. Do not annotate what changed — the Decision Record covers that. The developer should be able to copy the code block directly.
-**Step 3: Launch ALL agents in parallel**
+### Convergence Signals
-For EVERY agent discovered, launch a Task in parallel:
+When a recommendation has convergence_count >= 3, prefix it with **[Strong Signal — N agents]**. This means multiple independent agents flagged the same concern. Strong signals should:
+- Be given elevated visibility in the enhanced plan
+- Trigger a PR scope question: 'If this strong signal represents a standalone fix (e.g., type consolidation, performance fix), recommend it as a separate prerequisite PR rather than bundling into this feature PR.'
-```
-Task [agent-name]: "Review this plan using your expertise. Apply all your checks and patterns. Plan content: [full plan content]"
-```
+### Action Classification
-**CRITICAL RULES:**
-- Do NOT filter agents by "relevance" - run them ALL
-- Do NOT skip agents because they "might not apply" - let them decide
-- Launch ALL agents in a SINGLE message with multiple Task tool calls
-- 20, 30, 40 parallel agents is fine - use everything
-- Each agent may catch something others miss
-- The goal is MAXIMUM coverage, not efficiency
+Classify every recommendation into one of FOUR buckets:
-**Step 4: Also discover and run research agents**
+**implement** — Code changes for this PR. Go into code blocks or Research Insights.
+**verify** — Checks before implementing. Go into Pre-Implementation Verification section.
+**fast_follow** — Out of scope for this PR but with real user-facing impact. These are NOT generic deferrals — they are specific, actionable items that should be ticketed before merge. Examples: type consolidation that multiple agents flagged, performance fixes unrelated to the feature, cleanup work that reduces technical debt. Go into Fast Follow section.
+**defer** — Lower-priority items or nice-to-haves. Go into Deferred section.
-Research agents (like `best-practices-researcher`, `framework-docs-researcher`, `git-history-analyzer`, `repo-research-analyst`) should also be run for relevant plan sections.
+The difference between fast_follow and defer: fast_follow items have real UX or reliability impact and MUST be ticketed. Deferred items are genuine nice-to-haves.
-### 6. Wait for ALL Agents and Synthesize Everything
+### Sequencing
-
-Wait for ALL parallel agents to complete - skills, research agents, review agents, everything. Then synthesize all findings into a comprehensive enhancement.
-
+State dependency relationships explicitly:
+- 'Fix X must be implemented before Fix Y because...'
+- 'Fix X and Fix Y are independent'
-**Collect outputs from ALL sources:**
+### Resolve Conditionals — Do Not Leave Forks for the Developer
-1. **Skill-based sub-agents** - Each skill's full output (code examples, patterns, recommendations)
-2. **Learnings/Solutions sub-agents** - Relevant documented learnings from /workflows:compound
-3. **Research agents** - Best practices, documentation, real-world examples
-4. **Review agents** - All feedback from every reviewer (architecture, security, performance, simplicity, etc.)
-5. **Context7 queries** - Framework documentation and patterns
-6. **Web searches** - Current best practices and articles
+If the plan provides alternative implementations contingent on codebase state (e.g., "if computeScopedFilterCounts is in-memory, use approach A; if DB-based, use approach B"), READ the actual codebase to determine which applies. Include ONLY the applicable approach in the Implementation Spec. Note the discarded alternative briefly in the Decision Record.
-**For each agent's findings, extract:**
-- [ ] Concrete recommendations (actionable items)
-- [ ] Code patterns and examples (copy-paste ready)
-- [ ] Anti-patterns to avoid (warnings)
-- [ ] Performance considerations (metrics, benchmarks)
-- [ ] Security considerations (vulnerabilities, mitigations)
-- [ ] Edge cases discovered (handling strategies)
-- [ ] Documentation links (references)
-- [ ] Skill-specific patterns (from matched skills)
-- [ ] Relevant learnings (past solutions that apply - prevent repeating mistakes)
+Do NOT leave "if X, do A; if Y, do B" in the Implementation Spec. The developer should never have to stop implementing to investigate which branch applies — that's the enhancer's job. If the codebase state genuinely cannot be determined (e.g., the file doesn't exist yet), state the assumption explicitly and pick one path.
-**Deduplicate and prioritize:**
-- Merge similar recommendations from multiple agents
-- Prioritize by impact (high-value improvements first)
-- Flag conflicting advice for human review
-- Group by plan section
+### Version Verification
-### 7. Enhance Plan Sections
+BEFORE suggesting any code change, check PLAN_MANIFEST.json's `frameworks_with_versions` for the resolved version. Do NOT suggest APIs that don't exist in the installed version:
+- If the manifest says React 19, verify the API exists in React 19 (not just React 18 or 20)
+- If the manifest says ES2022 target (check tsconfig.json if available), do NOT use ES2023+ APIs like Array.findLast
+- If the manifest has `version_mismatches`, use the ACTUAL resolved version, not what the plan text stated
+- When suggesting library APIs, verify they exist in the specific major version
-
-Merge research findings back into the plan, adding depth without changing the original structure.
-
+This single check prevents the most common category of enhancer-introduced bugs.
-**Enhancement format for each section:**
+### Accessibility Verification
-```markdown
-## [Original Section Title]
+When suggesting CSS animations or transitions:
+- Verify `prefers-reduced-motion` fallbacks do NOT leave permanent visual artifacts (stuck opacity, stuck transforms, permanent overlays). Reduced-motion alternatives must be time-bounded or produce no visual change.
+- Verify `aria-live` regions are pre-mounted in the DOM, not conditionally rendered — screen readers silently drop announcements from newly mounted live regions.
-[Original content preserved]
+### Self-Consistency Check
-### Research Insights
+BEFORE writing the final output, review your own enhancement for internal contradictions:
+- If you say content should go in 'primacy position', verify it actually IS placed early in the file, not at the bottom
+- If you describe something as 'ephemeral', verify no other section assumes it persists
+- If you recommend a validation layer, check you haven't already recommended the same validation at another boundary
+- If two sections give conflicting guidance on where logic should live, resolve the conflict explicitly
-**Best Practices:**
-- [Concrete recommendation 1]
-- [Concrete recommendation 2]
+Flag any contradictions you catch as a note: '**Self-check:** [what was caught and resolved]'
-**Performance Considerations:**
-- [Optimization opportunity]
-- [Benchmark or metric to target]
+### Decision Record (PART 1)
-**Implementation Details:**
-```[language]
-// Concrete code example from research
-```
+Add this block at the TOP of the plan. This is the reviewer-facing section.
-**Edge Cases:**
-- [Edge case 1 and how to handle]
-- [Edge case 2 and how to handle]
+# Decision Record
-**References:**
-- [Documentation URL 1]
-- [Documentation URL 2]
-```
+**Deepened on:** [date]
+**Sections enhanced:** [count] of [total]
+**Research agents used:** [count]
+**Total recommendations applied:** [count] ([N] implement, [M] fast_follow, [P] defer)
-### 8. Add Enhancement Summary
+## Pre-Implementation Verification
+Run these checks BEFORE writing any code:
+1. [ ] [Verification task — e.g., confirm library version, check existing types]
-At the top of the plan, add a summary section:
+**IMPORTANT:** This is the ONLY location for the verification checklist. Do NOT repeat or duplicate this list in the Implementation Spec. The Implementation Spec should open with: "Run the Pre-Implementation Verification in the Decision Record above before starting."
-```markdown
-## Enhancement Summary
+## Implementation Sequence
+1. [Fix] — implement first because [reason]
-**Deepened on:** [Date]
-**Sections enhanced:** [Count]
-**Research agents used:** [List]
+## Key Improvements
+1. [Most impactful] [Strong Signal — N agents] if applicable
+2. [Second most impactful]
+3. [Third most impactful]
-### Key Improvements
-1. [Major improvement 1]
-2. [Major improvement 2]
-3. [Major improvement 3]
+## Research Insights
+Consolidated findings from all research agents. Organized by theme, not by plan section.
-### New Considerations Discovered
-- [Important finding 1]
-- [Important finding 2]
-```
+### [Theme 1 — e.g., State Management]
+- [Insight with source attribution]
+- [Insight with source attribution]
-### 9. Update Plan File
+### [Theme 2 — e.g., Accessibility]
+- [Insight with source attribution]
-**Write the enhanced plan:**
-- Preserve original filename
-- Add `-deepened` suffix if user prefers a new file
-- Update any timestamps or metadata
+## New Considerations Discovered
+- [Finding not in original plan]
-## Output Format
+## Fast Follow (ticket before merge)
+Items out of this PR's scope but with real user-facing impact:
+- [ ] [Item] — why it matters, suggested ticket scope
-Update the plan file in place (or if user requests a separate file, append `-deepened` after `-plan`, e.g., `2026-01-15-feat-auth-plan-deepened.md`).
+## Cross-Cutting Concerns
+- [Concern spanning multiple sections]
-## Quality Checks
+## Deferred to Future Work
+- [Item] — why deferred (low impact, speculative, or blocked)
-Before finalizing:
-- [ ] All original content preserved
-- [ ] Research insights clearly marked and attributed
-- [ ] Code examples are syntactically correct
-- [ ] Links are valid and relevant
-- [ ] No contradictions between sections
-- [ ] Enhancement summary accurately reflects changes
+---
+# Implementation Spec
+---
-## Post-Enhancement Options
+[The clean, implementation-ready plan follows here]
-After writing the enhanced plan, use the **AskUserQuestion tool** to present these options:
+### Content Rules
+- The Decision Record is for reviewers. The Implementation Spec is for developers. Do not mix audiences.
+- In the Implementation Spec: NO `// ENHANCED:` comments, NO `(Rec #X, Y agents)` references, NO `### Research Insights` blocks. Just clean, implementable guidance.
+- In the Decision Record: agent consensus signals, strong signal markers, and research attribution ARE appropriate.
+- Mark code blocks: `` for final code, `` for pseudocode that depends on project-specific details.
+- Every must_change recommendation MUST appear in the Implementation Spec (merged naturally into the plan content).
+- Strong signal items (3+ agents) get **[Strong Signal]** prefix in the Decision Record and PR scope assessment.
+- When deferring an item that has UX consequences, add a bridge mitigation: a lightweight prompt-level or code-level workaround that partially addresses the gap until the full fix ships.
-**Question:** "Plan deepened at `[plan_path]`. What would you like to do next?"
+4. Write to .deepen/ENHANCED_PLAN.md
+5. Return to parent: 'Enhancement complete. Enhanced of sections with recommendations ( implement, fast_follow). Written to .deepen/ENHANCED_PLAN.md'
+")
+```
-**Options:**
-1. **View diff** - Show what was added/changed
-2. **Run `/technical_review`** - Get feedback from reviewers on enhanced plan
-3. **Start `/workflows:work`** - Begin implementing this enhanced plan
-4. **Deepen further** - Run another round of research on specific sections
-5. **Revert** - Restore original plan (if backup exists)
+Log checkpoint:
-Based on selection:
-- **View diff** → Run `git diff [plan_path]` or show before/after
-- **`/technical_review`** → Call the /technical_review command with the plan file path
-- **`/workflows:work`** → Call the /workflows:work command with the plan file path
-- **Deepen further** → Ask which sections need more research, then re-run those agents
-- **Revert** → Restore from git or backup
+```bash
+echo "## Phase 6: Enhancement — $([ -f .deepen/ENHANCED_PLAN.md ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
+```
-## Example Enhancement
+### 7b. Quality Review (Phase 6b — CoVe Pattern)
-**Before (from /workflows:plan):**
-```markdown
-## Technical Approach
+
+This is a POST-ENHANCEMENT verification agent. It reads ONLY the enhanced plan — NOT the intermediate recommendations. This context isolation prevents the reviewer from inheriting the enhancer's perspective.
+
-Use React Query for data fetching with optimistic updates.
```
-
-**After (from /workflows:deepen-plan):**
-```markdown
-## Technical Approach
-
-Use React Query for data fetching with optimistic updates.
-
-### Research Insights
-
-**Best Practices:**
-- Configure `staleTime` and `cacheTime` based on data freshness requirements
-- Use `queryKey` factories for consistent cache invalidation
-- Implement error boundaries around query-dependent components
-
-**Performance Considerations:**
-- Enable `refetchOnWindowFocus: false` for stable data to reduce unnecessary requests
-- Use `select` option to transform and memoize data at query level
-- Consider `placeholderData` for instant perceived loading
-
-**Implementation Details:**
-```typescript
-// Recommended query configuration
-const queryClient = new QueryClient({
- defaultOptions: {
- queries: {
- staleTime: 5 * 60 * 1000, // 5 minutes
- retry: 2,
- refetchOnWindowFocus: false,
- },
+Task quality-reviewer("
+You are a Plan Quality Reviewer using the Chain-of-Verification (CoVe) pattern. Your job is to find problems in the ENHANCED plan that the enhancement process may have introduced.
+
+## Instructions:
+1. Read .deepen/ENHANCED_PLAN.md — the enhanced plan to review
+2. Read .deepen/original_plan.md — the original for comparison
+3. Read .deepen/PLAN_MANIFEST.json — section structure
+
+## Step 1: Extract Claims
+List every concrete claim or instruction the enhanced plan makes. Focus on:
+- Where it says content should be placed (file, section, position)
+- What it describes as ephemeral vs persistent
+- What validation/checking layers it adds
+- What it says is in/out of scope
+- Sequencing dependencies between items
+
+## Step 2: Verification Questions
+For each claim, form a verification question:
+- 'The plan says X should go in primacy position — is it actually placed at the top of the file?'
+- 'The plan says suggestions are ephemeral — does any other section assume they persist?'
+- 'The plan adds validation at layer A — does it also add the same validation at layer B and C?'
+
+## Step 3: Code Block Completeness Check
+
+For every constant, type, function, or import referenced in `` code blocks:
+- Verify it is EITHER: (a) defined elsewhere in the plan, (b) listed in Pre-Implementation Verification as something to check/confirm, OR (c) a standard library/framework API
+- Flag any undefined references as 'undefined_references' in the output. Example: a code block uses `FILTER_KEY_TO_PRODUCT_FIELD[key]` but this constant is never defined in the plan and not in the verification checklist.
+
+## Step 4: Integration Test Coverage Check
+
+If the plan describes N interconnected layers or components of a feature (e.g., "three layers: delta counts + conversational repair + visual brushing"), verify there is at least ONE test that exercises all N layers end-to-end for the same user action. Flag missing cross-layer integration tests.
+
+## Step 5: Check and Report
+
+Write to .deepen/QUALITY_REVIEW.json:
+
+{
+ \"self_contradictions\": [
+ {
+ \"claim_a\": \"\",
+ \"claim_b\": \"\",
+ \"severity\": \"high|medium|low\",
+ \"suggested_resolution\": \"\"
+ }
+ ],
+ \"pr_scope_assessment\": {
+ \"recommended_split\": true|false,
+ \"reason\": \"\",
+ \"suggested_prs\": [
+ {
+ \"title\": \"\",
+ \"scope\": \"\",
+ \"rationale\": \"\"
+ }
+ ]
},
-});
+ \"defensive_stacking\": [
+ {
+ \"what\": \"\",
+ \"layers\": [\"schema\", \"backend\", \"frontend\"],
+ \"recommendation\": \"\"
+ }
+ ],
+ \"deferred_without_mitigation\": [
+ {
+ \"item\": \"\",
+ \"ux_consequence\": \"\",
+ \"bridge_mitigation\": \"\"
+ }
+ ],
+ \"undefined_references\": [
+ {
+ \"code_block_location\": \"\",
+ \"reference\": \"\",
+ \"suggestion\": \"\"
+ }
+ ],
+ \"missing_integration_tests\": [
+ {
+ \"layers\": [\"\", \"\", \"\"],
+ \"missing_test\": \"\",
+ \"user_action\": \"\"
+ }
+ ],
+ \"overall_quality\": \"good|needs_revision|major_issues\",
+ \"summary\": \"<200 chars — overall assessment>\"
+}
+
+4. Return to parent: 'Quality review complete. [overall_quality]. [count] self-contradictions, PR split: [yes/no], [count] defensive stacking issues. Written to .deepen/QUALITY_REVIEW.json'
+")
+```
+
+Log checkpoint:
+
+```bash
+echo "## Phase 6b: Quality Review — $([ -f .deepen/QUALITY_REVIEW.json ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
+```
+
+### 8. Verify Enhanced Plan Integrity (Phase 7)
+
+```bash
+node -e "
+const fs = require('fs');
+const norm = s => s.replace(/\u2014/g, '--').replace(/\u2013/g, '-');
+const manifest = JSON.parse(fs.readFileSync('.deepen/PLAN_MANIFEST.json', 'utf8'));
+const enhanced = norm(fs.readFileSync('.deepen/ENHANCED_PLAN.md', 'utf8'));
+const enhancedLower = enhanced.toLowerCase();
+
+let found = 0, missing = [];
+for (const section of manifest.sections) {
+ const title = norm(section.title);
+ if (enhanced.includes(title)) {
+ found++;
+ } else if (enhancedLower.includes(title.toLowerCase())) {
+ found++;
+ console.log('FUZZY MATCH: ' + JSON.stringify(section.title) + ' (case mismatch but present)');
+ } else {
+ missing.push(section.title);
+ }
+}
+
+if (missing.length > 0) {
+ console.log('PRESERVATION FAILURE -- missing ' + missing.length + ' of ' + manifest.sections.length + ' sections:');
+ missing.forEach(t => console.log(' - ' + t));
+} else {
+ console.log('ALL ' + manifest.sections.length + ' sections preserved (' + found + ' found).');
+}
+"
+```
+
+Log checkpoint (single entry — do NOT run preservation check twice):
+
+```bash
+PRES_RESULT=$(node -e "
+const fs = require('fs');
+const norm = s => s.replace(/\u2014/g, '--').replace(/\u2013/g, '-');
+const m = JSON.parse(fs.readFileSync('.deepen/PLAN_MANIFEST.json', 'utf8'));
+const e = norm(fs.readFileSync('.deepen/ENHANCED_PLAN.md', 'utf8')).toLowerCase();
+const missing = m.sections.filter(s => e.includes(norm(s.title).toLowerCase()) === false);
+console.log(missing.length === 0 ? 'PASS' : 'PARTIAL');
+")
+echo "## Phase 7: Preservation Check — $PRES_RESULT" >> .deepen/PIPELINE_LOG.md
+echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
+echo "" >> .deepen/PIPELINE_LOG.md
+echo "## PIPELINE COMPLETE" >> .deepen/PIPELINE_LOG.md
+echo "- End: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md
```
-**Edge Cases:**
-- Handle race conditions with `cancelQueries` on component unmount
-- Implement retry logic for transient network failures
-- Consider offline support with `persistQueryClient`
+### 9. Present Enhanced Plan
-**References:**
-- https://tanstack.com/query/latest/docs/react/guides/optimistic-updates
-- https://tkdodo.eu/blog/practical-react-query
+#### Step 9a: Copy to Final Location
+
+```bash
+cp .deepen/ENHANCED_PLAN.md
+```
+
+#### Step 9b: Read Enhancement Summary and Quality Review
+
+Read ONLY the Enhancement Summary block from the top of the enhanced plan (first ~30 lines). Do NOT read the entire plan into parent context.
+
+Also read `.deepen/QUALITY_REVIEW.json` for the quality assessment. Present the quality findings alongside the enhancement summary.
+
+#### Step 9c: Present Summary
+
+```markdown
+## Plan Deepened
+
+**Plan:** [plan title]
+**File:** [path to enhanced plan]
+
+### Enhancement Summary:
+- **Sections Enhanced:** [N] of [M]
+- **Research Agents Used:** [count]
+- **Total Recommendations Applied:** [count]
+- **Duplicates Removed:** [count]
+
+### Key Improvements:
+1. [Most impactful]
+2. [Second most impactful]
+3. [Third most impactful]
+
+### New Considerations Discovered:
+- [Finding 1]
+- [Finding 2]
+
+### Quality Review:
+- **Overall:** [good/needs_revision/major_issues]
+- **Self-contradictions found:** [count] — [brief description if any]
+- **PR scope:** [single PR / recommend split into N PRs]
+ - [If split recommended: list suggested PRs]
+- **Defensive stacking:** [count] issues — [brief description if any]
+- **Deferred items needing bridge mitigation:** [count]
```
-NEVER CODE! Just research and enhance the plan.
+#### Step 9d: Present Pipeline Log
+
+Read and display the contents of `.deepen/PIPELINE_LOG.md` to the user so they can report diagnostics.
+
+#### Step 9e: Offer Next Steps
+
+**"Plan deepened. What would you like to do next?"**
+
+1. **View diff** — `git diff `
+2. **Run `/plan_review`** — Get review agents' feedback
+3. **Start `/workflows:work`** — Begin implementing
+4. **Deepen further** — Run another round on specific sections
+5. **Revert** — `git checkout `
+6. **Compound insights** — Run `/workflows:compound` to extract novel patterns
+
+## Appendix: Token Budget Reference
+
+| Component | Token Budget | Notes |
+|-----------|-------------|-------|
+| Plan manifest return | ~100 | One sentence + version mismatch count |
+| Discovery (listings) | ~1,000-2,000 | File lists, frontmatter |
+| Matched resources list | ~500 | Names and paths |
+| Per-agent summary (10-20) | ~100-150 each | One sentence + counts |
+| Validation script | ~0 | Bash (now reports truncated_count totals) |
+| Per-section judge returns (N) | ~100 each | One sentence per section |
+| Data prep agent return | ~100 | One sentence (compiles MERGE_INPUT.json) |
+| Merge judge return | ~100 | One sentence + cross-section count |
+| Enhancement return | ~100 | One sentence |
+| Quality review return | ~100 | One sentence |
+| Quality review JSON (parent reads) | ~500 | PR scope + contradictions |
+| Enhancement summary | ~500 | Top of plan |
+| Parent overhead | ~5,000 | Instructions, synthesis |
+| **Total parent from agents** | **~8,500-13,000** | **Slightly more returns but judge ~75% faster** |