diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index 563dfcd8..a1b7be99 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -12,7 +12,7 @@ { "name": "compound-engineering", "description": "AI-powered development tools that get smarter with every use. Make each unit of engineering work easier than the last. Includes 29 specialized agents, 22 commands, and 19 skills.", - "version": "2.33.0", + "version": "2.34.0", "author": { "name": "Kieran Klaassen", "url": "https://github.com/kieranklaassen", diff --git a/plugins/compound-engineering/.claude-plugin/plugin.json b/plugins/compound-engineering/.claude-plugin/plugin.json index a74039ac..9b35c5a7 100644 --- a/plugins/compound-engineering/.claude-plugin/plugin.json +++ b/plugins/compound-engineering/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "compound-engineering", - "version": "2.33.0", + "version": "2.34.0", "description": "AI-powered development tools. 29 agents, 22 commands, 19 skills, 1 MCP server for code review, research, design, and workflow automation.", "author": { "name": "Kieran Klaassen", diff --git a/plugins/compound-engineering/CHANGELOG.md b/plugins/compound-engineering/CHANGELOG.md index b80621c6..bce7f8d7 100644 --- a/plugins/compound-engineering/CHANGELOG.md +++ b/plugins/compound-engineering/CHANGELOG.md @@ -5,6 +5,59 @@ All notable changes to the compound-engineering plugin will be documented in thi The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [2.34.0] - 2026-02-13 + +### Changed + +- **`/deepen-plan` command** — Complete rewrite with **context-managed map-reduce** architecture to prevent context overflow. Validated across multiple real-world runs. + + **Architecture:** + - Sub-agents write full analysis JSON to `.deepen/` on disk, return only a ~200-char completion signal to parent + - Parent context stays under ~13k tokens regardless of agent count (vs unbounded in v1) + - 10-phase pipeline: Analyze → Discover → Research (batched) → Validate → Judge (parallel per-section + merge) → Enhance → Quality Review → Preservation Check → Present + + **Context Overflow Prevention (crash-tested):** + - **Batched agent launches** — Max 4 Task() agents pending simultaneously. Prevents session crash from simultaneous returns (anthropics/claude-code#11280, #8136) + - **200-char return cap** — Hard limit on agent return messages. All analysis lives in JSON files on disk + - **Task() failure recovery** — Retry once on silent infrastructure errors (`[Tool result missing due to internal error]`) + + **Version Grounding:** + - Plan-analyzer reads lockfile → package.json → plan text (priority order) to resolve actual framework versions + - Prevents downstream agents from researching wrong library versions (e.g., MUI 5 when project uses MUI 7) + - `version_mismatches` field flags discrepancies between plan text and actual dependencies + + **Per-Section Judge Parallelization:** + - Replaces single monolithic judge with parallel per-section judges + merge judge + - Section judges run in parallel (batched max 4), each deduplicates and ranks within its section + - Merge judge resolves cross-section conflicts, identifies cross-section convergence + - Reduced judge time from ~21 min to ~8-10 min in testing + + **Two-Part Output Structure:** + - **Decision Record** (reviewer-facing): Enhancement summary, agent consensus, research insights, strong signal markers, fast follow items, verification checklist + - **Implementation Spec** (developer-facing): Clean, linear implementation guidance with ready-to-copy code blocks — no `// ENHANCED:` annotations or `(Rec #X)` references + + **Quality Review (CoVe Pattern):** + - Post-enhancement agent checks for self-contradictions, PR scope assessment, defensive stacking, code completeness (undefined references), integration test gap detection, deferred items needing bridge mitigations + - Runs in isolated context — does not inherit enhancer's perspective + + **Enhancer Improvements:** + - **Resolve conditionals** — Reads codebase to determine which implementation path applies, eliminates "if X use A, if Y use B" forks + - **Version verification** — Checks `frameworks_with_versions` before suggesting APIs (prevents ES2023+ suggestions for ES2022 targets) + - **Accessibility verification** — Ensures `prefers-reduced-motion` fallbacks don't leave permanent visual artifacts + - **Convergence signals** — `[Strong Signal — N agents]` markers when 3+ agents independently flag same concern + - **`fast_follow` classification** — Fourth action bucket for items with real UX impact but out of PR scope (must be ticketed before merge) + + **Other Improvements:** + - `truncated_count` required field — Agents report omitted recommendations beyond 8-cap; judge weights convergence accordingly + - `learnings-researcher` integration — Single dedicated agent replaces N per-file learning agents + - Pipeline checkpoint logging to `.deepen/PIPELINE_LOG.md` for diagnostics + - Cross-platform safe: project-relative `.deepen/`, Node.js validation (no Python3 dependency) + - Architectural Decision Challenge phase with `project-architecture-challenger` agent + - `agent-native-architecture-reviewer` with dedicated skill routing + - PROJECT ARCHITECTURE CONTEXT block for all review/research agents + +--- + ## [2.33.0] - 2026-02-12 ### Added diff --git a/plugins/compound-engineering/commands/deepen-plan.md b/plugins/compound-engineering/commands/deepen-plan.md index a7054764..c8d3a6ed 100644 --- a/plugins/compound-engineering/commands/deepen-plan.md +++ b/plugins/compound-engineering/commands/deepen-plan.md @@ -4,543 +4,1188 @@ description: Enhance a plan with parallel research agents for each section to ad argument-hint: "[path to plan file]" --- -# Deepen Plan - Power Enhancement Mode +# Deepen Plan (v3 — Context-Managed Map-Reduce) + +**Note: The current year is 2026.** Use this when searching for recent documentation and best practices. + +Take an existing plan (from `/workflows:plan`) and enhance each section with parallel research, skill application, and review agents — using file-based synthesis to prevent context overflow while maximizing depth. ## Introduction -**Note: The current year is 2026.** Use this when searching for recent documentation and best practices. +Senior Technical Research Lead with expertise in architecture, best practices, and production-ready implementation patterns -This command takes an existing plan (from `/workflows:plan`) and enhances each section with parallel research agents. Each major element gets its own dedicated research sub-agent to find: -- Best practices and industry patterns -- Performance optimizations -- UI/UX improvements (if applicable) -- Quality enhancements and edge cases -- Real-world implementation examples +## Architecture: Phased File-Based Map-Reduce -The result is a deeply grounded, production-ready plan with concrete implementation details. +1. **Analyze Phase** (sequential) — Parse plan into structured manifest. **Grounds versions in lockfile/package.json**, not plan text. +2. **Discover Phase** (parent) — Find available skills, learnings, agents using Glob/Read. Match against manifest. +3. **Research Phase** (batched parallel) — Matched agents write structured recommendations to `.deepen/`, return only a completion signal. Agents report `truncated_count` when capped. +4. **Validate** — Verify all expected agent files exist, conform to schema (including required `truncated_count`), flag zero-tool-use hallucination risk. +5. **Judge Phase** (parallel per-section + data prep + merge) — Per-section judges run in parallel (batched, max 4). Data prep agent (haiku) compiles all results into a single `MERGE_INPUT.json`. Merge judge reads one file and focuses on cross-section conflict/convergence reasoning. +6. **Judge Validation** — Verify judge output references real manifest sections. +7. **Enhance Phase** — Synthesis agent reads consolidated recommendations + original plan, writes enhanced version. **Verifies APIs exist in resolved versions before suggesting code.** Classifies items as implement/verify/fast_follow/defer. Two-part output: Decision Record + Implementation Spec. +8. **Quality Review** — CoVe-pattern agent checks enhanced plan for self-contradictions, PR scope, defensive stacking, deferred items needing bridge mitigations. +9. **Preservation Check** — Single-pass verification that enhanced plan contains every original section. +10. **Present** — Parent reads enhancement summary + quality review and presents next steps. + +Parent context stays under ~15k tokens of agent output regardless of agent count. + +## Task() Failure Recovery + + +If any Task() call returns an error, empty result, or `[Tool result missing due to internal error]`, retry ONCE with identical parameters before failing the phase. This is a known Claude Code infrastructure issue — the subprocess can silently fail due to timeout, OOM, or connection drop. The retry almost always succeeds. Log the failure and retry in the pipeline log. + ## Plan File #$ARGUMENTS **If the plan path above is empty:** -1. Check for recent plans: `ls -la docs/plans/` -2. Ask the user: "Which plan would you like to deepen? Please provide the path (e.g., `docs/plans/2026-01-15-feat-my-feature-plan.md`)." +1. Check for recent plans: `ls -la plans/` +2. Ask the user: "Which plan would you like to deepen? Please provide the path." Do not proceed until you have a valid plan file path. -## Main Tasks +## Checkpoint Logging -### 1. Parse and Analyze Plan Structure + +After EVERY phase, write a checkpoint to `.deepen/PIPELINE_LOG.md`. This is diagnostic — report these results back. - -First, read and parse the plan to identify each major section that can be enhanced with research. - - -**Read the plan file and extract:** -- [ ] Overview/Problem Statement -- [ ] Proposed Solution sections -- [ ] Technical Approach/Architecture -- [ ] Implementation phases/steps -- [ ] Code examples and file references -- [ ] Acceptance criteria -- [ ] Any UI/UX components mentioned -- [ ] Technologies/frameworks mentioned (Rails, React, Python, TypeScript, etc.) -- [ ] Domain areas (data models, APIs, UI, security, performance, etc.) - -**Create a section manifest:** +Format each checkpoint as: ``` -Section 1: [Title] - [Brief description of what to research] -Section 2: [Title] - [Brief description of what to research] -... +## Phase N: [Name] — [PASS/FAIL/PARTIAL] +- Started: [timestamp from date command] +- Completed: [timestamp] +- Notes: [what happened, any issues] +- Files created: [list] ``` + -### 2. Discover and Apply Available Skills +## Main Tasks - -Dynamically discover all available skills and match them to plan sections. Don't assume what skills exist - discover them at runtime. - +### 1. Prepare the Scratchpad Directory -**Step 1: Discover ALL available skills from ALL sources** + +Use a project-relative path, NOT /tmp/. The /tmp/ path causes two problems: +1. Claude Code's Read tool and MCP filesystem tools cannot access /tmp/ (outside allowed directories) +2. On Windows, /tmp/ resolves to different locations depending on the subprocess + ```bash -# 1. Project-local skills (highest priority - project-specific) -ls .claude/skills/ - -# 2. User's global skills (~/.claude/) -ls ~/.claude/skills/ +DEEPEN_DIR=".deepen" +rm -rf "$DEEPEN_DIR" +mkdir -p "$DEEPEN_DIR" +grep -qxF '.deepen/' .gitignore 2>/dev/null || echo '.deepen/' >> .gitignore + +cp "$DEEPEN_DIR/original_plan.md" + +# Initialize pipeline log +echo "# Deepen Plan Pipeline Log" > "$DEEPEN_DIR/PIPELINE_LOG.md" +echo "" >> "$DEEPEN_DIR/PIPELINE_LOG.md" +echo "## Phase 0: Setup — PASS" >> "$DEEPEN_DIR/PIPELINE_LOG.md" +echo "- Started: $(date -u +%H:%M:%S)" >> "$DEEPEN_DIR/PIPELINE_LOG.md" +echo "- Plan copied to .deepen/original_plan.md" >> "$DEEPEN_DIR/PIPELINE_LOG.md" +echo "" >> "$DEEPEN_DIR/PIPELINE_LOG.md" +``` -# 3. compound-engineering plugin skills -ls ~/.claude/plugins/cache/*/compound-engineering/*/skills/ +### 2. Analyze Plan Structure (Phase 1 — Sequential) -# 4. ALL other installed plugins - check every plugin for skills -find ~/.claude/plugins/cache -type d -name "skills" 2>/dev/null + +Run this BEFORE discovering or launching any agents. This produces the structured manifest that drives intelligent agent selection. + -# 5. Also check installed_plugins.json for all plugin locations -cat ~/.claude/plugins/installed_plugins.json +``` +Task plan-analyzer(" +You are a Plan Structure Analyzer. Parse a development plan into a structured manifest. + +## Instructions: +1. Read .deepen/original_plan.md + +2. **GROUND versions in actual dependency files — do NOT trust plan text for versions.** + Resolve framework/library versions using this priority order (highest trust first): + a. **Lockfile** (exact resolved versions): Glob for package-lock.json, yarn.lock, pnpm-lock.yaml, Gemfile.lock, poetry.lock. Read the relevant entries. + b. **Dependency file** (semver ranges): Read package.json, Gemfile, pyproject.toml, etc. Extract version ranges. + c. **Plan text** (lowest trust): Only use versions stated in the plan if no dependency file exists. Mark as unverified. + + For each technology, record: + - The resolved version from the lockfile/dependency file + - Whether the plan text stated a different version (version mismatch) + - The source: \"lockfile\", \"dependency_file\", or \"plan_text_unverified\" + +3. Write your analysis to .deepen/PLAN_MANIFEST.json using this EXACT schema: + +{ + \"plan_title\": \"\", + \"plan_path\": \"<original file path>\", + \"technologies\": [\"Rails\", \"React\", \"TypeScript\", ...], + \"domains\": [\"authentication\", \"caching\", \"API design\", ...], + \"sections\": [ + { + \"id\": 1, + \"title\": \"<section title>\", + \"summary\": \"<1-2 sentences>\", + \"technologies\": [\"subset\"], + \"domains\": [\"subset\"], + \"has_code_examples\": true|false, + \"has_ui_components\": true|false, + \"has_data_models\": true|false, + \"has_api_design\": true|false, + \"has_security_concerns\": true|false, + \"has_performance_concerns\": true|false, + \"has_testing_strategy\": true|false, + \"has_deployment_concerns\": true|false, + \"enhancement_opportunities\": \"<what research would improve this section>\" + } + ], + \"frameworks_with_versions\": { + \"React\": {\"version\": \"19.1.0\", \"source\": \"lockfile\"}, + \"MUI\": {\"version\": \"7.3.7\", \"source\": \"lockfile\"} + }, + \"version_mismatches\": [ + { + \"technology\": \"MUI\", + \"plan_stated\": \"5\", + \"actual_resolved\": \"7.3.7\", + \"source\": \"lockfile\", + \"impact\": \"All MUI API recommendations must target v7, not v5\" + } + ], + \"overall_risk_areas\": [\"<area>\"], + \"acceptance_criteria_count\": <number>, + \"implementation_phases_count\": <number> +} + +4. Also write a human-readable summary to .deepen/PLAN_MANIFEST.md (max 300 words). If version mismatches were found, list them prominently at the top. + +5. Return to parent: 'Plan analysis complete. <N> sections identified across <M> technologies. [X version mismatches found.] Written to .deepen/PLAN_MANIFEST.json' +") ``` -**Important:** Check EVERY source. Don't assume compound-engineering is the only plugin. Use skills from ANY installed plugin that's relevant. - -**Step 2: For each discovered skill, read its SKILL.md to understand what it does** +Wait for completion. Then log checkpoint: ```bash -# For each skill directory found, read its documentation -cat [skill-path]/SKILL.md +echo "## Phase 1: Plan Analysis — $([ -f .deepen/PLAN_MANIFEST.json ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "- Files: $(ls .deepen/PLAN_MANIFEST.* 2>/dev/null)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md ``` -**Step 3: Match skills to plan content** +### 3. Discover Available Skills, Learnings, and Agents (Phase 2) -For each skill discovered: -- Read its SKILL.md description -- Check if any plan sections match the skill's domain -- If there's a match, spawn a sub-agent to apply that skill's knowledge +<critical_instruction> +This step runs in the PARENT context. Discovery only — read directory listings and frontmatter, NOT full file contents. Keep lightweight. +</critical_instruction> -**Step 4: Spawn a sub-agent for EVERY matched skill** +#### Step 3a: Discover Skills -**CRITICAL: For EACH skill that matches, spawn a separate sub-agent and instruct it to USE that skill.** +Use Claude Code's native tools: -For each matched skill: ``` -Task general-purpose: "You have the [skill-name] skill available at [skill-path]. +Glob: .claude/skills/*/SKILL.md +Glob: ~/.claude/skills/*/SKILL.md +Glob: ~/.claude/plugins/cache/**/skills/*/SKILL.md +``` -YOUR JOB: Use this skill on the plan. +For each discovered SKILL.md, Read first 10 lines only (frontmatter/description). -1. Read the skill: cat [skill-path]/SKILL.md -2. Follow the skill's instructions exactly -3. Apply the skill to this content: +#### Step 3b: Discover Learnings -[relevant plan section or full plan] +**Preferred method:** If the compound-engineering plugin's `learnings-researcher` agent is available (check `~/.claude/plugins/cache/**/agents/research/learnings-researcher.md`), use it as a single dedicated agent in Step 4 instead of spawning per-file learning agents. It searches `docs/solutions/` by frontmatter metadata — one specialized agent replaces N generic ones with better quality. -4. Return the skill's full output +**Fallback (no learnings-researcher available):** -The skill tells you what to do - follow it. Execute the skill completely." +``` +Glob: docs/solutions/**/*.md ``` -**Spawn ALL skill sub-agents in PARALLEL:** -- 1 sub-agent per matched skill -- Each sub-agent reads and uses its assigned skill -- All run simultaneously -- 10, 20, 30 skill sub-agents is fine +For each found file, Read first 15 lines (frontmatter only). -**Each sub-agent:** -1. Reads its skill's SKILL.md -2. Follows the skill's workflow/instructions -3. Applies the skill to the plan -4. Returns whatever the skill produces (code, recommendations, patterns, reviews, etc.) +#### Step 3c: Discover Review/Research Agents -**Example spawns:** ``` -Task general-purpose: "Use the dhh-rails-style skill at ~/.claude/plugins/.../dhh-rails-style. Read SKILL.md and apply it to: [Rails sections of plan]" +Glob: .claude/agents/*.md +Glob: ~/.claude/agents/*.md +Glob: ~/.claude/plugins/cache/**/agents/**/*.md +``` + +For compound-engineering plugin agents: +- USE: `agents/review/*`, `agents/research/*`, `agents/design/*`, `agents/docs/*` +- SKIP: `agents/workflow/*` (workflow orchestrators, not reviewers) + +#### Step 3d: Match Against Manifest + +Read `.deepen/PLAN_MANIFEST.json` and match discovered resources: + +**Skills** — Match if skill's domain overlaps with any plan technology or domain: +- Rails plans -> `dhh-rails-style` +- Ruby gem plans -> `andrew-kane-gem-writer` +- Frontend/UI plans -> `frontend-design` +- AI/agent plans -> `agent-native-architecture` +- LLM integration plans -> `dspy-ruby` +- Documentation plans -> `every-style-editor`, `compound-docs` +- Skill creation plans -> `create-agent-skills` + +**Important:** Skills may have `references/` subdirectories. Instruct skill agents to also check `references/`, `assets/`, `templates/` directories within the skill path. + +**Special routing — `agent-native-architecture` skill:** This skill is interactive with a routing table. Do NOT use the generic skill template. Use the dedicated template in Step 4. + +**Learnings** — Match if tags, category, or module overlaps with plan technologies/domains. + +**Agents** — Two tiers: + +**Always run (cross-cutting):** +- Security agents (security-sentinel) +- Architecture agents (architecture-strategist) +- Performance agents (performance-oracle) +- Project Architecture Challenger (see Step 4) + +**Manifest-matched (run if domain overlap):** +- Framework-specific reviewers (dhh-rails-reviewer, kieran-rails-reviewer, kieran-typescript-reviewer, kieran-python-reviewer) +- Domain-specific agents (data-integrity-guardian, deployment-verification-agent) +- Frontend agents (julik-frontend-races-reviewer, design agents) +- Code quality agents (code-simplicity-reviewer, pattern-recognition-specialist) +- Agent-native reviewer (for plans involving agent/tool features) + +#### Handling Sparse Discovery -Task general-purpose: "Use the frontend-design skill at ~/.claude/plugins/.../frontend-design. Read SKILL.md and apply it to: [UI sections of plan]" +If few/no matched skills/learnings found, acknowledge: "Limited institutional knowledge available. Enhancement based primarily on framework documentation and cross-cutting analysis." -Task general-purpose: "Use the agent-native-architecture skill at ~/.claude/plugins/.../agent-native-architecture. Read SKILL.md and apply it to: [agent/tool sections of plan]" +Write matched resources list to `.deepen/MATCHED_RESOURCES.md`. -Task general-purpose: "Use the security-patterns skill at ~/.claude/skills/security-patterns. Read SKILL.md and apply it to: [full plan]" +Log checkpoint: + +```bash +echo "## Phase 2: Discovery — PASS" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "- Skills found: $(grep -c 'skill' .deepen/MATCHED_RESOURCES.md 2>/dev/null || echo 0)" >> .deepen/PIPELINE_LOG.md +echo "- Learnings found: $(grep -c 'learning' .deepen/MATCHED_RESOURCES.md 2>/dev/null || echo 0)" >> .deepen/PIPELINE_LOG.md +echo "- Agents found: $(grep -c 'agent' .deepen/MATCHED_RESOURCES.md 2>/dev/null || echo 0)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md ``` -**No limit on skill sub-agents. Spawn one for every skill that could possibly be relevant.** +### 4. Launch Research Agents (Phase 3 — Batched Parallel) + +<critical_instruction> +KNOWN ISSUE: When 10+ Task() agents return simultaneously, they can dump ~100-200K tokens into the parent context at once. Claude Code's compaction triggers too late (~98-99% usage) and the session locks up (anthropics/claude-code#11280, #8136). + +MITIGATION: Launch agents in BATCHES of 3-4. Wait for each batch to complete before launching the next. This caps simultaneous returns and gives compaction room to fire between batches. -### 3. Discover and Apply Learnings/Solutions +Batch order: +- **Batch 1:** Always-run cross-cutting agents (security-sentinel, architecture-strategist, performance-oracle, project-architecture-challenger) +- **Batch 2:** Manifest-matched review agents (framework reviewers, domain agents, code quality) +- **Batch 3:** Skill agents + learnings-researcher +- **Batch 4:** Docs-researchers (one per technology) -<thinking> -Check for documented learnings from /workflows:compound. These are solved problems stored as markdown files. Spawn a sub-agent for each learning to check if it's relevant. -</thinking> +Wait for each batch to fully complete before starting the next. Between batches, log a checkpoint. +</critical_instruction> -**LEARNINGS LOCATION - Check these exact folders:** +<critical_instruction> +EVERY agent prompt MUST include these output constraints. This prevents context overflow. + +Append this SHARED_CONTEXT + OUTPUT_RULES block to EVERY agent spawn prompt: ``` -docs/solutions/ <-- PRIMARY: Project-level learnings (created by /workflows:compound) -├── performance-issues/ -│ └── *.md -├── debugging-patterns/ -│ └── *.md -├── configuration-fixes/ -│ └── *.md -├── integration-issues/ -│ └── *.md -├── deployment-issues/ -│ └── *.md -└── [other-categories]/ - └── *.md +## SHARED CONTEXT +Read .deepen/PLAN_MANIFEST.md first for plan overview, technologies, and risk areas. +Read .deepen/original_plan.md for the full plan content. + +## OUTPUT RULES (MANDATORY — VIOLATION CAUSES SESSION CRASH) +1. Write your FULL analysis as JSON to .deepen/{your_agent_name}.json +2. Use this EXACT schema: + { + "agent_type": "skill|learning|research|review", + "agent_name": "<your name>", + "source_type": "skill|documented-learning|official-docs|community-web", + "summary": "<500 chars max>", + "tools_used": ["read_file:path", "web_search:query", ...], + "recommendations": [ + { + "section_id": <NUMBER from manifest — must be a numeric id like 1, 2, 3. NOT a string like "Phase-1">, + "type": "best-practice|edge-case|anti-pattern|performance|security|code-example|architecture|ux|testing", + "title": "<100 chars>", + "recommendation": "<500 chars>", + "code_example": "<optional, max 800 chars>", + "references": ["<URL or doc>"], + "priority": "high|medium|low", + "confidence": 0.0-1.0 + } + ], + "truncated_count": 0 + } +3. Max 8 recommendations per agent. Prioritize by impact. +4. Only include recommendations with confidence >= 0.6. +5. Every recommendation MUST reference a NUMERIC section_id from the plan manifest (e.g., 1, 2, 3 — NOT "Phase-1" or "Phase-1-Types-Store"). String section IDs will be silently dropped by section judges. +6. Code examples are ENCOURAGED. +7. tools_used is MANDATORY. If empty, set confidence to 0.5 max. +8. **truncated_count is REQUIRED (default 0).** If you had more recommendations beyond the 8 cap, set this to the number you omitted. Example: you found 12 relevant issues but only wrote the top 8 → truncated_count: 4. The judge uses this to weight convergence signals. +8. **CRITICAL — YOUR RETURN MESSAGE TO PARENT MUST BE UNDER 200 CHARACTERS.** + Return ONLY this exact format: + "Done. <N> recs for <M> sections in .deepen/{agent_name}.json" + Do NOT return recommendations, analysis, code, or explanations to the parent. + Do NOT summarize your findings in the return message. + ALL analysis goes in the JSON file. The return message is just a completion signal. + If you return more than 200 characters, you risk crashing the parent session. ``` +</critical_instruction> -**Step 1: Find ALL learning markdown files** +#### Batch Execution -Run these commands to get every learning file: +<critical_instruction> +DO NOT launch all agents at once. Follow this batch sequence: -```bash -# PRIMARY LOCATION - Project learnings -find docs/solutions -name "*.md" -type f 2>/dev/null +**BATCH 1 — Cross-cutting (always-run):** Launch these 3-4 agents in parallel. Wait for ALL to complete. +- security-sentinel +- architecture-strategist +- performance-oracle +- project-architecture-challenger + +Log: `echo "## Phase 3a: Batch 1 (cross-cutting) — PASS" >> .deepen/PIPELINE_LOG.md` + +**BATCH 2 — Manifest-matched reviewers:** Launch matched review agents in parallel (max 4). Wait for ALL to complete. +- Framework reviewers, domain agents, code quality agents, agent-native reviewer + +Log: `echo "## Phase 3b: Batch 2 (reviewers) — PASS" >> .deepen/PIPELINE_LOG.md` + +**BATCH 3 — Skills + Learnings:** Launch matched skill agents + learnings-researcher in parallel (max 4). Wait for ALL to complete. + +Log: `echo "## Phase 3c: Batch 3 (skills+learnings) — PASS" >> .deepen/PIPELINE_LOG.md` + +**BATCH 4 — Docs researchers:** Launch per-technology docs researchers in parallel (max 4). Wait for ALL to complete. + +Log: `echo "## Phase 3d: Batch 4 (docs) — PASS" >> .deepen/PIPELINE_LOG.md` + +If a batch has more than 4 agents, split it into sub-batches of 4. Never have more than 4 Task() calls pending simultaneously. +</critical_instruction> + +#### Agent Templates -# If docs/solutions doesn't exist, check alternate locations: -find .claude/docs -name "*.md" -type f 2>/dev/null -find ~/.claude/docs -name "*.md" -type f 2>/dev/null +**For each matched SKILL:** +``` +Task skill-agent(" +You have the [skill-name] skill at [skill-path]. +1. Read the skill: Read [skill-path]/SKILL.md +2. Check for additional resources: + - Glob [skill-path]/references/*.md + - Glob [skill-path]/assets/* + - Glob [skill-path]/templates/* +3. Read the plan context from .deepen/ +4. Apply the skill's expertise to the plan +5. Write recommendations following the OUTPUT RULES +" + SHARED_CONTEXT + OUTPUT_RULES) ``` -**Step 2: Read frontmatter of each learning to filter** +**For each matched LEARNING:** +``` +Task learning-agent(" +Read this learning file completely: [path] +This documents a previously solved problem. Check if it applies to the plan. +If relevant: write specific recommendations. +If not relevant: write empty recommendations array with summary 'Not applicable: [reason]' +" + SHARED_CONTEXT + OUTPUT_RULES) +``` -Each learning file has YAML frontmatter with metadata. Read the first ~20 lines of each file to get: +**For each matched REVIEW/RESEARCH AGENT:** +``` +Task [agent-name](" +Review this plan using your expertise. Focus on your domain. -```yaml ---- -title: "N+1 Query Fix for Briefs" -category: performance-issues -tags: [activerecord, n-plus-one, includes, eager-loading] -module: Briefs -symptom: "Slow page load, multiple queries in logs" -root_cause: "Missing includes on association" ---- +## PROJECT ARCHITECTURE CONTEXT +Read the project's CLAUDE.md for project-specific architectural principles. Evaluate the plan against THESE principles. +" + SHARED_CONTEXT + OUTPUT_RULES) ``` -**For each .md file, quickly scan its frontmatter:** +**For each technology in the manifest, spawn a docs-researcher:** +``` +Task docs-researcher-[technology](" +Research current (2025-2026) best practices for [technology] [version if available]. + +## Documentation Research Steps: +1. Query Context7 MCP for official framework documentation: + - First: mcp__plugin_compound-engineering_context7__resolve-library-id for '[technology]' + - Then: mcp__plugin_compound-engineering_context7__query-docs with the resolved ID +2. Web search for recent (2025-2026) articles, migration guides, changelog notes +3. Search for version-specific changes if manifest includes a version +4. Find concrete code patterns and configuration recommendations + +Budget: 3-5 searches per technology. +" + SHARED_CONTEXT + OUTPUT_RULES) +``` -```bash -# Read first 20 lines of each learning (frontmatter + summary) -head -20 docs/solutions/**/*.md +**SPECIAL: `agent-native-architecture` skill (if matched):** +``` +Task agent-native-architecture-reviewer(" +You are an Agent-Native Architecture Reviewer. + +## Instructions: +1. Read [skill-path]/SKILL.md — focus on <architecture_checklist>, <anti_patterns>, <core_principles> +2. Read these reference files: + - [skill-path]/references/from-primitives-to-domain-tools.md + - [skill-path]/references/mcp-tool-design.md + - [skill-path]/references/refactoring-to-prompt-native.md +3. Read project's CLAUDE.md +4. Read .deepen/ plan context + +## Apply these checks: +- Does a new tool duplicate existing capability? +- Does the tool encode business logic that should live in the agent prompt? +- Are there two ways to accomplish the same outcome? +- Is logic in the right layer? +- Do hardcoded values belong in skills? +- Are features truly needed now, or YAGNI? +- Does the Architecture Review Checklist pass? + +Use type 'architecture' or 'anti-pattern' for findings. +" + SHARED_CONTEXT + OUTPUT_RULES) ``` -**Step 3: Filter - only spawn sub-agents for LIKELY relevant learnings** +**ALWAYS RUN: Project Architecture Challenger** +``` +Task project-architecture-challenger(" +You are a Project Architecture Challenger. Your job is to CHALLENGE the plan's decisions against the project's own architectural principles. + +## Instructions: +1. Read project's CLAUDE.md — extract architectural principles, patterns, conventions +2. Read .deepen/original_plan.md +3. Read .deepen/PLAN_MANIFEST.md + +## For each major decision, ask: +- **Redundancy**: Does this duplicate something existing? +- **Layer placement**: Is business logic in the right place? +- **YAGNI enforcement**: Does the plan acknowledge YAGNI but build it anyway? +- **Hardcoded vs emergent**: Are values hardcoded that could be discovered? +- **Convention drift**: Does any decision contradict CLAUDE.md? +- **Complexity budget**: Does each feature earn its complexity? + +High confidence (0.8+) when CLAUDE.md explicitly contradicts the plan. +Medium confidence (0.6-0.7) for judgment calls. +" + SHARED_CONTEXT + OUTPUT_RULES) +``` -Compare each learning's frontmatter against the plan: -- `tags:` - Do any tags match technologies/patterns in the plan? -- `category:` - Is this category relevant? (e.g., skip deployment-issues if plan is UI-only) -- `module:` - Does the plan touch this module? -- `symptom:` / `root_cause:` - Could this problem occur with the plan? +After ALL batches complete, log the overall checkpoint: -**SKIP learnings that are clearly not applicable:** -- Plan is frontend-only → skip `database-migrations/` learnings -- Plan is Python → skip `rails-specific/` learnings -- Plan has no auth → skip `authentication-issues/` learnings +```bash +AGENT_COUNT=$(ls .deepen/*.json 2>/dev/null | grep -v PLAN_MANIFEST | wc -l) +echo "## Phase 3: Research Agents (All Batches) — $([ $AGENT_COUNT -gt 0 ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "- Agent JSON files written: $AGENT_COUNT" >> .deepen/PIPELINE_LOG.md +echo "- Files: $(ls .deepen/*.json 2>/dev/null | grep -v PLAN_MANIFEST)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md +``` -**SPAWN sub-agents for learnings that MIGHT apply:** -- Any tag overlap with plan technologies -- Same category as plan domain -- Similar patterns or concerns +<late_notification_handling> +Late agent completion notifications are expected and harmless. Because agents are batched, late notifications should be rare — but if you receive one after moving to Step 5+, ignore it. The agent's JSON file is already on disk. +</late_notification_handling> -**Step 4: Spawn sub-agents for filtered learnings** +### 5. Verify and Validate Agent Outputs (Phase 4) -For each learning that passes the filter: +#### Step 5a: Verify Expected Files Exist +```bash +EXPECTED_AGENTS="<list of agent names you launched>" + +MISSING="" +for agent in $EXPECTED_AGENTS; do + if ! ls .deepen/${agent}*.json 1>/dev/null 2>&1; then + MISSING="$MISSING $agent" + echo "MISSING: $agent" + fi +done + +if [ -n "$MISSING" ]; then + echo "WARNING: Missing agent files:$MISSING" +fi ``` -Task general-purpose: " -LEARNING FILE: [full path to .md file] -1. Read this learning file completely -2. This learning documents a previously solved problem +Re-launch missing agents before proceeding. -Check if this learning applies to this plan: +#### Step 5b: Validate JSON Schema and Flag Hallucination Risk ---- -[full plan content] ---- +<critical_instruction> +Use Node.js for validation — Python3 may not be installed on Windows. +</critical_instruction> -If relevant: -- Explain specifically how it applies -- Quote the key insight or solution -- Suggest where/how to incorporate it - -If NOT relevant after deeper analysis: -- Say 'Not applicable: [reason]' +```bash +node -e " +const fs = require('fs'); +const path = require('path'); +const files = fs.readdirSync('.deepen').filter(f => f.endsWith('.json') && f !== 'PLAN_MANIFEST.json'); +let valid = 0, invalid = 0, noTools = 0, totalTruncated = 0; +for (const file of files) { + const fp = path.join('.deepen', file); + try { + const data = JSON.parse(fs.readFileSync(fp, 'utf8')); + if (Array.isArray(data.recommendations) === false) throw new Error('recommendations not an array'); + if (data.recommendations.length > 8) throw new Error('too many recommendations: ' + data.recommendations.length); + if (typeof data.truncated_count !== 'number') throw new Error('missing required field: truncated_count'); + for (let i = 0; i < data.recommendations.length; i++) { + const rec = data.recommendations[i]; + if (rec.section_id == null) throw new Error('rec ' + i + ': missing section_id'); + if (typeof rec.section_id !== 'number') { + console.log('WARNING: ' + file + ' rec ' + i + ': section_id is ' + JSON.stringify(rec.section_id) + ' (string) — must be numeric. Section judges may drop this rec.'); + } + if (rec.type == null || rec.type === '') throw new Error('rec ' + i + ': missing type'); + if (rec.recommendation == null || rec.recommendation === '') throw new Error('rec ' + i + ': missing recommendation'); + } + const tools = data.tools_used || []; + const truncNote = data.truncated_count > 0 ? ' (truncated ' + data.truncated_count + ')' : ''; + if (tools.length === 0) { + console.log('WARNING NO TOOLS: ' + file + truncNote); + noTools++; + } else { + console.log('VALID: ' + file + ' - ' + data.recommendations.length + ' recs, ' + tools.length + ' tools' + truncNote); + } + totalTruncated += data.truncated_count; + valid++; + } catch (e) { + console.log('INVALID: ' + file + ' -- ' + e.message + ' -- removing'); + fs.unlinkSync(fp); + invalid++; + } +} +console.log('Summary: ' + valid + ' valid, ' + invalid + ' invalid, ' + noTools + ' no-tools-used, ' + totalTruncated + ' total truncated recs'); " ``` -**Example filtering:** +Log checkpoint: + +```bash +echo "## Phase 4: Validation — PASS" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md ``` -# Found 15 learning files, plan is about "Rails API caching" -# SPAWN (likely relevant): -docs/solutions/performance-issues/n-plus-one-queries.md # tags: [activerecord] ✓ -docs/solutions/performance-issues/redis-cache-stampede.md # tags: [caching, redis] ✓ -docs/solutions/configuration-fixes/redis-connection-pool.md # tags: [redis] ✓ +### 6. Judge Phase — Per-Section Parallel Judging + Merge (Phase 5) -# SKIP (clearly not applicable): -docs/solutions/deployment-issues/heroku-memory-quota.md # not about caching -docs/solutions/frontend-issues/stimulus-race-condition.md # plan is API, not frontend -docs/solutions/authentication-issues/jwt-expiry.md # plan has no auth -``` +<critical_instruction> +Do NOT read individual agent JSON files into parent context. Launch PARALLEL per-section JUDGE agents that each read them in their own context windows. + +The judge phase has two steps: +1. **Section Judges** (parallel, batched) — One judge per manifest section. Each deduplicates, ranks, and assigns convergence signals for its section only. +2. **Merge Judge** (sequential) — Reads all section judgments, resolves cross-section conflicts, identifies cross-section convergence, produces final consolidated output. -**Spawn sub-agents in PARALLEL for all filtered learnings.** +This replaces the single monolithic judge, cutting judge time from ~21 min to ~8-10 min. +</critical_instruction> -**These learnings are institutional knowledge - applying them prevents repeating past mistakes.** +#### Step 6a: Read section count and plan batching -### 4. Launch Per-Section Research Agents +Read `.deepen/PLAN_MANIFEST.json` to get the section count. Calculate how many judge batches are needed (max 4 per batch). -<thinking> -For each major section in the plan, spawn dedicated sub-agents to research improvements. Use the Explore agent type for open-ended research. -</thinking> +#### Step 6b: Launch Per-Section Judges (batched) -**For each identified section, launch parallel research:** +For each section in the manifest, launch a section judge. Batch in groups of max 4, wait for each batch to complete. ``` -Task Explore: "Research best practices, patterns, and real-world examples for: [section topic]. -Find: -- Industry standards and conventions -- Performance considerations -- Common pitfalls and how to avoid them -- Documentation and tutorials -Return concrete, actionable recommendations." +Task judge-section-N(" +You are a Section Judge for section N: '[section_title]'. Consolidate recommendations targeting THIS section only. + +## Instructions: +1. Read .deepen/PLAN_MANIFEST.json for section N's structure +2. Read ALL JSON files in .deepen/*.json (skip PLAN_MANIFEST.json, skip JUDGED_*.json) +3. Collect ONLY recommendations where section_id == N + +4. EVIDENCE CHECK: If tools_used is empty AND source_type is NOT 'skill', downweight confidence by 0.2. + +5. Within this section's recommendations: + a. DEDUPLICATE: Remove semantically similar recs (keep higher-confidence) + b. RESOLVE CONFLICTS: Prefer higher attribution priority source + c. RANK by: source_type priority FIRST, then priority, then confidence + d. SELECT top 8 maximum + +**Source Attribution Priority (highest to lowest):** +- skill — Institutional knowledge +- documented-learning — Previously solved problems +- official-docs — Framework documentation +- community-web — Blog posts, tutorials + +6. Preserve code_example fields + +7. Assign impact level: + - must_change — Plan has gap causing failures if not addressed + - should_change — Significant improvement + - consider — Valuable enhancement worth evaluating + - informational — Context or reference + +8. CONVERGENCE SIGNAL: If 3+ agents independently flagged the same concern, mark with convergence_count. TRUNCATION-AWARE: If an agent has truncated_count > 0, it may have had additional matching recommendations. If 2 agents converge AND both were truncated, treat as 3-agent strength. + +9. DEFENSIVE STACKING CHECK: If multiple recommendations add validation for the same data at different layers, flag as a cross-cutting concern. + +10. Write to .deepen/JUDGED_SECTION_N.json: + +{ + \"section_id\": N, + \"section_title\": \"<from manifest>\", + \"raw_count\": <recs targeting this section>, + \"duplicates_removed\": <count>, + \"conflicts_resolved\": <count>, + \"recommendations\": [ + { + \"id\": 1, + \"type\": \"best-practice|...\", + \"impact\": \"must_change|should_change|consider|informational\", + \"title\": \"<100 chars>\", + \"recommendation\": \"<500 chars>\", + \"code_example\": \"<or null>\", + \"references\": [\"...\"], + \"priority\": \"high|medium|low\", + \"confidence\": 0.0-1.0, + \"source_agents\": [\"agent1\", \"agent2\"], + \"convergence_count\": <number> + } + ], + \"section_concerns\": [\"<any defensive stacking or within-section issues>\"] +} + +11. Return to parent: 'Section N judged. <X> raw -> <Y> after dedup. Written to .deepen/JUDGED_SECTION_N.json' +") ``` -**Also use Context7 MCP for framework documentation:** +Log checkpoint per batch: +```bash +echo "## Phase 5a: Section Judges Batch [B] — PASS" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md +``` + +#### Step 6c: Data Prep Agent (mechanical — model: haiku) + +<critical_instruction> +The merge judge previously failed due to OOM/timeout when reading 20+ files AND doing cross-section reasoning in one context. Split into two agents: a cheap data prep agent handles all I/O, then the merge judge focuses entirely on reasoning from a single pre-compiled input file. +</critical_instruction> -For any technologies/frameworks mentioned in the plan, query Context7: ``` -mcp__plugin_compound-engineering_context7__resolve-library-id: Find library ID for [framework] -mcp__plugin_compound-engineering_context7__query-docs: Query documentation for specific patterns +Task judge-data-prep(" +You are a Data Preparation Agent. Your job is purely mechanical — extract and compile data from multiple files into a single structured input for the merge judge. No judgment, no synthesis. + +## Instructions: +1. Read .deepen/PLAN_MANIFEST.json — extract plan_title, section count +2. Read ALL .deepen/JUDGED_SECTION_*.json files — extract each section's full recommendations array, raw_count, duplicates_removed, conflicts_resolved, section_concerns +3. Read ALL agent JSON files in .deepen/*.json (skip PLAN_MANIFEST.json, JUDGED_*.json) — extract ONLY agent_name and summary fields (ignore recommendations — those are already in section judges) + +4. Write to .deepen/MERGE_INPUT.json: + +{ + \"plan_title\": \"<from manifest>\", + \"section_count\": <N>, + \"sections\": [ + { + \"section_id\": <id>, + \"section_title\": \"<title>\", + \"raw_count\": <from section judge>, + \"duplicates_removed\": <from section judge>, + \"conflicts_resolved\": <from section judge>, + \"section_concerns\": [\"<from section judge>\"], + \"recommendations\": [<full array from section judge>] + } + ], + \"agent_summaries\": [ + {\"agent\": \"<name>\", \"summary\": \"<500 chars>\"} + ], + \"totals\": { + \"total_raw\": <sum of all raw_count>, + \"total_duplicates_removed\": <sum>, + \"total_conflicts_resolved\": <sum> + } +} + +5. Return to parent: 'Data prep complete. <N> sections, <M> agent summaries compiled to .deepen/MERGE_INPUT.json' +", model: haiku) ``` -**Use WebSearch for current best practices:** +#### Step 6d: Merge Judge (reasoning — reads one file) -Search for recent (2024-2026) articles, blog posts, and documentation on topics in the plan. +After data prep completes, the merge judge reads a single pre-compiled input and focuses entirely on cross-section analysis. -### 5. Discover and Run ALL Review Agents - -<thinking> -Dynamically discover every available agent and run them ALL against the plan. Don't filter, don't skip, don't assume relevance. 40+ parallel agents is fine. Use everything available. -</thinking> +``` +Task judge-merge(" +You are the Merge Judge. Your job is cross-section reasoning — conflict detection, convergence analysis, and final consolidation. All data has been pre-compiled for you in one file. + +## Instructions: +1. Read .deepen/MERGE_INPUT.json — this contains ALL section judgments and agent summaries in one file. Do NOT read individual agent or section judge files. + +## Cross-Section Analysis (your unique job): +2. CROSS-SECTION CONFLICTS: Check if any recommendation in Section A contradicts one in Section C (e.g., same file referenced with conflicting guidance on where logic should live). Flag conflicts with both section IDs and a resolution recommendation. + +3. CROSS-SECTION CONVERGENCE: Check if different sections independently recommend the same pattern (e.g., Section 1 recommends typed filterContext AND Section 3 recommends deriving from typed context). This strengthens both signals — note the cross-section reinforcement. + +4. RENUMBER recommendation IDs sequentially across all sections (1, 2, 3... not per-section). + +5. Write to .deepen/JUDGED_RECOMMENDATIONS.json: + +{ + \"plan_title\": \"<from MERGE_INPUT>\", + \"total_raw_recommendations\": <from MERGE_INPUT totals>, + \"duplicates_removed\": <from MERGE_INPUT totals>, + \"conflicts_resolved\": <MERGE_INPUT totals + any new cross-section conflicts>, + \"low_evidence_downweighted\": <count>, + \"sections\": [ + <each section's recommendations from MERGE_INPUT, with renumbered IDs> + ], + \"cross_cutting_concerns\": [ + { + \"title\": \"<concern spanning multiple sections>\", + \"description\": \"<explanation including cross-section conflict/convergence analysis>\", + \"affected_sections\": [1, 3, 5] + } + ], + \"agent_summaries\": <from MERGE_INPUT> +} + +6. Return to parent: 'Merge complete. <X> total recs across <Y> sections. <Z> cross-section concerns. Written to .deepen/JUDGED_RECOMMENDATIONS.json' +") +``` -**Step 1: Discover ALL available agents from ALL sources** +#### Step 6e: Validate Judge Output ```bash -# 1. Project-local agents (highest priority - project-specific) -find .claude/agents -name "*.md" 2>/dev/null +node -e " +const fs = require('fs'); +try { + const judged = JSON.parse(fs.readFileSync('.deepen/JUDGED_RECOMMENDATIONS.json', 'utf8')); + const manifest = JSON.parse(fs.readFileSync('.deepen/PLAN_MANIFEST.json', 'utf8')); + const manifestIds = new Set(manifest.sections.map(s => s.id)); + + if (Array.isArray(judged.sections) === false) throw new Error('sections not array'); + if (judged.sections.length === 0) throw new Error('sections empty'); + + let totalRecs = 0; + for (const section of judged.sections) { + if (manifestIds.has(section.section_id) === false) { + console.log('WARNING: Section ID ' + section.section_id + ' not in manifest'); + } + totalRecs += section.recommendations.length; + } + console.log('JUDGE VALID: ' + judged.sections.length + ' sections, ' + totalRecs + ' recommendations'); +} catch (e) { + console.log('JUDGE INVALID: ' + e.message); +} +" +``` -# 2. User's global agents (~/.claude/) -find ~/.claude/agents -name "*.md" 2>/dev/null +Log checkpoint: -# 3. compound-engineering plugin agents (all subdirectories) -find ~/.claude/plugins/cache/*/compound-engineering/*/agents -name "*.md" 2>/dev/null +```bash +echo "## Phase 5: Judge (all sections + merge) — $([ -f .deepen/JUDGED_RECOMMENDATIONS.json ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md +``` -# 4. ALL other installed plugins - check every plugin for agents -find ~/.claude/plugins/cache -path "*/agents/*.md" 2>/dev/null +### 7. Enhance the Plan (Phase 6 — Synthesis) -# 5. Check installed_plugins.json to find all plugin locations -cat ~/.claude/plugins/installed_plugins.json +<critical_instruction> +Do NOT read judged recommendations into parent context. Launch a SYNTHESIS agent. +</critical_instruction> -# 6. For local plugins (isLocal: true), check their source directories -# Parse installed_plugins.json and find local plugin paths +``` +Task plan-enhancer(" +You are a Plan Enhancement Writer. Merge research recommendations into the original plan. + +## Instructions: +1. Read .deepen/original_plan.md — source plan +2. Read .deepen/JUDGED_RECOMMENDATIONS.json — consolidated findings +3. Read .deepen/PLAN_MANIFEST.json — section structure + +## Enhancement Rules: + +### Output Structure — Two Audiences, Two Sections + +The enhanced plan MUST have two clearly separated parts: + +**PART 1: Decision Record** (top of file) +This section is for reviewers and future-you. It explains WHAT changed from the original plan and WHY. It contains: +- Enhancement Summary (counts, agents, dates) +- Pre-Implementation Verification checklist +- Key Improvements with agent consensus signals and [Strong Signal] markers +- Research Insights (consolidated from all sections — NOT interleaved in the spec) +- New Considerations Discovered +- Fast Follow items +- Cross-Cutting Concerns +- Deferred items + +**PART 2: Implementation Spec** (rest of file) +This section is for the developer implementing the plan. It is a clean, linear 'do this, then this, then this' document. It contains: +- The original plan structure with enhancements merged seamlessly +- Clean code blocks ready to copy — NO `// ENHANCED: <reason>` annotations, NO `(Rec #X, Y agents)` references +- No Research Insights blocks interrupting the flow +- Clear marking of code snippets: add `<!-- ready-to-copy -->` before code blocks that are final, add `<!-- illustrative -->` before code blocks that are pseudocode or depend on project-specific details + +Separate the two parts with: +``` +--- +# Implementation Spec +--- ``` -**Important:** Check EVERY source. Include agents from: -- Project `.claude/agents/` -- User's `~/.claude/agents/` -- compound-engineering plugin (but SKIP workflow/ agents - only use review/, research/, design/, docs/) -- ALL other installed plugins (agent-sdk-dev, frontend-design, etc.) -- Any local plugins +### Preservation -**For compound-engineering plugin specifically:** -- USE: `agents/review/*` (all reviewers) -- USE: `agents/research/*` (all researchers) -- USE: `agents/design/*` (design agents) -- USE: `agents/docs/*` (documentation agents) -- SKIP: `agents/workflow/*` (these are workflow orchestrators, not reviewers) +**All sections:** Preserve original section structure, ordering, and acceptance criteria. -**Step 2: For each discovered agent, read its description** +**Prose sections:** Preserve original text exactly. If a recommendation changes the guidance, rewrite the prose to incorporate the improvement naturally — do NOT append a separate 'Research Insights' block. The developer should read one coherent document, not an original + annotations. -Read the first few lines of each agent file to understand what it reviews/analyzes. +**Code blocks:** When must_change or should_change recommendations modify a code block, produce the FINAL corrected version. Do not annotate what changed — the Decision Record covers that. The developer should be able to copy the code block directly. -**Step 3: Launch ALL agents in parallel** +### Convergence Signals -For EVERY agent discovered, launch a Task in parallel: +When a recommendation has convergence_count >= 3, prefix it with **[Strong Signal — N agents]**. This means multiple independent agents flagged the same concern. Strong signals should: +- Be given elevated visibility in the enhanced plan +- Trigger a PR scope question: 'If this strong signal represents a standalone fix (e.g., type consolidation, performance fix), recommend it as a separate prerequisite PR rather than bundling into this feature PR.' -``` -Task [agent-name]: "Review this plan using your expertise. Apply all your checks and patterns. Plan content: [full plan content]" -``` +### Action Classification -**CRITICAL RULES:** -- Do NOT filter agents by "relevance" - run them ALL -- Do NOT skip agents because they "might not apply" - let them decide -- Launch ALL agents in a SINGLE message with multiple Task tool calls -- 20, 30, 40 parallel agents is fine - use everything -- Each agent may catch something others miss -- The goal is MAXIMUM coverage, not efficiency +Classify every recommendation into one of FOUR buckets: -**Step 4: Also discover and run research agents** +**implement** — Code changes for this PR. Go into code blocks or Research Insights. +**verify** — Checks before implementing. Go into Pre-Implementation Verification section. +**fast_follow** — Out of scope for this PR but with real user-facing impact. These are NOT generic deferrals — they are specific, actionable items that should be ticketed before merge. Examples: type consolidation that multiple agents flagged, performance fixes unrelated to the feature, cleanup work that reduces technical debt. Go into Fast Follow section. +**defer** — Lower-priority items or nice-to-haves. Go into Deferred section. -Research agents (like `best-practices-researcher`, `framework-docs-researcher`, `git-history-analyzer`, `repo-research-analyst`) should also be run for relevant plan sections. +The difference between fast_follow and defer: fast_follow items have real UX or reliability impact and MUST be ticketed. Deferred items are genuine nice-to-haves. -### 6. Wait for ALL Agents and Synthesize Everything +### Sequencing -<thinking> -Wait for ALL parallel agents to complete - skills, research agents, review agents, everything. Then synthesize all findings into a comprehensive enhancement. -</thinking> +State dependency relationships explicitly: +- 'Fix X must be implemented before Fix Y because...' +- 'Fix X and Fix Y are independent' -**Collect outputs from ALL sources:** +### Resolve Conditionals — Do Not Leave Forks for the Developer -1. **Skill-based sub-agents** - Each skill's full output (code examples, patterns, recommendations) -2. **Learnings/Solutions sub-agents** - Relevant documented learnings from /workflows:compound -3. **Research agents** - Best practices, documentation, real-world examples -4. **Review agents** - All feedback from every reviewer (architecture, security, performance, simplicity, etc.) -5. **Context7 queries** - Framework documentation and patterns -6. **Web searches** - Current best practices and articles +If the plan provides alternative implementations contingent on codebase state (e.g., "if computeScopedFilterCounts is in-memory, use approach A; if DB-based, use approach B"), READ the actual codebase to determine which applies. Include ONLY the applicable approach in the Implementation Spec. Note the discarded alternative briefly in the Decision Record. -**For each agent's findings, extract:** -- [ ] Concrete recommendations (actionable items) -- [ ] Code patterns and examples (copy-paste ready) -- [ ] Anti-patterns to avoid (warnings) -- [ ] Performance considerations (metrics, benchmarks) -- [ ] Security considerations (vulnerabilities, mitigations) -- [ ] Edge cases discovered (handling strategies) -- [ ] Documentation links (references) -- [ ] Skill-specific patterns (from matched skills) -- [ ] Relevant learnings (past solutions that apply - prevent repeating mistakes) +Do NOT leave "if X, do A; if Y, do B" in the Implementation Spec. The developer should never have to stop implementing to investigate which branch applies — that's the enhancer's job. If the codebase state genuinely cannot be determined (e.g., the file doesn't exist yet), state the assumption explicitly and pick one path. -**Deduplicate and prioritize:** -- Merge similar recommendations from multiple agents -- Prioritize by impact (high-value improvements first) -- Flag conflicting advice for human review -- Group by plan section +### Version Verification -### 7. Enhance Plan Sections +BEFORE suggesting any code change, check PLAN_MANIFEST.json's `frameworks_with_versions` for the resolved version. Do NOT suggest APIs that don't exist in the installed version: +- If the manifest says React 19, verify the API exists in React 19 (not just React 18 or 20) +- If the manifest says ES2022 target (check tsconfig.json if available), do NOT use ES2023+ APIs like Array.findLast +- If the manifest has `version_mismatches`, use the ACTUAL resolved version, not what the plan text stated +- When suggesting library APIs, verify they exist in the specific major version -<thinking> -Merge research findings back into the plan, adding depth without changing the original structure. -</thinking> +This single check prevents the most common category of enhancer-introduced bugs. -**Enhancement format for each section:** +### Accessibility Verification -```markdown -## [Original Section Title] +When suggesting CSS animations or transitions: +- Verify `prefers-reduced-motion` fallbacks do NOT leave permanent visual artifacts (stuck opacity, stuck transforms, permanent overlays). Reduced-motion alternatives must be time-bounded or produce no visual change. +- Verify `aria-live` regions are pre-mounted in the DOM, not conditionally rendered — screen readers silently drop announcements from newly mounted live regions. -[Original content preserved] +### Self-Consistency Check -### Research Insights +BEFORE writing the final output, review your own enhancement for internal contradictions: +- If you say content should go in 'primacy position', verify it actually IS placed early in the file, not at the bottom +- If you describe something as 'ephemeral', verify no other section assumes it persists +- If you recommend a validation layer, check you haven't already recommended the same validation at another boundary +- If two sections give conflicting guidance on where logic should live, resolve the conflict explicitly -**Best Practices:** -- [Concrete recommendation 1] -- [Concrete recommendation 2] +Flag any contradictions you catch as a note: '**Self-check:** [what was caught and resolved]' -**Performance Considerations:** -- [Optimization opportunity] -- [Benchmark or metric to target] +### Decision Record (PART 1) -**Implementation Details:** -```[language] -// Concrete code example from research -``` +Add this block at the TOP of the plan. This is the reviewer-facing section. -**Edge Cases:** -- [Edge case 1 and how to handle] -- [Edge case 2 and how to handle] +# Decision Record -**References:** -- [Documentation URL 1] -- [Documentation URL 2] -``` +**Deepened on:** [date] +**Sections enhanced:** [count] of [total] +**Research agents used:** [count] +**Total recommendations applied:** [count] ([N] implement, [M] fast_follow, [P] defer) -### 8. Add Enhancement Summary +## Pre-Implementation Verification +Run these checks BEFORE writing any code: +1. [ ] [Verification task — e.g., confirm library version, check existing types] -At the top of the plan, add a summary section: +**IMPORTANT:** This is the ONLY location for the verification checklist. Do NOT repeat or duplicate this list in the Implementation Spec. The Implementation Spec should open with: "Run the Pre-Implementation Verification in the Decision Record above before starting." -```markdown -## Enhancement Summary +## Implementation Sequence +1. [Fix] — implement first because [reason] -**Deepened on:** [Date] -**Sections enhanced:** [Count] -**Research agents used:** [List] +## Key Improvements +1. [Most impactful] [Strong Signal — N agents] if applicable +2. [Second most impactful] +3. [Third most impactful] -### Key Improvements -1. [Major improvement 1] -2. [Major improvement 2] -3. [Major improvement 3] +## Research Insights +Consolidated findings from all research agents. Organized by theme, not by plan section. -### New Considerations Discovered -- [Important finding 1] -- [Important finding 2] -``` +### [Theme 1 — e.g., State Management] +- [Insight with source attribution] +- [Insight with source attribution] -### 9. Update Plan File +### [Theme 2 — e.g., Accessibility] +- [Insight with source attribution] -**Write the enhanced plan:** -- Preserve original filename -- Add `-deepened` suffix if user prefers a new file -- Update any timestamps or metadata +## New Considerations Discovered +- [Finding not in original plan] -## Output Format +## Fast Follow (ticket before merge) +Items out of this PR's scope but with real user-facing impact: +- [ ] [Item] — why it matters, suggested ticket scope -Update the plan file in place (or if user requests a separate file, append `-deepened` after `-plan`, e.g., `2026-01-15-feat-auth-plan-deepened.md`). +## Cross-Cutting Concerns +- [Concern spanning multiple sections] -## Quality Checks +## Deferred to Future Work +- [Item] — why deferred (low impact, speculative, or blocked) -Before finalizing: -- [ ] All original content preserved -- [ ] Research insights clearly marked and attributed -- [ ] Code examples are syntactically correct -- [ ] Links are valid and relevant -- [ ] No contradictions between sections -- [ ] Enhancement summary accurately reflects changes +--- +# Implementation Spec +--- -## Post-Enhancement Options +[The clean, implementation-ready plan follows here] -After writing the enhanced plan, use the **AskUserQuestion tool** to present these options: +### Content Rules +- The Decision Record is for reviewers. The Implementation Spec is for developers. Do not mix audiences. +- In the Implementation Spec: NO `// ENHANCED:` comments, NO `(Rec #X, Y agents)` references, NO `### Research Insights` blocks. Just clean, implementable guidance. +- In the Decision Record: agent consensus signals, strong signal markers, and research attribution ARE appropriate. +- Mark code blocks: `<!-- ready-to-copy -->` for final code, `<!-- illustrative -->` for pseudocode that depends on project-specific details. +- Every must_change recommendation MUST appear in the Implementation Spec (merged naturally into the plan content). +- Strong signal items (3+ agents) get **[Strong Signal]** prefix in the Decision Record and PR scope assessment. +- When deferring an item that has UX consequences, add a bridge mitigation: a lightweight prompt-level or code-level workaround that partially addresses the gap until the full fix ships. -**Question:** "Plan deepened at `[plan_path]`. What would you like to do next?" +4. Write to .deepen/ENHANCED_PLAN.md +5. Return to parent: 'Enhancement complete. Enhanced <N> of <M> sections with <X> recommendations (<Y> implement, <Z> fast_follow). Written to .deepen/ENHANCED_PLAN.md' +") +``` -**Options:** -1. **View diff** - Show what was added/changed -2. **Run `/technical_review`** - Get feedback from reviewers on enhanced plan -3. **Start `/workflows:work`** - Begin implementing this enhanced plan -4. **Deepen further** - Run another round of research on specific sections -5. **Revert** - Restore original plan (if backup exists) +Log checkpoint: -Based on selection: -- **View diff** → Run `git diff [plan_path]` or show before/after -- **`/technical_review`** → Call the /technical_review command with the plan file path -- **`/workflows:work`** → Call the /workflows:work command with the plan file path -- **Deepen further** → Ask which sections need more research, then re-run those agents -- **Revert** → Restore from git or backup +```bash +echo "## Phase 6: Enhancement — $([ -f .deepen/ENHANCED_PLAN.md ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md +``` -## Example Enhancement +### 7b. Quality Review (Phase 6b — CoVe Pattern) -**Before (from /workflows:plan):** -```markdown -## Technical Approach +<critical_instruction> +This is a POST-ENHANCEMENT verification agent. It reads ONLY the enhanced plan — NOT the intermediate recommendations. This context isolation prevents the reviewer from inheriting the enhancer's perspective. +</critical_instruction> -Use React Query for data fetching with optimistic updates. ``` - -**After (from /workflows:deepen-plan):** -```markdown -## Technical Approach - -Use React Query for data fetching with optimistic updates. - -### Research Insights - -**Best Practices:** -- Configure `staleTime` and `cacheTime` based on data freshness requirements -- Use `queryKey` factories for consistent cache invalidation -- Implement error boundaries around query-dependent components - -**Performance Considerations:** -- Enable `refetchOnWindowFocus: false` for stable data to reduce unnecessary requests -- Use `select` option to transform and memoize data at query level -- Consider `placeholderData` for instant perceived loading - -**Implementation Details:** -```typescript -// Recommended query configuration -const queryClient = new QueryClient({ - defaultOptions: { - queries: { - staleTime: 5 * 60 * 1000, // 5 minutes - retry: 2, - refetchOnWindowFocus: false, - }, +Task quality-reviewer(" +You are a Plan Quality Reviewer using the Chain-of-Verification (CoVe) pattern. Your job is to find problems in the ENHANCED plan that the enhancement process may have introduced. + +## Instructions: +1. Read .deepen/ENHANCED_PLAN.md — the enhanced plan to review +2. Read .deepen/original_plan.md — the original for comparison +3. Read .deepen/PLAN_MANIFEST.json — section structure + +## Step 1: Extract Claims +List every concrete claim or instruction the enhanced plan makes. Focus on: +- Where it says content should be placed (file, section, position) +- What it describes as ephemeral vs persistent +- What validation/checking layers it adds +- What it says is in/out of scope +- Sequencing dependencies between items + +## Step 2: Verification Questions +For each claim, form a verification question: +- 'The plan says X should go in primacy position — is it actually placed at the top of the file?' +- 'The plan says suggestions are ephemeral — does any other section assume they persist?' +- 'The plan adds validation at layer A — does it also add the same validation at layer B and C?' + +## Step 3: Code Block Completeness Check + +For every constant, type, function, or import referenced in `<!-- ready-to-copy -->` code blocks: +- Verify it is EITHER: (a) defined elsewhere in the plan, (b) listed in Pre-Implementation Verification as something to check/confirm, OR (c) a standard library/framework API +- Flag any undefined references as 'undefined_references' in the output. Example: a code block uses `FILTER_KEY_TO_PRODUCT_FIELD[key]` but this constant is never defined in the plan and not in the verification checklist. + +## Step 4: Integration Test Coverage Check + +If the plan describes N interconnected layers or components of a feature (e.g., "three layers: delta counts + conversational repair + visual brushing"), verify there is at least ONE test that exercises all N layers end-to-end for the same user action. Flag missing cross-layer integration tests. + +## Step 5: Check and Report + +Write to .deepen/QUALITY_REVIEW.json: + +{ + \"self_contradictions\": [ + { + \"claim_a\": \"<what the plan says in one place>\", + \"claim_b\": \"<what the plan says elsewhere that contradicts>\", + \"severity\": \"high|medium|low\", + \"suggested_resolution\": \"<which claim should win and why>\" + } + ], + \"pr_scope_assessment\": { + \"recommended_split\": true|false, + \"reason\": \"<why split or not>\", + \"suggested_prs\": [ + { + \"title\": \"<PR title>\", + \"scope\": \"<what it contains>\", + \"rationale\": \"<why separate>\" + } + ] }, -}); + \"defensive_stacking\": [ + { + \"what\": \"<data being validated>\", + \"layers\": [\"schema\", \"backend\", \"frontend\"], + \"recommendation\": \"<which layers to keep and which are redundant>\" + } + ], + \"deferred_without_mitigation\": [ + { + \"item\": \"<what was deferred>\", + \"ux_consequence\": \"<what users will experience>\", + \"bridge_mitigation\": \"<lightweight workaround to add now>\" + } + ], + \"undefined_references\": [ + { + \"code_block_location\": \"<which section/commit the code block is in>\", + \"reference\": \"<the constant/type/function used but not defined>\", + \"suggestion\": \"<define it, add to verification checklist, or confirm it exists in codebase>\" + } + ], + \"missing_integration_tests\": [ + { + \"layers\": [\"<layer 1>\", \"<layer 2>\", \"<layer 3>\"], + \"missing_test\": \"<description of the end-to-end test that should exist>\", + \"user_action\": \"<the user action that should trigger all layers>\" + } + ], + \"overall_quality\": \"good|needs_revision|major_issues\", + \"summary\": \"<200 chars — overall assessment>\" +} + +4. Return to parent: 'Quality review complete. [overall_quality]. [count] self-contradictions, PR split: [yes/no], [count] defensive stacking issues. Written to .deepen/QUALITY_REVIEW.json' +") +``` + +Log checkpoint: + +```bash +echo "## Phase 6b: Quality Review — $([ -f .deepen/QUALITY_REVIEW.json ] && echo 'PASS' || echo 'FAIL')" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md +``` + +### 8. Verify Enhanced Plan Integrity (Phase 7) + +```bash +node -e " +const fs = require('fs'); +const norm = s => s.replace(/\u2014/g, '--').replace(/\u2013/g, '-'); +const manifest = JSON.parse(fs.readFileSync('.deepen/PLAN_MANIFEST.json', 'utf8')); +const enhanced = norm(fs.readFileSync('.deepen/ENHANCED_PLAN.md', 'utf8')); +const enhancedLower = enhanced.toLowerCase(); + +let found = 0, missing = []; +for (const section of manifest.sections) { + const title = norm(section.title); + if (enhanced.includes(title)) { + found++; + } else if (enhancedLower.includes(title.toLowerCase())) { + found++; + console.log('FUZZY MATCH: ' + JSON.stringify(section.title) + ' (case mismatch but present)'); + } else { + missing.push(section.title); + } +} + +if (missing.length > 0) { + console.log('PRESERVATION FAILURE -- missing ' + missing.length + ' of ' + manifest.sections.length + ' sections:'); + missing.forEach(t => console.log(' - ' + t)); +} else { + console.log('ALL ' + manifest.sections.length + ' sections preserved (' + found + ' found).'); +} +" +``` + +Log checkpoint (single entry — do NOT run preservation check twice): + +```bash +PRES_RESULT=$(node -e " +const fs = require('fs'); +const norm = s => s.replace(/\u2014/g, '--').replace(/\u2013/g, '-'); +const m = JSON.parse(fs.readFileSync('.deepen/PLAN_MANIFEST.json', 'utf8')); +const e = norm(fs.readFileSync('.deepen/ENHANCED_PLAN.md', 'utf8')).toLowerCase(); +const missing = m.sections.filter(s => e.includes(norm(s.title).toLowerCase()) === false); +console.log(missing.length === 0 ? 'PASS' : 'PARTIAL'); +") +echo "## Phase 7: Preservation Check — $PRES_RESULT" >> .deepen/PIPELINE_LOG.md +echo "- Completed: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md +echo "" >> .deepen/PIPELINE_LOG.md +echo "## PIPELINE COMPLETE" >> .deepen/PIPELINE_LOG.md +echo "- End: $(date -u +%H:%M:%S)" >> .deepen/PIPELINE_LOG.md ``` -**Edge Cases:** -- Handle race conditions with `cancelQueries` on component unmount -- Implement retry logic for transient network failures -- Consider offline support with `persistQueryClient` +### 9. Present Enhanced Plan -**References:** -- https://tanstack.com/query/latest/docs/react/guides/optimistic-updates -- https://tkdodo.eu/blog/practical-react-query +#### Step 9a: Copy to Final Location + +```bash +cp .deepen/ENHANCED_PLAN.md <original_plan_path> +``` + +#### Step 9b: Read Enhancement Summary and Quality Review + +Read ONLY the Enhancement Summary block from the top of the enhanced plan (first ~30 lines). Do NOT read the entire plan into parent context. + +Also read `.deepen/QUALITY_REVIEW.json` for the quality assessment. Present the quality findings alongside the enhancement summary. + +#### Step 9c: Present Summary + +```markdown +## Plan Deepened + +**Plan:** [plan title] +**File:** [path to enhanced plan] + +### Enhancement Summary: +- **Sections Enhanced:** [N] of [M] +- **Research Agents Used:** [count] +- **Total Recommendations Applied:** [count] +- **Duplicates Removed:** [count] + +### Key Improvements: +1. [Most impactful] +2. [Second most impactful] +3. [Third most impactful] + +### New Considerations Discovered: +- [Finding 1] +- [Finding 2] + +### Quality Review: +- **Overall:** [good/needs_revision/major_issues] +- **Self-contradictions found:** [count] — [brief description if any] +- **PR scope:** [single PR / recommend split into N PRs] + - [If split recommended: list suggested PRs] +- **Defensive stacking:** [count] issues — [brief description if any] +- **Deferred items needing bridge mitigation:** [count] ``` -NEVER CODE! Just research and enhance the plan. +#### Step 9d: Present Pipeline Log + +Read and display the contents of `.deepen/PIPELINE_LOG.md` to the user so they can report diagnostics. + +#### Step 9e: Offer Next Steps + +**"Plan deepened. What would you like to do next?"** + +1. **View diff** — `git diff <plan_path>` +2. **Run `/plan_review`** — Get review agents' feedback +3. **Start `/workflows:work`** — Begin implementing +4. **Deepen further** — Run another round on specific sections +5. **Revert** — `git checkout <plan_path>` +6. **Compound insights** — Run `/workflows:compound` to extract novel patterns + +## Appendix: Token Budget Reference + +| Component | Token Budget | Notes | +|-----------|-------------|-------| +| Plan manifest return | ~100 | One sentence + version mismatch count | +| Discovery (listings) | ~1,000-2,000 | File lists, frontmatter | +| Matched resources list | ~500 | Names and paths | +| Per-agent summary (10-20) | ~100-150 each | One sentence + counts | +| Validation script | ~0 | Bash (now reports truncated_count totals) | +| Per-section judge returns (N) | ~100 each | One sentence per section | +| Data prep agent return | ~100 | One sentence (compiles MERGE_INPUT.json) | +| Merge judge return | ~100 | One sentence + cross-section count | +| Enhancement return | ~100 | One sentence | +| Quality review return | ~100 | One sentence | +| Quality review JSON (parent reads) | ~500 | PR scope + contradictions | +| Enhancement summary | ~500 | Top of plan | +| Parent overhead | ~5,000 | Instructions, synthesis | +| **Total parent from agents** | **~8,500-13,000** | **Slightly more returns but judge ~75% faster** |