From 06fc7a667fa5635bd3bcb7d90d24044607a089af Mon Sep 17 00:00:00 2001
From: Yesudeep Mangalapilly <yesudeep@google.com>
Date: Fri, 13 Feb 2026 13:36:15 -0800
Subject: [PATCH] docs(py): audit and fix stale Python documentation
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Cross-checked all markdown files in py/ against the codebase and open PRs.
Fixed outdated content across 9 files.

engdoc/index.md:
- Fix Python version: 3.12+ → 3.10+
- Update feature parity table (6 of 7 features now ✅, Agents still ❌)
- Replace 8-plugin table with full 23-plugin parity table
- Rewrite all 6 Python code examples (generation, structured output,
  tool calling, chat, agents, data retrieval) with correct imports,
  Genkit() class API, and @ai.tool() decorator pattern

engdoc/extending/api.md:
- Replace stale Sync/Async design section (GenkitExperimental/SyncGenkit/
  AsyncGenkit never implemented) with actual async-first architecture
  documenting GenkitRegistry → GenkitBase → Genkit hierarchy

engdoc/extending/index.md:
- Update d2 diagram plugin list from 7 to 22 plugins

engdoc/extending/servers.md:
- Fill Python TODO links with actual file paths (flows.py, reflection.py)

engdoc/user_guide/python/publishing_pypi.md:
- Add ReleaseKit as primary publishing mechanism
- Demote manual workflow to "Legacy" section

GEMINI.md:
- Remove 7 dangling references to deleted files (engdoc/planning/,
  blog-genkit-python-*.md, release-publishing-guide.md)
- Update blog article guidelines from mandatory to optional
- Remove stale validation script checking deleted paths

.github/PR_RELEASE.md:
- Remove dangling reference to deleted blog-genkit-python-0.5.0.md

PARITY_AUDIT.md:
- G7: ✅ Done → ⬜ Reverted (#4459 reverted by #4469, needs re-land)
- §8c.3/§8c.4: Update stale text — X-Genkit-Span-Id IS now sent (#4511)
- §1d: genkitx-cohere ❌ → ✅ (in-tree cohere plugin exists)
- §6c: Community coverage 3/6 → 4/6
- G17: 🔄 draft → ⬜ (#4521 closed, needs new PR)
- G3/G12-G16/G4: Note #4510 is closed, needs new PR after G38
- G2→G1: Mark as superseded (#4516 titled [SUPERSEDED])
---
 py/.github/PR_RELEASE.md                      |    6 -
 py/GEMINI.md                                  |   31 +-
 py/PARITY_AUDIT.md                            |  144 +--
 py/engdoc/ROADMAP.org                         |  240 ----
 py/engdoc/blog-genkit-python-0.5.0.md         |  304 -----
 py/engdoc/extending/api.md                    |  310 ++---
 py/engdoc/extending/index.md                  |   21 +-
 py/engdoc/extending/servers.md                |    5 +-
 py/engdoc/index.md                            |  257 ++---
 py/engdoc/model-conformance-roadmap.md        |  491 --------
 .../feature_parity_analysis.md                |  653 -----------
 .../parity-analysis/model_spec_compliance.md  |  267 -----
 .../parity-analysis/plugin_api_consistency.md |  295 -----
 py/engdoc/parity-analysis/roadmap.md          |  256 -----
 .../parity-analysis/sample_parity_roadmap.md  |  471 --------
 py/engdoc/planning/FEATURE_MATRIX.md          |  448 --------
 py/engdoc/planning/README.md                  |  121 --
 py/engdoc/planning/azure-telemetry-plugin.md  |  452 --------
 py/engdoc/planning/cloudflare-ai-plugin.md    |  376 -------
 .../planning/cloudflare-telemetry-plugin.md   |  339 ------
 py/engdoc/planning/observability-plugin.md    |  453 --------
 py/engdoc/planning/vercel-plugins.md          |  384 -------
 py/engdoc/release-publishing-guide.md         |  340 ------
 .../user_guide/python/publishing_pypi.md      |   38 +-
 py/plugins/README.md                          |    2 -
 py/tools/conform/ANNOUNCEMENT.md              |  275 -----
 py/tools/conform/README.md                    |   18 +-
 py/tools/releasekit/ANNOUNCEMENT.md           |  236 ----
 py/tools/releasekit/FIXES.md                  |  143 ---
 py/tools/releasekit/README.md                 |   51 +-
 .../docs/competitive-gap-analysis.md          |    5 +-
 .../releasekit/docs/roadmap-execution-plan.md | 1002 -----------------
 py/tools/releasekit/roadmap.md                |  136 ++-
 33 files changed, 490 insertions(+), 8080 deletions(-)
 delete mode 100644 py/engdoc/ROADMAP.org
 delete mode 100644 py/engdoc/blog-genkit-python-0.5.0.md
 delete mode 100644 py/engdoc/model-conformance-roadmap.md
 delete mode 100644 py/engdoc/parity-analysis/feature_parity_analysis.md
 delete mode 100644 py/engdoc/parity-analysis/model_spec_compliance.md
 delete mode 100644 py/engdoc/parity-analysis/plugin_api_consistency.md
 delete mode 100644 py/engdoc/parity-analysis/roadmap.md
 delete mode 100644 py/engdoc/parity-analysis/sample_parity_roadmap.md
 delete mode 100644 py/engdoc/planning/FEATURE_MATRIX.md
 delete mode 100644 py/engdoc/planning/README.md
 delete mode 100644 py/engdoc/planning/azure-telemetry-plugin.md
 delete mode 100644 py/engdoc/planning/cloudflare-ai-plugin.md
 delete mode 100644 py/engdoc/planning/cloudflare-telemetry-plugin.md
 delete mode 100644 py/engdoc/planning/observability-plugin.md
 delete mode 100644 py/engdoc/planning/vercel-plugins.md
 delete mode 100644 py/engdoc/release-publishing-guide.md
 delete mode 100644 py/tools/conform/ANNOUNCEMENT.md
 delete mode 100644 py/tools/releasekit/ANNOUNCEMENT.md
 delete mode 100644 py/tools/releasekit/FIXES.md
 delete mode 100644 py/tools/releasekit/docs/roadmap-execution-plan.md

diff --git a/py/.github/PR_RELEASE.md b/py/.github/PR_RELEASE.md
index b9b96495a2..7c5b400f35 100644
--- a/py/.github/PR_RELEASE.md
+++ b/py/.github/PR_RELEASE.md
@@ -18,12 +18,6 @@ Version bump and release documentation for Genkit Python SDK v0.5.0.
   - Contributor acknowledgments with PR links
 - `PR_DESCRIPTION_0.5.0.md` - Release notes for GitHub
 
-### Blog Article
-- `py/engdoc/blog-genkit-python-0.5.0.md` - Release announcement with:
-  - Feature highlights
-  - Code examples
-  - Getting started guide
-
 ### Contributor Acknowledgments
 13 contributors recognized with 188 total pull requests:
 - @pavelgj (34 PRs) - Technical lead
diff --git a/py/GEMINI.md b/py/GEMINI.md
index 49444e3d2f..e9a87cf36e 100644
--- a/py/GEMINI.md
+++ b/py/GEMINI.md
@@ -1812,10 +1812,6 @@ plugin categorization guides.
 
 ## Changes
 
-### New Planning Documents (engdoc/planning/)
-- **FILE_NAME.md** - Description of integration plan
-- **ROADMAP.md** - Status and effort metrics
-
 ### Updated Documentation
 - **py/plugins/README.md** - Updated categorization guide
 
@@ -2909,7 +2905,7 @@ Use this checklist when drafting a release PR:
 | 11 | **Categorize contributions** | Use bold categories: **Core**, **Plugins**, **Fixes**, etc. |
 | 12 | **Include PR numbers** | Add (#1234) for each major contribution |
 | 13 | **Add dotprompt table** | Same format as main table with PRs, Commits, Key Contributions |
-| 14 | **Create blog article** | `py/engdoc/blog-genkit-python-X.Y.Z.md` |
+| 14 | **Create blog article** | Optional: draft in PR description or external blog |
 | 15 | **Verify code examples** | Test all code snippets match actual API patterns |
 | 16 | **Run release validation** | `./bin/validate_release_docs` (see below) |
 | 17 | **Commit with --no-verify** | `git commit --no-verify -m "docs(py): ..."` |
@@ -2989,17 +2985,7 @@ else
     echo "OK"
 fi
 
-# 7. Check blog article exists for version in CHANGELOG
-echo -n "Checking blog article exists... "
-VERSION=$(grep -m1 '## \[' CHANGELOG.md | grep -oE '[0-9]+\.[0-9]+\.[0-9]+')
-if [ -f "engdoc/blog-genkit-python-$VERSION.md" ]; then
-    echo "OK (found blog-genkit-python-$VERSION.md)"
-else
-    echo "FAIL: Missing engdoc/blog-genkit-python-$VERSION.md"
-    ERRORS=$((ERRORS + 1))
-fi
-
-# 8. Verify imports work
+# 7. Verify imports work
 echo -n "Checking Python imports... "
 if python -c "from genkit.ai import Genkit, Output; print('OK')" 2>/dev/null; then
     :
@@ -3044,12 +3030,12 @@ done
 6. **Match table formats**: External repo tables should have same columns as main table
 7. **Cross-check repositories**: Check both firebase/genkit and google/dotprompt for Python work
 8. **Use --no-verify**: For documentation-only changes, skip hooks for faster iteration
-9. **Always include blog article**: Every release needs a blog article in `py/engdoc/`
+9. **Consider a blog article**: Major releases may warrant a blog article
 10. **Branding**: Use "Genkit" not "Firebase Genkit" (rebranded as of 2025)
 
 #### Blog Article Guidelines
 
-Every release MUST include a blog article at `py/engdoc/blog-genkit-python-X.Y.Z.md`.
+Major releases may include a blog article (e.g. in the PR description or an external blog).
 
 **Branding Note**: The project is called **"Genkit"** (not "Firebase Genkit"). While the
 repository is hosted at `github.com/firebase/genkit` and some blog posts may be published
@@ -3098,20 +3084,12 @@ CRITICAL: Before publishing any blog article, extract and validate ALL code snip
 against the actual codebase to ensure they would compile/run correctly.
 
 ```bash
-# Extract Python code blocks from a blog article and check for common errors
-grep -A 50 '```python' py/engdoc/blog-genkit-python-*.md | grep -E \
-  'response\.text\(\)|output_schema=|asyncio\.run\(|from genkit import Genkit'
-
 # Verify import statements match actual module structure
 python -c "from genkit.ai import Genkit, Output; print('Imports OK')"
 
 # Check that decorator patterns exist in codebase
 grep -r "@ai.flow()" py/samples/*/src/main.py | head -3
 grep -r "@ai.tool()" py/samples/*/src/main.py | head -3
-
-# Validate a blog article's code examples by syntax checking
-python -m py_compile <(grep -A 20 '```python' py/engdoc/blog-genkit-python-*.md | \
-  grep -v '```' | head -50) 2>&1 || echo "Syntax errors found!"
 ```
 
 **Blog Article Code Review Checklist:**
@@ -3558,7 +3536,6 @@ For the v0.5.0 release specifically:
 #### Full Release Guide
 
 For detailed release instructions, see:
-- `py/engdoc/release-publishing-guide.md` - Complete step-by-step guide
 - `py/.github/PR_DESCRIPTION_0.5.0.md` - v0.5.0 PR description template
 - `py/CHANGELOG.md` - Full changelog format
 
diff --git a/py/PARITY_AUDIT.md b/py/PARITY_AUDIT.md
index 42881516b1..fcbb7e0541 100644
--- a/py/PARITY_AUDIT.md
+++ b/py/PARITY_AUDIT.md
@@ -1,7 +1,7 @@
 # Genkit Feature Parity Audit — JS / Go / Python
 
-> Generated: 2025-02-08. Updated: 2026-02-09. Baseline: `firebase/genkit` JS implementation, with explicit JS vs Go vs Python parity tracking.
-> Last verified: 2026-02-09 against genkit-ai org (14 repos) and BloomLabsInc/genkit-plugins.
+> Generated: 2025-02-08. Updated: 2026-02-13. Baseline: `firebase/genkit` JS implementation, with explicit JS vs Go vs Python parity tracking.
+> Last verified: 2026-02-13 against genkit-ai org (14 repos) and BloomLabsInc/genkit-plugins.
 
 ## 1. Plugin Parity Matrix
 
@@ -23,6 +23,7 @@
 | Microsoft Foundry | `microsoft-foundry` | — | — | ✅ | Python-only |
 | Mistral | `mistral` | — | — | ✅ | Python-only |
 | xAI (Grok) | `xai` | — | — | ✅ | Python-only |
+| Cohere | `cohere` | — | — | ✅ | Python-only |
 | **Vector Stores** | | | | | |
 | Dev Local Vectorstore | `dev-local-vectorstore` / `localvec` | ✅ | ✅ | ✅ | |
 | Pinecone | `pinecone` | ✅ | ✅ | ❌ | Missing in Python |
@@ -44,6 +45,7 @@
 | Express | `express` | ✅ | — | — | JS-only |
 | Next.js | `next` | ✅ | — | — | JS-only |
 | Flask | `flask` | — | — | ✅ | Python-only |
+| FastAPI | `fastapi` | — | — | ✅ | Python-only |
 | Server plugin | `server` | — | ✅ | — | Go-only |
 | **Other** | | | | | |
 | LangChain | `langchain` | ✅ | — | — | JS-only |
@@ -53,11 +55,11 @@
 
 | Metric | JS | Go | Python |
 |--------|:--:|:--:|:------:|
-| Total in-tree plugins | 18 | 16 | 20 |
+| Total in-tree plugins | 18 | 16 | 22 |
 | Shared (JS+Go+Python) | 11 | 11 | 11 |
-| Model provider plugins | 6 | 4 | 12 |
+| Model provider plugins | 6 | 4 | 13 |
 | Vector store plugins | 4 | 4 | 1 |
-| Unique to this SDK | 7 | 5 | 9 |
+| Unique to this SDK | 7 | 5 | 11 |
 
 ### 1c. Plugin Gap Table (Parity Focus)
 
@@ -81,7 +83,7 @@
 | `genkitx-anthropic` | `BloomLabsInc/genkit-plugins` | JS | `anthropic` (in-tree) | ✅ |
 | `genkitx-mistral` | `BloomLabsInc/genkit-plugins` | JS | `mistral` (in-tree) | ✅ |
 | `genkitx-groq` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ |
-| `genkitx-cohere` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ |
+| `genkitx-cohere` | `BloomLabsInc/genkit-plugins` | JS | ✅ `cohere` (in-tree) | ✅ |
 | `genkitx-azure-openai` | `BloomLabsInc/genkit-plugins` | JS | `microsoft-foundry` (partial) | ⚠️ |
 | `genkitx-convex` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ |
 | `genkitx-hnsw` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ |
@@ -96,9 +98,9 @@
 
 | Sample Set | JS | Go | Python | Notes |
 |------------|:--:|:--:|:------:|-------|
-| Canonical internal sample/testapp set | 32 (`js/testapps`) | 37 (`go/samples`) | 37 runnable (`py/samples`, excluding `shared`, `sample-test`) | Primary parity baseline |
+| Canonical internal sample/testapp set | 32 (`js/testapps`) | 37 (`go/samples`) | 39 runnable (`py/samples`, excluding `shared`, `sample-test`) | Primary parity baseline |
 | Public showcase samples | 9 (`samples/js-*`) | — | — | Public docs/demo set |
-| Total directories under samples root | — | 37 | 39 | Python includes utility dirs (`shared`, `sample-test`) |
+| Total directories under samples root | — | 37 | 41 | Python includes utility dirs (`shared`, `sample-test`) |
 
 ### 2b. Sample Area Parity (JS vs Go vs Python)
 
@@ -132,36 +134,39 @@ Per Google OSS guidelines:
 
 | Plugin | LICENSE | README | pyproject | CHANGELOG | py.typed | tests/ | Status |
 |--------|:------:|:------:|:---------:|:---------:|:--------:|:------:|:------:|
-| amazon-bedrock | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-| anthropic | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ |
-| checks | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (1) | ⚠️ |
-| cloudflare-workers-ai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (1) | ⚠️ |
-| compat-oai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (7) | ⚠️ |
-| deepseek | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ |
-| dev-local-vectorstore | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (4) | ⚠️ |
-| evaluators | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-| firebase | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-| flask | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (1) | ⚠️ |
-| google-cloud | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-| google-genai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ |
-| huggingface | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-| mcp | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (5) | ⚠️ |
-| microsoft-foundry | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ |
-| mistral | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ |
-| observability | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-| ollama | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (4) | ⚠️ |
-| vertex-ai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (4) | ⚠️ |
-| xai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ |
-
-**Legend**: ✅ = present, ❌ = missing, ⚠️ = mostly OK (only CHANGELOG missing)
+| amazon-bedrock | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| anthropic | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| checks | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (1) | ✅ |
+| cloudflare-workers-ai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| compat-oai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (7) | ✅ |
+| deepseek | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| dev-local-vectorstore | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| evaluators | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| firebase | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| flask | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| google-cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| google-genai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (9) | ✅ |
+| huggingface | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| mcp | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (5) | ✅ |
+| microsoft-foundry | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| mistral | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| observability | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ |
+| ollama | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (5) | ✅ |
+| vertex-ai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| xai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ |
+| cohere | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (5) | ✅ |
+| fastapi | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ (0) | ⚠️ |
+
+**Legend**: ✅ = present, ❌ = missing, ⚠️ = mostly OK
 
 ### 3c. Missing Files Summary
 
 | Issue | Count | Affected |
 |-------|:-----:|----------|
 | Missing `py.typed` | ~~9~~ **0** | All fixed ✅ |
-| Missing `CHANGELOG.md` | 21 | ALL plugins + core package |
+| Missing `CHANGELOG.md` | ~~21~~ **0** | All fixed ✅ (G11) |
 | Missing sample `LICENSE` | ~~1~~ **0** | `provider-checks-hello` fixed ✅ |
+| Missing tests | 1 | `fastapi` plugin (0 test files) |
 
 ### 3d. Core Package (`packages/genkit`)
 
@@ -170,14 +175,13 @@ Per Google OSS guidelines:
 | LICENSE | ✅ |
 | README.md | ✅ |
 | pyproject.toml | ✅ |
-| CHANGELOG.md | ❌ |
+| CHANGELOG.md | ✅ |
 | py.typed | ✅ |
 | tests/ | ✅ (44 test files) |
 
 ### 3e. Sample Compliance
 
-All 37 samples have: `README.md` ✅, `run.sh` ✅, `pyproject.toml` ✅
-All samples except `provider-checks-hello` had `LICENSE` ✅ (now fixed).
+All 39 samples have: `README.md` ✅, `run.sh` ✅, `pyproject.toml` ✅, `LICENSE` ✅.
 
 ---
 
@@ -186,27 +190,30 @@ All samples except `provider-checks-hello` had `LICENSE` ✅ (now fixed).
 | Component | Test Files | Notes |
 |-----------|:----------:|-------|
 | **Core** (`packages/genkit`) | 44 | Comprehensive |
-| **compat-oai** | 7 | Best-covered plugin |
-| **google-genai** | 7 | Best-covered plugin |
+| **google-genai** | 9 | Best-covered plugin |
+| **compat-oai** | 7 | Well-covered |
+| **cohere** | 5 | Well-covered |
 | **mcp** | 5 | Well-covered |
+| **ollama** | 5 | Well-covered |
+| **amazon-bedrock** | 4 | Good |
+| **cloudflare-workers-ai** | 4 | Good |
 | **dev-local-vectorstore** | 4 | Good |
-| **ollama** | 4 | Good |
+| **microsoft-foundry** | 4 | Good |
+| **mistral** | 4 | Good |
 | **vertex-ai** | 4 | Good |
-| **amazon-bedrock** | 3 | Good |
+| **xai** | 4 | Good |
 | **anthropic** | 3 | Good |
-| **cloudflare-workers-ai** | 3 | Good |
 | **deepseek** | 3 | Good |
 | **evaluators** | 3 | Good |
 | **firebase** | 3 | Good |
 | **flask** | 3 | Good |
 | **google-cloud** | 3 | Good |
 | **huggingface** | 3 | Good |
-| **microsoft-foundry** | 3 | Good |
-| **mistral** | 3 | Good |
 | **observability** | 3 | Good |
-| **xai** | 3 | Good |
-| **Total (plugins)** | 70 | All plugins ≥ 3 |
-| **Total (workspace)** | 136 | Including core + samples |
+| **checks** | 1 | Minimal |
+| **fastapi** | 0 | ❌ No tests |
+| **Total (plugins)** | 84 | 20 of 22 plugins have tests |
+| **Total (workspace)** | 128+ | Including core + samples |
 
 ---
 
@@ -359,7 +366,7 @@ Python users typically use `httpx` or `requests` directly.
 | LangChain integration plugin | ✅ | — | ❌ | Go + Python | P3 |
 | **Community Ecosystem** (BloomLabs etc.) | | | | | |
 | Groq provider (`genkitx-groq`) | ✅ (community) | — | ❌ | Python | P3 |
-| Cohere provider (`genkitx-cohere`) | ✅ (community) | — | ❌ | Python | P3 |
+| Cohere provider (`genkitx-cohere`) | ✅ (community) | — | ✅ `cohere` (in-tree) | Python | ✅ |
 | Azure OpenAI (`genkitx-azure-openai`) | ✅ (community) | — | ✅ `microsoft-foundry` (superset) | Python | ✅ |
 | Convex vector store (`genkitx-convex`) | ✅ (community) | — | ❌ | Python | P3 |
 | HNSW vector store (`genkitx-hnsw`) | ✅ (community) | — | ❌ | Python | P3 |
@@ -370,8 +377,8 @@ Python users typically use `httpx` or `requests` directly.
 
 | Feature | Notes |
 |---------|-------|
-| 8 unique model providers | Bedrock, Cloudflare Workers AI, DeepSeek, HuggingFace, MS Foundry, Mistral, xAI, Observability |
-| Flask plugin | Python web framework integration |
+| 9 unique model providers | Bedrock, Cloudflare Workers AI, Cohere, DeepSeek, HuggingFace, MS Foundry, Mistral, xAI, Observability |
+| Flask + FastAPI plugins | Python web framework integrations |
 | ASGI/gRPC production sample | `web-endpoints-hello` — production-ready template with security, resilience, multi-server |
 | `check_consistency` tooling | Automated 25-check workspace hygiene script |
 | `release_check` tooling | Automated 15-check pre-release validation |
@@ -434,7 +441,7 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel
 | `genkitx-anthropic` | Provider (Anthropic) | Covered via `anthropic` | ✅ |
 | `genkitx-mistral` | Provider (Mistral) | Covered via `mistral` | ✅ |
 | `genkitx-groq` | Provider (Groq) | ❌ Not available | ❌ |
-| `genkitx-cohere` | Provider (Cohere) | ❌ Not available | ❌ |
+| `genkitx-cohere` | Provider (Cohere) | ✅ `cohere` (in-tree) | ✅ |
 | `genkitx-azure-openai` | Provider (Azure OpenAI) | `microsoft-foundry` (partial) | ⚠️ |
 
 **Vector Store Plugins:**
@@ -455,13 +462,13 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel
 
 | External Category | Current Python Coverage | Gap Level |
 |-------------------|-------------------------|:---------:|
-| Community model providers (6) | 3 of 6 covered | ⚠️ |
+| Community model providers (6) | 4 of 6 covered | ⚠️ |
 | Community vector stores (3) | 0 of 3 covered | ❌ |
 | Community other plugins (1) | 0 of 1 covered | ❌ |
 | genkit-ai org plugins (5) | All covered via in-tree equivalents | ✅ |
 | Priority relative to JS-canonical parity | Secondary | ⚠️ |
 
-**Note on community provider gaps**: The missing community providers (`genkitx-groq`, `genkitx-cohere`) could potentially be addressed via `compat-oai` since both Groq and Cohere offer OpenAI-compatible API endpoints. However, dedicated plugins would provide optimal model capability declarations and embedder support.
+**Note on community provider gaps**: The missing community provider `genkitx-groq` could potentially be addressed via `compat-oai` since Groq offers an OpenAI-compatible API endpoint. However, a dedicated plugin would provide optimal model capability declarations and embedder support. Cohere is now covered by the in-tree `cohere` plugin ([#4518](https://github.com/firebase/genkit/pull/4518)).
 
 ---
 
@@ -486,26 +493,26 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel
 
 ### 7a. Python Roadmap (JS-Canonical Parity)
 
-> Updated: 2026-02-09. Status legend: ⬜ = not started, 🔄 = PR open, ✅ = merged, ⏳ = deferred, ⏸️ = paused (blocked on upstream), ~~struck~~ = superseded.
+> Updated: 2026-02-13. Status legend: ⬜ = not started, 🔄 = PR open, ✅ = merged, ⏳ = deferred, ⏸️ = paused (blocked on upstream), ~~struck~~ = superseded.
 
 | Gap ID | SDK | Work Item | Reference | Status | PR |
 |--------|-----|-----------|-----------|:------:|:---|
 | **G38** | Python | **Generate-level middleware V2** — 3-tier hooks (`generate`/`model`/`tool`), `define_middleware`, registry | §8l | ⬜ Blocked | Upstream: JS [#4515](https://github.com/firebase/genkit/pull/4515), Go [#4422](https://github.com/firebase/genkit/pull/4422) |
-| G2 → G1 | Python | Add `middleware` storage to `Action`, then add `use=` to `define_model` | §8b.1 | ⏸️ Paused | [#4516](https://github.com/firebase/genkit/pull/4516) — paused pending G38 |
-| G7 | Python | Wire DAP action discovery into `GET /api/actions` | §8a, §8c.5 | ✅ Done | [#4459](https://github.com/firebase/genkit/pull/4459) |
+| G2 → G1 | Python | Add `middleware` storage to `Action`, then add `use=` to `define_model` | §8b.1 | ⏸️ Superseded | [#4516](https://github.com/firebase/genkit/pull/4516) — open but superseded, pending G38 |
+| G7 | Python | Wire DAP action discovery into `GET /api/actions` | §8a, §8c.5 | ⬜ Reverted | [#4459](https://github.com/firebase/genkit/pull/4459) merged then reverted by [#4469](https://github.com/firebase/genkit/pull/4469) — needs re-land |
 | G6 → G5 | Python | Pass `span_id` in `on_trace_start`, send `X-Genkit-Span-Id` | §8c.3, §8c.4 | ✅ Done | [#4511](https://github.com/firebase/genkit/pull/4511) |
-| G3 | Python | Implement `simulate_constrained_generation` middleware | §8b.3, §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 |
-| G12 | Python | Implement `retry` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 |
-| G13 | Python | Implement `fallback` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 |
-| G14 | Python | Implement `validate_support` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 |
-| G15 | Python | Implement `download_request_media` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 |
-| G16 | Python | Implement `simulate_system_prompt` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 |
+| G3 | Python | Implement `simulate_constrained_generation` middleware | §8b.3, §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 |
+| G12 | Python | Implement `retry` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 |
+| G13 | Python | Implement `fallback` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 |
+| G14 | Python | Implement `validate_support` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 |
+| G15 | Python | Implement `download_request_media` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 |
+| G16 | Python | Implement `simulate_system_prompt` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 |
 | G18 | Python | Add multipart tool support (`defineTool({multipart: true})`) | §8h | 🔄 | [#4513](https://github.com/firebase/genkit/pull/4513) |
 | ~~G19~~ | ~~Python~~ | ~~Add Model API V2 (`defineModel({apiVersion: 'v2'})`)~~ | ~~§8i~~ | ~~Superseded~~ | Replaced by G38 (middleware V2) + G41 (bidi models) |
 | G20 | Python | Add `context` parameter to `Genkit()` constructor | §8j | 🔄 | [#4512](https://github.com/firebase/genkit/pull/4512) |
 | G21 | Python | Add `clientHeader` parameter to `Genkit()` constructor | §8j | 🔄 | [#4512](https://github.com/firebase/genkit/pull/4512) |
 | G22 | Python | Add `name` parameter to `Genkit()` constructor | §8j | 🔄 | [#4512](https://github.com/firebase/genkit/pull/4512) |
-| G4 | Python | Move `augment_with_context` to define-model time | §8b.2 | 🔄 | [#4510](https://github.com/firebase/genkit/pull/4510) — logic valid, needs G38 interface |
+| G4 | Python | Move `augment_with_context` to define-model time | §8b.2 | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — logic valid, needs new PR after G38 |
 | **G39** | Python | **Bidirectional Action** primitive (`define_bidi_action`) | §8m | ⬜ Blocked | Upstream: JS [#4288](https://github.com/firebase/genkit/pull/4288) |
 | **G40** | Python | **Bidirectional Flow** primitive (`define_bidi_flow`) | §8m | ⬜ Blocked | Upstream: JS [#4288](https://github.com/firebase/genkit/pull/4288) |
 | **G41** | Python | **Bidirectional Model** (`define_bidi_model`, `generate_bidi`) for real-time LLM APIs | §8m | ⬜ Blocked | Upstream: JS [#4210](https://github.com/firebase/genkit/pull/4210) |
@@ -517,7 +524,7 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel
 | G30 | Python | Add Cloud SQL PG vector store parity | §5g | ⏳ Deferred | — |
 | G31 | Python | Add dedicated Python MCP parity sample | §2b/§9 | 🔄 | [#4248](https://github.com/firebase/genkit/pull/4248) |
 | G8 | Python | Implement `genkit.client` (`run_flow` / `stream_flow`) | §5c/§9 | ⏳ Deferred | — |
-| G17 | Python | Add built-in `api_key()` context provider | §8g | 🔄 | [#4521](https://github.com/firebase/genkit/pull/4521) (draft) |
+| G17 | Python | Add built-in `api_key()` context provider | §8g | ⬜ | [#4521](https://github.com/firebase/genkit/pull/4521) (closed) — needs new PR |
 | G11 | Python | Add `CHANGELOG.md` to plugins + core | §3c | ✅ Done | [#4507](https://github.com/firebase/genkit/pull/4507), [#4508](https://github.com/firebase/genkit/pull/4508) |
 | G33 | Python | Consider LangChain integration parity | §1c/§9 | ⏳ Deferred | — |
 | G34 | Python | Track BloomLabs vector stores (Convex, HNSW, Milvus) | §6b/§9 | ⏳ Deferred | — |
@@ -721,7 +728,7 @@ Both JS and Python use the **same core protocol**: Newline-delimited JSON (NDJSO
 | Header | JS | Python | Gap |
 |--------|-----|--------|-----|
 | `X-Genkit-Trace-Id` | ✅ Set in `onTraceStart` callback. Both streaming and non-streaming. | ✅ Set when trace ID is available. Both streaming and non-streaming. | ✅ Identical |
-| **`X-Genkit-Span-Id`** | ✅ Set in `onTraceStart` callback (`reflection.ts:247`). | ❌ **Not sent**. Only listed in CORS `expose_headers`. | **Gap**: Python never sends this header. |
+| **`X-Genkit-Span-Id`** | ✅ Set in `onTraceStart` callback (`reflection.ts:247`). | ✅ Set in `wrapped_on_trace_start` callback. Both streaming and non-streaming. | ✅ Fixed by [#4511](https://github.com/firebase/genkit/pull/4511) |
 | `X-Genkit-Version` / `x-genkit-version` | ✅ Set as `X-Genkit-Version` in `onTraceStart` callback AND as `x-genkit-version` in non-streaming list endpoints. | ✅ Set as `x-genkit-version` in all responses. | ✅ Functionally equivalent (case-insensitive HTTP headers). |
 | CORS `expose_headers` | Not explicitly shown (uses express CORS). | `['X-Genkit-Trace-Id', 'X-Genkit-Span-Id', 'x-genkit-version']` | ✅ Python is more explicit. |
 
@@ -729,7 +736,7 @@ Both JS and Python use the **same core protocol**: Newline-delimited JSON (NDJSO
 
 | Aspect | JS | Python | Gap |
 |--------|-----|--------|-----|
-| Callback arguments | `({traceId, spanId})` — receives **both** trace ID and span ID as a destructured object. | `(tid: str)` — receives **only** trace ID as a string. | **Gap**: Python cannot send `X-Genkit-Span-Id` because it doesn't receive the span ID. |
+| Callback arguments | `({traceId, spanId})` — receives **both** trace ID and span ID as a destructured object. | `(tid: str, sid: str)` — receives **both** trace ID and span ID. | ✅ Fixed by [#4511](https://github.com/firebase/genkit/pull/4511) |
 
 **JS** (`js/core/src/reflection.ts:234-258`):
 ```js
@@ -743,16 +750,17 @@ const onTraceStartCallback = ({ traceId: tid, spanId }) => {
 };
 ```
 
-**Python** (`py/.../core/reflection.py:395-399`):
+**Python** (`py/.../core/reflection.py`):
 ```python
-def wrapped_on_trace_start(tid: str) -> None:
-    nonlocal run_trace_id
+def wrapped_on_trace_start(tid: str, sid: str) -> None:
+    nonlocal run_trace_id, run_span_id
     run_trace_id = tid
-    on_trace_start(tid)
+    run_span_id = sid
+    on_trace_start(tid, sid)
     trace_id_event.set()
 ```
 
-**Fix required**: Update `on_trace_start` callback signature throughout the Python action system to pass both `trace_id` and `span_id`, then include `X-Genkit-Span-Id` in reflection response headers.
+**Fixed**: `on_trace_start` now receives both `trace_id` and `span_id`, and `X-Genkit-Span-Id` is included in reflection response headers ([#4511](https://github.com/firebase/genkit/pull/4511)).
 
 #### 8c.5 Action Discovery Endpoint (`GET /api/actions`)
 
diff --git a/py/engdoc/ROADMAP.org b/py/engdoc/ROADMAP.org
deleted file mode 100644
index 3052eecc07..0000000000
--- a/py/engdoc/ROADMAP.org
+++ /dev/null
@@ -1,240 +0,0 @@
-#+title: SDK Roadmap
-#+description: An org document that enlists the milestones and objectives of our SDK roadmap.
-
-* SDK Roadmap [0/0]
-** Objectives [0/4]
-- [ ] The Python SDK needs to be at feature parity with the JavaScript SDK.
-- [ ] The Go SDK needs to be at feature parity with the JavaScript SDK.
-- [ ] The Python Dotprompt library needs to be at feature parity with the JavaScript Dotprompt implementation.
-- [ ] The Go Dotprompt library needs to be at feature parity with the JavaScript Dotprompt implementation.
-** Specifications and Schemas [0/4]
-- [ ] dotprompt
-  - [ ] helpers (based on yaml spec)
-    - [ ] json
-      - [ ] Go
-      - [ ] Python
-    - [ ] media
-      - [ ] Go
-      - [ ] Python
-    - [ ] role
-      - [ ] Go
-      - [ ] Python
-    - [ ] history
-      - [ ] Create the spec yaml
-      - [ ] Go
-      - [ ] Python
-    - [ ] section
-      - [ ] Create the spec yaml
-      - [ ] Go
-      - [ ] Python
-  - [ ] metadata.yaml
-    - [ ] Go
-    - [ ] Python
-  - [ ] partials.yaml
-    - [ ] Go
-    - [ ] Python
-  - [ ] picoschema.yaml
-    - [ ] Go
-    - [ ] Python
-  - [ ] variables.yaml
-    - [ ] Go
-    - [ ] Python
-- [ ] genkit-schema converter
-  - [ ] schema.py and tests
-    - [ ] Candidate
-    - [ ] CandidateError
-    - [ ] DataPart
-    - [ ] DocumentData
-    - [ ] FinishReason
-    - [ ] GenerateActionOptions
-    - [ ] GenerateCommonConfig
-    - [ ] GenerateRequest
-    - [ ] GenerateResponse
-    - [ ] GenerateResponseChunk
-    - [ ] GenerationUsage
-    - [ ] InstrmentationLibrary
-    - [ ] Link
-    - [ ] MediaPart
-    - [ ] Message
-    - [ ] ModelInfo
-    - [ ] ModelRequest
-    - [ ] ModelResponse
-    - [ ] ModelResponseChunk
-    - [ ] Part
-    - [ ] Role
-    - [ ] SpanContext
-    - [ ] SpanData
-    - [ ] SpanMetadata
-    - [ ] SpanStatus
-    - [ ] TextPart
-    - [ ] TimeEvent
-    - [ ] ToolDefinition
-    - [ ] ToolRequest
-    - [ ] ToolRequestPart
-    - [ ] ToolResponse
-    - [ ] ToolResponsePart
-    - [ ] TraceData
-- [ ] reflection API [0/7]
-
-  See: `reflectionApi.yaml`
-
-  - [ ] GET /api/actions: Retrieves all runnable actions.
-  - [ ] POST /api/runAction: Runs an action and returns the result.
-  - [ ] GET /api/envs/{env}/traces: Retrieves all traces for a given environment (e.g. dev or prod)
-  - [ ] GET /api/envs/{env}/traces/{traceId}: Retrieves traces for the given environment
-  - [ ] GET /api/envs/{env}/flowStates: Retrieves all flow states for a given environment (e.g. dev or prod)
-  - [ ] GET /api/envs/{env}/flowStates/{flowId}: Retrieves a flow state for the given ID
-  - [ ] GET /api/__health: health check
-- [ ] generate API
-  - [ ]
-** Plugins [0/9]
-- [ ] Design [0/2]
-  - [ ] Proposal with example API
-  - [ ] Design review
-- [ ] Chroma [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-- [ ] Dotprompt [0/0]
-- [ ] Firebase [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-- [ ] GoogleAI [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-- [ ] Ollama [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-- [ ] OpenAI [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-- [ ] Pinecone [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-- [ ] VertexAI [0/4]
-  - [ ] Plugin
-  - [ ] Documentation
-  - [ ] Tests
-  - [ ] Sample
-** Samples
-- [ ] Hello world
-- [ ] Basic Gemini
-- [ ] Context caching
-- [ ] Context caching2
-- [ ] Custom evaluators
-- [ ] Docs Menu Basic
-- [ ] Docs Menu RAG
-- [ ] Flow sample 1
-- [ ] Flow sample 2
-- [ ] Prompt file
-- [ ] RAG
-- [ ] Vertex AI model garden
-- [ ] Vertex AI reranker
-- [ ] Vertex AI Vector Search
-** Server implementations [/]
-- [ ] multiprocessing server cluster [0/2]
-  - [ ] reflection server in dev mode
-  - [ ] production flows server
-** CI/CD/Dev workflow [2/6]
-- [-] Unit testing library
-  - [ ] Go testify
-  - [X] Python pytest
-- [X] Unit testing watcher
-  - [X] pytest-watcher
-- [-] Coverage analysis
-  - [X] pytest-cov
-  - [ ] Go test coverage tool
-- [ ] Vulnerability analysis
-  - [ ] Python
-  - [ ] Go
-- [ ] License compatibility checks
-  - [ ] Python
-  - [ ] Go
-- [X] Automated license header check
-  - [X] Python
-  - [X] Go
-** Git Hooks [0/1]
-- [-] Pre-commit and pre-push hooks
-  - [-] Build Code
-    - [-] go build
-      - [X] Genkit
-      - [ ] Dotprompt
-    - [X] build python distribution
-      - [X] Genkit
-      - [X] Dotprompt
-  - [X] Distribution
-    - [X] Genkit
-    - [X] Dotprompt
-  - [-] Documentation
-    - [-] godoc
-      - [X] Genkit
-      - [ ] Dotprompt
-    - [X] engdoc using mkdocs
-      - [X] Genkit
-      - [X] Dotprompt
-    - [ ] Python API doc using mkdocstrings
-      - [ ] Genkit
-      - [ ] Dotprompt
-  - [-] Test
-    - [-] go test
-      - [X] Genkit
-      - [ ] Dotprompt
-    - [X] pytest with coverage threshold
-      - [X] Genkit
-      - [X] Dotprompt
-  - [X] Format
-  - [-] Lint
-    - [-] Python
-      - [-] mypy static type checks
-        - [X] dotprompt
-        - [ ] genkit
-** Dependencies
-- [X] Handlebars
-  - [X] handlebars-py (MIT License; feasibility test done)
-  - [X] pybars3 (LGPL 3.0 License; cannot use)
-- [ ] JSON Schema
-  - [ ] Go:
-    - [ ] https://github.com/swaggest/jsonschema-go
-    - [ ] https://github.com/xeipuuv/gojsonschema
-    - [ ] https://github.com/santhosh-tekuri/jsonschema
-    - [ ] https://github.com/qri-io/jsonschema
-- [ ] Picoschema
-  - [ ] Go
-    - [ ] https://github.com/jumonapp/picoschema
-** Release management
-- [X] Semantic Versioning and tagging
-- [X] PyPi project for dotprompt https://pypi.org/project/dotprompt/
-- [X] PyPi project for genkit https://pypi.org/project/genkit/
-- [X] Version consistency check script (bin/check_versions)
-- [X] Dynamic plugin matrix for publish workflow
-- [X] Post-publish verification job
-- [X] Shell script linting (shellcheck)
-
-** API Documentation [0/4]
-- [X] MkDocs configuration with Material theme
-- [X] mkdocstrings for all 22 plugins (mkdocs.yml)
-- [ ] Docs publish workflow (deploy to GitHub Pages on release)
-  - [ ] Create publish_docs.yml workflow
-  - [ ] Configure GitHub Pages source
-  - [ ] Add version selector for multiple SDK versions
-  - [ ] Add search functionality
-- [ ] API reference pages for each plugin
-  - [ ] Auto-generate from docstrings
-  - [ ] Add usage examples
-  - [ ] Add configuration reference
-
-** Integration Tests [/]
-  - [ ] Go
-  - [ ] Python
-  - [X] JS
diff --git a/py/engdoc/blog-genkit-python-0.5.0.md b/py/engdoc/blog-genkit-python-0.5.0.md
deleted file mode 100644
index 0edf69ce73..0000000000
--- a/py/engdoc/blog-genkit-python-0.5.0.md
+++ /dev/null
@@ -1,304 +0,0 @@
-# Genkit Python SDK 0.5.0: A Major Leap Forward
-
-Building intelligent AI-powered applications in Python just got significantly better. Today, we're thrilled to announce the release of **Genkit Python SDK 0.5.0**—our most significant update yet, with **178 commits**, **680+ files changed**, and contributions from **13 developers** across **188 PRs** over the past 8 months.
-
-This release transforms Genkit for Python from an experimental SDK into a production-ready framework with comprehensive plugin coverage, enterprise-grade security, and first-class developer experience.
-
-## What's New in 0.5.0
-
-### Massive Plugin Ecosystem Expansion
-
-We've added **7 new model provider plugins** and **3 telemetry plugins**, giving you access to virtually every major AI provider:
-
-**New Model Providers:**
-- **AWS Bedrock**: Access Claude, Titan, Llama, and more through AWS
-- **Azure OpenAI (Microsoft Foundry)**: Enterprise Azure OpenAI integration
-- **Cloudflare Workers AI**: Edge AI with Cloudflare's global network
-- **Mistral AI**: Mistral Large, Small, Codestral, and Pixtral models
-- **Hugging Face**: 17+ inference providers through one plugin
-- **Anthropic**: Full Claude model support
-- **DeepSeek**: DeepSeek models with structured output
-
-**New Telemetry Plugins:**
-- **AWS X-Ray**: Production observability with SigV4 signing
-- **Observability**: Third-party backends (Sentry, Honeycomb, Datadog)
-- **Google Cloud Telemetry**: Full parity with JS/Go SDKs
-
-### Async-First Architecture
-
-The Python SDK now embraces async-first design throughout. Here's how clean your code can be:
-
-```python
-from genkit.ai import Genkit
-from genkit.plugins.google_genai import GoogleAI
-
-ai = Genkit(
-    plugins=[GoogleAI()],
-    model='googleai/gemini-2.5-flash',
-)
-
-@ai.flow()
-async def analyze_sentiment(text: str) -> str:
-    """Analyzes sentiment of the given text."""
-    response = await ai.generate(
-        prompt=f'Analyze the sentiment of this text: {text}'
-    )
-    return response.text  # Property, not method
-```
-
-### Agentive Tool Calling
-
-Genkit makes it easy to give your AI agents the ability to call functions. Define tools with the `@ai.tool()` decorator:
-
-```python
-from pydantic import BaseModel, Field
-
-class WeatherInput(BaseModel):
-    location: str = Field(description='City and state, e.g. San Francisco, CA')
-
-@ai.tool()
-def get_weather(input: WeatherInput) -> dict:
-    """Get the current weather for a location."""
-    return {
-        'location': input.location,
-        'temperature': 21.5,
-        'conditions': 'sunny',
-    }
-
-@ai.flow()
-async def weather_agent(location: str) -> str:
-    """AI agent that can check the weather."""
-    response = await ai.generate(
-        prompt=f"What's the weather in {location}?",
-        tools=[get_weather],
-    )
-    return response.text
-```
-
-### Enhanced Dotprompt Integration
-
-We've deeply integrated with [Dotprompt](https://github.com/google/dotprompt), our prompt templating engine, bringing:
-
-- **Directory/file prompt loading**: Automatic prompt discovery matching the JS SDK
-- **Handlebars partials**: Template reuse with `define_partial`
-- **Python 3.14 support**: Full compatibility via our Rust-based Handlebars engine
-- **Cycle detection**: Prevents infinite recursion in partial resolution
-- **Path traversal hardening**: Security fix for CWE-22 vulnerability
-
-```python
-from genkit.ai import Genkit
-from genkit.plugins.google_genai import GoogleAI
-
-ai = Genkit(plugins=[GoogleAI()], model='googleai/gemini-2.5-flash')
-
-# Define partials for reusable template components
-ai.define_partial('greeting', 'Hello, {{name}}!')
-ai.define_partial('signature', '\n\nBest regards,\n{{sender}}')
-
-# Use partials in prompts ({{> greeting}} and {{> signature}})
-@ai.flow()
-async def send_email(name: str, sender: str) -> str:
-    response = await ai.generate(
-        prompt=f'Write an email to {name} from {sender}'
-    )
-    return response.text
-```
-
-### Comprehensive Type Safety
-
-We now run **three type checkers** on every commit:
-
-| Type Checker | Provider | Purpose |
-|-------------|----------|---------|
-| **ty** | Astral (Ruff) | Fast, strict checking |
-| **pyrefly** | Meta | Additional coverage |
-| **pyright** | Microsoft | Full type analysis |
-
-This means you get better IDE support, fewer runtime errors, and more confident refactoring.
-
-### Pydantic Output Instances
-
-Generate structured data directly into Pydantic models:
-
-```python
-from pydantic import BaseModel, Field
-from genkit.ai import Genkit, Output
-from genkit.plugins.google_genai import GoogleAI
-
-ai = Genkit(plugins=[GoogleAI()], model='googleai/gemini-2.5-flash')
-
-class RpgCharacter(BaseModel):
-    name: str = Field(description='name of the character')
-    backstory: str = Field(description='character backstory')
-    abilities: list[str] = Field(description='list of abilities (3-4)')
-
-@ai.flow()
-async def generate_character(name: str) -> RpgCharacter:
-    result = await ai.generate(
-        prompt=f'Generate an RPG character named {name}',
-        output=Output(schema=RpgCharacter),  # Use Output wrapper
-    )
-    # Returns an RpgCharacter instance, not dict!
-    return result.output
-```
-
-## Critical Fixes & Security
-
-This release addresses several important issues:
-
-- **Race Condition Fix**: Dev server startup race condition resolved (#4225)
-- **Thread Safety**: Per-event-loop HTTP client caching prevents event loop binding errors
-- **Security Audit**: Full Ruff security rules (S) audit completed
-- **SigV4 Signing**: AWS X-Ray OTLP exporter now uses proper AWS signatures
-
-## Developer Experience Improvements
-
-### Hot Reloading
-
-All samples now support hot reloading via [Watchdog](https://github.com/gorakhargosh/watchdog):
-
-```bash
-# Start with hot reloading
-genkit start -- python main.py
-```
-
-### CI Consolidation
-
-Every commit is now release-worthy. Our consolidated CI runs:
-- All three type checkers
-- Full test suite across Python 3.10-3.14
-- Security scanning
-- License compliance
-- Package builds
-
-### Rich Tracebacks
-
-Better error output with [Rich](https://github.com/Textualize/rich) tracebacks in all samples.
-
-## Available Plugins
-
-### Model Providers
-
-| Plugin | Models | Status |
-|--------|--------|--------|
-| `genkit-plugin-google-genai` | Gemini 2.5, Imagen, embeddings | ✅ Stable |
-| `genkit-plugin-ollama` | Gemma, Llama, Mistral (local) | ✅ Stable |
-| `genkit-plugin-anthropic` | Claude 3.5 Sonnet, Opus | ✅ New |
-| `genkit-plugin-amazon-bedrock` | Claude, Titan, Llama via AWS | ✅ New |
-| `genkit-plugin-microsoft-foundry` | Azure OpenAI | ✅ New |
-| `genkit-plugin-cloudflare-workers-ai` | Cloudflare Workers AI | ✅ New |
-| `genkit-plugin-mistral` | Mistral Large, Codestral | ✅ New |
-| `genkit-plugin-huggingface` | 17+ providers | ✅ New |
-| `genkit-plugin-deepseek` | DeepSeek models | ✅ New |
-| `genkit-plugin-xai` | Grok models | ✅ New |
-
-### Telemetry & Observability
-
-| Plugin | Destination | Status |
-|--------|-------------|--------|
-| `genkit-plugin-google-cloud` | Cloud Trace/Logging | ✅ Stable |
-| `genkit-plugin-aws` | AWS X-Ray | ✅ New |
-| `genkit-plugin-observability` | Sentry, Honeycomb, Datadog | ✅ New |
-| `genkit-plugin-firebase` | Firebase/Firestore | ✅ Stable |
-
-### Data & Retrieval
-
-| Plugin | Purpose | Status |
-|--------|---------|--------|
-| `genkit-plugin-firebase` | Vector search with Firestore | ✅ Stable |
-| `genkit-plugin-dev-local-vectorstore` | Local vector store for dev | ✅ Stable |
-
-## Get Started
-
-Install Genkit Python SDK 0.5.0:
-
-```bash
-pip install genkit==0.5.0
-```
-
-Or with specific plugins:
-
-```bash
-pip install genkit[google-genai,anthropic,amazon-bedrock]==0.5.0
-```
-
-### Quick Start Example
-
-```python
-import asyncio
-from genkit.ai import Genkit
-from genkit.plugins.google_genai import GoogleAI
-
-# Initialize at module level (best practice)
-ai = Genkit(
-    plugins=[GoogleAI()],
-    model='googleai/gemini-2.5-flash',
-)
-
-@ai.flow()
-async def greeting_flow(name: str) -> str:
-    """Generates a personalized greeting."""
-    response = await ai.generate(
-        prompt=f'Write a creative greeting for {name}'
-    )
-    return response.text  # Property, not method
-
-async def main():
-    result = await greeting_flow('World')
-    print(result)
-
-if __name__ == '__main__':
-    ai.run_main(main())  # Use ai.run_main() for proper lifecycle
-```
-
-Run with the Developer UI:
-
-```bash
-genkit start -- python main.py
-```
-
-## Contributors
-
-This release was made possible by an incredible team effort. Thank you to all **13 contributors** who made this release possible:
-
-| Contributor | Contributions |
-|-------------|---------------|
-| [@yesudeep](https://github.com/yesudeep) | Core architecture, 7 plugins, type safety, security |
-| [@MengqinShen](https://github.com/MengqinShen) | Resources, samples, model configs |
-| [@AbeJLazaro](https://github.com/AbeJLazaro) | Model Garden, Ollama, Gemini |
-| [@pavelgj](https://github.com/pavelgj) | Reflection API, embedders |
-| [@zarinn3pal](https://github.com/zarinn3pal) | Anthropic, DeepSeek, xAI, GCP telemetry |
-| [@huangjeff5](https://github.com/huangjeff5) | PluginV2, type safety, telemetry |
-| [@hendrixmar](https://github.com/hendrixmar) | Evaluators, OpenAI compat, Dotprompt |
-| [@ssbushi](https://github.com/ssbushi) | Evaluator plugins |
-| [@shrutip90](https://github.com/shrutip90) | ResourcePartSchema |
-| [@schlich](https://github.com/schlich) | Type annotations |
-| [@ktsmadhav](https://github.com/ktsmadhav) | Windows support |
-| [@junhyukhan](https://github.com/junhyukhan) | Documentation |
-| [@CorieW](https://github.com/CorieW) | Community contribution |
-
-Special thanks to the [google/dotprompt](https://github.com/google/dotprompt) team for the deep integration work.
-
-## What's Next?
-
-We're committed to continuously evolving Genkit Python. Coming soon:
-
-- **Session/Chat API**: Multi-turn conversation management
-- **Reflection API v2**: WebSocket and JSON-RPC 2.0 support
-- **More plugins**: Checks, Chroma, Pinecone, Cloud SQL PostgreSQL
-- **Feature parity**: Continued alignment with JS/Go SDKs
-
-## Get Involved
-
-Got questions or feedback? Join us on:
-- [Discord](https://discord.gg/qXt5zzQKpc)
-- [Stack Overflow](https://stackoverflow.com/questions/tagged/genkit)
-- [GitHub Issues](https://github.com/firebase/genkit/issues)
-
-Explore the [full documentation](https://python.api.genkit.dev) and start building!
-
-Happy coding, and we look forward to seeing what you create with Genkit Python 0.5.0!
-
----
-
-*Tags: Launch | Genkit | AI | Python*
diff --git a/py/engdoc/extending/api.md b/py/engdoc/extending/api.md
index 7b8eafb937..5fad3dd83c 100644
--- a/py/engdoc/extending/api.md
+++ b/py/engdoc/extending/api.md
@@ -139,210 +139,110 @@ differences here for now.
 * **Content negotiation**: Different response formats based on accept headers or
   query parameters.
 
-# Sync vs Async Design
+# Async-First Design
+
 Genkit is a library that allows application developers to create AI flows for
 their applications using an API that abstracts over various components such as
-indexers, retiervers, models, embedders, etc.
-
-Ideally, as a user, one would like the API to be async-first because this
-single-threaded model of dealing with concurrency is the direction that Python
-frameworks are taking and Genkit naturally lives in an async world. Genkit is
-majorly I/O-bound not as much computationally-bound since it's primary purpose
-is composing various AI foundational components and setting up typed
-communication patterns between them.
-
-### Shape of the API
-
-Before we begin, let's study `structlog`, a structured logging library that has
-had to deal with this problem as well and exposes a well-defined set of APIs
-that is familiar to the Python world:
-
-```python
-import asyncio
-import structlog
-
-logger = structlog.get_logger(__name__)
-
-async def foo() -> str:
-  """Foo.
-
-  Returns:
-    The name of this function.
-  """
-  await logger.ainfo('Returning foo from function', fn=foo.__name__)
-
-  return foo.__name__
-
-
-if __name__ == '__main__':
-    asyncio.run(foo())
-
-```
-
-Running the program displays the following on the console:
-
-```shell
-zsh❯ uv run foo.py
-2025-03-30 14:23:13 [info     ] Returning foo from function    fn=foo
-
-```
-
-`structlog` exposes the async equivalent (`await logger.ainfo()`) functionality
-of their `logger.info()` calls using the minimally-invasive `a*` prefix, without
-resorting to any sort of magic.
-
-We propose to do the same:
-
-```python
-ai = Genkit()
-
-@ai.flow()
-async def async_flow(...):
-    response = await ai.generate(f"Answer this: {query}")
-    return {"answer": response.text}
-
-@ai.flow()
-def sync_flow(...):
-    response = ai.generate(f"Answer this: {query}")
-    return {"answer": response.text}
-
-async def main() -> None:
-    """Main entry-point."""
-    ...
-
-if __name__ == '__main__':
-    asyncio.run(main())
-
-```
-
-!!! note
-
-    In an initial iteration of this design, we were considering using decorators
-    to detect whether the callable is a coroutine and change the meaning of the
-    `ai` treating it as a special variable inside it, but this increases the
-    complexity of the implementation and adds very little value.
+indexers, retrievers, models, embedders, etc.
 
-    We have, therefore, decided to favor simplicity and add the `a*` prefix to
-    every asynchronous method made available by the API.
+The API is **async-first** because this single-threaded model of dealing with
+concurrency is the direction that Python frameworks are taking and Genkit
+naturally lives in an async world. Genkit is majorly I/O-bound, not as much
+computationally-bound, since its primary purpose is composing various AI
+foundational components and setting up typed communication patterns between them.
 
-To make this work, we could have a user-facing veneer
-`genkit.ai.GenkitExperimental` class that composes 2 implementations of Genkit:
+### Class Hierarchy
 
-- `genkit.ai.AsyncGenkit`
-- `genkit.ai.SyncGenkit`
-
-#### ASCII Diagram
+The implementation uses a three-level class hierarchy:
 
 ```ascii
-+---------------------+      +-------------------+
-|   RegistrarMixin    |      |      Registry     |
-|---------------------|      |-------------------|
-| - _registry         |<>----|(placeholder type) |  (Composition: RegistrarMixin has a Registry)
-|---------------------|      +-------------------+
-| + __init__(registry)|
-| + flow()            |
-| + tool()            |
++---------------------+
+|   GenkitRegistry    |  (in _registry.py)
+|---------------------|
+| + flow()            |  Decorator to register flows
+| + tool()            |  Decorator to register tools
+| + define_model()    |  Register model actions
+| + define_embedder() |  Register embedder actions
 | + registry (prop)   |
 +--------^------------+
-         | (Inheritance: GenkitExperimental is-a RegistrarMixin)
-+--------|-----------------+      +----------------------+     +----------------------+
-|  GenkitExperimental      |----->|      AsyncGenkit     |     |       SyncGenkit     |
-| (in _veneer.py)          |<>--  | (in _async.py)       |     | (in _sync.py)        |
-|--------------------------|  |   |----------------------|     |----------------------|
-| - _registry (inherited)  |  |   | + generate()         |     | + generate()         |
-| - _async_ai : AsyncGenkit|  |   | + generate_stream()  |     | + generate_stream()  |
-| - _sync_ai : SyncGenkit  |  *-->+----------------------+     *-->+----------------------+
-|--------------------------|        (Async Implementation)        (Independent Sync Impl.)
-| + __init__(registry)     |
-| + flow() (inherited)     |
-| + tool() (inherited)     |
-|                          |
-| + generate() ----------> calls _sync_ai.generate()
-| + generate_stream() ---> calls _sync_ai.generate_stream()
-|                          |
-| + agenerate() ---------> calls _async_ai.generate()
-| + agenerate_stream() --> calls _async_ai.generate_stream()
-|                          |
-| + aio (prop) ---------> returns _async_ai instance
-| + io (prop) ----------> returns _sync_ai instance
-+--------------------------+
+         |
++--------|-----------+
+|    GenkitBase      |  (in _base_async.py)
+|--------------------|
+| + __init__(        |
+|     plugins,       |
+|     model,         |
+|     reflection_    |
+|     server_spec)   |
++--------^-----------+
+         |
++--------|-----------+
+|      Genkit        |  (in _aio.py)
+|--------------------|
+| + generate()       |  async — text generation
+| + generate_stream()|  streaming generation
+| + embed()          |  async — create embeddings
+| + retrieve()       |  async — fetch documents
+| + rerank()         |  async — reorder documents
+| + evaluate()       |  async — evaluate outputs
+| + chat()           |  session-based chat
++--------------------+
 ```
 
-#### Mermaid Diagram
-
 ```mermaid
 classDiagram
-    class RegistrarMixin {
-        -Registry _registry
-        +__init__(registry: Registry | None)
-        +flow(name: str | None, description: str | None) Callable
-        +tool(name: str | None, description: str | None) Callable
+    class GenkitRegistry {
+        <<_registry.py>>
+        +flow(name, description) Callable
+        +tool(name, description) Callable
+        +define_model(config, fn) Action
+        +define_embedder(config, fn) Action
         +registry() Registry
     }
 
-    class Registry {
-        %% Placeholder for Registry type %%
+    class GenkitBase {
+        <<_base_async.py>>
+        +__init__(plugins, model, reflection_server_spec)
     }
 
-    class AsyncGenkit {
-        <<_async.py>>
-        +generate(prompt: str) str
-        +generate_stream(prompt: str) AsyncGenerator
+    class Genkit {
+        <<_aio.py>>
+        +generate(model, prompt, system, ...) GenerateResponseWrapper
+        +generate_stream(model, prompt, ...) tuple
+        +embed(embedder, content) EmbedResponse
+        +retrieve(retriever, query) list
+        +rerank(reranker, query, documents) list
+        +evaluate(evaluator, dataset) EvalResponse
     }
 
-    class SyncGenkit {
-        <<_sync.py>>
-        +generate(prompt: str) str
-        +generate_stream(prompt: str) Generator
-    }
-
-    class GenkitExperimental {
-        <<_veneer.py>>
-        -AsyncGenkit _async_ai
-        -SyncGenkit _sync_ai
-        +__init__(registry: Registry | None)
-        +generate(prompt: str) str
-        +generate_stream(prompt: str) Generator
-        +agenerate(prompt: str) str
-        +agenerate_stream(prompt: str) AsyncGenerator
-        +aio() AsyncGenkit
-        +io() SyncGenkit
-    }
-
-    RegistrarMixin *-- Registry : has a >
-    GenkitExperimental --|> RegistrarMixin : inherits
-    GenkitExperimental *-- AsyncGenkit : has _async_ai >
-    GenkitExperimental *-- SyncGenkit : has _sync_ai >
-
-    GenkitExperimental --> AsyncGenkit : calls agenerate()
-    GenkitExperimental --> AsyncGenkit : calls agenerate_stream()
-    GenkitExperimental --> SyncGenkit : calls generate()
-    GenkitExperimental --> SyncGenkit : calls generate_stream()
+    GenkitBase --|> GenkitRegistry : inherits
+    Genkit --|> GenkitBase : inherits
 ```
 
-An instance of each of these would be exposed as a property on the veneer class.
-The veneer class should use a mixin called `RegistrarMixin` to manage the
-registration of AI blocks such as tools, flows, actions, etc
-
-### Maintaining parity
+All methods on the `Genkit` class are `async`. Synchronously-defined flows and
+tools are executed using a thread-pool executor internally.
 
-This would imply we'd have 2 implementations of Genkit. There's 2 ways that
-occur to me in which we could maintain parity:
+### Usage
 
-1. Maintain two separate implementations one for async and another for sync.
+```python
+from genkit.ai import Genkit
+from genkit.plugins.google_genai import GoogleAI
 
-2. Implement one in terms of the other.
+ai = Genkit(
+    plugins=[GoogleAI()],
+    model='googleai/gemini-2.0-flash',
+)
 
-We recommend option 1 for simplicity and easier maintenance.
+@ai.flow()
+async def my_flow(query: str) -> str:
+    response = await ai.generate(prompt=f"Answer this: {query}")
+    return response.text
+```
 
 ## Implementation
 
-Currently, the Veneer API contains an implementation that uses threads to start
-a reflection server when Genkit is in use in an environment where the
-`GENKIT_ENV` environment variable has been set to `'dev'`.
-
-There are a few ways to set that environment variable, and running the development
-server using `genkit start` also sets it.
+The `Genkit` class starts a reflection server when the `GENKIT_ENV` environment
+variable has been set to `'dev'`.
 
 Running the following command:
 
@@ -350,60 +250,28 @@ Running the following command:
 genkit start -- uv run sample.py
 ```
 
-would set `GENKIT_ENV='dev'` within a running instance of `sample.py`.
+sets `GENKIT_ENV='dev'` within a running instance of `sample.py`.
 
 `genkit start` exposes a developer UI (usually called dev UI for short) that is
 used for debugging and that talks to a reflection API server implemented by the
-veneer `Genkit` class instance.  The reflection API server provides a way for
-the dev UI to allow users to debug their custom flows, test features such as
-models and plugins, and also observe traces emitted by these components.
+`Genkit` class instance. The reflection API server provides a way for the dev UI
+to allow users to debug their custom flows, test features such as models and
+plugins, and also observe traces emitted by these components.
 
 ### Concurrency handling
 
-We would like to avoid using threads since asyncio is primarily a
-single-threaded design and threading complicates the internals of the API.
-Synchronously-defined flows, tools, and other actions would execute using a
-thread-pool executor used by the `SyncGenkit` implementation.
+The implementation avoids using threads for server infrastructure since asyncio
+is primarily a single-threaded design. The reflection server runs as a coroutine
+on the same event loop.
 
 #### Scenarios
 
-- For simple short lived applications, when we don't have the dev server we'd
-  want the program to exit since that shouldn't start the reflection server.
-
-- For simple short lived applications, when we have the dev server (meaning the
-  `GENKIT_ENV=dev` environment variable has been set), we should start the
-  reflection server and prevent the application's main thread from exiting and
-  shutting down the process to enable debugging.
-
-- For servers, we'd want the user to be able to add the reflection server to a
-  manager object such as that used in @multi_server.py  passed into the
-  arguments of the Genkit veneer class instance so that it attaches to the
-  server manager alongside any application servers written by the end user.
-
-The end user should not need to expliclity add code to their main thread to wait
-for the reflection server when dev mode is enabled. Since we're building an
-asyncio-first solution it should naturally do that since we'd be running the
-reflection server on the same event loop.
-
-```pseudocode
-if short lived app:
-  if dev mode enabled:
-    add reflection server coroutine to the event loop so main thread waits for dev UI debugging
-  else:
-    complete all flows and exit normally
-elif long-lived server:
-  if dev mode enabled:
-    add reflection server coroutine to the server manager to enable debuggging using dev UI
-  else:
-    run user-defined servers using server manager
-
-```
+- For simple short-lived applications without dev mode, the program exits
+  normally after completing all flows.
 
-Each of these can be demonstrated using individual entry-points sharing a common
-set of flows and tools. For example, the sample would define all the flows in
-`flows.py` and use them in both `server_example.py` and `short_lived_example.py`
-as a demonstration:
+- For simple short-lived applications with dev mode (`GENKIT_ENV=dev`), the
+  reflection server starts and prevents the main thread from exiting to enable
+  debugging.
 
-- `flows.py`
-- `server_example.py`
-- `short_lived_example.py`
+- For long-lived servers, the reflection server attaches to the server manager
+  alongside any application servers written by the end user.
diff --git a/py/engdoc/extending/index.md b/py/engdoc/extending/index.md
index 99adf0f124..c51fe449d4 100644
--- a/py/engdoc/extending/index.md
+++ b/py/engdoc/extending/index.md
@@ -43,13 +43,28 @@ genkit: {
   }
   plugins: {
     style: {fill: "#FCE4EC"}
-    chroma
-    pinecone
     google_genai
     google_cloud
-    openai
+    vertex_ai
     firebase
     ollama
+    anthropic
+    amazon_bedrock
+    cloudflare_workers_ai
+    cohere
+    compat_oai
+    deepseek
+    huggingface
+    microsoft_foundry
+    mistral
+    xai
+    observability
+    checks
+    evaluators
+    mcp
+    fastapi
+    flask
+    dev_local_vectorstore
   }
 }
 
diff --git a/py/engdoc/extending/servers.md b/py/engdoc/extending/servers.md
index a32ecbc2cf..2153ad4489 100644
--- a/py/engdoc/extending/servers.md
+++ b/py/engdoc/extending/servers.md
@@ -38,10 +38,10 @@ the runtime. The initialization process deals with:
 
 | Server           | Sources                                                                                                                                                                  |
 |------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| Flows            | [JS](https://github.com/firebase/genkit/blob/main/js/plugins/express/src/index.ts), [Go](TODO), [Python](TODO)                                                           |
+| Flows            | [JS](https://github.com/firebase/genkit/blob/main/js/plugins/express/src/index.ts), [Go](TODO), [Python](https://github.com/firebase/genkit/blob/main/py/packages/genkit/src/genkit/core/flows.py) |
 | Telemetry        | [JS](https://github.com/firebase/genkit/blob/main/genkit-tools/telemetry-server/src/index.ts)                                                                            |
 | Dev UI/Tools API | [JS](https://github.com/firebase/genkit/blob/main/genkit-tools/common/src/server/server.ts)                                                                              |
-| Reflection       | [JS](https://github.com/firebase/genkit/blob/main/js/core/src/reflection.ts), [Go](https://github.com/firebase/genkit/blob/main/go/genkit/reflection.go), [Python](TODO) |
+| Reflection       | [JS](https://github.com/firebase/genkit/blob/main/js/core/src/reflection.ts), [Go](https://github.com/firebase/genkit/blob/main/go/genkit/reflection.go), [Python](https://github.com/firebase/genkit/blob/main/py/packages/genkit/src/genkit/core/reflection.py) |
 
 ## Environment Variables
 
@@ -104,4 +104,3 @@ Many of these servers handle signals to handle graceful termination and clean up
     | `SIGTSTP`            | 20 or 18 | Terminal stop signal. Sent when the user presses `Ctrl+Z`.                                              | Stop                  | Used for job control.                                           |
     | `SIGTTIN`            | 21       | Terminal input. Sent to a background process that attempts to read from the terminal.                   | Stop                  | Used for job control.                                           |
     | `SIGTTOU`            | 22       | Terminal output. Sent to a background process that attempts to write to the terminal.                   | Stop                  | Used for job control.                                           |
-
diff --git a/py/engdoc/index.md b/py/engdoc/index.md
index 57909edcdb..d62a62376a 100644
--- a/py/engdoc/index.md
+++ b/py/engdoc/index.md
@@ -19,7 +19,7 @@ tools for testing and debugging. The following language runtimes are supported:
 |------------------|---------|--------------|
 | Node.js          | 22.0+   | 1            |
 | Go               | 1.22+   | 1            |
-| Python           | 3.12+   | 1            |
+| Python           | 3.10+   | 1            |
 
 It is designed to work with any generative AI model API or vector database.
 While we offer integrations for Firebase and Google Cloud, you can use Genkit
@@ -52,25 +52,40 @@ capabilities in code:
 | Feature           | Python | JavaScript | Go |
 |-------------------|--------|------------|----|
 | Agents            | ❌     | ✅         | ✅ |
-| Chat              | ❌     | ✅         | ✅ |
-| Data retrieval    | ❌     | ✅         | ✅ |
-| Generation        | ❌     | ✅         | ✅ |
-| Prompt templating | ❌     | ✅         | ✅ |
-| Structured output | ❌     | ✅         | ✅ |
-| Tool calling      | ❌     | ✅         | ✅ |
+| Chat              | ✅     | ✅         | ✅ |
+| Data retrieval    | ✅     | ✅         | ✅ |
+| Generation        | ✅     | ✅         | ✅ |
+| Prompt templating | ✅     | ✅         | ✅ |
+| Structured output | ✅     | ✅         | ✅ |
+| Tool calling      | ✅     | ✅         | ✅ |
 
 ### Plugin Parity
 
-| Plugins      | Python | JavaScript | Go |
-|--------------|--------|------------|----|
-| Chroma DB    | ❌     | ✅         | ✅ |
-| Dotprompt    | ❌     | ✅         | ✅ |
-| Firebase     | ❌     | ✅         | ✅ |
-| Google AI    | ❌     | ✅         | ✅ |
-| Google Cloud | ❌     | ✅         | ✅ |
-| Ollama       | ❌     | ✅         | ✅ |
-| Pinecone     | ❌     | ✅         | ✅ |
-| Vertex AI    | ❌     | ✅         | ✅ |
+| Plugins                | Python | JavaScript | Go |
+|------------------------|--------|------------|----|
+| Amazon Bedrock         | ✅     | —          | —  |
+| Anthropic              | ✅     | —          | —  |
+| Checks                 | ✅     | ✅         | —  |
+| Cloudflare Workers AI  | ✅     | —          | —  |
+| Cohere                 | ✅     | —          | —  |
+| Compat-OAI             | ✅     | —          | —  |
+| DeepSeek               | ✅     | —          | —  |
+| Dev Local Vectorstore  | ✅     | ✅         | —  |
+| Dotprompt              | ✅     | ✅         | ✅ |
+| Evaluators             | ✅     | ✅         | —  |
+| FastAPI                | ✅     | —          | —  |
+| Firebase               | ✅     | ✅         | ✅ |
+| Flask                  | ✅     | —          | —  |
+| Google Cloud           | ✅     | ✅         | ✅ |
+| Google GenAI           | ✅     | ✅         | ✅ |
+| Hugging Face           | ✅     | —          | —  |
+| MCP                    | ✅     | —          | —  |
+| Microsoft Foundry      | ✅     | —          | —  |
+| Mistral                | ✅     | —          | —  |
+| Observability          | ✅     | —          | —  |
+| Ollama                 | ✅     | ✅         | —  |
+| Vertex AI              | ✅     | ✅         | ✅ |
+| xAI                    | ✅     | —          | —  |
 
 ## Examples
 
@@ -78,38 +93,31 @@ capabilities in code:
 
 === "Python"
 
-    ```python hl_lines="12 13 14 15 17 20 21 22" linenums="1"
+    ```python linenums="1"
     import asyncio
-    import structlog
 
-    from genkit.ai import genkit
-    from genkit.plugins.google_ai import googleAI
-    from genkit.plugins.google_ai.models import gemini15Flash
+    from genkit.ai import Genkit
+    from genkit.plugins.google_genai import GoogleAI
 
-    logger = structlog.get_logger()
+    ai = Genkit(
+        plugins=[GoogleAI()],
+        model='googleai/gemini-2.0-flash',
+    )
 
 
     async def main() -> None:
-        ai = genkit({        # (1)!
-          plugins: [googleAI()],
-          model: gemini15Flash,
-        })
+        response = await ai.generate(prompt='Why is AI awesome?')
+        print(response.text)
 
-        response = await ai.generate('Why is AI awesome?')
-        await logger.adebug(response.text)
-
-        stream, _ = ai.generate_stream("Tell me a story")
+        stream, _ = ai.generate_stream(prompt='Tell me a story')
         async for chunk in stream:
-            await logger.adebug("Received chunk", text=chunk.text)
-        await logger.adebug("Finished generating text stream")
+            print(chunk.text, end='')
 
 
     if __name__ == '__main__':
-        asyncio.run(content_generation())
+        asyncio.run(main())
     ```
 
-1. :man_raising_hand: Basic example of annotation.
-
 === "JavaScript"
 
     ```javascript
@@ -148,42 +156,37 @@ capabilities in code:
 
     ```python
     import asyncio
-    import structlog
+    from enum import Enum
 
-    from genkit.ai import genkit
-    from genkit.plugins.google_ai import googleAI
-    from genkit.plugins.google_ai.models import gemini15Flash
+    from pydantic import BaseModel
 
-    logger = structlog.get_logger()
+    from genkit.ai import Genkit, Output
+    from genkit.plugins.google_genai import GoogleAI
+
+    ai = Genkit(
+        plugins=[GoogleAI()],
+        model='googleai/gemini-2.0-flash',
+    )
 
-    from pydantic import BaseModel, Field, validator
-    from enum import Enum
 
     class Role(str, Enum):
         KNIGHT = "knight"
         MAGE = "mage"
         ARCHER = "archer"
 
+
     class CharacterProfile(BaseModel):
         name: str
         role: Role
         backstory: str
 
-    async def main() -> None:
-        ai = genkit({
-          plugins: [googleAI()],
-          model: gemini15Flash,
-        })
 
-        await logger.adebug("Generating structured output", prompt="Create a brief profile for a character in a fantasy video game.")
+    async def main() -> None:
         response = await ai.generate(
             prompt="Create a brief profile for a character in a fantasy video game.",
-            output={
-                "format": "json",
-                "schema": CharacterProfile,
-            },
+            output=Output(schema=CharacterProfile),
         )
-        await logger.ainfo("Generated output", output=response.output)
+        print(response.output)
 
 
     if __name__ == "__main__":
@@ -224,51 +227,30 @@ capabilities in code:
 
     ```python
     import asyncio
-    import structlog
 
-    from genkit.ai import genkit
-    from genkit.plugins.google_ai import googleAI
-    from genkit.plugins.google_ai.models import gemini15Flash
     from pydantic import BaseModel, Field
 
-    logger = structlog.get_logger()
-
-
-    class GetWeatherInput(BaseModel):
-        location: str = Field(description="The location to get the current weather for")
-
+    from genkit.ai import Genkit
+    from genkit.plugins.google_genai import GoogleAI
 
-    class GetWeatherOutput(BaseModel):
-        weather: str
+    ai = Genkit(
+        plugins=[GoogleAI()],
+        model='googleai/gemini-2.0-flash',
+    )
 
 
-    async def get_weather(input: GetWeatherInput) -> GetWeatherOutput:
-        await logger.adebug("Calling get_weather tool", location=input.location)
-        # Replace this with an actual API call to a weather service
-        weather_info = f"The current weather in {input.location} is 63°F and sunny."
-        return GetWeatherOutput(weather=weather_info)
+    @ai.tool()
+    async def get_weather(location: str = Field(description="The location to get the current weather for")) -> str:
+        """Gets the current weather in a given location."""
+        return f"The current weather in {location} is 63°F and sunny."
 
 
     async def main() -> None:
-        ai = genkit({
-          plugins: [googleAI()],
-          model: gemini15Flash,
-        })
-
-        get_weather_tool = ai.define_tool(
-            name="getWeather",
-            description="Gets the current weather in a given location",
-            input_schema=GetWeatherInput,
-            output_schema=GetWeatherOutput,
-            func=get_weather,
-        )
-
-        await logger.adebug("Generating text with tool", prompt="What is the weather like in New York?")
         response = await ai.generate(
             prompt="What is the weather like in New York?",
-            tools=[get_weather_tool],
+            tools=['get_weather'],
         )
-        await logger.ainfo("Generated text", text=response.text)
+        print(response.text)
 
 
     if __name__ == "__main__":
@@ -317,43 +299,30 @@ capabilities in code:
 
     ```python
     import asyncio
-    import structlog
-
-    from genkit.ai import genkit
-    from genkit.plugins.google_ai import googleAI
-    from genkit.plugins.google_ai.models import gemini15Flash
-    from pydantic import BaseModel, Field
 
-    logger = structlog.get_logger()
+    from genkit.ai import Genkit
+    from genkit.plugins.google_genai import GoogleAI
 
-
-    class ChatResponse(BaseModel):
-        text: str
-
-
-    async def chat(input: str) -> ChatResponse:
-        await logger.adebug("Calling chat tool", input=input)
-        # Replace this with an actual API call to a language model,
-        # providing the user query and the conversation history.
-        response_text = "Ahoy there! Your name is Pavel, you scurvy dog!"
-        return ChatResponse(text=response_text)
+    ai = Genkit(
+        plugins=[GoogleAI()],
+        model='googleai/gemini-2.0-flash',
+    )
 
 
     async def main() -> None:
-        ai = genkit({
-          plugins: [googleAI()],
-          model: gemini15Flash,
-        })
-
-        chat_tool = ai.chat({system: 'Talk like a pirate'})
-
-        await logger.adebug("Calling chat tool", input="Hi, my name is Pavel")
-        response = await chat_tool.send("Hi, my name is Pavel")
-
-        await logger.adebug("Calling chat tool", input="What is my name?")
-        response = await chat_tool.send("What is my name?")
+        response = await ai.generate(
+            prompt='Hi, my name is Pavel',
+            system='Talk like a pirate',
+        )
+        print(response.text)
 
-        await logger.ainfo("Chat response", text=response.text)
+        response = await ai.generate(
+            prompt='What is my name?',
+            system='Talk like a pirate',
+            messages=response.messages,
+        )
+        print(response.text)
+        # Ahoy there! Your name is Pavel, you scurvy dog!
 
 
     if __name__ == "__main__":
@@ -385,7 +354,8 @@ capabilities in code:
 === "Python"
 
     ```python
-
+    # Not yet implemented in Python.
+    # See: https://github.com/firebase/genkit/pull/4212
     ```
 
 === "JavaScript"
@@ -438,46 +408,39 @@ capabilities in code:
 
     ```python
     import asyncio
-    import structlog
 
-    from genkit.ai import genkit
-    from genkit.plugins.google_ai import googleAI
-    from genkit.plugins.google_ai.models import gemini15Flash, textEmbedding004
-    from genkit.plugins.dev_local_vectorstore import devLocalVectorstore, devLocalRetrieverRef
+    from genkit.ai import Genkit
+    from genkit.plugins.google_genai import GoogleAI
+    from genkit.plugins.dev_local_vectorstore import DevLocalVectorstore
 
-    logger = structlog.get_logger()
+    ai = Genkit(
+        plugins=[
+            GoogleAI(),
+            DevLocalVectorstore(
+                indexes=[{
+                    'index_name': 'BobFacts',
+                    'embedder': 'googleai/text-embedding-004',
+                }],
+            ),
+        ],
+        model='googleai/gemini-2.0-flash',
+    )
 
 
     async def main() -> None:
-        ai = genkit(
-            plugins=[
-                googleAI(),
-                devLocalVectorstore(
-                    [
-                        {
-                            "index_name": "BobFacts",
-                            "embedder": textEmbedding004,
-                        }
-                    ]
-                ),
-            ],
-            model=gemini15Flash,
-        )
-
-        retriever = devLocalRetrieverRef("BobFacts")
-
         query = "How old is Bob?"
 
-        await logger.adebug("Retrieving documents", query=query)
-        docs = await ai.retrieve(retriever=retriever, query=query)
+        docs = await ai.retrieve(
+            retriever='devLocalVectorstore/BobFacts',
+            query=query,
+        )
 
-        await logger.adebug("Generating answer", query=query)
         response = await ai.generate(
-            prompt=f"Use the provided context from the BobFacts database to answer this query: {query}",
+            prompt=f"Use the provided context to answer: {query}",
             docs=docs,
         )
+        print(response.text)
 
-        await logger.ainfo("Generated answer", answer=response.text)
 
     if __name__ == "__main__":
         asyncio.run(main())
diff --git a/py/engdoc/model-conformance-roadmap.md b/py/engdoc/model-conformance-roadmap.md
deleted file mode 100644
index 9f182de450..0000000000
--- a/py/engdoc/model-conformance-roadmap.md
+++ /dev/null
@@ -1,491 +0,0 @@
-# Model Conformance Testing Plan for Python Plugins
-
-> **Status:** Infrastructure + Native Runner Complete (P0–P3 done, P4 pending manual validation)
-> **Date:** 2026-02-11 (updated)
-> **Owner:** Python Genkit Team
-> **Scope:** Phase 1 covers google-genai, anthropic, and compat-oai (OpenAI).
->   All 13 plugins have entry points and specs.  Native test runner replaces
->   genkit CLI dependency.  Unified multi-runtime table.
-
----
-
-## Problem Statement
-
-The Genkit CLI provides a `genkit dev:test-model` command
-([genkit-tools/cli/src/commands/dev-test-model.ts][dev-test-model]) that runs
-standardized conformance tests against model providers. This command already
-works cross-runtime (JS and Python) via the reflection API, but we have no
-Python-side conformance test specs, entry points, or automation to exercise it.
-
-We need to:
-
-1. Verify that Python model provider plugins produce correct responses for the
-   same test cases used by JS plugins.
-2. Establish a repeatable, per-plugin conformance testing workflow.
-3. Identify and close feature parity gaps between Python and JS plugins.
-
-[dev-test-model]: https://github.com/firebase/genkit/blob/main/genkit-tools/cli/src/commands/dev-test-model.ts
-
----
-
-## Architecture
-
-The `conform` tool supports two execution modes:
-
-```
-                py/bin/conform check-model [PLUGIN...]
-                              |
-                     +--------+--------+
-                     |                 |
-            default (native)    --use-cli (legacy)
-                     |                 |
-           +---------+---------+       |
-           |         |         |       v
-        python      js        go    genkit dev:test-model
-           |         |         |       |
-     InProcess   Reflection  Reflection
-      Runner      Runner     Runner
-           |         |         |       |
-      import    subprocess subprocess subprocess
-     entry.py   entry.ts   entry.go   genkit CLI
-           |         |         |       |
-      action.   async HTTP  async HTTP  |
-      arun_raw   reflection  reflection |
-           |         |         |       |
-           +----+----+----+----+       |
-                |         |            |
-           10 Validators  |            |
-           (1:1 with JS)  |            |
-                |         |            |
-         Unified Results Table         |
-         (Runtime column when          v
-          multiple runtimes)    Legacy per-runtime
-                                   tables
-```
-
-**Native runner (default):**
-
-1. For Python: imports `conformance_entry.py` in-process, calls
-   `action.arun_raw()` directly (no subprocess, no HTTP, no genkit CLI).
-2. For JS/Go: starts the entry point subprocess, discovers the reflection
-   server via `.genkit/runtimes/*.json`, communicates via async HTTP.
-3. 10 validators ported 1:1 from the canonical JS source.
-4. Results displayed in a unified table with Runtime column.
-
-**Legacy CLI runner (`--use-cli`):**
-
-1. Delegates to `genkit dev:test-model` via subprocess.
-2. Discovers the running Python runtime via `.genkit/runtimes/*.json`.
-3. Sends standardized test requests via `POST /api/runAction`.
-4. Validates responses using built-in validators.
-
----
-
-## Cross-Runtime Feature Parity Analysis
-
-### Plugins with JS Counterparts
-
-| Plugin | JS Location | JS Models | Python Models | Parity | Gaps in Python | Python Extras |
-|--------|-------------|-----------|---------------|--------|----------------|---------------|
-| **google-genai** | In-repo `js/plugins/google-genai/` | 24 (Gemini, TTS, Gemini-Image, Gemma, Imagen, Veo) | 23+ (same families) | **Partial** | Imagen under `googleai/` prefix (only registered under `vertexai/`) | More legacy Gemini preview versions |
-| **anthropic** | In-repo `js/plugins/anthropic/` | 8 (Claude 3-haiku through opus-4-5) | 8 (identical list and capabilities) | **Full** | None | None |
-| **compat-oai** | In-repo `js/plugins/compat-oai/` | 49 (30 chat, 2 image gen, 3 TTS, 3 STT, 3 embed, 2 DeepSeek, 6 xAI) | 30+ (22+ chat, 2 image gen, 3 TTS, 3 STT, 3 embed) | **Full** | Vision (gpt-4-vision*), gpt-4-32k (older models) | DeepSeek/xAI split into dedicated plugins |
-| **ollama** | In-repo `js/plugins/ollama/` | Dynamic discovery | Dynamic discovery | **Full** | Cosmetic: JS declares `media=true`, `toolChoice=true`; Python omits | Python declares `output=['text','json']` |
-| **amazon-bedrock** | External [aws-bedrock-js-plugin][bedrock-js] | ~35 (Amazon, Claude 2-3.7, Cohere, Mistral, AI21, Llama) | 50+ (all JS models included) | **Python superset** | None | DeepSeek, Gemma, NVIDIA, Qwen, Writer, Moonshot, newer Claude 4.x |
-| **microsoft-foundry** | External [azure-foundry-js-plugin][foundry-js] | ~32 chat + DALL-E + TTS + Whisper + embed | 30+ chat + embed + dynamic catalog | **Partial** | DALL-E image gen, TTS, Whisper STT | Claude, DeepSeek, Grok, Llama, Mistral; dynamic Azure catalog (11k+ models) |
-| **deepseek** | JS: in `compat-oai` as `deepseek/` prefix | 2 (deepseek-chat, deepseek-reasoner) | 4 (+ deepseek-v3, deepseek-r1) | **Python superset** | None | 2 additional models |
-| **xai** | JS: in `compat-oai` as `xai/` prefix | 6 (grok-3 family, grok-2-vision, grok-2-image) | 6 (grok-3 family, grok-4, grok-2-vision) | **Partial** | Image gen (grok-2-image-1212) | grok-4 (newer model) |
-
-[bedrock-js]: https://github.com/genkit-ai/aws-bedrock-js-plugin
-[foundry-js]: https://github.com/genkit-ai/azure-foundry-js-plugin
-
-### Python-Only Plugins (no JS counterpart)
-
-| Plugin | Models | Notes |
-|--------|--------|-------|
-| **mistral** | 30+ (Large 3, Medium 3.1, Small 3.2, Ministral 3, Magistral, Codestral, Devstral, Voxtral, Pixtral, Embed) | No JS plugin exists. PR #4485: embeddings + streaming fix. PR #4486: full capability update. |
-| **huggingface** | 10+ popular models + any HF model ID | No JS plugin exists |
-| **cloudflare-workers-ai** | 15+ (Llama, Mistral, Qwen, Gemma, Phi, DeepSeek) | No JS plugin exists |
-
-### Gaps Summary (Ordered by Priority)
-
-| Priority | Plugin | Gap | Impact | Fix Effort |
-|----------|--------|-----|--------|------------|
-| **HIGH** | google-genai | Imagen under `googleai/` prefix | Blocks spec symlink for conformance tests | Low (~20 lines in `google.py`) |
-| ~~MEDIUM~~ | compat-oai | ~~Image gen (dall-e-3, gpt-image-1)~~ | ✅ Done (PR #4477) | -- |
-| ~~MEDIUM~~ | compat-oai | ~~TTS (tts-1, tts-1-hd, gpt-4o-mini-tts)~~ | ✅ Done (PR #4477) | -- |
-| ~~MEDIUM~~ | compat-oai | ~~STT (whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe)~~ | ✅ Done (PR #4477) | -- |
-| **MEDIUM** | microsoft-foundry | DALL-E, TTS, Whisper | Mirrors compat-oai gaps | Medium |
-| **LOW** | xai | Image gen (grok-2-image-1212) | Single model missing | Medium (new handler) |
-| **LOW** | compat-oai | Vision models (gpt-4-vision*), gpt-4-32k | Older models, multimodal works via gpt-4o | Low (add model defs) |
-| **LOW** | ollama | `media`, `toolChoice` metadata | Cosmetic only, no functional impact | Trivial |
-
----
-
-## Dependency Graph
-
-All tasks for Phase 1 and their dependency relationships:
-
-```
-DEPENDENCY GRAPH
-================
-
-           +-----------------+       +-----------------+
-           | fix-imagen-gap  |       | setup-dir       |
-           | (P0)            |       | (P0)            |
-           +----+-------+----+       +--+---------+--+-+
-                |       |               |         |  |
-                |  +----+---------------+         |  |
-                |  |    |                         |  |
-           +----v--v-+  +----v-----------+  +-----v--+  +-----v--------+
-           | symlink |  | entry-         |  | spec-  |  | spec-        |
-           | gemini  |  | google-genai   |  | anthr. |  | compat-oai   |
-           | (P1)    |  | (P1)           |  | (P1)   |  | (P1)         |
-           +----+----+  +-------+--------+  +---+----+  +-----+--------+
-                |               |               |              |
-                +-------+-------+-------+-------+--------------+
-                        |
-                   +----v-----------+
-                   | runner-script  |
-                   | (P2)           |
-                   +----+-----------+
-                        |
-                   +----v-----------+
-                   | validate-      |
-                   | google-genai   |
-                   | (P3)           |
-                   +----------------+
-```
-
-**Edge list (A -> B means "A must complete before B can start"):**
-
-- `fix-imagen-gap` -> `symlink-gemini-spec`
-- `fix-imagen-gap` -> `entry-google-genai`
-- `setup-dir` -> `symlink-gemini-spec`
-- `setup-dir` -> `entry-google-genai`
-- `setup-dir` -> `spec-anthropic`
-- `setup-dir` -> `spec-compat-oai`
-- `symlink-gemini-spec` -> `runner-script`
-- `entry-google-genai` -> `runner-script`
-- `spec-anthropic` -> `runner-script`
-- `spec-compat-oai` -> `runner-script`
-- `runner-script` -> `validate-google-genai`
-
----
-
-## Phased Execution Plan (Reverse Topological Order)
-
-Execute each phase to completion before starting the next. **All tasks within a
-phase are independent and should run in parallel** for fastest completion.
-
-**Critical path:** `fix-imagen-gap` -> `symlink-gemini-spec` -> `runner-script`
--> `validate-google-genai`
-
-### Phase 0: Leaves ✅ COMPLETE
-
-| Task | Description | File(s) | Effort | Status |
-|------|-------------|---------|--------|--------|
-| `fix-imagen-gap` | GoogleAI already registers Imagen under `googleai/` (verified in code) | `google.py` lines 378-380, 523-527, 596-601 | N/A | ✅ Already done |
-| `setup-dir` | Created `py/tests/conformance/` with dirs for all 10 plugins | `py/tests/conformance/{google-genai,anthropic,compat-oai,...}/` | Trivial | ✅ Done |
-
-**Parallelizable:** Yes, both tasks are independent.
-
-### Phase 1: Specs + Entry Points ✅ COMPLETE
-
-| Task | Description | Depends On | File(s) | Status |
-|------|-------------|------------|---------|--------|
-| `symlink-gemini-spec` | Symlinked JS spec into conformance dir | P0 | `google-genai/model-conformance.yaml` → JS spec | ✅ Done |
-| `entry-google-genai` | Minimal google-genai entry point | P0 | `google-genai/conformance_entry.py` | ✅ Done |
-| `spec-anthropic` | Anthropic entry point + YAML spec | P0 | `anthropic/{conformance_entry.py,model-conformance.yaml}` | ✅ Done |
-| `spec-compat-oai` | compat-oai entry point + YAML spec (gpt-4o, gpt-4o-mini, dall-e-3, tts-1) | P0 | `compat-oai/{conformance_entry.py,model-conformance.yaml}` | ✅ Done (updated with multimodal, PR #4477) |
-
-**Note:** All 10 plugins (including Phase 2 plugins) have entry points and specs.
-
-### Phase 2: Orchestration ✅ COMPLETE
-
-| Task | Description | Depends On | File(s) | Status |
-|------|-------------|------------|---------|--------|
-| `runner-script` | Shell script to orchestrate per-plugin conformance test runs | All Phase 1 tasks | `py/bin/test-model-conformance` | ✅ Done |
-
-### Phase 2.5: Spec Audit + Model Updates ✅ COMPLETE
-
-| Task | Description | File(s) | Status |
-|------|-------------|---------|--------|
-| `audit-specs` | Verified all 11 plugin specs against official provider documentation (Feb 11, 2026). Fixed model names, corrected Supports flags, added missing models. Total: 24 models across 11 plugins. | All `model-conformance.yaml` files | ✅ Done |
-
-**Changes made during audit:**
-
-| Plugin | Before | After | Changes |
-|--------|--------|-------|---------|
-| **anthropic** | 2 models | 4 models | Added claude-sonnet-4-5, claude-opus-4-6 |
-| **deepseek** | 1 model (no structured-output) | 2 models | Added structured-output to chat, added deepseek-reasoner (no tools) |
-| **xai** | 1 model (grok-3, legacy) | 2 models | Replaced grok-3 → grok-4-fast-non-reasoning, added grok-2-vision-1212 |
-| **mistral** | 1 model (no vision) | 2 models | Added vision tests, added mistral-large-latest |
-| **amazon-bedrock** | Missing structured-output | Fixed | Added structured-output, streaming-structured-output |
-| **cloudflare** | Missing tool-request | Fixed | Added tool-request, streaming-multiturn |
-| **ollama** | Missing tool-request, vision | Fixed | Added tool-request, input-image-base64 |
-
-### Phase 3: Validation ⏳ PENDING
-
-| Task | Description | Depends On | File(s) | Status |
-|------|-------------|------------|---------|--------|
-| `validate-google-genai` | Manual end-to-end validation with live API via `genkit dev:test-model` | `runner-script` | -- (manual run) | ⏳ Not yet run |
-
-### Execution Timeline
-
-```
-TIME -->
-==========================================================================
-
-P0:  [fix-imagen-gap ~~~~~~~~~~~~]  [setup-dir ~~~]
-     (parallel)                      (parallel)
-                                         |
-     --- all P0 complete ----------------+--------
-                                         |
-P1:  [symlink-gemini-spec ~]  [entry-google-genai ~]
-     [spec-anthropic ~~~~~~]  [spec-compat-oai ~~~~]
-     (all 4 in parallel)
-                    |
-     --- all P1 complete ---
-                    |
-P2:  [runner-script ~~~~~~~~~~~~]
-                    |
-P2.5:[audit-specs ~~~~~~~~~]
-                    |
-P3:  [conform tool ~~~~~~~~~~~~~~~]  ← native runner, unified table
-                    |
-P4:  [validate-google-genai ~~~~]
-                    |
-     === PHASE 1 SCOPE COMPLETE ===
-```
-
-### Phase 3: Conform CLI Tool + Native Runner ✅ COMPLETE
-
-| Task | Description | File(s) | Status |
-|------|-------------|---------|--------|
-| `conform-cli` | Multi-runtime CLI tool (`py/tools/conform/`) | `cli.py`, `config.py`, `runner.py`, etc. | ✅ Done (PR #4593) |
-| `native-runner` | In-process runner for Python, reflection runner for JS/Go | `test_model.py`, `reflection.py` | ✅ Done |
-| `validators` | 10 validators ported 1:1 from JS canonical source | `validators/*.py` | ✅ Done |
-| `unified-table` | Single table with Runtime column across runtimes | `display.py`, `types.py` | ✅ Done |
-| `global-flags` | `--runtime` accepts matrix (e.g., `python go`), shown in subcommand help | `cli.py` | ✅ Done |
-| `remove-test-model` | Merged into `check-model` (native runner is default, `--use-cli` for legacy) | `cli.py` | ✅ Done |
-
-### Phase 4: Validation ⏳ PENDING
-
----
-
-## What To Build
-
-### Prerequisite: Fix Imagen Gap in Python google-genai Plugin
-
-The JS plugin supports Imagen under the `googleai/` prefix but the Python plugin
-only registers it under `vertexai/`. The `ImagenModel` class is already
-client-agnostic (uses `client.aio.models.generate_images()` which works for
-both); only the registration code needs updating.
-
-**File:** `py/plugins/google-genai/src/genkit/plugins/google_genai/google.py`
-
-**Changes (~20 lines):**
-
-1. **`GoogleAI.init()`** -- Add Imagen model loop after Gemini registration:
-   ```python
-   for name in genai_models.imagen:
-       actions.append(self._resolve_model(googleai_name(name)))
-   ```
-2. **`GoogleAI._resolve_model()`** -- Add Imagen detection branch (mirror
-   VertexAI logic):
-   ```python
-   if clean_name.lower().startswith('imagen'):
-       model_ref = vertexai_image_model_info(clean_name)
-       model = ImagenModel(clean_name, self._client)
-       IMAGE_SUPPORTED_MODELS[clean_name] = model_ref
-       config_schema = ImagenConfigSchema
-       # ... create and return Action
-   ```
-3. **`GoogleAI.list_actions()`** -- Include Imagen in discovered actions list:
-   ```python
-   for name in genai_models.imagen:
-       actions_list.append(
-           model_action_metadata(
-               name=googleai_name(name),
-               info=vertexai_image_model_info(name).model_dump(by_alias=True),
-               config_schema=ImagenConfigSchema,
-           )
-       )
-   ```
-
-### Directory Layout
-
-All conformance testing files live under `py/tests/conform/`:
-
-```
-py/tests/conform/
-  google-genai/
-    conformance_entry.py                  # minimal Genkit entry point
-    model-conformance.yaml -> symlink     # -> js/plugins/google-genai/tests/model-tests-tts.yaml
-  anthropic/
-    conformance_entry.py
-    model-conformance.yaml                # anthropic-specific spec
-  compat-oai/
-    conformance_entry.py
-    model-conformance.yaml                # openai-specific spec
-  ...13 plugins total...
-py/tools/conform/                         # conform CLI tool
-  src/conform/
-    cli.py                                # arg parsing + dispatch
-    config.py                             # TOML config loader
-    runner.py                             # legacy genkit CLI runner
-    test_model.py                         # native runner + ActionRunner Protocol
-    reflection.py                         # async HTTP client for reflection API
-    validators/                           # 10 validators (1:1 with JS)
-py/bin/conform                            # wrapper script
-```
-
-### Entry Point Template
-
-Each plugin gets a minimal Python script that initializes Genkit with just that
-plugin. The reflection server starts automatically in dev mode (`GENKIT_ENV=dev`,
-set by `genkit start`).
-
-```python
-"""Minimal entry point for model conformance testing via genkit dev:test-model."""
-import asyncio
-from genkit.ai import Genkit
-from genkit.plugins.google_genai import GoogleAI  # varies per plugin
-
-ai = Genkit(plugins=[GoogleAI()])
-
-async def main():
-    while True:
-        await asyncio.sleep(3600)
-
-if __name__ == '__main__':
-    ai.run_main(main())
-```
-
-### Spec Files
-
-**google-genai:** Symlink to the JS spec file so both runtimes test the same
-models with the same expectations:
-
-```bash
-# From py/tests/conformance/google-genai/
-ln -s "$(git rev-parse --show-toplevel)/js/plugins/google-genai/tests/model-tests-tts.yaml" model-conformance.yaml
-```
-
-The JS spec tests:
-- `googleai/imagen-4.0-generate-001` (output-image)
-- `googleai/gemini-2.5-flash-preview-tts` (custom TTS test)
-- `googleai/gemini-2.5-pro` (tool-request, structured-output, multiturn, system-role, image-base64, image-url, video-youtube)
-- `googleai/gemini-3-pro-preview` (same + reasoning, streaming, tool-response custom tests)
-- `googleai/gemini-2.5-flash` (same as gemini-2.5-pro)
-
-Env: `GEMINI_API_KEY`
-
-**anthropic:** New spec. Models: `anthropic/claude-sonnet-4` and
-`anthropic/claude-haiku-4-5`. Tests: tool-request, multiturn, system-role,
-input-image-base64, input-image-url, streaming-multiturn, streaming-tool-request.
-Haiku-4-5 adds structured-output and streaming-structured-output.
-
-Env: `ANTHROPIC_API_KEY`
-
-**compat-oai (OpenAI):** New spec. Models: `openai/gpt-4o` and
-`openai/gpt-4o-mini`. Tests: tool-request, structured-output, multiturn,
-system-role, input-image-base64, input-image-url, streaming-multiturn,
-streaming-tool-request, streaming-structured-output.
-
-Env: `OPENAI_API_KEY`
-
-### Conform CLI Tool
-
-**Location:** `py/bin/conform` (wrapper) → `py/tools/conform/`
-
-```bash
-# Usage:
-conform check-model                       # test all plugins, all runtimes
-conform check-model anthropic xai         # test specific plugins
-conform --runtime python go check-model   # matrix: python + go only
-conform check-model --use-cli             # legacy genkit CLI fallback
-conform list                              # show readiness table
-conform check-plugin                      # lint-time file check
-```
-
-The tool:
-- Uses the native runner by default (in-process for Python, async HTTP for JS/Go)
-- Falls back to `genkit dev:test-model` subprocess with `--use-cli`
-- Runs across all configured runtimes by default (`--runtime` for matrix)
-- Shows a unified table with Runtime column across runtimes
-- Reports aggregate pass/fail and exits non-zero on failure
-
-> **Note:** [`uv`](https://docs.astral.sh/uv/) is the project's standard Python
-> package manager and task runner, already used throughout the repository (see
-> `py/pyproject.toml` workspace configuration and `py/bin/` scripts). It is
-> installed as part of the developer setup via `bin/setup`.
-
-### Built-in Test Capabilities
-
-The following test types are available from `dev:test-model` (from
-[dev-test-model.ts lines 254-476][dev-test-model]):
-
-| Test | Description |
-|------|-------------|
-| `tool-request` | Tool/function calling conformance |
-| `structured-output` | JSON schema output |
-| `multiturn` | Multi-turn conversation |
-| `streaming-multiturn` | Streaming + multiturn |
-| `streaming-tool-request` | Streaming tool calls |
-| `streaming-structured-output` | Streaming structured output |
-| `system-role` | System message handling |
-| `input-image-base64` | Base64 image input |
-| `input-image-url` | URL image input |
-| `input-video-youtube` | YouTube video input |
-| `output-audio` | TTS/audio output |
-| `output-image` | Image generation |
-
-### Built-in Validators
-
-`has-tool-request[:toolName]`, `valid-json`, `text-includes:expected`,
-`text-starts-with:prefix`, `text-not-empty`, `valid-media:type`, `reasoning`,
-plus streaming variants (`stream-text-includes`, `stream-has-tool-request`,
-`stream-valid-json`).
-
----
-
-## Phase 2 (Future -- after Phase 1 validated)
-
-Add conformance specs for remaining plugins. The parity analysis above informs
-which capabilities to test per plugin:
-
-| Plugin | Test Capabilities | Notes |
-|--------|-------------------|-------|
-| **mistral** | tool-request, structured-output, multiturn, system-role, streaming-multiturn, input-image-base64, input-image-url | All Large 3/Medium 3.1/Small 3.2/Ministral 3/Magistral support vision. Voxtral adds audio input. |
-| **deepseek** | tool-request, structured-output, multiturn, system-role, streaming-multiturn | |
-| **xai** | tool-request, structured-output, multiturn, system-role, streaming-multiturn | grok-2-vision adds input-image |
-| **ollama** | tool-request, structured-output, multiturn, system-role | Depends on locally installed model |
-| **amazon-bedrock** | tool-request, structured-output, multiturn, system-role, streaming-multiturn, input-image-base64 | Model-dependent |
-| **huggingface** | tool-request, structured-output, multiturn, system-role | Model-dependent |
-| **microsoft-foundry** | tool-request, structured-output, multiturn, system-role, streaming-multiturn, input-image-base64 | Model-dependent |
-| **cloudflare-workers-ai** | tool-request, structured-output, multiturn, system-role | Model-dependent |
-
----
-
-## CI Integration Notes
-
-- These are **live API tests** -- they call real model endpoints. Do NOT run in
-  standard CI.
-- Gate behind manual trigger or CI label (e.g., `run-conformance-tests`).
-- Each plugin requires its own API key/credentials.
-- Consider a `--dry-run` mode in the runner script that validates spec files
-  parse correctly without making API calls.
-
----
-
-## Effort Estimates
-
-| Phase | Tasks | Effort | Parallelizable |
-|-------|-------|--------|----------------|
-| **P0** | 2 tasks (fix-imagen-gap, setup-dir) | ~1 hour | Yes |
-| **P1** | 4 tasks (symlink, entry, 2 specs) | ~2 hours | Yes |
-| **P2** | 1 task (runner script) | ~1 hour | No |
-| **P3** | 1 task (E2E validation) | ~1 hour | No |
-| **Total** | 8 tasks | ~3-5 hours (with parallelism) | |
diff --git a/py/engdoc/parity-analysis/feature_parity_analysis.md b/py/engdoc/parity-analysis/feature_parity_analysis.md
deleted file mode 100644
index ba2961a780..0000000000
--- a/py/engdoc/parity-analysis/feature_parity_analysis.md
+++ /dev/null
@@ -1,653 +0,0 @@
-# Genkit Feature Parity Analysis: JS vs Python
-
-This document analyzes feature gaps and behavioral differences between the JavaScript (canonical) implementation and the Python implementation of Genkit.
-
----
-
-## Executive Summary
-
-| Category | JS | Python | Gap |
-|----------|-----|--------|-----|
-| Plugins | 18 | 13 | 5 missing |
-| Core API Methods | ~45 | ~35 | 10+ missing |
-| Session/Chat | ✅ | ❌ | **Critical Gap** |
-| Background Actions | ✅ | ❌ | **Critical Gap** |
-| Dynamic Action Provider | ✅ | ❌ | Significant Gap |
-
----
-
-## 1. Missing Core Features
-
-### 1.1 Session & Chat (Critical Gap)
-
-> [!CAUTION]
-> Python lacks stateful conversation management entirely.
-
-**JS has:**
-- [Session](/js/ai/src/session.ts) class with:
-  - `updateState(data)` - Update session state
-  - `updateMessages(thread, messages)` - Manage thread history
-  - `chat()` - Create chat sessions with thread support
-  - `run(fn)` - Execute within session context
-  - `toJSON()` - Serialize session
-- [Chat](/js/ai/src/chat.ts) class with:
-  - `send(options)` - Send message with history
-  - `sendStream(options)` - Streaming with history
-  - `messages()` - Get conversation history
-  - Thread management (multiple conversations per session)
-- `SessionStore` interface for persistence
-- `ai.createSession()` and `ai.chat()` veneer methods
-
-**Python has:** Nothing equivalent. No way to maintain conversation history across multiple `generate()` calls without manual message management.
-
----
-
-### 1.2 Background Actions & Background Models (Critical Gap)
-
-> [!CAUTION]
-> Python lacks long-running operation support.
-
-**JS has:**
-- [BackgroundAction](/js/core/src/background-action.ts):
-  - `start(input, options)` - Start background operation
-  - `check(operation)` - Check operation status
-  - `cancel(operation)` - Cancel running operation
-- `Operation` type with `id`, `done`, `output`, `error`, `metadata`
-- `defineBackgroundAction()` - Register background actions
-- `defineBackgroundModel()` - Register models that return operations (e.g., video generation)
-- [ai.checkOperation()](/js/genkit/src/genkit.ts#L866-L886) - Veneer method
-
-**Python has:** Nothing. Cannot use models like Veo that return operations for later retrieval.
-
----
-
-### 1.3 Dynamic Action Provider (Significant Gap)
-
-**JS has:**
-- [DynamicActionProvider](/js/core/src/dynamic-action-provider.ts):
-  - Caching with configurable TTL
-  - `invalidateCache()` - Force refresh
-  - `getAction(type, name)` - Resolve action dynamically
-  - `listActionMetadata(type, name)` - List available actions
-  - Used by MCP plugin for dynamic tool discovery
-
-**Python has:** Nothing. The MCP plugin must pre-register all actions.
-
----
-
-### 1.4 Missing Veneer Methods
-
-| JS Method | Description | Python Status |
-|-----------|-------------|---------------|
-| `ai.createSession()` | Create stateful session | ❌ Missing |
-| `ai.chat()` | Quick chat session | ❌ Missing |
-| `ai.currentSession()` | Get active session | ❌ Missing |
-| `ai.checkOperation()` | Check background op | ❌ Missing |
-| `ai.defineSimpleRetriever()` | Simplified retriever | ❌ Missing |
-| `ai.defineBackgroundModel()` | Background model | ❌ Missing |
-| `ai.defineDynamicActionProvider()` | DAP registration | ❌ Missing |
-| `ai.defineJsonSchema()` | JSON Schema registration | ❌ Missing |
-| `ai.dynamicTool()` | Unregistered tools | ❌ Missing |
-| `ai.run()` | Named trace step | ❌ Missing |
-| `ai.embedMany()` | Bulk embedding | ❌ Missing |
-| `ai.index()` | Indexing veneer | ❌ Missing |
-
----
-
-## 2. Plugin Gaps
-
-### 2.1 Missing Plugins (5)
-
-| JS Plugin | Description | Priority |
-|-----------|-------------|----------|
-| `@genkit-ai/checks` | Google Checks for safety | Medium |
-| `@genkit-ai/chroma` | Chroma vector store | Low |
-| `@genkit-ai/cloud-sql-pg` | Cloud SQL PostgreSQL | Medium |
-| `@genkit-ai/pinecone` | Pinecone vector store | Medium |
-| `@genkit-ai/langchain` | LangChain integration | Low |
-| `@genkit-ai/next` | Next.js integration | N/A (Python irrelevant) |
-| `@genkit-ai/express` | Express integration | N/A (Flask exists) |
-| `@genkit-ai/googleai` | Legacy Google AI | Being deprecated |
-
-### 2.2 Plugin Feature Gaps
-
-#### Vertex AI Plugin
-
-| Feature | JS | Python |
-|---------|-----|--------|
-| Gemini Models | ✅ | ✅ |
-| Imagen Models | ✅ | ✅ (via google-genai) |
-| Embedders | ✅ | Limited |
-| Rerankers | ✅ | ❌ Missing |
-| Context Caching | ✅ | ❌ Missing |
-| Vector Search | ✅ | ✅ |
-| Evaluation | ✅ | ❌ Missing |
-| Model Garden | ✅ | ✅ |
-
-#### Google GenAI Plugin
-
-| Feature | JS | Python |
-|---------|-----|--------|
-| Gemini Models | ✅ | ✅ |
-| Imagen Models | ✅ | ✅ |
-| Embedders | ✅ | ✅ |
-| Context Caching | ✅ | ❌ Missing |
-| Live/Realtime | ✅ | ❌ Missing |
-
----
-
-## 2.3 Prompt API (`ai.prompt()` / `ai.definePrompt()`)
-
-### JS API
-
-```typescript
-// Lookup prompt by name
-ai.prompt<I, O, CustomOptions>(name: string, options?: { variant?: string })
-  : ExecutablePrompt<I, O, CustomOptions>
-
-// Define prompt with config + template/function
-ai.definePrompt({
-  name: string,
-  model?: string,
-  input?: { schema: z.ZodSchema },
-  output?: { schema: z.ZodSchema },
-  config?: GenerationConfig,
-  messages?: string | ((input) => Message[]),  // Template string or function
-  tools?: ToolRef[],
-}, templateOrFn?)
-```
-
-**Key JS Features:**
-- Generic type parameters `<I, O, CustomOptions>` for type-safe input/output
-- `messages` can be a Dotprompt template string
-- Returns `ExecutablePrompt` with `()` call and `.stream()` method
-- Automatic `.prompt` file loading from `promptDir`
-
-### Python API
-
-```python
-# Lookup prompt by name
-await ai.prompt(name: str, variant: str | None = None)
-  -> ExecutablePrompt
-
-# Define prompt with explicit kwargs
-ai.define_prompt(
-    name: str | None = None,
-    variant: str | None = None,
-    model: str | None = None,
-    config: GenerationCommonConfig | dict | None = None,
-    description: str | None = None,
-    input_schema: type | dict | str | None = None,
-    system: str | Part | list[Part] | Callable | None = None,
-    prompt: str | Part | list[Part] | Callable | None = None,
-    messages: str | list[Message] | Callable | None = None,
-    output_format: str | None = None,
-    output_content_type: str | None = None,
-    output_instructions: bool | str | None = None,
-    output_schema: type | dict | str | None = None,
-    output_constrained: bool | None = None,
-    max_turns: int | None = None,
-    return_tool_requests: bool | None = None,
-    tools: list[str] | None = None,
-    tool_choice: ToolChoice | None = None,
-    use: list[ModelMiddleware] | None = None,
-    docs: list[DocumentData] | Callable | None = None,
-    metadata: dict | None = None,
-)
-```
-
-**Key Differences:**
-
-| Feature | JS | Python | Notes |
-|---------|-----|--------|-------|
-| Type generics | ✅ `<I, O, CustomOptions>` | ❌ | No typed input/output |
-| Sync lookup | ✅ Sync | ❌ `async` | Python requires `await` |
-| Separate `system` param | ❌ | ✅ | Python has dedicated system param |
-| Separate `prompt` param | ❌ | ✅ | Python can pass prompt separately |
-| `output_*` params | Combined in `output` | ✅ Explicit | More granular in Python |
-| `docs` param | ❌ | ✅ | Python has docs for RAG |
-| `max_turns` | In config | ✅ Direct | Easier access in Python |
-| Template strings | ✅ Dotprompt | ✅ Handlebars | Both support templates |
-| `.prompt` file loading | ✅ Auto | ✅ Auto | Both support file loading |
-
-### ExecutablePrompt Comparison
-
-**JS:**
-```typescript
-const result = await myPrompt({ name: 'value' });
-const { stream, response } = await myPrompt.stream({ name: 'value' });
-```
-
-**Python:**
-```python
-result = await my_prompt(name='value')
-stream, response = await my_prompt.stream(name='value')
-```
-
-> [!NOTE]
-> Python's prompt API is **more complete** in some ways (explicit `system`, `docs`, `max_turns`), but lacks the type safety of JS generics.
-
----
-
-## 2.4 Complete API Surface Comparison
-
-### Method Parity Matrix
-
-| Method | JS | Python | Notes |
-|--------|-----|--------|-------|
-| **Core Generation** |
-| `generate()` | ✅ | ✅ | Both support |
-| `generateStream()` | ✅ | ✅ `generate_stream()` | Name differs |
-| `checkOperation()` | ✅ | ❌ | Python missing (needed for Veo) |
-| **Prompts** |
-| `prompt()` | ✅ Sync | ✅ `async` | Python requires await |
-| `definePrompt()` | ✅ | ✅ `define_prompt()` | Both support |
-| **Flows** |
-| `defineFlow()` | ✅ | ✅ `@ai.flow()` decorator | Python uses decorator |
-| `run()` | ✅ | ❌ | Step tracing missing in Python |
-| `currentContext()` | ✅ | ❌ | Python missing |
-| **Tools** |
-| `defineTool()` | ✅ | ✅ `@ai.tool()` decorator | Python uses decorator |
-| `dynamicTool()` | ✅ | ❌ | Python missing |
-| **Models** |
-| `defineModel()` | ✅ | ✅ `define_model()` | Both support |
-| `defineBackgroundModel()` | ✅ | ❌ | Python missing |
-| **RAG** |
-| `retrieve()` | ✅ | ✅ | Both support |
-| `index()` | ✅ | ✅ | Both support |
-| `defineRetriever()` | ✅ | ✅ `define_retriever()` | Both support |
-| `defineSimpleRetriever()` | ✅ | ❌ | Python missing |
-| `defineIndexer()` | ✅ | ✅ `define_indexer()` | Both support |
-| **Embeddings** |
-| `embed()` | ✅ | ✅ | Both support |
-| `embedMany()` | ✅ | ❌ | Python missing |
-| `defineEmbedder()` | ✅ | ✅ `define_embedder()` | Both support |
-| **Reranking** |
-| `rerank()` | ✅ | ✅ | Both support |
-| `defineReranker()` | ✅ | ✅ `define_reranker()` | Both support |
-| **Evaluation** |
-| `evaluate()` | ✅ | ✅ | Both support |
-| `defineEvaluator()` | ✅ | ✅ `define_evaluator()` | Both support |
-| — | — | ✅ `define_batch_evaluator()` | Python extra |
-| **Schemas** |
-| `defineSchema()` | ✅ | ✅ `define_schema()` | Both support |
-| `defineJsonSchema()` | ✅ | ❌ | Python missing |
-| **Templates** |
-| `defineHelper()` | ✅ | ✅ `define_helper()` | Both support |
-| `definePartial()` | ✅ | ✅ `define_partial()` | Both support |
-| **Dynamic Actions** |
-| `defineDynamicActionProvider()` | ✅ | ❌ | Python missing (MCP) |
-| **Formats** |
-| — | — | ✅ `define_format()` | Python extra |
-| **Resources** |
-| — | — | ✅ `define_resource()` | Python extra |
-| **Session/Chat** |
-| `createSession()` | ✅ | ❌ | Python missing |
-| `loadSession()` | ✅ | ❌ | Python missing |
-| `chat()` | ✅ | ❌ | Python missing |
-| **Lifecycle** |
-| `configure()` | ✅ | ❌ | Python uses constructor |
-| `stopServers()` | ✅ | ❌ | Python missing |
-| `run_main()` | ❌ | ✅ | Python extra |
-
-### Critical Missing APIs (Python)
-
-| API | Use Case | Priority |
-|-----|----------|----------|
-| `checkOperation()` | Poll long-running ops (Veo, Imagen) | P0 |
-| `createSession()`/`loadSession()` | Stateful multi-turn | P0 |
-| `chat()` | Simple chat interface | P0 |
-| `run()` | Step tracing in flows | P1 |
-| `dynamicTool()` | Runtime tool creation | P1 |
-| `defineBackgroundModel()` | Long-running models | P1 |
-| `defineDynamicActionProvider()` | MCP host support | P2 |
-| `currentContext()` | Auth/context access | P2 |
-| `embedMany()` | Batch embedding | P2 |
-| `defineSimpleRetriever()` | Quick retriever setup | P3 |
-| `defineJsonSchema()` | Register JSON schemas | P3 |
-
-### Python Extras (Not in JS)
-
-| API | Use Case |
-|-----|----------|
-| `define_batch_evaluator()` | Evaluate entire dataset at once |
-| `define_format()` | Register custom output formats |
-| `define_resource()` | Register MCP resources |
-| `run_main()` | Dev server entry point |
-
----
-
-## 2.5 Telemetry and Tracing Comparison
-
-### Tracing Infrastructure
-
-| Feature | JS | Python | Notes |
-|---------|-----|--------|-------|
-| **OpenTelemetry SDK** | ✅ `@opentelemetry/sdk-node` | ✅ `opentelemetry-sdk` | Both use OTel |
-| **TracerProvider** | ✅ | ✅ | Both configure |
-| **SimpleSpanProcessor** | ✅ (dev) | ✅ (dev) | Same pattern |
-| **BatchSpanProcessor** | ✅ (prod) | ✅ (prod) | Same pattern |
-| **RealtimeSpanProcessor** | ✅ | ✅ | Parity achieved |
-| **Configurable via env** | ✅ `GENKIT_ENABLE_REALTIME_TELEMETRY` | ✅ | Parity achieved |
-
-### Realtime Tracing
-
-> [!NOTE]
-> Both JS and Python now have `RealtimeSpanProcessor` that exports spans on **both start AND end**, enabling live trace visualization during development.
-
-**JS RealtimeSpanProcessor:**
-```typescript
-class RealtimeSpanProcessor implements SpanProcessor {
-  onStart(span: Span): void {
-    // Export immediately for real-time updates
-    this.exporter.export([span], () => {});
-  }
-  onEnd(span: ReadableSpan): void {
-    // Export completed span
-    this.exporter.export([span], () => {});
-  }
-}
-```
-
-**Python:** Equivalent implementation in `genkit.core.trace.realtime_processor`:
-```python
-class RealtimeSpanProcessor(SpanProcessor):
-    def on_start(self, span: Span, parent_context: Context | None = None) -> None:
-        # Export immediately for real-time updates
-        self._exporter.export([span])
-
-    def on_end(self, span: ReadableSpan) -> None:
-        # Export completed span
-        self._exporter.export([span])
-```
-
-### Span Exporters
-
-| Exporter | JS | Python | Notes |
-|----------|-----|--------|-------|
-| **TelemetryServerExporter** | ✅ `TraceServerExporter` | ✅ `TelemetryServerSpanExporter` | Both have |
-| **GCP Cloud Trace** | ✅ | ✅ | Both via plugins |
-| **AdjustingTraceExporter** | ✅ (redacts content) | ✅ | Parity achieved |
-| **Custom exporter API** | ✅ | ✅ `add_custom_exporter()` | Both support |
-
-### Telemetry Configuration API
-
-| API | JS | Python | Notes |
-|-----|-----|--------|-------|
-| `enableTelemetry(config)` | ✅ | ❌ | Python auto-configures |
-| `flushTracing()` | ✅ | ✅ `ai.flush_tracing()` | Parity achieved |
-| `cleanUpTracing()` | ✅ | ❌ | Python no cleanup |
-| `TelemetryConfig` type | ✅ `Partial<NodeSDKConfiguration>` | ❌ | Python untyped |
-
-### Metrics
-
-| Metric | JS (GCP Plugin) | Python (GCP Plugin) |
-|--------|----------------|---------------------|
-| `genkit/ai/generate/requests` | ✅ | ✅ |
-| `genkit/ai/generate/failures` | ✅ | ✅ |
-| `genkit/ai/generate/latency` | ✅ | ✅ |
-| `genkit/ai/generate/input/tokens` | ✅ | ✅ |
-| `genkit/ai/generate/output/tokens` | ✅ | ✅ |
-| `genkit/ai/generate/input/characters` | ✅ | ✅ |
-| `genkit/ai/generate/output/characters` | ✅ | ✅ |
-| `genkit/ai/generate/input/images` | ✅ | ✅ |
-| `genkit/ai/generate/output/images` | ✅ | ✅ |
-| `genkit/ai/generate/input/videos` | ✅ | ✅ |
-| `genkit/ai/generate/output/videos` | ✅ | ✅ |
-| `genkit/ai/generate/input/audio` | ✅ | ✅ |
-| `genkit/ai/generate/output/audio` | ✅ | ✅ |
-
-> [!NOTE]
-> **Metrics parity is good!** Both JS and Python google-cloud plugins record the same AI monitoring metrics.
-
-### GCP Plugin Comparison
-
-| Feature | JS `@genkit-ai/google-cloud` | Python `google-cloud` |
-|---------|------------------------------|----------------------|
-| **Cloud Trace export** | ✅ | ✅ |
-| **Cloud Metrics export** | ✅ | ✅ |
-| **Automatic instrumentation** | ✅ (Pino, Winston) | ❌ |
-| **Span adjustment/redaction** | ✅ `AdjustingTraceExporter` | ✅ | Parity achieved |
-| **Feature markers** | ✅ (marks genkit spans) | ❌ |
-
-### Telemetry Gaps Summary
-
-| Gap | Priority | Status |
-|-----|----------|--------|
-| ~~**RealtimeSpanProcessor**~~ | ~~P1~~ | ✅ Implemented - live tracing now works |
-| ~~**Span redaction**~~ | ~~P2~~ | ✅ Implemented - `AdjustingTraceExporter` |
-| ~~**flushTracing() API**~~ | ~~P2~~ | ✅ Implemented - `ai.flush_tracing()` |
-| **Logging instrumentation** | P3 | Logs not auto-correlated |
-| **enableTelemetry() config** | P3 | Less flexibility |
-
----
-
-## 3. Model Configuration Not Showing in DevUI
-
-> [!CAUTION]
-> **Critical Bug**: Model configuration options don't appear in DevUI for Python Google GenAI models.
-
-### Root Cause
-
-Python's `google.py` has `config_schema` **commented out** when calling `model_action_metadata()`:
-
-```python
-# google.py line 484-490
-actions_list.append(
-    model_action_metadata(
-        name=vertexai_name(name),
-        info=google_model_info(name).model_dump(),
-        # config_schema=GeminiConfigSchema,  # <-- COMMENTED OUT!
-    ),
-)
-```
-
-Compare to JS which **always passes `configSchema`**:
-
-```typescript
-// gemini.ts line 547-551
-return modelActionMetadata({
-  name: ref.name,
-  info: ref.info,
-  configSchema: ref.configSchema,  // <-- Always included!
-});
-```
-
-### Impact
-
-- DevUI cannot display model configuration options (temperature, topP, safety settings, etc.)
-- Users cannot adjust model parameters through the UI
-- Affects both GoogleAI and VertexAI plugins
-
-### Fix Required
-
-1. Uncomment `config_schema=GeminiConfigSchema` in `_resolve_model()` and `list_actions()`
-2. Ensure `GeminiConfigSchema` is properly exported and includes all options
-3. Verify JSON schema serialization works for Pydantic models
-
-### Files to Fix
-- [google.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/google.py#L243) - `list_actions()` 
-- [google.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/google.py#L455) - VertexAI `list_actions()`
-- [gemini.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/models/gemini.py#L179-L187) - `GeminiConfigSchema` definition
-
----
-
-## 4. Behavioral Differences
-
-> [!IMPORTANT]
-> These are differences in how features behave, not missing features.
-
-### 4.1 Streaming Response Handling
-
-**JS:** Returns `GenerateStreamResponse` with both `stream` (async iterable) and `response` (promise) accessible simultaneously. Stream chunks are also available via `onChunk` callback.
-
-```typescript
-const { response, stream } = ai.generateStream({...});
-for await (const chunk of stream) {
-  console.log(chunk.text);
-}
-const final = await response;
-```
-
-**Python:** Returns GenerateResponseWrapper that requires different access patterns. The `stream` attribute returns an async generator.
-
-```python
-response = await ai.generate_stream(...)
-async for chunk in response.stream:
-    print(chunk.text)
-final = await response.response
-```
-
-**Action Needed:** Verify streaming API ergonomics match JS patterns.
-
----
-
-### 4.2 Tool Call Response Handling
-
-**JS:** Tools can return `Part[]` when configured with `multipart: true`, allowing rich responses with multiple content types.
-
-```typescript
-ai.defineTool({ multipart: true }, async (input) => {
-  return [
-    { text: "Analysis:" },
-    { media: { url: "data:..." } }
-  ];
-});
-```
-
-**Python:** Tools return a single value that gets wrapped. No explicit multipart support.
-
-**Action Needed:** Add multipart tool support to Python.
-
----
-
-### 4.3 Output Schema Validation Behavior
-
-**JS:** Uses Zod schemas. When `output.schema` is specified, attempts to parse and validate response. Returns typed `response.output`.
-
-**Python:** Uses Pydantic models. Schema is converted to JSON Schema for the model request, but response parsing may differ.
-
-**Action Needed:** Verify output parsing behavior matches, especially for:
-- Partial JSON handling
-- Array extraction
-- Nested object validation
-
----
-
-### 4.4 Prompt Resolution Order
-
-**JS:** 
-1. Looks in prompt cache (already loaded)
-2. Loads from `promptDir` (default: `./prompts`)
-3. Checks registered prompts via `definePrompt()`
-
-**Python:**
-1. Checks registry for registered prompts
-2. Loads from prompt directory if configured
-
-**Action Needed:** Verify prompt loading order and caching behavior.
-
----
-
-### 4.5 Model Middleware Execution
-
-**JS:** Middleware wraps the generate call with before/after hooks, can modify request and response.
-
-**Python:** Similar structure but async/await patterns differ.
-
-**Action Needed:** Verify middleware execution order and error handling.
-
----
-
-## 5. Type System Differences
-
-### 5.1 Part Type Construction
-
-**JS:**
-```typescript
-{ text: "hello" }  // Direct construction
-{ media: { url: "..." } }
-```
-
-**Python:** (After recent fixes)
-```python
-Part(root=TextPart(text="hello"))  # Must use root=
-Part(root=MediaPart(media=Media(url="...")))
-```
-
-**Status:** Being addressed - needs consistent patterns documented.
-
----
-
-### 5.2 Schema Registration
-
-**JS:**
-```typescript
-ai.defineSchema('Recipe', RecipeSchema);  // Zod schema
-ai.defineJsonSchema('Recipe', {...});     // JSON Schema
-```
-
-**Python:**
-```python
-ai.define_schema('Recipe', Recipe)  # Pydantic model only
-```
-
-**Action Needed:** Add `define_json_schema()` to Python.
-
----
-
-## 6. Testing & Tooling Gaps
-
-| Feature | JS | Python |
-|---------|-----|--------|
-| Echo Model | ✅ | ✅ |
-| Programmable Model | ✅ | ❌ |
-| Test Action | ✅ | Limited |
-| Trace Viewer | ✅ | ✅ |
-
----
-
-## 7. Priority Recommendations
-
-### P0 - Critical (Block key use cases)
-
-1. **Model Config Schema Bug** - DevUI cannot show model parameters (commented out in Python)
-2. **Session/Chat** - Required for conversational AI applications
-3. **Background Actions** - Required for video/image generation models
-
-### P1 - High (Significant functionality gaps)
-
-4. **Context Caching** - Cost optimization for long contexts
-5. **Dynamic Action Provider** - MCP and plugin extensibility
-6. **Vertex Rerankers** - RAG quality improvement
-7. **Vertex Evaluation** - Built-in quality assessment
-
-### P2 - Medium (Completeness)
-
-8. **`defineSimpleRetriever()`** - Developer convenience
-9. **`ai.run()`** - Trace step naming
-10. **`embedMany()`** - Batch embedding efficiency
-11. **Multipart tools** - Rich tool responses
-
-### P3 - Low (Nice to have)
-
-12. **Chroma plugin**
-13. **Pinecone plugin**
-14. **Cloud SQL plugin**
-
----
-
-## 8. Files Reference
-
-### JS Core
-- [genkit.ts](/js/genkit/src/genkit.ts) - Main Genkit class
-- [session.ts](/js/ai/src/session.ts) - Session management
-- [chat.ts](/js/ai/src/chat.ts) - Chat implementation
-- [background-action.ts](/js/core/src/background-action.ts) - Background ops
-- [dynamic-action-provider.ts](/js/core/src/dynamic-action-provider.ts) - DAP
-
-### Python Core
-- [_registry.py](/py/packages/genkit/src/genkit/ai/_registry.py) - GenkitRegistry
-- [_base_async.py](/py/packages/genkit/src/genkit/ai/_base_async.py) - Genkit base
-- [generate.py](/py/packages/genkit/src/genkit/blocks/generate.py) - Generation
-- [prompt.py](/py/packages/genkit/src/genkit/blocks/prompt.py) - Prompts
diff --git a/py/engdoc/parity-analysis/model_spec_compliance.md b/py/engdoc/parity-analysis/model_spec_compliance.md
deleted file mode 100644
index c2efc1aa73..0000000000
--- a/py/engdoc/parity-analysis/model_spec_compliance.md
+++ /dev/null
@@ -1,267 +0,0 @@
-# Model Spec Compliance Analysis
-
-This document cross-checks Python model plugin implementations against the Genkit Model Action Specification ([model-spec.md](/docs/model-spec.md)).
-
----
-
-## Executive Summary
-
-| Area | Types Defined | Plugin Implementation | Gap Level |
-|------|--------------|----------------------|-----------|
-| Core Types | ✅ Complete | ⚠️ Partial use | Medium |
-| Metadata | ✅ Complete | ⚠️ Missing fields | Medium |
-| Docs Context | ✅ Type exists | ❌ Not implemented | High |
-| Latency Tracking | ✅ Type exists | ❌ Not tracked | Medium |
-| Partial Tool Streaming | ✅ Type exists | ❌ Not implemented | Low |
-
----
-
-## 1. Type Compliance (Python Core Types)
-
-### 1.1 GenerateRequest ✅
-
-| Field | Spec | Python Type | Status |
-|-------|------|-------------|--------|
-| `messages` | Required | ✅ `list[Message]` | Complete |
-| `config` | Any | ✅ `Any \| None` | Complete |
-| `tools` | ToolDefinition[] | ✅ `list[ToolDefinition] \| None` | Complete |
-| `toolChoice` | enum | ✅ `ToolChoice \| None` | Complete |
-| `output` | OutputConfig | ✅ `OutputConfig \| None` | Complete |
-| `docs` | DocumentData[] | ✅ `list[DocumentData] \| None` | Complete |
-
-### 1.2 GenerateResponse ✅
-
-| Field | Spec | Python Type | Status |
-|-------|------|-------------|--------|
-| `message` | Message | ✅ `Message \| None` | Complete |
-| `finishReason` | enum | ✅ `FinishReason \| None` | Complete |
-| `finishMessage` | string | ✅ `str \| None` | Complete |
-| `usage` | GenerationUsage | ✅ `GenerationUsage \| None` | Complete |
-| `latencyMs` | number | ✅ `float \| None` | Complete |
-| `custom` | any | ✅ `Any \| None` | Complete |
-| `request` | GenerateRequest | ✅ `GenerateRequest \| None` | Complete |
-
-### 1.3 GenerateResponseChunk ✅
-
-| Field | Spec | Python Type | Status |
-|-------|------|-------------|--------|
-| `role` | Role | ✅ `Role \| None` | Complete |
-| `index` | number | ✅ `float \| None` | Complete |
-| `content` | Part[] | ✅ `list[Part]` | Complete |
-| `aggregated` | boolean | ✅ `bool \| None` | Complete |
-| `custom` | any | ✅ `Any \| None` | Complete |
-
-### 1.4 Part Types ✅
-
-| Part Type | Spec | Python Type | Status |
-|-----------|------|-------------|--------|
-| TextPart | ✅ | ✅ `TextPart` | Complete |
-| MediaPart | ✅ | ✅ `MediaPart` | Complete |
-| ToolRequestPart | ✅ | ✅ `ToolRequestPart` | Complete |
-| ToolResponsePart | ✅ | ✅ `ToolResponsePart` | Complete |
-| CustomPart | ✅ | ✅ `CustomPart` | Complete |
-| ReasoningPart | ✅ | ✅ `ReasoningPart` | Complete |
-| DataPart | Reserved | ✅ `DataPart` | Complete |
-
-### 1.5 ToolRequest (Partial Streaming) ✅
-
-| Field | Spec | Python Type | Status |
-|-------|------|-------------|--------|
-| `partial` | boolean | ✅ `bool \| None` | Complete |
-
----
-
-## 2. Model Action Metadata Gaps
-
-> [!WARNING]
-> Plugin implementations don't fully populate action metadata per spec.
-
-### 2.1 Spec Requirements
-
-```json
-{
-  "model": {
-    "label": "Human-readable name",
-    "versions": ["version1", "version2"],
-    "supports": {
-      "multiturn": true,
-      "media": true,
-      "tools": true,
-      "systemRole": true,
-      "output": ["json", "text"],
-      "contentType": ["application/json"],
-      "context": false,
-      "constrained": "no-tools",
-      "toolChoice": true,
-      "longRunning": false
-    },
-    "stage": "stable",
-    "customOptions": { /* JSON Schema */ }
-  }
-}
-```
-
-### 2.2 Gap Analysis
-
-| Field | Google GenAI | Anthropic | Ollama |
-|-------|-------------|-----------|--------|
-| `label` | ✅ | ⚠️ Missing | ⚠️ Missing |
-| `versions` | ⚠️ Some models | ✅ | ❌ |
-| `multiturn` | ✅ | ✅ | ✅ |
-| `media` | ✅ | ✅ | ✅ |
-| `tools` | ✅ | ✅ | ✅ |
-| `systemRole` | ✅ | ✅ | ✅ |
-| `output` | ⚠️ Some | ❌ Missing | ❌ Missing |
-| `contentType` | ❌ Missing | ❌ Missing | ❌ Missing |
-| `context` | ❌ Missing | ❌ Missing | ❌ Missing |
-| `constrained` | ✅ | ❌ Missing | ⚠️ Hardcoded |
-| `toolChoice` | ✅ | ❌ Missing | ⚠️ Missing |
-| `longRunning` | ❌ Missing | ❌ Missing | ❌ Missing |
-| `stage` | ✅ | ❌ Missing | ❌ Missing |
-| `customOptions` | ❌ Not exposed | ❌ | ❌ |
-
----
-
-## 3. Plugin Implementation Gaps
-
-### 3.1 Docs Context Handling ❌
-
-> [!CAUTION]
-> **Critical**: No Python plugin implements `docs` context augmentation.
-
-**Spec Requirement:**
-> If `docs` are provided, the model action should incorporate them into the context, typically by augmenting the message history.
-
-**Current State:**
-- Types: `GenerateRequest.docs` exists
-- Google GenAI: Does not process `docs` field
-- Anthropic: Does not process `docs` field
-- Ollama: Does not process `docs` field
-
-### 3.2 Latency Tracking ❌
-
-**Spec Requirement:**
-> `latencyMs`: Time taken for generation in milliseconds.
-
-**Current State:**
-- Types: `GenerateResponse.latency_ms` exists
-- Google GenAI: Not populating latency_ms
-- Anthropic: Not populating latency_ms
-- Ollama: Not populating latency_ms
-
-### 3.3 Request Echo ⚠️
-
-**Spec Requirement:**
-> `request`: The request that triggered this response.
-
-**Current State:**
-- Types: `GenerateResponse.request` exists
-- Plugins: Not consistently populating this field
-
-### 3.4 Partial Tool Streaming ❌
-
-**Spec Requirement:**
-> Some models support streaming tool calls with `partial: true`. The final chunk should have `partial: false`.
-
-**Current State:**
-- Types: `ToolRequest.partial` exists
-- Google GenAI: Not implemented
-- Anthropic: Not implemented
-- Ollama: Not implemented
-
-### 3.5 Server-Side Tools Configuration ⚠️
-
-**Spec Requirement:**
-> Features like Web Search, Code Execution, or URL Context are configured in `config`, not `tools`.
-
-**Current State:**
-- Google GenAI: ✅ Supports `url_context`, `file_search` in config
-- Anthropic: ❌ Not implemented
-- Ollama: ❌ Not applicable
-
-### 3.6 Config Passthrough ⚠️
-
-**Spec Requirement:**
-> Pass all remaining unknown keys directly to the underlying model API.
-
-**Current State:**
-- Google GenAI: ✅ Inherits from SDK type, passes through
-- Anthropic: ⚠️ Only extracts known keys, doesn't pass through
-- Ollama: ✅ Passes through via `ollama_api.Options(**config)`
-
----
-
-## 4. Behavior Compliance
-
-### 4.1 System Message Handling ✅
-
-| Plugin | Extracts System | Separate Field | Status |
-|--------|----------------|----------------|--------|
-| Google GenAI | ✅ | ✅ `systemInstruction` | Compliant |
-| Anthropic | ✅ | ✅ `system` | Compliant |
-| Ollama | ✅ | ⚠️ Varies by model | Mostly |
-
-### 4.2 Tool Definition Conversion ✅
-
-| Plugin | Name Sanitization | Schema Convert | Description |
-|--------|-------------------|----------------|-------------|
-| Google GenAI | ✅ | ✅ | ✅ |
-| Anthropic | ✅ | ✅ | ✅ |
-| Ollama | ✅ | ✅ | ✅ |
-
-### 4.3 Finish Reason Mapping ✅
-
-| Plugin | Maps Provider Reasons | Standard Enum |
-|--------|----------------------|---------------|
-| Google GenAI | ✅ | ✅ |
-| Anthropic | ✅ | ✅ |
-| Ollama | ⚠️ Limited | ✅ |
-
-### 4.4 Structured Output ⚠️
-
-| Plugin | Schema Passed | Constrained Support | Status |
-|--------|--------------|---------------------|--------|
-| Google GenAI | ✅ | ✅ `no-tools` | Good |
-| Anthropic | ⚠️ | ❌ | Limited |
-| Ollama | ⚠️ | ⚠️ | Limited |
-
----
-
-## 5. Priority Recommendations
-
-### P0 - Critical
-
-1. **Implement `docs` context handling** - RAG use case is broken without it
-2. **Add latency tracking** - Important for monitoring
-
-### P1 - High
-
-3. **Complete metadata fields** - `contentType`, `context`, `longRunning`
-4. **Config passthrough** for Anthropic/Ollama - Future-proofing
-5. **customOptions JSON Schema** - DevUI config display
-
-### P2 - Medium
-
-6. **Partial tool streaming** - Advanced feature
-7. **Request echo** in response - Debugging support
-8. **Stage field** for all plugins - Model lifecycle
-
-### P3 - Low
-
-9. **Versions array** for dynamic models
-10. **Output contentType** support
-
----
-
-## 6. Files Reference
-
-### Spec
-- [model-spec.md](/docs/model-spec.md)
-
-### Python Types
-- [typing.py](/py/packages/genkit/src/genkit/core/typing.py) - Core types
-
-### Plugin Implementations
-- [gemini.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/models/gemini.py)
-- [anthropic/models.py](/py/plugins/anthropic/src/genkit/plugins/anthropic/models.py)
-- [ollama/models.py](/py/plugins/ollama/src/genkit/plugins/ollama/models.py)
diff --git a/py/engdoc/parity-analysis/plugin_api_consistency.md b/py/engdoc/parity-analysis/plugin_api_consistency.md
deleted file mode 100644
index 24f33c4ba0..0000000000
--- a/py/engdoc/parity-analysis/plugin_api_consistency.md
+++ /dev/null
@@ -1,295 +0,0 @@
-# Plugin API Consistency Report
-
-This document analyzes model provider API consistency across JS and Python Genkit plugins, comparing initialization parameters, config schemas, and feature support.
-
----
-
-## Executive Summary
-
-| Plugin | JS Config Schema | Python Config Schema | Gap Level |
-|--------|-----------------|---------------------|-----------|
-| Google GenAI | Full Zod Schema (25+ fields) | Pydantic (inherits from SDK) | **Medium** |
-| Anthropic | AnthropicConfigSchema (10+ fields) | GenerationCommonConfig only | **Critical** |
-| Ollama | OllamaConfigSchema (6 fields) | GenerationCommonConfig | **Medium** |
-
----
-
-## 1. Google GenAI / Vertex AI Plugin
-
-### 1.1 Plugin Initialization Options
-
-| Parameter | JS | Python | Notes |
-|-----------|-----|--------|-------|
-| `apiKey` | ✅ | ✅ | Both support |
-| `apiVersion` | ✅ | ❌ | Python missing |
-| `baseUrl` | ✅ | ❌ | Python missing |
-| `customHeaders` | ✅ | ❌ (internal only) | Python injects headers internally |
-| `legacyResponseSchema` | ✅ | ❌ | Python missing |
-| `experimental_debugTraces` | ✅ | ❌ | Python missing |
-| `credentials` | ✅ | ✅ | Both support |
-| `project` | ✅ | ✅ | VertexAI only |
-| `location` | ✅ | ✅ | VertexAI only |
-| `debug_config` | ❌ | ✅ | Python has SDK debug |
-| `http_options` | ❌ | ✅ | Python SDK-specific |
-
-### 1.2 GeminiConfigSchema Comparison
-
-**JS (Zod Schema):**
-```typescript
-GeminiConfigSchema = GenerationCommonConfigSchema.extend({
-  apiKey: z.string().optional(),           // Override plugin apiKey
-  baseUrl: z.string().optional(),          // Override baseUrl
-  apiVersion: z.string().optional(),       // Override apiVersion
-  safetySettings: z.array(...),            // Safety filters
-  codeExecution: z.boolean().optional(),   // Enable code execution
-  contextCache: z.boolean().optional(),    // Enable context caching
-  functionCallingConfig: z.object({...}),  // Tool control
-  responseModalities: z.array(...),        // TEXT/IMAGE/AUDIO
-  googleSearchRetrieval: z.boolean(),      // Grounding with Google Search
-  fileSearch: z.object({...}),             // File search stores
-  urlContext: z.boolean(),                 // URL context grounding
-  temperature: z.number().min(0).max(2),   // With descriptions
-  topP: z.number().min(0).max(1),
-  thinkingConfig: z.object({
-    includeThoughts: z.boolean(),
-    thinkingBudget: z.number().min(0).max(24576),
-    thinkingLevel: z.enum(['MINIMAL', 'LOW', 'MEDIUM', 'HIGH']),
-  }),
-});
-```
-
-**Python (Pydantic Model):**
-```python
-class GeminiConfigSchema(genai_types.GenerateContentConfig):
-    code_execution: bool | None = None
-    response_modalities: list[str] | None = None
-    thinking_config: dict[str, Any] | None = None
-    file_search: dict[str, Any] | None = None
-    url_context: dict[str, Any] | None = None
-    api_version: str | None = None
-```
-
-### 1.3 Config Schema Gaps
-
-| Field | JS | Python | Priority |
-|-------|-----|--------|----------|
-| `safetySettings` | ✅ Typed array | Inherits from SDK | P1 |
-| `contextCache` | ✅ boolean | ❌ Missing | P1 |
-| `functionCallingConfig` | ✅ Typed object | Inherits from SDK | P2 |
-| `googleSearchRetrieval` | ✅ boolean/object | Inherits from SDK | P2 |
-| Per-field descriptions | ✅ All fields | ❌ None | P2 |
-| Type validation bounds | ✅ min/max | ❌ None | P2 |
-
-### 1.4 API Surface Gap
-
-> [!WARNING]
-> Python plugin does not expose `googleAI.model()` or `vertexAI.model()` convenience methods for creating model references with typed configs.
-
-**JS Pattern:**
-```typescript
-const model = googleAI.model('gemini-2.5-flash', { 
-  temperature: 0.8,
-  thinkingConfig: { includeThoughts: true }
-});
-```
-
-**Python Pattern:**
-```python
-# No equivalent - must use string reference
-response = await ai.generate(
-    model='googleai/gemini-2.5-flash',
-    config={'temperature': 0.8}  # Untyped dict
-)
-```
-
----
-
-## 2. Anthropic Plugin
-
-### 2.1 Plugin Initialization Options
-
-| Parameter | JS | Python | Notes |
-|-----------|-----|--------|-------|
-| `apiKey` | ✅ | ✅ | Both support |
-| `apiVersion` | ✅ ('stable'/'beta') | ❌ | Python missing |
-| `models` | ❌ | ✅ | Python-specific |
-| `**anthropic_params` | ❌ | ✅ | Python passes to SDK |
-
-### 2.2 Config Schema Comparison
-
-> [!CAUTION]
-> **Critical Gap**: Python Anthropic uses only `GenerationCommonConfig`, missing all Claude-specific features.
-
-**JS AnthropicConfigSchema:**
-```typescript
-AnthropicConfigSchema = GenerationCommonConfigSchema.extend({
-  tool_choice: z.union([
-    z.object({ type: z.literal('auto') }),
-    z.object({ type: z.literal('any') }),
-    z.object({ type: z.literal('tool'), name: z.string() }),
-  ]),
-  metadata: z.object({ user_id: z.string() }).optional(),
-  apiVersion: z.enum(['stable', 'beta']).optional(),
-  thinking: z.object({
-    enabled: z.boolean().optional(),
-    budgetTokens: z.number().min(1024).optional(),
-  }).optional(),
-});
-```
-
-**Python (Uses GenerationCommonConfig only):**
-```python
-# No Claude-specific config!
-# Just: temperature, max_output_tokens, top_p, stop_sequences, top_k
-```
-
-### 2.3 Missing Python Features
-
-| Feature | Description | Impact |
-|---------|-------------|--------|
-| **Thinking Config** | Extended thinking with budget tokens | Users cannot enable Claude thinking |
-| **API Version** | Switch between stable/beta APIs | No access to beta features |
-| **Tool Choice** | Force specific tool use | Less control over tool calling |
-| **Metadata** | User ID tracking | No usage tracking |
-| **Citations** | Document citation support | `anthropicDocument()` missing |
-| **Cache Control** | `cacheControl()` helper | No prompt caching |
-
-### 2.4 Model Mapping Gap
-
-**JS:**
-```typescript
-KNOWN_CLAUDE_MODELS = {
-  'claude-3-haiku': AnthropicBaseConfigSchema,
-  'claude-3-5-haiku': AnthropicBaseConfigSchema,
-  'claude-sonnet-4': AnthropicThinkingConfigSchema,  // Separate schema!
-  'claude-opus-4': AnthropicThinkingConfigSchema,
-  'claude-sonnet-4-5': AnthropicThinkingConfigSchema,
-  // ...  
-};
-```
-
-**Python:**
-```python
-# All models use same GenerationCommonConfig
-# No model-specific config schemas
-```
-
----
-
-## 3. Ollama Plugin
-
-### 3.1 Plugin Initialization Options
-
-| Parameter | JS | Python | Notes |
-|-----------|-----|--------|-------|
-| `serverAddress` | ✅ | ✅ | Both support |
-| `requestHeaders` | ✅ | ✅ | Both support |
-| `models` | ✅ | ✅ | Pre-register models |
-| `embedders` | ✅ | ✅ | Pre-register embedders |
-
-### 3.2 Config Schema Comparison
-
-**JS OllamaConfigSchema:**
-```typescript
-OllamaConfigSchema = GenerationCommonConfigSchema.extend({
-  temperature: z.number().min(0.0).max(1.0)
-    .describe('...defaults value is 0.8'),
-  topP: z.number().min(0).max(1.0)
-    .describe('...defaults value is 0.9'),
-});
-```
-
-**Python (Uses GenerationCommonConfig):**
-```python
-# Untyped config - relies on Ollama SDK defaults
-```
-
-### 3.3 Gaps
-
-| Gap | Impact |
-|-----|--------|
-| No per-field descriptions | Less IDE help |
-| No type validation | Invalid values sent to Ollama |
-
----
-
-## 4. Common API Pattern Gaps
-
-### 4.1 Model Reference Factory
-
-**JS Pattern (All Plugins):**
-```typescript
-// Type-safe model reference with IDE autocomplete
-const model = googleAI.model('gemini-2.5-flash');
-const model = anthropic.model('claude-sonnet-4');
-const model = ollama.model('llama3');
-```
-
-**Python Pattern:**
-```python
-# Only string-based references - no type safety
-model='googleai/gemini-2.5-flash'
-model='anthropic/claude-sonnet-4'
-model='ollama/llama3'
-```
-
-### 4.2 Embedder Reference Factory
-
-**JS Pattern:**
-```typescript
-const embedder = googleAI.embedder('gemini-embedding-001');
-```
-
-**Python:** ❌ No equivalent
-
-### 4.3 Config Schema in DevUI
-
-| Plugin | JS DevUI | Python DevUI |
-|--------|----------|--------------|
-| Google GenAI | ✅ Full config | ❌ Empty (commented out) |
-| Anthropic | ✅ Full config | ⚠️ Basic only |
-| Ollama | ✅ Full config | ⚠️ Basic only |
-
----
-
-## 5. Priority Recommendations
-
-### P0 - Critical
-
-1. **Add Python `config_schema` to model metadata** - Fix the commented out code
-2. **Anthropic ThinkingConfig** - Required for Claude 4.x models
-
-### P1 - High
-
-3. **Anthropic-specific config schema** - tool_choice, metadata, apiVersion
-4. **Google GenAI plugin options** - apiVersion, baseUrl, customHeaders
-5. **Model reference factories** - `plugin.model()` pattern for Python
-
-### P2 - Medium
-
-6. **Config field descriptions** - Match JS documentation
-7. **Type validation** - min/max bounds on numeric fields
-8. **Ollama config schema** - Match JS validation
-
-### P3 - Low
-
-9. **Embedder reference factories** - `plugin.embedder()` pattern
-10. **Debug trace options** - Match JS tracing options
-
----
-
-## 6. Files Reference
-
-### JS Plugins
-- [googleai/types.ts](/js/plugins/google-genai/src/googleai/types.ts) - Plugin options
-- [googleai/gemini.ts](/js/plugins/google-genai/src/googleai/gemini.ts) - Config schema
-- [anthropic/types.ts](/js/plugins/anthropic/src/types.ts) - Anthropic config
-- [anthropic/models.ts](/js/plugins/anthropic/src/models.ts) - Model definitions
-- [ollama/index.ts](/js/plugins/ollama/src/index.ts) - Ollama config
-
-### Python Plugins
-- [google_genai/google.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/google.py)
-- [google_genai/models/gemini.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/models/gemini.py)
-- [anthropic/plugin.py](/py/plugins/anthropic/src/genkit/plugins/anthropic/plugin.py)
-- [anthropic/models.py](/py/plugins/anthropic/src/genkit/plugins/anthropic/models.py)
-- [ollama/plugin_api.py](/py/plugins/ollama/src/genkit/plugins/ollama/plugin_api.py)
diff --git a/py/engdoc/parity-analysis/roadmap.md b/py/engdoc/parity-analysis/roadmap.md
deleted file mode 100644
index 23e0c0586a..0000000000
--- a/py/engdoc/parity-analysis/roadmap.md
+++ /dev/null
@@ -1,256 +0,0 @@
-# Parity Analysis & Roadmap
-
-> [!NOTE]
-> This document tracks the feature parity of Genkit Python plugins against the
-> Genkit Node.js reference implementation. Use this to identify gaps and plan work.
-
----
-
-## Current Status (Updated 2026-02-06)
-
-> [!IMPORTANT]
-> **Overall Parity: ~99% Complete** - Nearly all milestones done!
->
-> Legacy formatting and type checking issues fixed throughout the repo.
-> Remaining work is focused on resolving specific model quirks (e.g. DeepSeek R1 reasoning).
-
-| Plugin | API Conformance | Missing Features | Security Issues | Test Coverage | Priority |
-|--------|-----------------|------------------|-----------------|---------------|----------|
-| google-genai | ✅ Verified | Minor | None | Good | - |
-| anthropic | ✅ Mostly Conformant (PR #4482) | Citations | None | ✅ Good | Low |
-| amazon-bedrock | ✅ Verified | Guardrails | None | Good | Low |
-| ollama | ✅ Verified | Vision via chat API | None | Fair | Low |
-| mistral | ✅ Mostly Conformant (PR #4481) | Agents API, Codestral FIM | None | ✅ Good | Low |
-| xai | ⚠️ Gaps | Agent Tools API (server/client-side) | None | Fair | Medium |
-| deepseek | ✅ Mostly Conformant (PR #4480) | Multi-round reasoning | None | ✅ Good | Low |
-| cloudflare-workers-ai | ✅ Verified | Async Batch API | None | Good | Low |
-| huggingface | ⚠️ Gaps | Inference Endpoints, TGI | None | Fair | Medium |
-| azure | ⚠️ Gaps | Azure AI Studio | None | Fair | Medium |
-
-### Priority Actions
-
-| Priority | Task | Plugin | Effort | Description |
-|----------|------|--------|--------|-------------|
-| ~~P0~~ | ~~Fix `reasoning_content` extraction~~ | ~~deepseek~~ | ~~M~~ | ✅ Done (PR #4480) - Extracted via `MessageAdapter` in compat-oai, emits `ReasoningPart` |
-| ~~P0~~ | ~~Add parameter validation warnings~~ | ~~deepseek~~ | ~~S~~ | ✅ Done (PR #4480) - `_warn_reasoning_params()` logs warnings for ignored params |
-| ~~P1~~ | ~~Add cache control support~~ | ~~anthropic~~ | ~~M~~ | ✅ Done (PR #4482) - `cache_control` with TTL for cost savings |
-| ~~P1~~ | ~~Add PDF/Document support~~ | ~~anthropic~~ | ~~M~~ | ✅ Done (PR #4482) - `DocumentBlockParam` for common use case |
-| ~~P1~~ | ~~Add embeddings support~~ | ~~mistral~~ | ~~S~~ | ✅ Done (PR #4481) - `mistral-embed` model |
-| **P2** | Add Agent Tools API | xai | M | Server/client-side tool calling (Jan 2026) |
-| **P2** | Add Agents API | mistral | L | Mistral Agents endpoint |
-| **P2** | Add Inference Endpoints | huggingface | M | Dedicated endpoints for production |
-| **P3** | Add Guardrails | amazon-bedrock | M | Bedrock Guardrails integration |
-| **P3** | Add Azure AI Studio | azure | L | New unified API |
-
-### Detailed Gap Analysis
-
-#### 1. google-genai (Gemini/Vertex AI)
-
-**Status**: ✅ Mostly Conformant
-
-**Verified Features**:
-- Text generation (streaming/non-streaming) ✓
-- Embeddings ✓
-- Image generation (Imagen) ✓
-- Video generation (Veo) ✓
-- Function/tool calling ✓
-- Context caching ✓
-- Safety settings ✓
-- Evaluators (Vertex AI) ✓
-- Rerankers (Vertex AI Discovery Engine) ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Grounding with Google Search | Not implemented for Gemini API | Medium - useful for RAG | Medium |
-| Code execution tool | Built-in code execution not exposed | Low | Low |
-| Audio generation (Lyria) | Partial - helpers only, no full model | Low | Low |
-
----
-
-#### 2. anthropic (Claude)
-
-**Status**: ✅ Mostly Conformant (PR #4482)
-
-**Verified Features**:
-- Messages API ✓
-- Tool/function calling ✓
-- Streaming ✓
-- Vision (images) ✓
-- Thinking mode (extended thinking) ✓
-- ✅ Cache control (ephemeral) ✓ (PR #4482)
-- ✅ PDF/Document support (`DocumentBlockParam`) ✓ (PR #4482)
-- ✅ URL image source ✓ (PR #4482)
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| ~~Cache control (ephemeral)~~ | ~~`cache_control` with TTL not supported~~ | ~~High~~ | ✅ Done (PR #4482) |
-| ~~PDF/Document support~~ | ~~`DocumentBlockParam` not implemented~~ | ~~High~~ | ✅ Done (PR #4482) |
-| Citations | Citation extraction not supported | Medium | P2 |
-| Web search tool | Server-side `web_search` tool not supported | Medium | P2 |
-| Batch API | Message batches not supported | Low - async processing | P3 |
-
----
-
-#### 3. amazon-bedrock
-
-**Status**: ✅ Mostly Conformant
-
-**Verified Features**:
-- Converse API ✓
-- ConverseStream API ✓
-- Tool calling ✓
-- Multi-provider support (Claude, Nova, Llama, etc.) ✓
-- Inference profiles for cross-region ✓
-- Embeddings ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Guardrails | Bedrock Guardrails not integrated | Medium - content filtering | P3 |
-| Knowledge bases | RAG via Bedrock KB not supported | Medium | P3 |
-| Model invocation logging | CloudWatch logging config | Low | P4 |
-
----
-
-#### 4. ollama
-
-**Status**: ✅ Conformant
-
-**Verified Features**:
-- Chat API (/api/chat) ✓
-- Generate API (/api/generate) ✓
-- Embeddings API (/api/embeddings) ✓
-- Tool calling ✓
-- Streaming ✓
-- Model discovery ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Vision in chat | Images via chat API need testing | Low - works via generate | P4 |
-| Pull models | Model download/management | Low - user manages | P4 |
-
----
-
-#### 5. mistral
-
-**Status**: ✅ Mostly Conformant (PR #4481)
-
-**Verified Features**:
-- Chat completions ✓
-- Streaming ✓
-- Tool/function calling ✓
-- JSON mode ✓
-- Vision models (Pixtral) ✓
-- ✅ Embeddings (`mistral-embed`) ✓ (PR #4481)
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| ~~Embeddings~~ | ~~`mistral-embed` model not supported~~ | ~~Medium~~ | ✅ Done (PR #4481) |
-| Agents API | Mistral Agents endpoint not supported | High - agentic workflows | P2 |
-| FIM (Fill-in-Middle) | Codestral FIM for code completion | Medium - code use cases | P2 |
-| Built-in tools | websearch, code_interpreter, image_generation | Medium | P3 |
-
----
-
-#### 6. xai (Grok)
-
-**Status**: ⚠️ Has Gaps
-
-**Verified Features**:
-- Chat completions ✓
-- Streaming ✓
-- Tool/function calling ✓
-- Vision (grok-2-vision) ✓
-- Reasoning effort parameter ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Agent Tools API | Server-side and client-side tool calling (Jan 2026) | High - new feature | P2 |
-| Web search options | Built-in web search configuration | Medium | P3 |
-| New models | grok-4-1-fast-reasoning, grok-4-1-fast-non-reasoning | Medium | P2 |
-
----
-
-#### 7. deepseek
-
-**Status**: ✅ Mostly Conformant (PR #4480)
-
-**Verified Features**:
-- Chat completions (OpenAI-compatible) ✓
-- Streaming ✓
-- Uses compat-oai for implementation ✓
-- `reasoning_content` extraction for R1/reasoner models ✓ (PR #4480)
-- Parameter validation warnings for R1 (temp, top_p, tools) ✓ (PR #4480)
-- Chat vs. reasoning model capability split ✓ (PR #4480)
-- `is_reasoning_model()` helper ✓ (PR #4480)
-
-**Implementation Details (PR #4480)**:
-- **compat-oai layer**: `MessageAdapter` wraps raw Pydantic `ChatCompletionMessage` for safe `reasoning_content` access (Pydantic raises `AttributeError` for unknown fields). `MessageConverter.to_genkit()` emits `ReasoningPart` before `TextPart` (matching JS order).
-- **Streaming**: `MessageAdapter(delta).reasoning_content` in `_generate_stream()` replaces unsafe `getattr()` pattern.
-- **deepseek plugin**: `_warn_reasoning_params()` logs warnings when `temperature`, `top_p`, or `tools` are passed to R1 models. Model capabilities split into chat (`tools=True`) vs. reasoning (`tools=False`).
-
-**Remaining Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| ~~`reasoning_content`~~ | ~~CoT output not extracted/exposed~~ | ~~**Critical**~~ | ✅ Done (PR #4480) |
-| ~~Parameter validation~~ | ~~R1 ignores temp/top_p but no warning~~ | ~~High~~ | ✅ Done (PR #4480) |
-| ~~Multi-round reasoning~~ | ~~Must strip reasoning_content from context~~ | ~~High~~ | ✅ Done — `ReasoningPart` skipped in `MessageConverter.to_openai()` |
-| Tool calling in R1 | Not supported in reasoner mode | Medium - documented limitation | P2 |
-
----
-
-#### 8. cloudflare-workers-ai (Cloudflare Workers AI)
-
-**Status**: ✅ Mostly Conformant
-
-**Verified Features**:
-- Text generation ✓
-- Streaming (SSE) ✓
-- Tool calling (via CF specific implementation) ✓
-- Embeddings ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Async Batch API | Not implemented | Low | Low |
-| Function calling standardization | Uses custom impl instead of OpenAI compat | Medium | Low |
-
----
-
-#### 9. huggingface
-
-**Status**: ⚠️ Has Gaps
-
-**Verified Features**:
-- Text generation (inference API) ✓
-- Streaming ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Inference Endpoints | Dedicated endpoints not supported | Medium - production use | P2 |
-| TGI Integration | Text Generation Inference specific features | Medium | P3 |
-| Chat templating | Better reliance on tokenizer chat templates | Low | P3 |
-
----
-
-#### 10. azure (Azure OpenAI)
-
-**Status**: ⚠️ Has Gaps
-
-**Verified Features**:
-- Chat completions ✓
-- Streaming ✓
-- Tool calling ✓
-
-**Gaps**:
-| Gap | Description | Impact | Priority |
-|-----|-------------|--------|----------|
-| Azure AI Studio | New unified API not supported | Medium | P3 |
-| Entra ID Auth | Managed identity support | Medium - enterprise | P2 |
-| On Your Data | Azure Search integration | Medium | P3 |
diff --git a/py/engdoc/parity-analysis/sample_parity_roadmap.md b/py/engdoc/parity-analysis/sample_parity_roadmap.md
deleted file mode 100644
index 04b99b2c06..0000000000
--- a/py/engdoc/parity-analysis/sample_parity_roadmap.md
+++ /dev/null
@@ -1,471 +0,0 @@
-# Sample Parity Analysis: JS vs Python
-
-> **Updated:** 2026-02-07
-> **Scope:** Every JS code sample on genkit.dev docs -> Python `py/samples/` counterpart.
->
-> **Exclusions (per team decision):**
-> - **Chat/Session API** -- Deprecated, skip
-> - **Agents / Multi-Agent** -- Not yet in Python SDK, skip
-> - **MCP** -- Will come later, skip
-> - **Durable Streaming** -- Not yet in Python SDK, skip
-> - **Client SDK** -- JS client-side only, not applicable to Python backend SDK
-
----
-
-## Summary
-
-**JS Sample Locations:**
-- `/samples/` - 9 polished demo samples (js-angular, js-chatbot, js-menu, etc.)
-- `/js/testapps/` - 32 internal test/demo apps (advanced scenarios)
-
-**Python Sample Location:**
-- `/py/samples/` - 36 samples (including shared, sample-test)
-
-| Metric | JS (`samples/` + `testapps/`) | Python (`py/samples/`) | Gap |
-|--------|-------------------------------|------------------------|-----|
-| Plugin hello demos | 8 | 14 | **Python superset** |
-| Advanced feature demos | 15 | 10 | **-5** |
-| RAG samples | 5 | 4 | -1 |
-| Evaluation | 2 | 2 | Parity |
-| Media generation | 1 | 3 | **Python superset** |
-| Observability | 0 | 2 | **Python superset** |
-
----
-
-## 1. genkit.dev Docs -> Python Sample Coverage
-
-This is the authoritative mapping from every JS code feature demonstrated in
-the genkit.dev documentation to its Python sample coverage. Only in-scope
-features are listed (exclusions above apply).
-
-### `/docs/models` -- Generating Content with AI Models
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Basic generation | `ai.generate('prompt')` | All hello samples (`generate_greeting`) | Covered |
-| Model reference | `googleAI.model('gemini-2.5-flash')` | All hello samples | Covered |
-| Model string ID | `model: 'googleai/gemini-2.5-flash'` | All hello samples | Covered |
-| System prompts | `system: "..."` | `provider-google-genai-hello` + most hello samples | Covered |
-| Multi-turn (messages) | `messages: [{role, content}]` | `provider-google-genai-hello` + most hello samples | Covered |
-| Model parameters | `config: {maxOutputTokens, temperature, ...}` | Most hello samples (`generate_with_config`) | Covered |
-| Structured output | `output: { schema: ZodSchema }` | Most samples (`generate_character`) | Covered |
-| Streaming text | `ai.generateStream()` | Most samples (`generate_streaming_story`) | Covered |
-| Streaming + structured | `generateStream() + output schema` | `provider-google-genai-hello` | Covered |
-| Multimodal input (image URL) | `prompt: [{media: {url}}, {text}]` | `provider-google-genai-hello`, `provider-anthropic-hello`, `provider-xai-hello`, etc. | Covered |
-| Multimodal input (base64) | `data:image/jpeg;base64,...` | `provider-google-genai-hello` describe_image | Covered |
-| Generating media (images) | `output: {format: 'media'}` (Imagen) | `provider-google-genai-media-models-demo` | Covered |
-| Generating media (TTS) | Text-to-speech | `provider-google-genai-media-models-demo`, `provider-compat-oai-hello` | Covered |
-| Middleware (retry) | `use: [retry({...})]` | `framework-middleware-demo` | Covered |
-| Middleware (fallback) | `use: [fallback({...})]` | `framework-middleware-demo` | Covered |
-
-> **SDK Status:** Python has `use=` middleware infrastructure in `generate()`.
-> `framework-middleware-demo` demonstrates custom retry and logging middleware.
-
-### `/docs/tool-calling` -- Tool Calling
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Define tools | `ai.defineTool()` | All samples with tools | Covered |
-| Use tools in generate | `tools: [getWeather]` | Most samples (`generate_weather`) | Covered |
-| `maxTurns` | `maxTurns: 8` | 3 samples use `max_turns=2` | Covered |
-| `returnToolRequests` | `returnToolRequests: true` | `provider-google-genai-context-caching` | Covered |
-| Interrupts (tool-based) | `ctx.interrupt()` | `framework-tool-interrupts`, `provider-google-genai-hello` | Covered |
-| Dynamic tools at runtime | Tool defined inline at generate() | `framework-dynamic-tools-demo` uses `ai.dynamic_tool()` | Covered |
-| Streaming + tool calling | Stream with tools | All provider hello samples (`generate_streaming_with_tools`) | Covered |
-
-> **SDK Status:** `ai.dynamic_tool()` exists. Streaming + tools is demonstrated
-> in all 12 provider hello samples via `generate_streaming_with_tools` flow.
-
-### `/docs/interrupts` -- Interrupts
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Tool-based interrupt | `@ai.tool(interrupt=True)` + `ctx.interrupt()` | `framework-tool-interrupts`, `provider-google-genai-hello` | Covered |
-| Check response.interrupts | Loop checking for interrupts | `framework-tool-interrupts` | Covered |
-| Resume with respond | `resume: { respond: [...] }` | `framework-tool-interrupts`, `provider-google-genai-hello` | Covered |
-| `defineInterrupt()` | Standalone interrupt API | Not in Python SDK | N/A (SDK gap) |
-| Restartable interrupts | `restart` option | Not in Python SDK | N/A (SDK gap) |
-
-> **SDK Status:** Python only supports tool-based interrupts via
-> `@ai.tool(interrupt=True)`. No standalone `define_interrupt()` API exists.
-> This is a SDK feature gap, not a sample gap.
-
-### `/docs/context` -- Context
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Context in generate() | `context: { auth: {...} }` | `framework-context-demo` (`context_in_generate`) | Covered |
-| Context in flow | `{context}` destructured | `framework-context-demo` (`context_in_flow`) | Covered |
-| Context in tool | `{context}` in tool handler | `framework-context-demo` | Covered |
-| Context propagation | Auto-propagation to sub-actions | `framework-context-demo` (`context_propagation_chain`) | Covered |
-| `ai.current_context()` | Access current context | `framework-context-demo` (`context_current_context`) | Covered |
-
-> **SDK Status:** Full context support exists: `context=` on `generate()` and
-> flows, `ActionRunContext`, `ai.current_context()`, and auto-propagation.
-> `framework-context-demo` provides comprehensive coverage with 4 dedicated flows.
-
-### `/docs/dotprompt` -- Managing Prompts with Dotprompt
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| .prompt files | YAML frontmatter + template | `framework-prompt-demo` | Covered (bug: see below) |
-| Running prompts from code | `ai.prompt('name')` | `framework-prompt-demo` | Covered (bug: see below) |
-| Streaming prompts | `prompt.stream()` | `framework-prompt-demo` | Covered (bug: see below) |
-| Input/Output schemas (Picoschema) | `schema:` in frontmatter | `framework-prompt-demo` | Covered (bug: see below) |
-| Schema references | `ai.defineSchema()` + name ref | `framework-prompt-demo` | Covered (bug: see below) |
-| Model configuration | `config:` in frontmatter | `framework-prompt-demo` | Covered (bug: see below) |
-| Handlebars templates | `{{variable}}`, `{{#if}}` | `framework-prompt-demo` | Covered (bug: see below) |
-| Multi-message prompts | `{{role "system"}}` | `framework-prompt-demo` (in partial) | Covered (bug: see below) |
-| Partials | `{{>partialName}}` | `framework-prompt-demo` (`_style.prompt`) | Covered (bug: see below) |
-| Custom helpers | `ai.defineHelper()` | `framework-prompt-demo` (`list` helper) | Covered (bug: see below) |
-| Prompt variants | `.variant.prompt` files | Blocked by SDK bug | **BUG** |
-| **Tool calling in prompts** | `tools: [...]` in frontmatter | Not in framework-prompt-demo | **GAP** |
-| **Multimodal prompts** | `{{media url=photoUrl}}` | Not in framework-prompt-demo | **GAP** |
-| **Defining prompts in code** | `ai.definePrompt()` | Not in framework-prompt-demo | **GAP** |
-| **Default input values** | `default:` in frontmatter | Not in framework-prompt-demo | **GAP** |
-
-> **SDK Bug (B1b):** `framework-prompt-demo` had a P0 bug: `Failed to load lazy
-> action recipe.robot: maximum recursion depth exceeded`. Root cause is a
-> **self-referential lazy loading loop** in the SDK's `create_prompt_from_file()`
-> at `py/packages/genkit/src/genkit/blocks/prompt.py` -- when loading a variant
-> prompt, `resolve_action_by_key()` is called with the action's own key before
-> `_cached_prompt` is set, which triggers `_trigger_lazy_loading()` to re-invoke
-> `create_prompt_from_file()` for the same action, causing infinite recursion.
-> This is NOT a dotprompt library bug. Only Python is affected (JS uses a
-> `lazy()` wrapper guaranteeing single evaluation).
->
-> **Workaround (B1a):** `recipe.robot.prompt` was removed to unblock the sample.
-> **Fix:** Tracked at [firebase/genkit#4491](https://github.com/firebase/genkit/issues/4491).
-> Once fixed, variant demo should be re-added.
-
-### `/docs/flows` -- Flows
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Define flows | `@ai.flow()` decorator | All samples | Covered |
-| Input/output schemas | Pydantic models | All samples | Covered |
-| Streaming flows | `ctx.send_chunk()` | Several samples | Covered |
-| Deploy with Flask | Flask integration | `web-flask-hello` | Covered |
-| Flow steps (`ai.run()`) | Named trace spans | `provider-google-genai-hello` (line 434), `framework-realtime-tracing-demo` | Covered |
-
-> All flow features documented on genkit.dev are covered.
-
-### `/docs/rag` -- Retrieval-Augmented Generation
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Basic RAG flow | Retriever + generate | `framework-restaurant-demo` (case_04/05), `provider-firestore-retriever` | Covered |
-| Embedders | `ai.embed()` | `provider-google-genai-hello`, `provider-ollama-hello` | Covered |
-| Custom retriever | `ai.defineRetriever()` | `provider-firestore-retriever` | Covered |
-| Simple retriever | `ai.defineSimpleRetriever()` | No equivalent | **GAP** (minor) |
-| Vector search (Firestore) | Firestore vector store | `provider-vertex-ai-vector-search-firestore` | Covered |
-| Vector search (BigQuery) | BigQuery vector store | `provider-vertex-ai-vector-search-bigquery` | Covered |
-| Reranker | `ai.rerank()` | `provider-vertex-ai-rerank-eval` | Covered |
-| Custom reranker | `ai.defineReranker()` | No sample | **GAP** (minor) |
-| **Indexer** | `ai.index()` + flow | **No indexer sample** | **GAP** |
-
-> **SDK Status:** Python SDK does not have a built-in local dev vector store
-> plugin (like JS `@genkit-ai/dev-local-vectorstore`). Indexing is done via
-> external SDKs (Firestore, etc.). The RAG Python tab on genkit.dev shows
-> Firestore-based retrieval only.
-
-### `/docs/evaluation` -- Evaluation
-
-| Feature | JS Doc Example | Python Sample | Status |
-|---------|---------------|---------------|--------|
-| Custom evaluator | `ai.defineEvaluator()` | `framework-evaluator-demo` | Covered |
-| Built-in metrics | `GenkitMetric.MALICIOUSNESS` | `provider-vertex-ai-rerank-eval` (BLEU, ROUGE, etc.) | Covered |
-| **Full eval pipeline** | Dataset -> inference -> metrics -> results | **No end-to-end pipeline sample** | **GAP** |
-| **Data synthesis** | Generate test questions from docs | **No sample** | **GAP** |
-
-> The JS `evals` testapp demonstrates dataset creation, flow evaluation, and
-> result analysis as a complete pipeline. Python needs an equivalent.
-
----
-
-## 2. Plugin Hello World Demos
-
-| Plugin | JS | Python | Notes |
-|--------|-----|--------|-------|
-| Google GenAI | Yes | `provider-google-genai-hello` | Parity |
-| Vertex AI | Yes (in basic-gemini) | `provider-google-genai-vertexai-hello` | Parity |
-| Anthropic | Yes | `provider-anthropic-hello` | Parity |
-| Ollama | Yes | `provider-ollama-hello` | Parity |
-| OpenAI Compat | Yes | `provider-compat-oai-hello` | Parity |
-| xAI (Grok) | No | `provider-xai-hello` | Python extra |
-| DeepSeek | No | `provider-deepseek-hello` | Python extra |
-| Model Garden | Yes | `provider-vertex-ai-model-garden` | Parity |
-| Mistral | No | `provider-mistral-hello` | Python extra |
-| HuggingFace | No | `provider-huggingface-hello` | Python extra |
-| Amazon Bedrock | No | `provider-amazon-bedrock-hello` | Python extra |
-| Cloudflare Workers AI | No | `provider-cloudflare-workers-ai-hello` | Python extra |
-| Microsoft Foundry | No | `provider-microsoft-foundry-hello` | Python extra |
-
----
-
-## 3. Incomplete Hello Samples
-
-Several hello samples are missing `generate_with_system_prompt` and/or
-`generate_multi_turn_chat` flows that other hello samples already have.
-
-### `generate_with_system_prompt` flow
-
-- [x] `provider-microsoft-foundry-hello` -- DONE
-- [x] `provider-mistral-hello` -- DONE
-- [x] `provider-huggingface-hello` -- DONE
-- [x] `provider-google-genai-vertexai-hello` -- DONE
-- [ ] `web-short-n-long` (still uses old name `system_prompt`)
-- [ ] `provider-vertex-ai-model-garden` (still uses old name `system_prompt`)
-
-### `generate_multi_turn_chat` flow
-
-- [x] `provider-microsoft-foundry-hello` -- DONE
-- [x] `provider-google-genai-vertexai-hello` -- DONE
-- [ ] `web-short-n-long` (still uses old name `multi_turn_chat`)
-- [ ] `provider-vertex-ai-model-garden` (still uses old name `multi_turn_chat`)
-
----
-
-## 4. Items Already Covered (verified)
-
-These were previously flagged as gaps but are now confirmed covered:
-
-| Feature | Sample | Notes |
-|---------|--------|-------|
-| Streaming + structured output | `provider-google-genai-hello` | Has streaming structured output flow |
-| Media generation (images) | `provider-google-genai-media-models-demo` | Imagen, Gemini Image, image editing |
-| Media generation (TTS) | `provider-google-genai-media-models-demo`, `provider-compat-oai-hello` | Google TTS, OpenAI TTS |
-| Reranker | `provider-vertex-ai-rerank-eval` | Vertex AI semantic reranker + eval metrics |
-| Dynamic tools | `framework-dynamic-tools-demo` | Standalone sample with `ai.dynamic_tool()` |
-| Flow steps (`ai.run()`) | `provider-google-genai-hello`, `framework-realtime-tracing-demo` | Named trace spans |
-| Multimodal input | Multiple hello samples | Image, video, audio input |
-| Tool interrupts | `framework-tool-interrupts`, `provider-google-genai-hello` | Full interrupt + resume flow |
-| Context propagation | `framework-context-demo` | 4 flows covering generate, flow, tool, and current_context |
-| Custom middleware | `framework-middleware-demo` | Retry, logging, and chained middleware |
-| Streaming + tool calling | All provider hello samples | `generate_streaming_with_tools` flow in all 12 |
-
----
-
-## 5. Items Out of Scope (not in Python SDK)
-
-| Feature | Doc Page | Reason |
-|---------|----------|--------|
-| Chat/Session API | `chat.mdx` | Deprecated |
-| Agents / Multi-Agent | `agentic-patterns.mdx`, `multi-agent.mdx` | Not yet in Python SDK |
-| MCP | `mcp-server.mdx`, `model-context-protocol.mdx` | Will come later |
-| Durable Streaming | `durable-streaming.mdx` | Not in Python SDK |
-| `defineInterrupt()` | `interrupts.mdx` | Only tool-based interrupts in Python |
-| Client SDK | `client.mdx` | JS client-side only |
-
----
-
-## 6. Execution Roadmap
-
-### Dependency Graph
-
-```mermaid
-flowchart TD
-    subgraph phase0 [Phase 0 - Leaves]
-        B1a["B1a: Remove recipe.robot.prompt DONE"]
-        B1b["B1b: Fix SDK lazy loading bug firebase/genkit#4491"]
-        G1["G1: Context demo DONE"]
-        G6["G6: Streaming + tools DONE"]
-        G7["G7: Custom middleware DONE"]
-        N6["N6: Fix typo firestore-retriever DONE"]
-    end
-
-    subgraph phase1 [Phase 1 - Dotprompt + Eval + Hello]
-        G2["G2: Dotprompt tool calling"]
-        G3["G3: Dotprompt define in code"]
-        G4["G4: Dotprompt multimodal"]
-        G5["G5: Dotprompt defaults"]
-        G8["G8: Eval pipeline"]
-        H1["H1: generate_with_system_prompt 2 remaining"]
-        H2["H2: generate_multi_turn_chat 2 remaining"]
-    end
-
-    subgraph phase2 [Phase 2 - RAG + Eval Extras]
-        N1["N1: Simple retriever"]
-        N2["N2: Custom reranker"]
-        N3["N3: Data synthesis"]
-        N4["N4: Indexer sample"]
-        N7["N7: Firebase Functions"]
-    end
-
-    subgraph phase3 [Phase 3 - Polish]
-        N5["N5: DevUI gallery"]
-    end
-
-    B1a --> B1b
-    B1b --> G2
-    B1b --> G3
-    B1b --> G4
-    B1b --> G5
-    G8 --> N3
-    G1 --> N5
-    G2 --> N5
-    G3 --> N5
-    G6 --> N5
-    G7 --> N5
-    G8 --> N5
-    H1 --> N5
-```
-
-### Edge List
-
-`A -> B` means "A must complete before B can start":
-
-- `B1a -> B1b` (removing the bad variant file makes the sample usable; SDK fix restores variant support)
-- `B1b -> G2` (SDK fix unblocks dotprompt tool calling)
-- `B1b -> G3` (SDK fix unblocks dotprompt define-in-code)
-- `B1b -> G4` (SDK fix unblocks dotprompt multimodal)
-- `B1b -> G5` (SDK fix unblocks dotprompt defaults)
-- `G8 -> N3` (eval pipeline design informs data synthesis)
-- `{G1, G2, G3, G6, G7, G8, H1} -> N5` (DevUI gallery showcases all features)
-
-**Critical path:** `B1a -> B1b -> G2/G3/G4/G5 -> N5`
-
----
-
-### Phase 0: Leaves (no dependencies, all parallel) -- MOSTLY DONE
-
-All tasks in this phase are independent (except B1a -> B1b which are sequential).
-
-| Task | Description | Status | Notes |
-|------|-------------|--------|-------|
-| **B1a** | Remove `recipe.robot.prompt` from framework-prompt-demo to unblock the sample. | **DONE** | Variant file and code removed |
-| **B1b** | Fix the SDK lazy loading bug in `create_prompt_from_file()` that causes infinite recursion when loading `.variant.prompt` files. Root cause: self-referential loop where `resolve_action_by_key()` is called with own key before `_cached_prompt` is set. Once fixed, re-add variant demo. | **BLOCKED** | Tracked at [firebase/genkit#4491](https://github.com/firebase/genkit/issues/4491). Only Python affected. |
-| **G1** | Context demo -- `framework-context-demo` with flows for `context=` in generate, context in flows, context in tools, auto-propagation, `ai.current_context()`. | **DONE** | 4 flows: `context_in_generate`, `context_in_flow`, `context_current_context`, `context_propagation_chain` |
-| **G6** | Streaming + tool calling -- `generate_streaming_with_tools` flow added to all 12 provider hello samples. | **DONE** | Uses shared `generate_streaming_with_tools_logic` |
-| **G7** | Custom middleware demo -- `framework-middleware-demo` with retry, logging, and chained middleware. | **DONE** | 3 flows: `logging_demo`, `request_modifier_demo`, `chained_middleware_demo` |
-| **N6** | Rename `firestore-retreiver` to `firestore-retriever` (typo fix). Now `provider-firestore-retriever`. | **DONE** | Directory renamed |
-
----
-
-### Phase 1: Dotprompt Completion + Eval + Hello Consistency
-
-G2-G5 are all unblocked by B1b. G8, H1, H2 are independent leaves placed here
-for workload balancing.
-
-| Task | Description | Depends On | Status |
-|------|-------------|------------|--------|
-| **G2** | Dotprompt: tool calling in prompts -- add a `.prompt` file with `tools: [search, calculate]` in frontmatter, plus a flow that loads and runs it. | B1b | Pending |
-| **G3** | Dotprompt: define prompts in code -- add `ai.define_prompt()` usage (no `.prompt` file, purely programmatic). | B1b | Pending |
-| **G4** | Dotprompt: multimodal prompts -- add a `.prompt` file using `{{media url=photoUrl}}` helper with image input schema. | B1b | Pending |
-| **G5** | Dotprompt: default input values -- add `default:` section to an existing or new `.prompt` file. | B1b | Pending |
-| **G8** | Eval pipeline sample -- end-to-end evaluation: define a custom evaluator, prepare a dataset, run inference-based eval, report results. | -- | Pending |
-| **H1** | Add `generate_with_system_prompt` flow to 2 remaining samples: `web-short-n-long`, `provider-vertex-ai-model-garden`. | -- | 4/6 Done |
-| **H2** | Add `generate_multi_turn_chat` flow to 2 remaining samples: `web-short-n-long`, `provider-vertex-ai-model-garden`. | -- | 4/6 Done |
-
-**Parallelizable:** G2-G5 are independent of each other (all just need B1b).
-G8, H1, H2 are independent of everything.
-
----
-
-### Phase 2: RAG and Eval Extras
-
-Lower-priority items that round out coverage for `rag.mdx` and `evaluation.mdx`.
-N3 depends on G8. All others are independent leaves.
-
-| Task | Description | Depends On | Status |
-|------|-------------|------------|--------|
-| **N1** | Simple retriever -- `ai.define_simple_retriever()` equivalent if SDK supports it, or a minimal custom retriever pattern. | -- | Pending |
-| **N2** | Custom reranker -- `ai.define_reranker()` with custom scoring logic. | -- | Pending |
-| **N3** | Data synthesis -- generate test questions from documents using an LLM. | G8 | Pending |
-| **N4** | Indexer sample -- document ingestion pipeline: chunk PDFs, generate embeddings, store in vector DB. | -- | Pending |
-| **N7** | Firebase Functions sample -- Python Cloud Functions deployment with Genkit. | -- | Pending |
-
----
-
-### Phase 3: Polish
-
-DevUI gallery depends on most features being in place so it can showcase them all.
-
-| Task | Description | Depends On | Status |
-|------|-------------|------------|--------|
-| **N5** | DevUI gallery -- a single sample that showcases all DevUI features: prompts, flows, tools, evaluators, structured output, streaming, context, middleware. | G1, G2, G3, G6, G7, G8, H1 | Pending |
-
----
-
-### Execution Timeline
-
-```
-TIME -->
-==========================================================================
-
-P0:  [B1a: remove recipe.robot.prompt] DONE
-     [B1b: fix SDK lazy loading bug] BLOCKED (firebase/genkit#4491)
-     [G1: context demo] DONE
-     [G6: streaming+tools] DONE
-     [G7: custom middleware] DONE
-     [N6: typo fix] DONE
-     (5 of 6 P0 tasks complete; B1b awaits SDK fix)
-                |
-     --- P0 partially complete (B1b on critical path) ---
-                |
-SDK: [S1: fix plugin structlog blowaway ~~~~] (HIGH - 5 plugins)
-     [S2: fix awarn protocol gap ~~~~~~~~~~~] (LOW)
-     [S3: fix ToolRunContext sole param ~~~~] (MEDIUM - #4492)
-     [S4: fix lazy loading recursion ~~~~~~~] (MEDIUM - #4491, same as B1b)
-     (all independent, each a separate PR)
-                |
-P1:  [G2: dotprompt tools ~~]  [G8: eval pipeline ~~~~~~]
-     [G3: dotprompt code ~~~]  [H1: system_prompt x2 ~~~]
-     [G4: dotprompt media ~~]  [H2: multi_turn x2 ~~~~~~]
-     [G5: dotprompt defaults]
-     (G2-G5 blocked by B1b/S4; G8/H1/H2 ready now)
-                |
-     --- all P1 complete ---
-                |
-P2:  [N1: simple retriever ~~~]  [N4: indexer ~~~~~~~~]
-     [N2: custom reranker ~~~~]  [N7: firebase funcs ~]
-     [N3: data synthesis ~~~~~~~~~~]
-     (N1/N2/N4/N7 parallel, N3 after G8)
-                |
-     --- all P2 complete ---
-                |
-P3:  [N5: DevUI gallery ~~~~~~~~~~~~~~]
-                |
-     === SAMPLE PARITY COMPLETE ===
-```
-
----
-
-### Progress Summary
-
-| Phase | Tasks | Done | Remaining | Blockers |
-|-------|-------|------|-----------|----------|
-| **P0** | B1a, B1b, G1, G6, G7, N6 | 5/6 | B1b | firebase/genkit#4491 |
-| **P1** | G2, G3, G4, G5, G8, H1, H2 | 0/7 (H1 4/6, H2 4/6) | All | G2-G5 blocked by B1b |
-| **P2** | N1, N2, N3, N4, N7 | 0/5 | All | N3 blocked by G8 |
-| **P3** | N5 | 0/1 | All | Broad P0-P2 deps |
-| **SDK** | S1, S2, S3, S4 | 0/4 | All | Separate PRs needed |
-| **Total** | 23 tasks | ~5.5 | ~17.5 | 1 SDK bug + 4 SDK fixes |
-
----
-
-## 7. Pending SDK / Infrastructure Fixes (Separate PRs)
-
-Issues discovered during the sample consolidation and logging refactoring.
-These should NOT be fixed in the samples PR -- each needs its own PR touching
-core SDK or plugin code.
-
-### SDK Bugs
-
-| ID | Severity | Description | Affected Code | Notes |
-|----|----------|-------------|---------------|-------|
-| ~~**S1**~~ | ~~**HIGH**~~ | ~~**Observability plugins blow away structlog config.** Five plugins call `structlog.configure(processors=new_processors)` with *only* the `processors` kwarg. Since `structlog.configure()` is a full-replace (not partial-update), this resets `wrapper_class`, `logger_factory`, `cache_logger_on_first_use` back to defaults -- silently destroying any custom structlog setup (e.g. the `setup_sample()` stdlib integration). **Fix:** Use `structlog.configure(**{**structlog.get_config(), 'processors': new_processors})` to preserve the full config.~~ | ~~All 5 plugins~~ | ✅ Done — all 5 plugins fixed |
-| **S2** | LOW | **`awarn` gap in `Logger` protocol.** `genkit.core.logging.Logger` declares `awarn()` but `structlog.stdlib.BoundLogger` only has `awarning` (no `awarn` alias). Calling `logger.awarn(...)` would raise `AttributeError`. Previously masked because `make_filtering_bound_logger` dynamically creates all method names. **Fix:** Either remove `awarn`/`warn` from the protocol, or add runtime aliases. | `py/packages/genkit/src/genkit/core/logging.py` | Only matters if `awarn` is actually called somewhere |
-| **S3** | MEDIUM | **`ToolRunContext` as sole parameter crashes with `PydanticSchemaGenerationError`.** When a `@ai.tool()` has `ToolRunContext` as its only parameter, the SDK tries to create a `TypeAdapter` for it (which fails) and would also dispatch the tool input instead of the context at runtime. **Workaround:** Use `Genkit.current_context()` with zero-arg tools. | `py/packages/genkit/src/genkit/core/action/_action.py` (lines 493-494, 592-598), `py/packages/genkit/src/genkit/ai/_registry.py` (lines 555-565) | Tracked at [firebase/genkit#4492](https://github.com/firebase/genkit/issues/4492) |
-| **S4** | MEDIUM | **SDK lazy loading infinite recursion for `.variant.prompt` files.** `create_prompt_from_file()` self-references via `resolve_action_by_key()` before caching, causing `RecursionError`. | `py/packages/genkit/src/genkit/blocks/prompt.py` | Tracked at [firebase/genkit#4491](https://github.com/firebase/genkit/issues/4491) |
-
-### Sample Naming Convention
-
-All samples follow a consistent prefix scheme:
-
-| Prefix | Category | Examples |
-|--------|----------|----------|
-| `provider-` | Model provider-specific | `provider-google-genai-hello`, `provider-anthropic-hello`, `provider-vertex-ai-model-garden` |
-| `framework-` | Genkit framework features | `framework-context-demo`, `framework-middleware-demo`, `framework-prompt-demo` |
-| `web-` | Web framework integration | `web-flask-hello`, `web-multi-server`, `web-short-n-long` |
-| (none) | Other | `dev-local-vectorstore-hello` |
diff --git a/py/engdoc/planning/FEATURE_MATRIX.md b/py/engdoc/planning/FEATURE_MATRIX.md
deleted file mode 100644
index 0a3ee15239..0000000000
--- a/py/engdoc/planning/FEATURE_MATRIX.md
+++ /dev/null
@@ -1,448 +0,0 @@
-# Plugin Feasibility & Feature Matrix
-
-This document provides a comprehensive comparison of all proposed plugins to help
-prioritize implementation efforts.
-
-## Executive Summary
-
-```
-┌─────────────────────────────────────────────────────────────────────────────────┐
-│                         PLUGIN PRIORITY RECOMMENDATION                          │
-├─────────────────────────────────────────────────────────────────────────────────┤
-│                                                                                 │
-│   PHASE 1 (Build Now)          PHASE 2 (Consider)       PHASE 3 (If Demanded)  │
-│   ──────────────────           ─────────────────        ────────────────────   │
-│   ┌─────────────────┐          ┌─────────────────┐      ┌─────────────────┐    │
-│   │ azure           │          │ cloudflare      │      │ vercel          │    │
-│   │ (telemetry)     │          │ (telemetry)     │      │ (helpers)       │    │
-│   │ Score: 92/100   │          │ Score: 75/100   │      │ Score: 55/100   │    │
-│   └─────────────────┘          └─────────────────┘      └─────────────────┘    │
-│                                                                                 │
-│   ┌─────────────────┐                                                          │
-│   │ cloudflare-ai   │                                                          │
-│   │ (models)        │                                                          │
-│   │ Score: 88/100   │                                                          │
-│   └─────────────────┘                                                          │
-│                                                                                 │
-│   ┌─────────────────┐                                                          │
-│   │ observability   │  ← NEW                                                   │
-│   │ (3rd party)     │                                                          │
-│   │ Score: 89/100   │                                                          │
-│   └─────────────────┘                                                          │
-│                                                                                 │
-└─────────────────────────────────────────────────────────────────────────────────┘
-```
-
----
-
-## Part 1: Model/AI Plugins
-
-### Feature Comparison
-
-| Feature | amazon-bedrock ❶ | google-genai ❶ | microsoft-foundry ❶ | cloudflare-ai ❷ |
-|---------|---------------|----------------|-------------|-----------------|
-| **Text Generation** | ✅ | ✅ | ✅ | ✅ |
-| **Streaming (SSE)** | ✅ | ✅ | ✅ | ✅ |
-| **Tool/Function Calling** | ✅ | ✅ | ✅ | ✅ (Llama 3+) |
-| **Embeddings** | ✅ | ✅ | ✅ | ✅ (BGE) |
-| **Image Generation** | ✅ (Nova) | ✅ (Imagen) | ✅ (DALL-E) | ✅ (Flux, SD) |
-| **Image Understanding** | ✅ | ✅ | ✅ | ✅ (Llama 4) |
-| **Speech-to-Text** | ✅ | ✅ | ✅ (Whisper) | ✅ (Whisper) |
-| **Text-to-Speech** | ✅ | ✅ | ✅ | ❌ |
-| **Video Generation** | ❌ | ✅ (Veo) | ❌ | ❌ |
-| **Audio Generation** | ❌ | ✅ (Lyria) | ❌ | ❌ |
-
-❶ = Already implemented  
-❷ = Proposed
-
-### Model Availability
-
-| Provider | Models | Notable Models |
-|----------|--------|----------------|
-| **AWS Bedrock** | 20+ | Claude 3.5, Llama 3, Nova, Titan |
-| **Google GenAI** | 10+ | Gemini 2, Imagen, Veo, Lyria |
-| **MS Foundry** | 11,000+ | GPT-4o, Claude, Llama, Mistral |
-| **Cloudflare AI** | 50+ | Llama 4, Mistral, Flux, Whisper |
-| **Vercel AI Gateway** | Pass-through | Any via OpenAI/Anthropic API |
-
-### Implementation Complexity
-
-| Plugin | API Type | Auth | SDK | Complexity |
-|--------|----------|------|-----|------------|
-| **amazon-bedrock** ❶ | Converse API | IAM/Keys | boto3 | Medium |
-| **google-genai** ❶ | REST/gRPC | API Key/ADC | google-genai | Medium |
-| **microsoft-foundry** ❶ | OpenAI-compat | API Key | openai | Low |
-| **cloudflare-ai** ❷ | REST | API Token | httpx | Low-Medium |
-
-### Cloudflare AI Feasibility Score
-
-| Factor | Score | Notes |
-|--------|-------|-------|
-| **API Documentation** | 9/10 | Excellent, clear examples |
-| **Python Support** | 7/10 | REST API, no official SDK |
-| **Model Variety** | 9/10 | 50+ models across categories |
-| **Streaming Support** | 9/10 | Native SSE for all LLMs |
-| **Tool Calling** | 8/10 | Supported on Llama 3+ |
-| **Community Demand** | 7/10 | Growing edge AI market |
-| **Maintenance Burden** | 8/10 | Simple REST, few breaking changes |
-| **Strategic Value** | 9/10 | Edge computing differentiator |
-| **TOTAL** | **88/100** | ✅ **BUILD** |
-
----
-
-## Part 2: Telemetry Plugins
-
-### Architecture Overview
-
-```
-┌─────────────────────────────────────────────────────────────────────────────────┐
-│                        TELEMETRY PLUGIN ARCHITECTURE                            │
-├─────────────────────────────────────────────────────────────────────────────────┤
-│                                                                                 │
-│   NATIVE PLATFORM BACKENDS              THIRD-PARTY BACKENDS                    │
-│   ────────────────────────              ────────────────────                    │
-│                                                                                 │
-│   ┌─────────┐  ┌─────────┐             ┌─────────────────────────┐             │
-│   │   aws   │  │ google- │             │    observability        │             │
-│   │         │  │ cloud   │             │                         │             │
-│   │ • SigV4 │  │ • ADC   │             │  • Sentry               │             │
-│   │ • X-Ray │  │ • Trace │             │  • Honeycomb            │             │
-│   │ • CW    │  │ • Logs  │             │  • Datadog              │             │
-│   └────┬────┘  └────┬────┘             │  • Grafana              │             │
-│        │            │                   │  • Axiom                │             │
-│        ▼            ▼                   │  • Custom OTLP          │             │
-│   ┌─────────┐  ┌─────────┐             └───────────┬─────────────┘             │
-│   │ X-Ray   │  │ Cloud   │                         │                            │
-│   │ Console │  │ Trace   │                         ▼                            │
-│   └─────────┘  └─────────┘             ┌─────────────────────────┐             │
-│                                        │  Any OTLP Backend       │             │
-│   ┌─────────┐                          │  (Sentry, Honeycomb,    │             │
-│   │  azure  │                          │   Datadog, etc.)        │             │
-│   │         │                          └─────────────────────────┘             │
-│   │ • Distro│                                                                   │
-│   │ • Live  │    CAN'T BE REPLICATED          CAN BE REPLICATED                │
-│   │ • Map   │    WITH GENERIC OTLP            WITH GENERIC OTLP                │
-│   └────┬────┘                                                                   │
-│        │                                                                        │
-│        ▼                                                                        │
-│   ┌─────────┐                                                                   │
-│   │  App    │                                                                   │
-│   │Insights │                                                                   │
-│   └─────────┘                                                                   │
-│                                                                                 │
-└─────────────────────────────────────────────────────────────────────────────────┘
-```
-
-### When to Use What
-
-```
-┌─────────────────────────────────────────────────────────────────────────────────┐
-│                     TELEMETRY PLUGIN DECISION GUIDE                             │
-├─────────────────────────────────────────────────────────────────────────────────┤
-│                                                                                 │
-│   "I'm on AWS and want X-Ray"           → aws plugin (SigV4, X-Ray format)     │
-│   "I'm on GCP and want Cloud Trace"     → google-cloud plugin (ADC)            │
-│   "I'm on Azure and want App Insights"  → azure plugin (Live Metrics, Map)     │
-│                                                                                 │
-│   "I'm on AWS but want Honeycomb"       → observability plugin (just OTLP)     │
-│   "I'm on GCP but want Sentry"          → observability plugin (just OTLP)     │
-│   "I'm multi-cloud, want Datadog"       → observability plugin (just OTLP)     │
-│   "I don't care, just give me traces"   → observability plugin (just OTLP)     │
-│                                                                                 │
-└─────────────────────────────────────────────────────────────────────────────────┘
-```
-
-### Feature Comparison
-
-| Feature | aws ❶ | google-cloud ❶ | azure ❷ | observability ❷ | cloudflare ❷ | vercel ❷ |
-|---------|-------|----------------|---------|-----------------|--------------|----------|
-| **Distributed Tracing** | ✅ X-Ray | ✅ Cloud Trace | ✅ App Insights | ✅ Any OTLP | ⚠️ 3rd party | ⚠️ 3rd party |
-| **Structured Logging** | ✅ CloudWatch | ✅ Cloud Logging | ✅ App Insights | ⚠️ Via backend | ✅ Logpush | ⚠️ 3rd party |
-| **Metrics** | ✅ CloudWatch | ✅ Cloud Monitoring | ✅ App Insights | ⚠️ Via backend | ⚠️ Workers Analytics | ⚠️ 3rd party |
-| **Live Metrics** | ❌ | ❌ | ✅ Built-in | ❌ | ❌ | ❌ |
-| **Application Map** | ❌ | ❌ | ✅ Built-in | ❌ | ❌ | ❌ |
-| **Log-Trace Correlation** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
-| **Auto-Instrumentation** | ⚠️ Manual | ⚠️ Manual | ✅ Distro | ⚠️ Manual | ⚠️ AI Gateway | ⚠️ Manual |
-| **Sentry Support** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ Native | ✅ OTLP |
-| **Honeycomb Support** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ Native | ✅ OTLP |
-| **Datadog Support** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ Native | ✅ OTLP |
-
-❶ = Already implemented  
-❷ = Proposed
-
-### Third-Party Backend Support
-
-| Backend | aws | google-cloud | azure | cloudflare | vercel |
-|---------|-----|--------------|-------|------------|--------|
-| **Sentry** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP |
-| **Honeycomb** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP |
-| **Datadog** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP |
-| **Grafana Cloud** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP |
-| **Axiom** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP |
-
-### Implementation Approach
-
-| Plugin | Approach | SDK | Setup Complexity |
-|--------|----------|-----|------------------|
-| **aws** ❶ | Custom OTLP + SigV4 | opentelemetry-* | Medium |
-| **google-cloud** ❶ | Custom OTLP | opentelemetry-* | Medium |
-| **azure** ❷ | Official Distro | azure-monitor-opentelemetry | **Very Low** |
-| **cloudflare** ❷ | Presets for 3rd party | opentelemetry-* | Low |
-| **vercel** ❷ | Standard OTLP | opentelemetry-* | Low |
-
-### Azure Telemetry Feasibility Score
-
-| Factor | Score | Notes |
-|--------|-------|-------|
-| **API Documentation** | 10/10 | Microsoft official docs |
-| **Python Support** | 10/10 | Official SDK with distro |
-| **Setup Simplicity** | 10/10 | One-liner `configure_azure_monitor()` |
-| **Feature Richness** | 9/10 | Live Metrics, App Map included |
-| **Community Demand** | 9/10 | Enterprise Azure users |
-| **Maintenance Burden** | 9/10 | Microsoft maintains SDK |
-| **Strategic Value** | 9/10 | Pairs with microsoft-foundry plugin |
-| **TOTAL** | **92/100** | ✅ **BUILD NOW** |
-
-### Cloudflare Telemetry Feasibility Score
-
-| Factor | Score | Notes |
-|--------|-------|-------|
-| **API Documentation** | 8/10 | Good Workers OTEL docs |
-| **Python Support** | 6/10 | REST API, standard OTEL |
-| **Setup Simplicity** | 7/10 | Dashboard config + code |
-| **Feature Richness** | 8/10 | AI Gateway auto-traces |
-| **Community Demand** | 7/10 | Growing edge users |
-| **Maintenance Burden** | 8/10 | Standard OTEL patterns |
-| **Strategic Value** | 8/10 | Pairs with cloudflare-ai |
-| **TOTAL** | **75/100** | ⚠️ **CONSIDER** |
-
-### Observability Plugin Feasibility Score
-
-| Factor | Score | Notes |
-|--------|-------|-------|
-| **API Documentation** | 9/10 | Standard OTLP, well-documented |
-| **Python Support** | 10/10 | Official opentelemetry-python |
-| **Setup Simplicity** | 9/10 | One function call with preset |
-| **Feature Coverage** | 8/10 | Traces + basic metrics |
-| **Community Demand** | 9/10 | Common request for 3rd party |
-| **Maintenance Burden** | 9/10 | Stable OTLP protocol |
-| **Strategic Value** | 8/10 | Platform-agnostic option |
-| **TOTAL** | **89/100** | ✅ **BUILD** |
-
-### Vercel Telemetry Feasibility Score
-
-| Factor | Score | Notes |
-|--------|-------|-------|
-| **API Documentation** | 6/10 | Node.js focused |
-| **Python Support** | 5/10 | No Vercel-specific SDK |
-| **Setup Simplicity** | 7/10 | Standard OTEL works |
-| **Feature Richness** | 5/10 | No unique features |
-| **Community Demand** | 6/10 | Python on Vercel growing |
-| **Maintenance Burden** | 8/10 | Standard OTEL patterns |
-| **Strategic Value** | 5/10 | No Vercel AI plugin needed |
-| **TOTAL** | **55/100** | ⚠️ **IF DEMANDED** |
-
----
-
-## Part 3: Effort vs Impact Matrix
-
-```
-                          IMPACT
-                    Low         High
-                ┌───────────┬───────────┐
-           Low  │           │  azure    │  ← Quick wins
-                │  vercel   │cloudflare │
-    EFFORT      │           │   (both)  │
-                ├───────────┼───────────┤
-           High │           │           │
-                │           │           │
-                │           │           │
-                └───────────┴───────────┘
-```
-
-### Effort Estimates
-
-| Plugin | Estimated Days | Dependencies |
-|--------|---------------|--------------|
-| **azure** | 5-7 days | azure-monitor-opentelemetry (official) |
-| **observability** | 5-7 days | opentelemetry-* |
-| **cloudflare-ai** | 10-15 days | httpx, pydantic |
-| **cloudflare** (telemetry) | 5-7 days | opentelemetry-* |
-| **vercel** | 3-5 days | opentelemetry-* |
-
-### Impact Factors
-
-| Plugin | New Users | Ecosystem Fit | Differentiation |
-|--------|-----------|---------------|-----------------|
-| **azure** | High (enterprise) | Pairs with microsoft-foundry | Good |
-| **cloudflare-ai** | Medium (edge) | New market | High |
-| **cloudflare** | Medium | Pairs with cloudflare-ai | Medium |
-| **vercel** | Low | Standalone | Low |
-
----
-
-## Part 4: Dependencies & Risk Analysis
-
-### External Dependencies
-
-| Plugin | Key Dependencies | Risk Level |
-|--------|-----------------|------------|
-| **azure** | azure-monitor-opentelemetry | ✅ Low (Microsoft maintained) |
-| **cloudflare-ai** | httpx | ✅ Low (stable library) |
-| **cloudflare** | opentelemetry-* | ✅ Low (CNCF standard) |
-| **vercel** | opentelemetry-* | ✅ Low (CNCF standard) |
-
-### API Stability Risk
-
-| Plugin | API Stability | Breaking Change Risk |
-|--------|--------------|---------------------|
-| **azure** | ✅ Stable | Low - versioned SDK |
-| **cloudflare-ai** | ⚠️ Evolving | Medium - new models added |
-| **cloudflare** | ✅ Stable | Low - standard OTEL |
-| **vercel** | ✅ Stable | Low - standard OTEL |
-
-### Maintenance Burden
-
-| Plugin | Ongoing Maintenance | Reason |
-|--------|-------------------|--------|
-| **azure** | Low | Microsoft maintains SDK |
-| **cloudflare-ai** | Medium | New models, config updates |
-| **cloudflare** | Low | Standard OTEL patterns |
-| **vercel** | Very Low | Just URL helpers |
-
----
-
-## Part 5: Final Recommendations
-
-### Priority Order
-
-| Priority | Plugin | Score | Action | Timeline |
-|----------|--------|-------|--------|----------|
-| **1** | azure (telemetry) | 92/100 | ✅ Build Now | 1 week |
-| **2** | observability (3rd party) | 89/100 | ✅ Build Now | 1 week |
-| **3** | cloudflare-ai (models) | 88/100 | ✅ Build Now | 2-3 weeks |
-| **4** | cloudflare (telemetry) | 75/100 | ⚠️ Consider | 1 week |
-| **5** | vercel (helpers) | 55/100 | ⚠️ If Demanded | 3-5 days |
-
-### Rationale
-
-**1. Azure Telemetry (Priority 1)**
-- Official Microsoft SDK with one-liner setup
-- Pairs naturally with existing `microsoft-foundry` plugin
-- High enterprise demand
-- Very low implementation effort
-
-**2. Observability Plugin (Priority 2)**
-- Platform-agnostic third-party backend support
-- One plugin for Sentry, Honeycomb, Datadog, Grafana, Axiom
-- Common user request
-- Uses stable OTLP protocol
-
-**3. Cloudflare AI (Priority 3)**
-- Growing edge AI market
-- 50+ models including latest Llama 4
-- Clear REST API
-- Differentiates Genkit in edge computing space
-
-**4. Cloudflare Telemetry (Priority 4)**
-- Pairs with cloudflare-ai plugin
-- Good third-party backend support (via observability plugin)
-- AI Gateway auto-traces are valuable
-- Lower priority since observability plugin covers 3rd party
-
-**5. Vercel (Priority 5)**
-- Python works on Vercel, but no unique features
-- AI Gateway = just URL change
-- Standard OTEL works fine
-- Build only if users explicitly request
-
-### What NOT to Build
-
-| Plugin | Reason |
-|--------|--------|
-| Vercel AI SDK wrapper | JS/TS only, use existing plugins |
-| Vercel OTEL package | Node.js only, standard OTEL works |
-| Generic OTEL presets | Too generic, not Genkit-specific value |
-
----
-
-## Appendix: Complete Feature Matrix
-
-### AI/Model Capabilities
-
-```
-┌─────────────────────────────────────────────────────────────────────────────────┐
-│                           AI/MODEL FEATURE MATRIX                               │
-├──────────────────────┬─────────┬──────────┬─────────┬─────────────┬────────────┤
-│ Feature              │ AWS     │ Google   │ Azure   │ Cloudflare  │ Vercel     │
-│                      │ Bedrock │ GenAI    │ Foundry │ Workers AI  │ AI Gateway │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼────────────┤
-│ Text Generation      │ ✅      │ ✅       │ ✅      │ ✅          │ ✅ proxy   │
-│ Streaming            │ ✅      │ ✅       │ ✅      │ ✅          │ ✅ proxy   │
-│ Tool Calling         │ ✅      │ ✅       │ ✅      │ ✅          │ ✅ proxy   │
-│ Structured Output    │ ✅      │ ✅       │ ✅      │ ⚠️ partial  │ ✅ proxy   │
-│ Embeddings           │ ✅      │ ✅       │ ✅      │ ✅          │ ❌         │
-│ Image Generation     │ ✅      │ ✅       │ ✅      │ ✅          │ ❌         │
-│ Image Understanding  │ ✅      │ ✅       │ ✅      │ ✅          │ ✅ proxy   │
-│ Speech-to-Text       │ ✅      │ ✅       │ ✅      │ ✅          │ ❌         │
-│ Text-to-Speech       │ ✅      │ ✅       │ ✅      │ ❌          │ ❌         │
-│ Video Generation     │ ❌      │ ✅       │ ❌      │ ❌          │ ❌         │
-│ Audio Generation     │ ❌      │ ✅       │ ❌      │ ❌          │ ❌         │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼────────────┤
-│ Python SDK           │ ✅ boto3│ ✅ google│ ✅ openai│ ❌ REST    │ ✅ openai  │
-│ Auth Method          │ IAM/Key │ Key/ADC  │ API Key │ API Token   │ API Key    │
-│ Regional             │ ✅      │ ✅       │ ✅      │ ❌ Global   │ ❌ Global  │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼────────────┤
-│ STATUS               │ ✅ DONE │ ✅ DONE  │ ✅ DONE │ 📋 PLANNED  │ ❌ SKIP    │
-└──────────────────────┴─────────┴──────────┴─────────┴─────────────┴────────────┘
-```
-
-### Telemetry Capabilities
-
-```
-┌──────────────────────────────────────────────────────────────────────────────────────────────┐
-│                              TELEMETRY FEATURE MATRIX                                        │
-├──────────────────────┬─────────┬──────────┬─────────┬─────────────┬─────────────┬───────────┤
-│ Feature              │ AWS     │ GCP      │ Azure   │ observ.     │ Cloudflare  │ Vercel    │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤
-│ Native Trace Backend │ ✅ X-Ray│ ✅ Trace │ ✅ Insght│ ❌ (3rd pty)│ ❌ (3rd pty)│ ❌ (3rd)  │
-│ Distributed Tracing  │ ✅      │ ✅       │ ✅      │ ✅ any OTLP │ ✅ export   │ ✅ export │
-│ Structured Logging   │ ✅      │ ✅       │ ✅      │ ⚠️ backend  │ ✅ Logpush  │ ⚠️ backend│
-│ Metrics              │ ✅      │ ✅       │ ✅      │ ⚠️ backend  │ ⚠️ basic    │ ❌        │
-│ Live Metrics         │ ❌      │ ❌       │ ✅      │ ❌          │ ❌          │ ❌        │
-│ Application Map      │ ❌      │ ❌       │ ✅      │ ❌          │ ❌          │ ❌        │
-│ Log-Trace Correlation│ ✅      │ ✅       │ ✅      │ ✅          │ ✅          │ ✅        │
-│ Auto-Instrumentation │ ⚠️      │ ⚠️       │ ✅      │ ⚠️ manual   │ ⚠️ AI only  │ ❌        │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤
-│ Sentry Export        │ ✅ OTLP │ ✅ OTLP  │ ✅ OTLP │ ✅ PRESET   │ ✅ Native   │ ✅ OTLP   │
-│ Honeycomb Export     │ ✅ OTLP │ ✅ OTLP  │ ✅ OTLP │ ✅ PRESET   │ ✅ Native   │ ✅ OTLP   │
-│ Datadog Export       │ ✅ OTLP │ ✅ OTLP  │ ✅ OTLP │ ✅ PRESET   │ ✅ Native   │ ✅ OTLP   │
-│ Grafana Export       │ ✅ OTLP │ ✅ OTLP  │ ✅ OTLP │ ✅ PRESET   │ ✅ Native   │ ✅ OTLP   │
-│ Axiom Export         │ ✅ OTLP │ ✅ OTLP  │ ✅ OTLP │ ✅ PRESET   │ ✅ Native   │ ✅ OTLP   │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤
-│ Official Python SDK  │ ❌ manual│ ❌ manual│ ✅ distro│ ✅ OTEL    │ ❌ REST     │ ❌ manual │
-│ Setup Complexity     │ Medium  │ Medium   │ Very Low│ Very Low    │ Low         │ Low       │
-├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤
-│ STATUS               │ ✅ DONE │ ✅ DONE  │ 📋 PLAN │ 📋 PLAN     │ 📋 CONSIDER │ ⚠️ DEFER  │
-└──────────────────────┴─────────┴──────────┴─────────┴─────────────┴─────────────┴───────────┘
-```
-
----
-
-## Decision: What to Attack
-
-Based on this analysis:
-
-### ✅ Build Now (Q1 2026)
-
-1. **azure** - 92/100 score, 1 week effort, high enterprise value
-2. **observability** - 89/100 score, 1 week effort, platform-agnostic 3rd party
-3. **cloudflare-ai** - 88/100 score, 2-3 weeks effort, edge differentiator
-
-### ⚠️ Consider (Q2 2026)
-
-4. **cloudflare** (telemetry) - 75/100 score, pairs with cloudflare-ai
-
-### ⏸️ Defer
-
-5. **vercel** - 55/100 score, build only if explicitly requested
diff --git a/py/engdoc/planning/README.md b/py/engdoc/planning/README.md
deleted file mode 100644
index 34ae8141ef..0000000000
--- a/py/engdoc/planning/README.md
+++ /dev/null
@@ -1,121 +0,0 @@
-# Plugin Implementation Plans
-
-This directory contains detailed implementation plans for proposed Genkit plugins.
-
-## Summary Table
-
-| Plugin | Type | Feasibility | Effort | Priority | Status |
-|--------|------|-------------|--------|----------|--------|
-| **azure** | Telemetry | ✅ HIGH | 1 week | High | Ready |
-| **observability** | Telemetry | ✅ HIGH | 1 week | High | Ready |
-| **cloudflare-ai** | Model | ✅ HIGH | 2-3 weeks | High | Ready |
-| **cloudflare** | Telemetry | ⚠️ MEDIUM-HIGH | 1-2 weeks | Medium | Consider |
-| **vercel** | Combined | ⚠️ MEDIUM | 1 week | Low | If demanded |
-
-> **Note:** The `observability` plugin provides presets for Sentry, Honeycomb, Datadog,
-> Grafana, and Axiom. It complements platform plugins (aws, google-cloud, azure) for
-> users who prefer third-party backends.
-
-## Detailed Plans
-
-### Ready for Implementation
-
-1. **[azure-telemetry-plugin.md](./azure-telemetry-plugin.md)** - Azure Application Insights
-   - Official Microsoft OTEL distro
-   - One-liner setup with `configure_azure_monitor()`
-   - Live metrics, application map, log correlation
-
-2. **[observability-plugin.md](./observability-plugin.md)** - Third-Party Backends
-   - Presets for Sentry, Honeycomb, Datadog, Grafana, Axiom
-   - Platform-agnostic OTLP export
-   - One function call setup
-
-3. **[cloudflare-ai-plugin.md](./cloudflare-ai-plugin.md)** - Cloudflare Workers AI
-   - 50+ models at the edge (Llama, Mistral, Flux, etc.)
-   - Streaming, tool calling, embeddings
-   - REST API with simple auth
-
-### Consider Building
-
-4. **[cloudflare-telemetry-plugin.md](./cloudflare-telemetry-plugin.md)** - Cloudflare Telemetry
-   - No native backend, but exports to Sentry, Honeycomb, Datadog, etc.
-   - AI Gateway auto-exports AI traces to third-party backends
-   - Recommend: Plugin with presets for common backends
-
-5. **[vercel-plugins.md](./vercel-plugins.md)** - Vercel AI & Telemetry
-   - Python DOES work on Vercel (FastAPI, Flask)
-   - AI SDK and @vercel/otel are JS-only, but AI Gateway + standard OTEL work
-   - Recommend: Build simple helper plugin if user demand exists
-
-## Feasibility Criteria
-
-### ✅ HIGH Feasibility
-- Official SDK/library available
-- Clear API documentation
-- Python support confirmed
-- Similar patterns to existing plugins
-
-### ⚠️ MEDIUM Feasibility
-- REST API available but no SDK
-- Limited Python-specific documentation
-- Workarounds required
-- May have feature gaps
-
-### ❌ LOW Feasibility
-- No Python support
-- Platform-specific (JS/Node only)
-- Would duplicate existing functionality
-- Not worth maintenance overhead
-
-## Implementation Priority
-
-### Phase 1 (Immediate)
-1. **azure** - Strong enterprise demand, official OTEL support
-2. **observability** - Platform-agnostic, Sentry/Honeycomb/Datadog presets
-3. **cloudflare-ai** - Growing edge AI market, good REST API
-
-### Phase 2 (Consider)
-4. **cloudflare** (telemetry) - AI Gateway integration, pairs with cloudflare-ai
-
-### Phase 3 (If Demanded)
-5. **vercel** - Simple helper plugin if users request it
-
-## Architecture Patterns
-
-All plugins should follow these patterns from existing implementations:
-
-### Model Plugins (like `amazon-bedrock`, `microsoft-foundry`)
-```
-plugins/{name}/
-├── src/genkit/plugins/{name}/
-│   ├── __init__.py          # ELI5 docs, exports
-│   ├── typing.py            # Config schemas per model
-│   ├── models/
-│   │   ├── model.py         # Base model implementation
-│   │   └── {family}.py      # Model-specific configs
-│   └── embedders/           # If applicable
-└── tests/
-```
-
-### Telemetry Plugins (like `aws`, `google-cloud`)
-```
-plugins/{name}/
-├── src/genkit/plugins/{name}/
-│   ├── __init__.py          # ELI5 docs, exports
-│   ├── telemetry/
-│   │   ├── __init__.py
-│   │   └── tracing.py       # Manager class
-│   └── typing.py            # Config schemas
-└── tests/
-```
-
-## Documentation Requirements
-
-All plugins must include:
-
-1. **ELI5 Concepts Table** - In module docstring
-2. **Data Flow Diagram** - ASCII art showing architecture
-3. **README.md** - Setup instructions, examples
-4. **Sample Application** - In `samples/{name}-hello/`
-
-See [GEMINI.md](../../GEMINI.md) for full documentation requirements.
diff --git a/py/engdoc/planning/azure-telemetry-plugin.md b/py/engdoc/planning/azure-telemetry-plugin.md
deleted file mode 100644
index 748a5229c2..0000000000
--- a/py/engdoc/planning/azure-telemetry-plugin.md
+++ /dev/null
@@ -1,452 +0,0 @@
-# Azure Telemetry Plugin Implementation Plan
-
-**Status:** Ready for Implementation  
-**Feasibility:** ✅ HIGH  
-**Estimated Effort:** Medium (1-2 weeks)  
-**Dependencies:** `azure-monitor-opentelemetry`, `opentelemetry-sdk`
-
-## Overview
-
-The `azure` plugin exports Genkit telemetry to Azure Monitor (Application Insights),
-providing distributed tracing, logging, and metrics for Azure-hosted applications.
-
-```
-┌─────────────────────────────────────────────────────────────────────────┐
-│                  AZURE TELEMETRY PLUGIN ARCHITECTURE                    │
-│                                                                         │
-│    Key Concepts (ELI5):                                                 │
-│    ┌─────────────────────┬────────────────────────────────────────────┐ │
-│    │ Azure Monitor       │ Microsoft's observability platform. See    │ │
-│    │                     │ traces, logs, metrics in Azure Portal.     │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Application Insights│ The part of Azure Monitor for apps.        │ │
-│    │                     │ Tracks requests, dependencies, exceptions.  │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Connection String   │ Your key to send data. Found in Azure      │ │
-│    │                     │ Portal > App Insights > Connection String. │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Azure Monitor Distro│ Microsoft's "batteries included" OTEL      │ │
-│    │                     │ package. One line to enable everything.    │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Live Metrics        │ Real-time view of your app's health.       │ │
-│    │                     │ See requests as they happen!               │ │
-│    └─────────────────────┴────────────────────────────────────────────┘ │
-│                                                                         │
-│    Data Flow:                                                           │
-│    ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐  │
-│    │  Genkit App     │────▶│  Azure Monitor  │────▶│  Application    │  │
-│    │  (Your Code)    │     │  OTEL Distro    │     │  Insights       │  │
-│    └─────────────────┘     └─────────────────┘     └─────────────────┘  │
-│           │                        │                       │            │
-│           │                        │                       │            │
-│           │                        ▼                       ▼            │
-│           │               ┌─────────────────────────────────────────┐   │
-│           │               │           Azure Portal                  │   │
-│           │               │  ┌─────────┐ ┌─────────┐ ┌─────────┐   │   │
-│           │               │  │ Traces  │ │  Logs   │ │ Metrics │   │   │
-│           │               │  │ (E2E)   │ │ (Query) │ │ (Charts)│   │   │
-│           │               │  └─────────┘ └─────────┘ └─────────┘   │   │
-│           │               └─────────────────────────────────────────┘   │
-│           │                                                             │
-│           │    ┌─────────────────┐                                      │
-│           └───▶│  structlog      │──── Logs with trace correlation      │
-│                │  integration    │                                      │
-│                └─────────────────┘                                      │
-└─────────────────────────────────────────────────────────────────────────┘
-```
-
-## Azure Monitor OpenTelemetry Distro
-
-Microsoft provides an official "batteries included" package that handles everything:
-
-```bash
-pip install azure-monitor-opentelemetry
-```
-
-**Version:** 1.8.5 (January 2026)  
-**Python Support:** 3.9 - 3.14
-
-### What It Includes
-
-- **Azure Monitor exporters** - Send data to Application Insights
-- **Auto-instrumentation** - HTTP, database, and framework libraries
-- **Trace correlation** - Links traces across services
-- **Live Metrics** - Real-time monitoring stream
-
-## Implementation
-
-### Core Plugin Class
-
-```python
-"""Azure telemetry plugin for Genkit.
-
-Exports traces, logs, and metrics to Azure Monitor (Application Insights).
-
-Key Concepts (ELI5)::
-
-    ┌─────────────────────┬────────────────────────────────────────────────┐
-    │ Concept             │ ELI5 Explanation                               │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Azure Monitor       │ Microsoft's observability platform. Like a    │
-    │                     │ dashboard showing your app's vital signs.     │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Application Insights│ The part that tracks your app specifically.   │
-    │                     │ Requests, errors, dependencies, performance.  │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Connection String   │ Your unique key to send telemetry. Like an    │
-    │                     │ address where your data should be delivered.  │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Live Metrics        │ Real-time view of requests as they happen.    │
-    │                     │ Like watching a live scoreboard.              │
-    └─────────────────────┴────────────────────────────────────────────────┘
-
-Data Flow::
-
-    ┌─────────────────────────────────────────────────────────────────────┐
-    │                    HOW AZURE TELEMETRY WORKS                        │
-    │                                                                     │
-    │    Your Genkit App                                                  │
-    │         │                                                           │
-    │         │  (1) Initialize AzureTelemetry                            │
-    │         ▼                                                           │
-    │    ┌─────────────────┐                                              │
-    │    │  AzureTelemetry │   Configures OTEL with Azure exporters       │
-    │    │  (Manager)      │                                              │
-    │    └────────┬────────┘                                              │
-    │             │                                                       │
-    │             │  (2) Auto-instruments your code                       │
-    │             ▼                                                       │
-    │    ┌─────────────────┐     ┌─────────────────┐                      │
-    │    │  TracerProvider │────▶│  AzureMonitor   │                      │
-    │    │  (OTEL)         │     │  Exporter       │                      │
-    │    └─────────────────┘     └────────┬────────┘                      │
-    │                                     │                               │
-    │             ┌───────────────────────┼───────────────────────┐       │
-    │             │                       │                       │       │
-    │             ▼                       ▼                       ▼       │
-    │    ┌─────────────────┐     ┌─────────────────┐     ┌─────────────┐  │
-    │    │  Traces         │     │  Logs           │     │  Metrics    │  │
-    │    │  (Distributed)  │     │  (Structured)   │     │  (Counters) │  │
-    │    └─────────────────┘     └─────────────────┘     └─────────────┘  │
-    │             │                       │                       │       │
-    │             └───────────────────────┼───────────────────────┘       │
-    │                                     │                               │
-    │                                     ▼                               │
-    │                          ┌─────────────────────┐                    │
-    │                          │  Application        │                    │
-    │                          │  Insights Portal    │                    │
-    │                          └─────────────────────┘                    │
-    └─────────────────────────────────────────────────────────────────────┘
-
-Example::
-
-    from genkit.ai import Genkit
-    from genkit.plugins.microsoft_foundry import AzureTelemetry
-    from genkit.plugins.microsoft_foundry import MicrosoftFoundry
-    
-    # Initialize Azure telemetry
-    AzureTelemetry().initialize()
-    
-    ai = Genkit(
-        plugins=[MicrosoftFoundry()],
-        model='microsoft-foundry/gpt-4o',
-    )
-"""
-
-import os
-import logging
-from typing import Any, MutableMapping, Mapping
-
-import structlog
-from azure.monitor.opentelemetry import configure_azure_monitor
-from opentelemetry import trace
-from opentelemetry.sdk.trace import TracerProvider
-
-from genkit.core.logging import get_logger
-
-
-logger = get_logger(__name__)
-
-
-class AzureTelemetry:
-    """Azure Monitor telemetry manager for Genkit applications.
-    
-    This class provides a centralized way to configure Azure Application Insights
-    telemetry, including distributed tracing, structured logging, and metrics.
-    
-    Args:
-        connection_string: Application Insights connection string.
-            Falls back to APPLICATIONINSIGHTS_CONNECTION_STRING env var.
-        service_name: Name of your service (appears in traces).
-        service_version: Version of your service.
-        enable_live_metrics: Enable real-time metrics stream.
-        log_level: Minimum log level to export.
-    
-    Example:
-        >>> telemetry = AzureTelemetry(service_name="my-genkit-app")
-        >>> telemetry.initialize()
-    """
-    
-    def __init__(
-        self,
-        connection_string: str | None = None,
-        service_name: str = "genkit-app",
-        service_version: str = "1.0.0",
-        enable_live_metrics: bool = True,
-        log_level: int = logging.INFO,
-    ):
-        self.connection_string = (
-            connection_string 
-            or os.environ.get('APPLICATIONINSIGHTS_CONNECTION_STRING')
-        )
-        self.service_name = service_name
-        self.service_version = service_version
-        self.enable_live_metrics = enable_live_metrics
-        self.log_level = log_level
-        
-        if not self.connection_string:
-            raise ValueError(
-                "Connection string required. Set APPLICATIONINSIGHTS_CONNECTION_STRING "
-                "or pass connection_string parameter."
-            )
-    
-    def initialize(self) -> None:
-        """Initialize Azure Monitor telemetry.
-        
-        This method:
-        1. Configures the Azure Monitor OpenTelemetry distro
-        2. Sets up structured logging with trace correlation
-        3. Enables live metrics if configured
-        """
-        # Configure Azure Monitor (one-liner!)
-        configure_azure_monitor(
-            connection_string=self.connection_string,
-            service_name=self.service_name,
-            service_version=self.service_version,
-            enable_live_metrics=self.enable_live_metrics,
-            logger_name="",  # Capture all loggers
-        )
-        
-        # Configure structlog for trace correlation
-        self._configure_logging()
-        
-        logger.info(
-            "Azure telemetry initialized",
-            service_name=self.service_name,
-            live_metrics=self.enable_live_metrics,
-        )
-    
-    def _configure_logging(self) -> None:
-        """Configure structlog to include Azure trace context."""
-        processors = list(structlog.get_config().get("processors", []))
-        
-        # Check if already configured
-        if any(
-            getattr(p, '__name__', '') == 'inject_azure_trace_context'
-            for p in processors
-        ):
-            return
-        
-        def inject_azure_trace_context(
-            _logger: Any,
-            method_name: str,
-            event_dict: MutableMapping[str, Any],
-        ) -> Mapping[str, Any]:
-            """Inject Azure trace context into log events."""
-            span = trace.get_current_span()
-            if span and span.is_recording():
-                ctx = span.get_span_context()
-                # Azure uses operation_Id and operation_ParentId
-                event_dict['operation_Id'] = format(ctx.trace_id, '032x')
-                event_dict['operation_ParentId'] = format(ctx.span_id, '016x')
-            return event_dict
-        
-        new_processors = list(processors)
-        new_processors.insert(max(0, len(new_processors) - 1), inject_azure_trace_context)
-        structlog.configure(processors=new_processors)
-```
-
-### Directory Structure
-
-```
-py/plugins/microsoft-foundry/
-├── pyproject.toml
-├── README.md
-├── LICENSE
-├── src/genkit/plugins/microsoft-foundry/
-│   ├── __init__.py              # Plugin entry, ELI5 docs, exports
-│   ├── telemetry/
-│   │   ├── __init__.py
-│   │   └── tracing.py           # AzureTelemetry class
-│   ├── typing.py                # Configuration schemas
-│   └── py.typed
-└── tests/
-    ├── conftest.py
-    └── azure_telemetry_test.py
-```
-
-### pyproject.toml
-
-```toml
-[project]
-name = "genkit-azure-plugin"
-version = "0.1.0"
-description = "Azure Monitor telemetry plugin for Genkit"
-requires-python = ">=3.10"
-dependencies = [
-    "genkit",
-    "azure-monitor-opentelemetry>=1.8.0",
-    "structlog>=24.0.0",
-]
-
-[project.optional-dependencies]
-dev = [
-    "pytest>=8.0.0",
-    "pytest-asyncio>=0.24.0",
-]
-```
-
-## Configuration Options
-
-### Connection String
-
-Get from Azure Portal:
-1. Go to your Application Insights resource
-2. Click "Overview" or "Properties"
-3. Copy the "Connection String"
-
-Format:
-```
-InstrumentationKey=xxx;IngestionEndpoint=https://xxx.in.applicationinsights.azure.com/;LiveEndpoint=https://xxx.livediagnostics.monitor.azure.com/;ApplicationId=xxx
-```
-
-### Environment Variables
-
-| Variable | Required | Description |
-|----------|----------|-------------|
-| `APPLICATIONINSIGHTS_CONNECTION_STRING` | Yes | App Insights connection string |
-| `AZURE_SDK_TRACING_IMPLEMENTATION` | No | Set to "opentelemetry" for SDK tracing |
-
-## Features
-
-### 1. Distributed Tracing
-
-Automatic tracing for:
-- HTTP requests (incoming and outgoing)
-- Database calls (via auto-instrumentation)
-- Genkit flows, models, and tools
-- Cross-service correlation
-
-### 2. Structured Logging
-
-Logs automatically include:
-- `operation_Id` - Links logs to traces
-- `operation_ParentId` - Parent span context
-- Custom properties from structlog
-
-### 3. Live Metrics
-
-Real-time stream showing:
-- Request rate
-- Failure rate
-- Response time
-- Server health
-
-### 4. Application Map
-
-Visual diagram of:
-- Service dependencies
-- Call flows
-- Performance bottlenecks
-
-## Sample Application
-
-```python
-# py/samples/provider-microsoft-foundry-hello/src/main.py
-"""Azure telemetry hello sample - Monitor Genkit with Application Insights.
-
-Key Concepts (ELI5)::
-
-    ┌─────────────────────┬────────────────────────────────────────────────┐
-    │ Concept             │ ELI5 Explanation                               │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Application Insights│ Microsoft's app monitoring. See traces, logs, │
-    │                     │ and metrics in Azure Portal.                   │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Connection String   │ Your key to send data. Find it in Azure       │
-    │                     │ Portal > App Insights > Properties.           │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Live Metrics        │ Real-time view of requests as they happen.    │
-    │                     │ Great for debugging production issues!         │
-    └─────────────────────┴────────────────────────────────────────────────┘
-"""
-
-from genkit.ai import Genkit
-from genkit.plugins.microsoft_foundry import AzureTelemetry
-from genkit.plugins.microsoft_foundry import MicrosoftFoundry
-
-# Initialize Azure telemetry FIRST
-AzureTelemetry(
-    service_name="microsoft-foundry-hello-sample",
-    enable_live_metrics=True,
-).initialize()
-
-ai = Genkit(
-    plugins=[MicrosoftFoundry()],
-    model='microsoft-foundry/gpt-4o',
-)
-
-@ai.flow()
-async def say_hi(name: str) -> str:
-    """Say hello - traced in Application Insights."""
-    response = await ai.generate(prompt=f"Say hi to {name}!")
-    return response.text
-```
-
-## Comparison with AWS/GCP Telemetry
-
-| Feature | AWS (`aws`) | GCP (`google-cloud`) | Azure (`azure`) |
-|---------|-------------|---------------------|-----------------|
-| Native Backend | X-Ray | Cloud Trace | Application Insights |
-| OTEL Distro | Manual setup | Manual setup | ✅ Official distro |
-| One-liner Setup | ❌ | ❌ | ✅ `configure_azure_monitor()` |
-| Live Metrics | ❌ | ❌ | ✅ Built-in |
-| Application Map | ❌ | ❌ | ✅ Built-in |
-| Log Correlation | ✅ | ✅ | ✅ |
-| Auto-instrumentation | Manual | Manual | ✅ Automatic |
-
-## Implementation Phases
-
-### Phase 1: Core Telemetry (3-4 days)
-
-1. Plugin skeleton with `AzureTelemetry` class
-2. Integration with `azure-monitor-opentelemetry`
-3. Structlog trace correlation
-4. Basic tests
-
-### Phase 2: Sample & Docs (2-3 days)
-
-1. `microsoft-foundry-hello` sample application
-2. README with setup instructions
-3. Integration with `microsoft-foundry` plugin
-
-### Phase 3: Advanced Features (Optional)
-
-1. Custom metrics support
-2. Exception tracking
-3. Availability tests integration
-
-## Risks and Mitigations
-
-| Risk | Impact | Mitigation |
-|------|--------|------------|
-| Connection string exposure | High | Document secure storage practices |
-| High telemetry volume | Medium | Configure sampling |
-| SDK version conflicts | Low | Pin compatible versions |
-
-## References
-
-- [Azure Monitor OpenTelemetry Distro](https://learn.microsoft.com/en-us/python/api/overview/azure/monitor-opentelemetry-readme)
-- [Configure Azure Monitor OpenTelemetry](https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-configuration)
-- [Application Insights Overview](https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview)
-- [PyPI Package](https://pypi.org/project/azure-monitor-opentelemetry/)
diff --git a/py/engdoc/planning/cloudflare-ai-plugin.md b/py/engdoc/planning/cloudflare-ai-plugin.md
deleted file mode 100644
index cb9e4d64ea..0000000000
--- a/py/engdoc/planning/cloudflare-ai-plugin.md
+++ /dev/null
@@ -1,376 +0,0 @@
-# Cloudflare Workers AI Plugin Implementation Plan
-
-**Status:** Ready for Implementation  
-**Feasibility:** ✅ HIGH  
-**Estimated Effort:** Medium (2-3 weeks)  
-**Dependencies:** `httpx`, `pydantic`
-
-## Overview
-
-The `cloudflare-ai` plugin provides access to Cloudflare Workers AI, enabling Genkit
-applications to use 50+ open-source AI models running at the edge across 200+ data centers.
-
-```
-┌─────────────────────────────────────────────────────────────────────────┐
-│                 CLOUDFLARE WORKERS AI PLUGIN ARCHITECTURE               │
-│                                                                         │
-│    Key Concepts (ELI5):                                                 │
-│    ┌─────────────────────┬────────────────────────────────────────────┐ │
-│    │ Workers AI          │ Cloudflare's AI at the edge. Models run    │ │
-│    │                     │ close to users (200+ data centers).        │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Account ID          │ Your Cloudflare account identifier.        │ │
-│    │                     │ Found in dashboard URL or API settings.    │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ API Token           │ Auth token with Workers AI permissions.    │ │
-│    │                     │ Create at dash.cloudflare.com/profile/api  │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ @cf/ Models         │ Model names start with @cf/ prefix.        │ │
-│    │                     │ @cf/meta/llama-3.1-8b-instruct             │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Edge Computing      │ Processing close to users. Lower latency   │ │
-│    │                     │ than centralized cloud data centers.       │ │
-│    └─────────────────────┴────────────────────────────────────────────┘ │
-│                                                                         │
-│    Data Flow:                                                           │
-│    ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐  │
-│    │  Genkit App     │────▶│  CF Workers AI  │────▶│  Edge Location  │  │
-│    │  (Your Code)    │     │  REST API       │     │  (Nearest DC)   │  │
-│    └─────────────────┘     └─────────────────┘     └─────────────────┘  │
-│           │                                               │             │
-│           │         ┌─────────────────────────────────────┘             │
-│           │         │                                                   │
-│           │         ▼                                                   │
-│           │    ┌─────────────────┐                                      │
-│           │    │  AI Models      │                                      │
-│           │    │  • Llama 3/4    │                                      │
-│           │    │  • Mistral      │                                      │
-│           │    │  • Flux (Image) │                                      │
-│           │    │  • Whisper      │                                      │
-│           │    └─────────────────┘                                      │
-└─────────────────────────────────────────────────────────────────────────┘
-```
-
-## API Details
-
-### Authentication
-
-```python
-# Environment variables
-CLOUDFLARE_ACCOUNT_ID=your_account_id
-CLOUDFLARE_API_TOKEN=your_api_token
-```
-
-### Base URL Pattern
-
-```
-https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/{MODEL}
-```
-
-### Request Format
-
-```python
-# Text Generation
-{
-    "messages": [
-        {"role": "system", "content": "You are a helpful assistant"},
-        {"role": "user", "content": "Hello!"}
-    ],
-    "stream": true,  # Optional: Enable SSE streaming
-    "max_tokens": 256,
-    "temperature": 0.6
-}
-
-# Image Generation
-{
-    "prompt": "A sunset over mountains",
-    "num_steps": 20,
-    "guidance": 7.5
-}
-
-# Embeddings
-{
-    "text": ["Hello world", "Goodbye world"]
-}
-```
-
-### Response Format
-
-```python
-# Non-streaming
-{
-    "result": {
-        "response": "Hello! How can I help you today?"
-    },
-    "success": true,
-    "errors": [],
-    "messages": []
-}
-
-# Streaming (SSE)
-data: {"response": "Hello"}
-data: {"response": "!"}
-data: {"response": " How"}
-data: [DONE]
-```
-
-## Model Support Matrix
-
-| Model | Type | Streaming | Tools | Status |
-|-------|------|-----------|-------|--------|
-| `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | Text | ✅ | ✅ | Priority |
-| `@cf/meta/llama-3.1-8b-instruct-fast` | Text | ✅ | ✅ | Priority |
-| `@cf/meta/llama-4-scout-17b-16e-instruct` | Multimodal | ✅ | ✅ | Priority |
-| `@cf/mistral/mistral-7b-instruct-v0.2` | Text | ✅ | ❌ | Phase 1 |
-| `@cf/qwen/qwen1.5-14b-chat-awq` | Text | ✅ | ❌ | Phase 2 |
-| `@cf/black-forest-labs/flux-2-klein-9b` | Image | ❌ | ❌ | Phase 2 |
-| `@cf/stabilityai/stable-diffusion-xl-base-1.0` | Image | ❌ | ❌ | Phase 2 |
-| `@cf/openai/whisper` | Speech→Text | ❌ | ❌ | Phase 3 |
-| `@cf/baai/bge-base-en-v1.5` | Embedding | ❌ | ❌ | Phase 1 |
-| `@cf/baai/bge-large-en-v1.5` | Embedding | ❌ | ❌ | Phase 1 |
-
-## Model Configuration Parameters
-
-### Text Generation (Llama/Mistral)
-
-```python
-class CloudflareLlamaConfig(BaseModel):
-    """Configuration for Llama models on Workers AI."""
-    
-    # Core parameters
-    temperature: float | None = Field(default=0.6, ge=0.0, le=5.0)
-    max_tokens: int | None = Field(default=256, ge=1, le=4096)
-    top_p: float | None = Field(default=0.9, ge=0.0, le=1.0)
-    top_k: int | None = Field(default=40, ge=1, le=100)
-    
-    # Repetition control
-    repetition_penalty: float | None = Field(default=1.0, ge=0.0, le=2.0)
-    presence_penalty: float | None = None  # Llama 3.1+
-    frequency_penalty: float | None = None  # Llama 3.1+
-    
-    # Output control
-    seed: int | None = None  # For reproducibility
-    raw: bool | None = None  # Return raw tokens
-```
-
-### Image Generation (Flux/Stable Diffusion)
-
-```python
-class CloudflareImageConfig(BaseModel):
-    """Configuration for image generation models."""
-    
-    num_steps: int | None = Field(default=20, ge=1, le=50)
-    guidance: float | None = Field(default=7.5, ge=0.0, le=20.0)
-    strength: float | None = Field(default=1.0, ge=0.0, le=1.0)  # For img2img
-    width: int | None = Field(default=1024)
-    height: int | None = Field(default=1024)
-    seed: int | None = None
-```
-
-## Directory Structure
-
-```
-py/plugins/cloudflare-ai/
-├── pyproject.toml
-├── README.md
-├── LICENSE
-├── src/genkit/plugins/cloudflare_ai/
-│   ├── __init__.py              # Plugin entry, ELI5 docs, exports
-│   ├── typing.py                # All Pydantic config schemas
-│   ├── constants.py             # Model names, URLs, defaults
-│   ├── models/
-│   │   ├── __init__.py
-│   │   ├── model.py             # CloudflareModel base implementation
-│   │   ├── text.py              # Text generation (Llama, Mistral, Qwen)
-│   │   ├── image.py             # Image generation (Flux, SD)
-│   │   ├── speech.py            # Speech-to-text (Whisper)
-│   │   └── utils.py             # Response parsing, error handling
-│   ├── embedders/
-│   │   ├── __init__.py
-│   │   └── embedder.py          # BGE embeddings implementation
-│   └── py.typed
-└── tests/
-    ├── conftest.py
-    ├── cloudflare_model_test.py
-    ├── cloudflare_embedder_test.py
-    └── integration_test.py
-```
-
-## Implementation Phases
-
-### Phase 1: Core Plugin (Week 1)
-
-1. **Plugin skeleton**
-   - `CloudflareAI` plugin class
-   - Authentication handling (API token, Account ID)
-   - HTTP client setup with `httpx`
-
-2. **Text generation models**
-   - Llama 3.1/3.3 support
-   - Streaming via SSE
-   - Tool/function calling
-
-3. **Embeddings**
-   - BGE embedder implementation
-   - Batch embedding support
-
-### Phase 2: Extended Models (Week 2)
-
-1. **Image generation**
-   - Flux models
-   - Stable Diffusion XL
-   - Base64 image handling
-
-2. **Additional text models**
-   - Mistral family
-   - Qwen models
-
-3. **Model configuration**
-   - Full parameter support per model family
-   - DevUI integration
-
-### Phase 3: Advanced Features (Week 3)
-
-1. **Speech models**
-   - Whisper integration
-   - Audio input handling
-
-2. **Multimodal**
-   - Llama 4 Scout vision support
-   - Image + text inputs
-
-3. **Sample application**
-   - `cloudflare-workers-ai-hello` sample
-   - README with setup instructions
-
-## Key Implementation Details
-
-### Plugin Class
-
-```python
-class CloudflareAI(Plugin):
-    """Cloudflare Workers AI plugin for Genkit.
-    
-    Provides access to 50+ AI models running at the edge.
-    
-    Example:
-        >>> from genkit.ai import Genkit
-        >>> from genkit.plugins.cloudflare_ai import CloudflareAI, cloudflare_model
-        >>> 
-        >>> ai = Genkit(
-        ...     plugins=[CloudflareAI()],
-        ...     model=cloudflare_model("@cf/meta/llama-3.1-8b-instruct-fast"),
-        ... )
-    """
-    
-    def __init__(
-        self,
-        account_id: str | None = None,
-        api_token: str | None = None,
-        models: list[str] | None = None,  # Subset of models to register
-    ):
-        self.account_id = account_id or os.environ.get('CLOUDFLARE_ACCOUNT_ID')
-        self.api_token = api_token or os.environ.get('CLOUDFLARE_API_TOKEN')
-        
-        if not self.account_id:
-            raise ValueError("CLOUDFLARE_ACCOUNT_ID required")
-        if not self.api_token:
-            raise ValueError("CLOUDFLARE_API_TOKEN required")
-```
-
-### Streaming Implementation
-
-```python
-async def _generate_stream(
-    self, 
-    model: str,
-    messages: list[dict],
-    config: CloudflareLlamaConfig,
-) -> AsyncIterator[GenerateResponseChunk]:
-    """Generate streaming response using SSE."""
-    
-    url = f"{BASE_URL.format(account_id=self.account_id)}/{model}"
-    
-    async with httpx.AsyncClient() as client:
-        async with client.stream(
-            "POST",
-            url,
-            headers={
-                "Authorization": f"Bearer {self.api_token}",
-                "Content-Type": "application/json",
-            },
-            json={
-                "messages": messages,
-                "stream": True,
-                **config.model_dump(exclude_none=True),
-            },
-        ) as response:
-            async for line in response.aiter_lines():
-                if line.startswith("data: "):
-                    data = line[6:]
-                    if data == "[DONE]":
-                        break
-                    chunk = json.loads(data)
-                    yield GenerateResponseChunk(
-                        content=[TextPart(text=chunk.get("response", ""))],
-                    )
-```
-
-## Testing Strategy
-
-1. **Unit tests** - Mock HTTP responses, test config validation
-2. **Integration tests** - Live API calls (requires credentials)
-3. **Model-specific tests** - Verify each model family works correctly
-
-## Environment Variables
-
-| Variable | Required | Description |
-|----------|----------|-------------|
-| `CLOUDFLARE_ACCOUNT_ID` | Yes | Your Cloudflare account ID |
-| `CLOUDFLARE_API_TOKEN` | Yes | API token with Workers AI permissions |
-
-## Sample Application
-
-```python
-# py/samples/provider-cloudflare-workers-ai-hello/src/main.py
-"""Cloudflare Workers AI hello sample - Edge AI with Genkit."""
-
-from genkit.ai import Genkit
-from genkit.plugins.cloudflare_ai import CloudflareAI, cloudflare_model
-
-ai = Genkit(
-    plugins=[CloudflareAI()],
-    model=cloudflare_model("@cf/meta/llama-3.1-8b-instruct-fast"),
-)
-
-@ai.flow()
-async def say_hi(name: str) -> str:
-    """Say hello using Llama at the edge."""
-    response = await ai.generate(prompt=f"Say hi to {name} in a friendly way!")
-    return response.text
-
-@ai.flow()
-async def generate_image(prompt: str) -> str:
-    """Generate an image using Flux."""
-    response = await ai.generate(
-        model=cloudflare_model("@cf/black-forest-labs/flux-2-klein-9b"),
-        prompt=prompt,
-    )
-    return response.media[0].url  # Base64 data URL
-```
-
-## Risks and Mitigations
-
-| Risk | Impact | Mitigation |
-|------|--------|------------|
-| Rate limiting | Medium | Implement exponential backoff |
-| Model availability | Low | Graceful fallback to alternative models |
-| SSE parsing edge cases | Medium | Comprehensive error handling |
-| Tool calling variations | Medium | Test with multiple model families |
-
-## References
-
-- [Workers AI Documentation](https://developers.cloudflare.com/workers-ai/)
-- [Workers AI Models Catalog](https://developers.cloudflare.com/workers-ai/models/)
-- [REST API Guide](https://developers.cloudflare.com/workers-ai/get-started/rest-api)
-- [Cloudflare API Reference](https://developers.cloudflare.com/api/resources/ai/)
diff --git a/py/engdoc/planning/cloudflare-telemetry-plugin.md b/py/engdoc/planning/cloudflare-telemetry-plugin.md
deleted file mode 100644
index 3d533ad919..0000000000
--- a/py/engdoc/planning/cloudflare-telemetry-plugin.md
+++ /dev/null
@@ -1,339 +0,0 @@
-# Cloudflare Telemetry Plugin Implementation Plan
-
-**Status:** Research Complete - Limited Native Support  
-**Feasibility:** ⚠️ MEDIUM (requires workarounds)  
-**Estimated Effort:** Low-Medium (1-2 weeks)  
-**Dependencies:** `httpx`, `opentelemetry-sdk`
-
-## Overview
-
-Cloudflare does not have a native tracing backend like AWS X-Ray or GCP Cloud Trace.
-However, they support **exporting OpenTelemetry data** to third-party observability platforms
-and have recently adopted OpenTelemetry internally for their logging pipeline.
-
-```
-┌─────────────────────────────────────────────────────────────────────────┐
-│               CLOUDFLARE TELEMETRY OPTIONS ARCHITECTURE                 │
-│                                                                         │
-│    Key Concepts (ELI5):                                                 │
-│    ┌─────────────────────┬────────────────────────────────────────────┐ │
-│    │ Logpush             │ Exports logs to external services. Like a  │ │
-│    │                     │ pipe sending data to S3, Datadog, etc.     │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Workers Analytics   │ Built-in metrics for Workers. Basic        │ │
-│    │                     │ request counts, CPU time, errors.          │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ AI Gateway OTEL     │ Auto-exports AI request traces! Includes   │ │
-│    │                     │ model, tokens, cost, latency.              │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Workers OTEL Export │ Export traces from Workers to Honeycomb,   │ │
-│    │                     │ Grafana, Axiom, Datadog, etc.              │ │
-│    └─────────────────────┴────────────────────────────────────────────┘ │
-│                                                                         │
-│    OPTION A: AI Gateway Integration (Recommended for AI apps)           │
-│    ──────────────────────────────────────────────────────────           │
-│    ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐  │
-│    │  Genkit App     │────▶│  CF AI Gateway  │────▶│  Workers AI     │  │
-│    │  (Your Code)    │     │  (Proxy)        │     │  or OpenAI etc  │  │
-│    └─────────────────┘     └────────┬────────┘     └─────────────────┘  │
-│                                     │                                   │
-│                          Auto-export OTEL traces                        │
-│                                     │                                   │
-│                                     ▼                                   │
-│                            ┌─────────────────┐                          │
-│                            │  Your OTEL      │                          │
-│                            │  Backend        │                          │
-│                            │  (Honeycomb,    │                          │
-│                            │   Grafana, etc) │                          │
-│                            └─────────────────┘                          │
-│                                                                         │
-│    OPTION B: Direct OTLP Export (For non-Workers apps)                  │
-│    ───────────────────────────────────────────────────                  │
-│    ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐  │
-│    │  Genkit App     │────▶│  OTLP Exporter  │────▶│  Any OTEL       │  │
-│    │  (Your Code)    │     │  (Standard)     │     │  Backend        │  │
-│    └─────────────────┘     └─────────────────┘     └─────────────────┘  │
-└─────────────────────────────────────────────────────────────────────────┘
-```
-
-## Cloudflare's OTEL Support
-
-### Supported Third-Party Backends
-
-Cloudflare Workers and AI Gateway support exporting OTEL data to:
-
-| Provider | Traces | Logs | Notes |
-|----------|--------|------|-------|
-| **Sentry** | ✅ | ✅ | Error tracking + traces |
-| **Honeycomb** | ✅ | ✅ | Full OTEL support |
-| **Grafana Cloud** | ✅ | ✅ | Tempo + Loki |
-| **Axiom** | ✅ | ✅ | Log + trace ingestion |
-| **Datadog** | ✅ | ✅ | Full APM integration |
-
-### 1. AI Gateway OpenTelemetry (Best for AI Apps)
-
-Cloudflare AI Gateway automatically exports traces with:
-
-- **Model information** - Which model was called
-- **Token usage** - Input/output tokens
-- **Cost estimates** - Approximate cost per request
-- **Prompts & completions** - Full request/response content
-- **Latency metrics** - Time to first token, total time
-
-Configuration via dashboard or API:
-```json
-{
-  "otel": {
-    "endpoint": "https://your-sentry-endpoint/v1/traces",
-    "headers": {
-      "Authorization": "Bearer your-sentry-dsn"
-    }
-  }
-}
-```
-
-Or for Honeycomb:
-```json
-{
-  "otel": {
-    "endpoint": "https://api.honeycomb.io/v1/traces",
-    "headers": {
-      "x-honeycomb-team": "your-api-key"
-    }
-  }
-}
-```
-
-### 2. Workers OTEL Export
-
-For Workers-deployed apps, traces can be exported to any of the supported backends.
-Configure in `wrangler.toml` or via the Cloudflare dashboard.
-
-### 3. Logpush
-
-Exports logs (not traces) to:
-- AWS S3
-- Google Cloud Storage
-- Azure Blob Storage
-- Elastic
-- Datadog
-- Splunk
-- Sentry
-- And more...
-
-## Implementation Options
-
-### Option A: AI Gateway Proxy Plugin (Recommended)
-
-Route AI requests through Cloudflare AI Gateway to get automatic telemetry.
-
-```python
-class CloudflareAIGateway(Plugin):
-    """Route AI requests through Cloudflare AI Gateway for telemetry.
-    
-    Works with ANY model provider (OpenAI, Anthropic, etc.) while adding:
-    - Automatic tracing to your OTEL backend
-    - Request caching
-    - Rate limiting
-    - Cost tracking
-    """
-    
-    def __init__(
-        self,
-        gateway_id: str,  # Your AI Gateway ID
-        account_id: str | None = None,
-        # The underlying provider (OpenAI, Anthropic, etc.)
-        provider: Literal["openai", "anthropic", "workers-ai"] = "openai",
-    ):
-        self.base_url = f"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}"
-```
-
-**Pros:**
-- Automatic OTEL traces for all AI requests
-- Works with any model provider
-- Includes cost tracking, caching, rate limiting
-- No code changes needed for telemetry
-
-**Cons:**
-- Adds network hop (slight latency)
-- Only covers AI requests, not all app traces
-- Requires AI Gateway setup in CF dashboard
-
-### Option B: Generic OTLP Export (Standard Approach)
-
-Use standard OpenTelemetry exporters to any backend.
-
-```python
-class CloudflareTelemetry(Plugin):
-    """Export Genkit telemetry to any OTEL-compatible backend.
-    
-    This is a thin wrapper around standard OTLP export, but with
-    preset configurations for popular Cloudflare-compatible backends.
-    """
-    
-    def __init__(
-        self,
-        backend: Literal["honeycomb", "grafana", "axiom", "datadog", "custom"],
-        endpoint: str | None = None,
-        api_key: str | None = None,
-        headers: dict[str, str] | None = None,
-    ):
-        ...
-```
-
-**Pros:**
-- Works for any Python app (not just CF Workers)
-- Full control over what's traced
-- Standard OTEL approach
-
-**Cons:**
-- No Cloudflare-specific features
-- Basically same as any OTLP export
-- Less "Cloudflare native"
-
-### Option C: Hybrid (Both)
-
-Combine AI Gateway for AI telemetry + standard OTEL for app telemetry.
-
-## Recommended Implementation
-
-Given Cloudflare's limited native tracing, I recommend **Option A (AI Gateway)** as the
-primary implementation with a simple helper for configuration.
-
-### Directory Structure
-
-```
-py/plugins/cloudflare/
-├── pyproject.toml
-├── README.md
-├── LICENSE
-├── src/genkit/plugins/cloudflare/
-│   ├── __init__.py              # Plugin entry, ELI5 docs
-│   ├── ai_gateway.py            # AI Gateway proxy configuration
-│   ├── typing.py                # Configuration schemas
-│   └── py.typed
-└── tests/
-    ├── conftest.py
-    └── ai_gateway_test.py
-```
-
-### Implementation
-
-```python
-# __init__.py
-"""Cloudflare plugin for Genkit - AI Gateway integration.
-
-This plugin configures Genkit to route AI requests through Cloudflare's
-AI Gateway, which provides automatic OpenTelemetry trace export.
-
-Key Concepts (ELI5)::
-
-    ┌─────────────────────┬────────────────────────────────────────────────┐
-    │ Concept             │ ELI5 Explanation                               │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ AI Gateway          │ A proxy that sits between you and AI models.  │
-    │                     │ Adds caching, rate limits, and tracing.        │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Gateway ID          │ Your gateway's unique name. Create in the     │
-    │                     │ Cloudflare dashboard under AI > AI Gateway.   │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Automatic OTEL      │ AI Gateway exports traces automatically.       │
-    │                     │ Configure destination in CF dashboard.         │
-    └─────────────────────┴────────────────────────────────────────────────┘
-
-Example::
-
-    from genkit.ai import Genkit
-    from genkit.plugins.cloudflare import configure_ai_gateway
-    from genkit.plugins.compat_oai import OpenAI
-    
-    # Route OpenAI requests through AI Gateway
-    ai = Genkit(
-        plugins=[
-            OpenAI(
-                base_url=configure_ai_gateway(
-                    account_id="your-account-id",
-                    gateway_id="your-gateway-id",
-                    provider="openai",
-                ),
-            ),
-        ],
-    )
-"""
-
-def configure_ai_gateway(
-    account_id: str | None = None,
-    gateway_id: str | None = None,
-    provider: str = "openai",
-) -> str:
-    """Get the AI Gateway base URL for a provider.
-    
-    Args:
-        account_id: Cloudflare account ID (or CLOUDFLARE_ACCOUNT_ID env var)
-        gateway_id: AI Gateway ID (or CLOUDFLARE_GATEWAY_ID env var)
-        provider: Provider name ("openai", "anthropic", "workers-ai", etc.)
-    
-    Returns:
-        Base URL to use with the provider's SDK/plugin
-    """
-    account_id = account_id or os.environ.get('CLOUDFLARE_ACCOUNT_ID')
-    gateway_id = gateway_id or os.environ.get('CLOUDFLARE_GATEWAY_ID')
-    
-    if not account_id or not gateway_id:
-        raise ValueError(
-            "CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_GATEWAY_ID required"
-        )
-    
-    return f"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}"
-```
-
-## Alternative: Don't Create a Separate Plugin
-
-Given the limited scope, this functionality could also be:
-
-1. **Documented in cloudflare-ai plugin** - Show how to use AI Gateway with Workers AI
-2. **Added as a utility function** - Simple helper in `genkit.plugins.cloudflare_ai`
-3. **Left to user configuration** - Just document how to set `base_url`
-
-## Environment Variables
-
-| Variable | Required | Description |
-|----------|----------|-------------|
-| `CLOUDFLARE_ACCOUNT_ID` | Yes | Your Cloudflare account ID |
-| `CLOUDFLARE_GATEWAY_ID` | Yes (for AI Gateway) | Your AI Gateway ID |
-
-## Comparison with Other Telemetry Plugins
-
-| Feature | AWS (`aws`) | GCP (`google-cloud`) | Cloudflare |
-|---------|-------------|---------------------|------------|
-| Native Tracing Backend | ✅ X-Ray | ✅ Cloud Trace | ❌ None (use 3rd party) |
-| Third-Party Export | ✅ | ✅ | ✅ Sentry, Honeycomb, etc. |
-| OTLP Export | ✅ | ✅ | ✅ (Workers + AI Gateway) |
-| Log Correlation | ✅ | ✅ | ✅ Logpush to many backends |
-| Metrics | ✅ CloudWatch | ✅ Cloud Monitoring | ⚠️ Workers Analytics |
-| Auto-instrumentation | ✅ | ✅ | ✅ AI Gateway auto-traces |
-| Python SDK | ✅ Official | ✅ Official | ❌ REST API only |
-
-## Feasibility Assessment
-
-**Feasibility: ⚠️ MEDIUM-HIGH**
-
-**Reasons:**
-1. No native Cloudflare tracing backend, BUT excellent third-party support
-2. AI Gateway auto-exports traces to Sentry, Honeycomb, Datadog, etc.
-3. Workers OTEL export supports all major observability platforms
-4. Implementation would provide presets for common backends
-
-**Recommendation:**
-- **Consider implementing** a `cloudflare` telemetry plugin that:
-  - Provides presets for Sentry, Honeycomb, Datadog, Grafana, Axiom
-  - Helps configure AI Gateway OTEL export
-  - Documents the integration patterns
-- Could be combined with `cloudflare-ai` plugin or kept separate
-
-## References
-
-- [AI Gateway OTEL Integration](https://developers.cloudflare.com/ai-gateway/observability/otel-integration/)
-- [Workers OTEL Export](https://developers.cloudflare.com/workers/observability/exporting-opentelemetry-data/)
-- [Logpush Documentation](https://developers.cloudflare.com/logs/logpush/)
-- [Workers Analytics](https://developers.cloudflare.com/workers/observability/)
diff --git a/py/engdoc/planning/observability-plugin.md b/py/engdoc/planning/observability-plugin.md
deleted file mode 100644
index 0856922bb6..0000000000
--- a/py/engdoc/planning/observability-plugin.md
+++ /dev/null
@@ -1,453 +0,0 @@
-# Observability Plugin Implementation Plan
-
-**Status:** Ready for Implementation  
-**Feasibility:** ✅ HIGH  
-**Estimated Effort:** 1 week  
-**Dependencies:** `opentelemetry-sdk`, `opentelemetry-exporter-otlp-proto-http`
-
-## Overview
-
-The `observability` plugin provides a unified way to export Genkit telemetry to any
-OTLP-compatible backend (Sentry, Honeycomb, Datadog, Grafana, Axiom, etc.) with
-simple presets for popular services.
-
-```
-┌─────────────────────────────────────────────────────────────────────────────────┐
-│                      OBSERVABILITY PLUGIN ARCHITECTURE                          │
-│                                                                                 │
-│    Key Concepts (ELI5):                                                         │
-│    ┌─────────────────────┬────────────────────────────────────────────────────┐ │
-│    │ OTLP                │ OpenTelemetry Protocol. The universal language     │ │
-│    │                     │ for sending traces. Sentry, Honeycomb, all speak it.│ │
-│    ├─────────────────────┼────────────────────────────────────────────────────┤ │
-│    │ Backend Preset      │ Pre-configured settings for a service. Just add    │ │
-│    │                     │ your API key and you're done!                      │ │
-│    ├─────────────────────┼────────────────────────────────────────────────────┤ │
-│    │ Sentry              │ Error tracking + tracing. Great for debugging      │ │
-│    │                     │ crashes and performance issues.                    │ │
-│    ├─────────────────────┼────────────────────────────────────────────────────┤ │
-│    │ Honeycomb           │ Observability platform built for debugging.        │ │
-│    │                     │ Query your traces like a database.                 │ │
-│    ├─────────────────────┼────────────────────────────────────────────────────┤ │
-│    │ Datadog             │ Full-stack monitoring. Traces, metrics, logs,      │ │
-│    │                     │ all in one place.                                  │ │
-│    └─────────────────────┴────────────────────────────────────────────────────┘ │
-│                                                                                 │
-│    Data Flow:                                                                   │
-│    ┌─────────────────┐     ┌─────────────────┐     ┌─────────────────┐         │
-│    │  Genkit App     │────▶│  OTLP Exporter  │────▶│  Your Backend   │         │
-│    │  (Your Code)    │     │  (HTTP/gRPC)    │     │  (Sentry, etc.) │         │
-│    └─────────────────┘     └─────────────────┘     └─────────────────┘         │
-│                                                                                 │
-│    Supported Backends:                                                          │
-│    ┌────────────┬────────────┬────────────┬────────────┬────────────┐          │
-│    │  Sentry    │ Honeycomb  │  Datadog   │  Grafana   │   Axiom    │          │
-│    │  ✅        │  ✅        │  ✅        │  ✅        │  ✅        │          │
-│    └────────────┴────────────┴────────────┴────────────┴────────────┘          │
-│                                                                                 │
-└─────────────────────────────────────────────────────────────────────────────────┘
-```
-
-## When to Use This Plugin
-
-```
-┌─────────────────────────────────────────────────────────────────────────────────┐
-│                     WHEN TO USE WHAT                                            │
-├─────────────────────────────────────────────────────────────────────────────────┤
-│                                                                                 │
-│   "I'm on AWS and want X-Ray"           → aws plugin (SigV4, X-Ray format)     │
-│   "I'm on GCP and want Cloud Trace"     → google-cloud plugin (ADC)            │
-│   "I'm on Azure and want App Insights"  → azure plugin (Live Metrics, Map)     │
-│                                                                                 │
-│   "I'm on AWS but want Honeycomb"       → observability plugin ← THIS ONE      │
-│   "I'm on GCP but want Sentry"          → observability plugin ← THIS ONE      │
-│   "I'm multi-cloud, want Datadog"       → observability plugin ← THIS ONE      │
-│   "I don't care, just give me traces"   → observability plugin ← THIS ONE      │
-│                                                                                 │
-└─────────────────────────────────────────────────────────────────────────────────┘
-```
-
-## Why Not Just Use Platform Plugins?
-
-Platform plugins (aws, google-cloud, azure) provide:
-- Platform-specific authentication (SigV4, ADC)
-- Native backend features (Live Metrics, X-Ray Service Map)
-- Platform log correlation (CloudWatch, Cloud Logging)
-
-**These CAN'T be replicated with generic OTLP.**
-
-The observability plugin is for users who:
-- Don't want platform-native tools
-- Need the same backend across multiple clouds
-- Prefer third-party tools like Sentry or Honeycomb
-- Want simple setup without platform-specific auth
-
-## Supported Backends
-
-| Backend | Endpoint | Auth | Features |
-|---------|----------|------|----------|
-| **Sentry** | `https://{org}.ingest.sentry.io/api/{project}/envelope/` | DSN | Error tracking, performance |
-| **Honeycomb** | `https://api.honeycomb.io/v1/traces` | API Key | Query-based debugging |
-| **Datadog** | `https://trace.agent.datadoghq.com/v0.4/traces` | API Key | Full-stack APM |
-| **Grafana Cloud** | `https://{stack}.grafana.net/otlp` | API Key | Tempo traces |
-| **Axiom** | `https://api.axiom.co/v1/traces` | API Token | Log + trace ingestion |
-| **Custom** | Any OTLP endpoint | Headers | Bring your own |
-
-## API Design
-
-```python
-"""Observability plugin for Genkit - Third-party telemetry backends.
-
-This plugin provides simple presets for popular observability platforms,
-all using standard OpenTelemetry Protocol (OTLP) export.
-
-Key Concepts (ELI5)::
-
-    ┌─────────────────────┬────────────────────────────────────────────────────┐
-    │ Concept             │ ELI5 Explanation                                   │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ OTLP                │ The universal language for traces. Like USB but   │
-    │                     │ for observability data.                           │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ Backend             │ Where your traces go. Sentry, Honeycomb, etc.     │
-    │                     │ Pick one, add your API key, done!                 │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ Preset              │ Pre-configured settings for a backend. Knows      │
-    │                     │ the right URLs, headers, and formats.             │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ Span                │ A single operation's timing. Like a stopwatch     │
-    │                     │ for one function call.                            │
-    └─────────────────────┴────────────────────────────────────────────────────┘
-
-Data Flow::
-
-    ┌─────────────────────────────────────────────────────────────────────────┐
-    │                  HOW OBSERVABILITY EXPORT WORKS                         │
-    │                                                                         │
-    │    Your Genkit App                                                      │
-    │         │                                                               │
-    │         │  (1) Flows, models, tools create spans                        │
-    │         ▼                                                               │
-    │    ┌─────────────────┐                                                  │
-    │    │  TracerProvider │   Collects all spans from your app               │
-    │    │  (OpenTelemetry)│                                                  │
-    │    └────────┬────────┘                                                  │
-    │             │                                                           │
-    │             │  (2) Batch and export via OTLP                            │
-    │             ▼                                                           │
-    │    ┌─────────────────┐                                                  │
-    │    │  OTLP Exporter  │   Sends to your chosen backend                   │
-    │    │  (HTTP POST)    │                                                  │
-    │    └────────┬────────┘                                                  │
-    │             │                                                           │
-    │             │  (3) View in your dashboard                               │
-    │             ▼                                                           │
-    │    ┌─────────────────┐                                                  │
-    │    │  Sentry /       │   Query, alert, debug your traces                │
-    │    │  Honeycomb /    │                                                  │
-    │    │  Datadog / etc  │                                                  │
-    │    └─────────────────┘                                                  │
-    └─────────────────────────────────────────────────────────────────────────┘
-
-Example::
-
-    from genkit.plugins.observability import configure_telemetry
-    
-    # Sentry
-    configure_telemetry(backend="sentry", sentry_dsn="https://...")
-    
-    # Honeycomb
-    configure_telemetry(backend="honeycomb", honeycomb_api_key="...")
-    
-    # Datadog
-    configure_telemetry(backend="datadog", datadog_api_key="...")
-    
-    # Custom OTLP endpoint
-    configure_telemetry(
-        backend="custom",
-        endpoint="https://my-collector/v1/traces",
-        headers={"Authorization": "Bearer ..."},
-    )
-"""
-
-from enum import Enum
-from typing import Literal
-
-
-class Backend(str, Enum):
-    """Supported observability backends."""
-    
-    SENTRY = "sentry"
-    HONEYCOMB = "honeycomb"
-    DATADOG = "datadog"
-    GRAFANA = "grafana"
-    AXIOM = "axiom"
-    CUSTOM = "custom"
-
-
-def configure_telemetry(
-    backend: Backend | Literal["sentry", "honeycomb", "datadog", "grafana", "axiom", "custom"],
-    *,
-    # Common options
-    service_name: str = "genkit-app",
-    service_version: str = "1.0.0",
-    environment: str | None = None,
-    
-    # Sentry
-    sentry_dsn: str | None = None,
-    
-    # Honeycomb
-    honeycomb_api_key: str | None = None,
-    honeycomb_dataset: str | None = None,
-    
-    # Datadog
-    datadog_api_key: str | None = None,
-    datadog_site: str = "datadoghq.com",
-    
-    # Grafana Cloud
-    grafana_endpoint: str | None = None,
-    grafana_api_key: str | None = None,
-    
-    # Axiom
-    axiom_api_token: str | None = None,
-    axiom_dataset: str | None = None,
-    
-    # Custom OTLP
-    endpoint: str | None = None,
-    headers: dict[str, str] | None = None,
-) -> None:
-    """Configure telemetry export to a third-party backend.
-    
-    Args:
-        backend: Which backend to use (sentry, honeycomb, datadog, etc.)
-        service_name: Name of your service (appears in traces)
-        service_version: Version of your service
-        environment: Environment name (production, staging, etc.)
-        
-        # Backend-specific (provide based on chosen backend):
-        sentry_dsn: Sentry DSN (for backend="sentry")
-        honeycomb_api_key: Honeycomb API key (for backend="honeycomb")
-        datadog_api_key: Datadog API key (for backend="datadog")
-        grafana_endpoint: Grafana Cloud OTLP endpoint (for backend="grafana")
-        axiom_api_token: Axiom API token (for backend="axiom")
-        
-        # Custom OTLP:
-        endpoint: Custom OTLP endpoint URL (for backend="custom")
-        headers: Custom headers for authentication (for backend="custom")
-    
-    Example:
-        >>> # Sentry
-        >>> configure_telemetry(backend="sentry", sentry_dsn="https://...")
-        >>> 
-        >>> # Honeycomb  
-        >>> configure_telemetry(backend="honeycomb", honeycomb_api_key="...")
-    """
-    ...
-```
-
-## Directory Structure
-
-```
-py/plugins/observability/
-├── pyproject.toml
-├── README.md
-├── LICENSE
-├── src/genkit/plugins/observability/
-│   ├── __init__.py              # Main API, configure_telemetry()
-│   ├── backends/
-│   │   ├── __init__.py
-│   │   ├── base.py              # Base backend configuration
-│   │   ├── sentry.py            # Sentry preset
-│   │   ├── honeycomb.py         # Honeycomb preset
-│   │   ├── datadog.py           # Datadog preset
-│   │   ├── grafana.py           # Grafana Cloud preset
-│   │   ├── axiom.py             # Axiom preset
-│   │   └── custom.py            # Custom OTLP
-│   ├── typing.py                # Configuration schemas
-│   └── py.typed
-└── tests/
-    ├── conftest.py
-    ├── sentry_test.py
-    ├── honeycomb_test.py
-    └── integration_test.py
-```
-
-## Implementation
-
-### Core Configuration
-
-```python
-# src/genkit/plugins/observability/__init__.py
-
-import os
-from typing import Any
-
-from opentelemetry import trace
-from opentelemetry.sdk.trace import TracerProvider
-from opentelemetry.sdk.trace.export import BatchSpanProcessor
-from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
-from opentelemetry.sdk.resources import Resource, SERVICE_NAME, SERVICE_VERSION
-
-from .backends import get_backend_config
-
-
-def configure_telemetry(
-    backend: str,
-    *,
-    service_name: str = "genkit-app",
-    service_version: str = "1.0.0",
-    **kwargs: Any,
-) -> None:
-    """Configure telemetry export to a third-party backend."""
-    
-    # Get backend-specific configuration
-    config = get_backend_config(backend, **kwargs)
-    
-    # Create resource with service info
-    resource = Resource.create({
-        SERVICE_NAME: service_name,
-        SERVICE_VERSION: service_version,
-    })
-    
-    # Create OTLP exporter with backend config
-    exporter = OTLPSpanExporter(
-        endpoint=config.endpoint,
-        headers=config.headers,
-    )
-    
-    # Configure tracer provider
-    provider = TracerProvider(resource=resource)
-    provider.add_span_processor(BatchSpanProcessor(exporter))
-    trace.set_tracer_provider(provider)
-```
-
-### Backend Presets
-
-```python
-# src/genkit/plugins/observability/backends/sentry.py
-
-from dataclasses import dataclass
-
-
-@dataclass
-class SentryConfig:
-    """Sentry OTLP configuration."""
-    
-    endpoint: str
-    headers: dict[str, str]
-
-
-def get_sentry_config(dsn: str) -> SentryConfig:
-    """Create Sentry configuration from DSN.
-    
-    DSN format: https://{key}@{org}.ingest.sentry.io/{project}
-    """
-    # Parse DSN and construct OTLP endpoint
-    # Sentry accepts OTLP at: https://{org}.ingest.sentry.io/api/{project}/envelope/
-    
-    return SentryConfig(
-        endpoint=f"https://sentry.io/api/0/envelope/",
-        headers={
-            "X-Sentry-Auth": f"Sentry sentry_key={dsn}",
-        },
-    )
-```
-
-## pyproject.toml
-
-```toml
-[project]
-name = "genkit-observability-plugin"
-version = "0.1.0"
-description = "Third-party observability backends for Genkit"
-requires-python = ">=3.10"
-dependencies = [
-    "genkit",
-    "opentelemetry-sdk>=1.20.0",
-    "opentelemetry-exporter-otlp-proto-http>=1.20.0",
-]
-
-[project.optional-dependencies]
-dev = [
-    "pytest>=8.0.0",
-    "pytest-asyncio>=0.24.0",
-]
-```
-
-## Sample Application
-
-```python
-# py/samples/provider-observability-hello/src/main.py
-"""Observability hello sample - Third-party telemetry with Genkit.
-
-Key Concepts (ELI5)::
-
-    ┌─────────────────────┬────────────────────────────────────────────────────┐
-    │ Concept             │ ELI5 Explanation                                   │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ Observability       │ Seeing what your app is doing. Like X-ray          │
-    │                     │ vision for your code!                              │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ Traces              │ The journey of a request through your app.         │
-    │                     │ Shows timing, errors, everything.                  │
-    ├─────────────────────┼────────────────────────────────────────────────────┤
-    │ Backend             │ Where traces are stored and visualized.            │
-    │                     │ Sentry, Honeycomb, Datadog, etc.                   │
-    └─────────────────────┴────────────────────────────────────────────────────┘
-"""
-
-import os
-from genkit.ai import Genkit
-from genkit.plugins.observability import configure_telemetry
-from genkit.plugins.google_genai import GoogleAI
-
-# Configure telemetry FIRST (before any Genkit operations)
-configure_telemetry(
-    backend="honeycomb",  # or "sentry", "datadog", etc.
-    honeycomb_api_key=os.environ["HONEYCOMB_API_KEY"],
-    service_name="provider-observability-hello",
-)
-
-ai = Genkit(
-    plugins=[GoogleAI()],
-    model="googleai/gemini-2.0-flash",
-)
-
-@ai.flow()
-async def say_hi(name: str) -> str:
-    """Say hello - traced to your observability backend."""
-    response = await ai.generate(prompt=f"Say hi to {name}!")
-    return response.text
-```
-
-## Environment Variables
-
-| Variable | Backend | Description |
-|----------|---------|-------------|
-| `SENTRY_DSN` | Sentry | Your Sentry DSN |
-| `HONEYCOMB_API_KEY` | Honeycomb | Honeycomb API key |
-| `DD_API_KEY` | Datadog | Datadog API key |
-| `GRAFANA_OTLP_ENDPOINT` | Grafana | Grafana Cloud OTLP endpoint |
-| `AXIOM_TOKEN` | Axiom | Axiom API token |
-
-## Feasibility Score
-
-| Factor | Score | Notes |
-|--------|-------|-------|
-| **API Documentation** | 9/10 | Standard OTLP, well-documented |
-| **Python Support** | 10/10 | Official opentelemetry-python |
-| **Setup Simplicity** | 9/10 | One function call with preset |
-| **Feature Coverage** | 8/10 | Traces + basic metrics |
-| **Community Demand** | 9/10 | Common request |
-| **Maintenance Burden** | 9/10 | Stable OTLP protocol |
-| **Strategic Value** | 8/10 | Platform-agnostic option |
-| **TOTAL** | **89/100** | ✅ **BUILD** |
-
-## References
-
-- [OpenTelemetry Python](https://opentelemetry.io/docs/languages/python/)
-- [Sentry OTLP](https://docs.sentry.io/platforms/python/tracing/)
-- [Honeycomb OpenTelemetry](https://docs.honeycomb.io/send-data/opentelemetry/)
-- [Datadog OTLP](https://docs.datadoghq.com/tracing/trace_collection/open_standards/otlp_ingest_in_the_agent/)
-- [Grafana Cloud OTLP](https://grafana.com/docs/grafana-cloud/send-data/otlp/)
-- [Axiom OpenTelemetry](https://axiom.co/docs/send-data/opentelemetry)
diff --git a/py/engdoc/planning/vercel-plugins.md b/py/engdoc/planning/vercel-plugins.md
deleted file mode 100644
index 26d9523395..0000000000
--- a/py/engdoc/planning/vercel-plugins.md
+++ /dev/null
@@ -1,384 +0,0 @@
-# Vercel Plugins Implementation Plan
-
-**Status:** Research Complete  
-**AI Plugin Feasibility:** ⚠️ LOW-MEDIUM (AI SDK is JS/TS only, but AI Gateway works)  
-**Telemetry Plugin Feasibility:** ⚠️ MEDIUM (standard OTEL works, @vercel/otel is Node.js only)  
-**Estimated Effort:** Low (if implemented)  
-**Dependencies:** `httpx`, `openai` or `anthropic`, `opentelemetry-sdk`
-
-## Overview
-
-**Important Clarification:** Python IS fully supported on Vercel as a runtime platform.
-FastAPI, Flask, and other Python frameworks work great as Vercel Functions.
-
-However, Vercel's **AI-specific SDKs** and **@vercel/otel** are JavaScript/TypeScript only.
-
-```
-┌─────────────────────────────────────────────────────────────────────────┐
-│                    VERCEL PYTHON SUPPORT MATRIX                         │
-│                                                                         │
-│    ┌─────────────────────────────────────────────────────────────────┐  │
-│    │                    Vercel + Python                              │  │
-│    ├─────────────────────────────────────────────────────────────────┤  │
-│    │  Feature           │ Python Support │ Notes                     │  │
-│    ├─────────────────────────────────────────────────────────────────┤  │
-│    │  Vercel Platform   │ ✅ YES         │ FastAPI, Flask work great │  │
-│    │  Vercel Functions  │ ✅ YES         │ Python serverless         │  │
-│    │  AI Gateway        │ ✅ YES         │ HTTP API, any language    │  │
-│    │  AI SDK            │ ❌ JS/TS only  │ No Python package         │  │
-│    │  @vercel/otel      │ ❌ Node.js     │ No Python package         │  │
-│    │  Standard OTEL     │ ✅ YES         │ Works from Python apps    │  │
-│    └─────────────────────────────────────────────────────────────────┘  │
-│                                                                         │
-│    Key Concepts (ELI5):                                                 │
-│    ┌─────────────────────┬────────────────────────────────────────────┐ │
-│    │ Vercel Functions    │ Run Python (FastAPI/Flask) as serverless. │ │
-│    │                     │ Auto-scales, 250MB limit per function.     │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ AI Gateway          │ A proxy that adds caching, rate limiting, │ │
-│    │                     │ and routing to AI API calls.              │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ Vercel AI SDK       │ JavaScript library for building AI apps.  │ │
-│    │                     │ NOT available for Python (use AI Gateway).│ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ @vercel/otel        │ Vercel's OTEL package for Node.js only.   │ │
-│    │                     │ Python apps use standard OTEL instead.    │ │
-│    ├─────────────────────┼────────────────────────────────────────────┤ │
-│    │ OIDC Token          │ Auto-generated auth token on Vercel.      │ │
-│    │                     │ Available to Python apps too!             │ │
-│    └─────────────────────┴────────────────────────────────────────────┘ │
-└─────────────────────────────────────────────────────────────────────────┘
-
-Reference: https://vercel.com/docs/frameworks/backend/fastapi
-```
-
-## Part 1: Vercel AI Gateway Plugin
-
-### What AI Gateway Provides
-
-The AI Gateway is an HTTP proxy that:
-- Routes requests to multiple AI providers
-- Adds request caching
-- Provides rate limiting
-- Offers fallback routing
-- Works with ANY language via HTTP
-
-### Python Integration
-
-Since AI Gateway uses OpenAI-compatible and Anthropic-compatible APIs, Python apps can
-use it by pointing existing SDKs at the gateway URL.
-
-```python
-# Using OpenAI SDK with Vercel AI Gateway
-from openai import OpenAI
-
-client = OpenAI(
-    api_key=os.getenv('AI_GATEWAY_API_KEY'),
-    base_url='https://ai-gateway.vercel.sh/v1'
-)
-
-response = client.chat.completions.create(
-    model='anthropic/claude-sonnet-4.5',  # Can use any provider!
-    messages=[{'role': 'user', 'content': 'Hello!'}]
-)
-```
-
-### Implementation Option: Simple Helper
-
-Rather than a full plugin, provide a helper function:
-
-```python
-# py/plugins/vercel/__init__.py
-"""Vercel AI Gateway integration for Genkit.
-
-Vercel's AI Gateway is a proxy that works with any AI provider, adding
-caching, rate limiting, and fallback routing.
-
-Key Concepts (ELI5)::
-
-    ┌─────────────────────┬────────────────────────────────────────────────┐
-    │ Concept             │ ELI5 Explanation                               │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ AI Gateway          │ A middleman between you and AI providers.     │
-    │                     │ Adds caching and rate limiting automatically. │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ Universal Provider  │ Use any model (OpenAI, Anthropic, etc.)       │
-    │                     │ through one consistent API.                   │
-    ├─────────────────────┼────────────────────────────────────────────────┤
-    │ API Key             │ Your AI_GATEWAY_API_KEY for authentication.   │
-    │                     │ Get it from Vercel dashboard.                 │
-    └─────────────────────┴────────────────────────────────────────────────┘
-
-Example::
-
-    from genkit.ai import Genkit
-    from genkit.plugins.vercel import vercel_gateway_url
-    from genkit.plugins.compat_oai import OpenAI
-    
-    # Route OpenAI requests through Vercel AI Gateway
-    ai = Genkit(
-        plugins=[
-            OpenAI(
-                base_url=vercel_gateway_url(),
-                api_key=os.environ['AI_GATEWAY_API_KEY'],
-            ),
-        ],
-    )
-"""
-
-import os
-
-AI_GATEWAY_BASE_URL = "https://ai-gateway.vercel.sh/v1"
-
-
-def vercel_gateway_url() -> str:
-    """Get the Vercel AI Gateway base URL.
-    
-    Returns:
-        The AI Gateway URL to use with OpenAI-compatible SDKs.
-    
-    Example:
-        >>> from genkit.plugins.compat_oai import OpenAI
-        >>> OpenAI(base_url=vercel_gateway_url())
-    """
-    return AI_GATEWAY_BASE_URL
-
-
-def get_vercel_auth() -> str | None:
-    """Get the appropriate auth token for Vercel.
-    
-    On Vercel deployments, uses OIDC token. Otherwise, uses API key.
-    
-    Returns:
-        Auth token string or None if not configured.
-    """
-    # On Vercel, OIDC token is auto-generated
-    if oidc := os.environ.get('VERCEL_OIDC_TOKEN'):
-        return oidc
-    # For local development, use API key
-    return os.environ.get('AI_GATEWAY_API_KEY')
-```
-
-### Feasibility Assessment
-
-**Feasibility: ⚠️ LOW-MEDIUM**
-
-**Reasons:**
-1. AI Gateway works fine with Python via existing SDKs
-2. No Vercel-specific AI functionality beyond the gateway
-3. Implementation would be trivial (just URL helper)
-4. Users can already do this without a plugin
-
-**Recommendation:**
-- Document how to use AI Gateway with existing plugins (`compat-oai`, `anthropic`)
-- Don't create a separate plugin unless there's strong user demand
-- Could add as a simple utility function in documentation
-
----
-
-## Part 2: Vercel Telemetry Plugin
-
-### Current State
-
-Vercel's `@vercel/otel` package is **Node.js only**, but Python apps on Vercel CAN use
-standard OpenTelemetry to export traces to any OTEL-compatible backend.
-
-```typescript
-// @vercel/otel - JavaScript/TypeScript only
-import { registerOTel } from '@vercel/otel';
-
-export function register() {
-  registerOTel({ serviceName: 'your-project-name' });
-}
-```
-
-### What @vercel/otel Provides (Node.js only)
-
-- Auto-configuration for Vercel's OTEL collector
-- Node.js and Edge runtime support
-- W3C Trace Context propagation
-- Fetch API instrumentation
-
-### Python Options on Vercel
-
-Python apps deployed on Vercel (FastAPI, Flask, etc.) can use standard OTEL:
-
-```python
-# Standard OTLP export from Python on Vercel
-from opentelemetry import trace
-from opentelemetry.sdk.trace import TracerProvider
-from opentelemetry.sdk.trace.export import BatchSpanProcessor
-from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
-
-# Works from Vercel Python Functions!
-provider = TracerProvider()
-provider.add_span_processor(
-    BatchSpanProcessor(
-        OTLPSpanExporter(
-            endpoint="https://api.honeycomb.io/v1/traces",  # Or any OTEL backend
-            headers={"x-honeycomb-team": os.environ["HONEYCOMB_API_KEY"]},
-        )
-    )
-)
-trace.set_tracer_provider(provider)
-```
-
-### Potential Vercel Telemetry Plugin
-
-A simple plugin could provide preset configs for common backends:
-
-```python
-class VercelTelemetry:
-    """Telemetry for Python apps on Vercel.
-    
-    Since @vercel/otel is Node.js only, this provides standard OTEL
-    configuration with presets for popular backends.
-    """
-    
-    def __init__(
-        self,
-        backend: Literal["honeycomb", "datadog", "grafana", "axiom"],
-        service_name: str = "genkit-vercel-app",
-        api_key: str | None = None,
-    ):
-        ...
-```
-
-### Feasibility Assessment
-
-**Feasibility: ⚠️ MEDIUM**
-
-**Reasons:**
-1. Python DOES work on Vercel (FastAPI, Flask are first-class)
-2. Standard OTEL works from Python Vercel Functions
-3. No Vercel-specific OTEL package, but standard approach works
-4. Could provide convenience presets for common backends
-
-**Recommendation:**
-- **Consider a simple helper** for common OTEL backends
-- Not Vercel-specific, but useful for Vercel Python users
-- Lower priority than `azure` and `cloudflare-ai`
-
----
-
-## Summary: Should We Build Vercel Plugins?
-
-### Vercel AI Plugin
-
-| Aspect | Assessment |
-|--------|------------|
-| **Need** | Low - existing plugins work fine with AI Gateway |
-| **Effort** | Very low - just URL helper |
-| **Value** | Convenience for Vercel users |
-| **Recommendation** | **Low priority** - document AI Gateway usage |
-
-### Vercel Telemetry Plugin
-
-| Aspect | Assessment |
-|--------|------------|
-| **Need** | Medium - Python on Vercel is growing |
-| **Effort** | Low - standard OTEL with presets |
-| **Value** | Convenience for common backends |
-| **Recommendation** | **Consider** if user demand exists |
-
----
-
-## Alternative: Documentation Only
-
-Instead of plugins, provide documentation showing how to:
-
-### Using AI Gateway with Genkit
-
-```markdown
-# Using Vercel AI Gateway with Genkit
-
-Vercel AI Gateway can be used with Genkit's `compat-oai` plugin by setting
-the base URL:
-
-```python
-from genkit.ai import Genkit
-from genkit.plugins.compat_oai import OpenAI
-import os
-
-ai = Genkit(
-    plugins=[
-        OpenAI(
-            base_url="https://ai-gateway.vercel.sh/v1",
-            api_key=os.environ['AI_GATEWAY_API_KEY'],
-        ),
-    ],
-    model="openai/gpt-4o",  # or "anthropic/claude-sonnet-4.5"
-)
-```
-
-### Telemetry for Python on Vercel
-
-For Python serverless functions on Vercel, use standard OpenTelemetry
-with your preferred backend (Honeycomb, Datadog, etc.):
-
-```python
-from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter
-# ... standard OTLP setup
-```
-```
-
----
-
-## Comparison with Other Platforms
-
-| Platform | AI Plugin | Telemetry Plugin | Python Runtime |
-|----------|-----------|------------------|----------------|
-| **AWS** | ✅ amazon-bedrock | ✅ aws | ✅ Full |
-| **GCP** | ✅ google-genai | ✅ google-cloud | ✅ Full |
-| **Azure** | ✅ microsoft-foundry | ✅ azure (planned) | ✅ Full |
-| **Cloudflare** | ✅ cloudflare-ai (planned) | ⚠️ AI Gateway only | ✅ Workers |
-| **Vercel** | ⚠️ AI Gateway helper | ⚠️ Standard OTEL | ✅ Functions |
-
----
-
-## Final Recommendation
-
-**Low priority, but feasible if user demand exists.** Python works great on Vercel!
-
-### Recommended Approach
-
-1. **Document AI Gateway usage** with existing `compat-oai` and `anthropic` plugins
-2. **Document standard OTEL** for telemetry from Python Vercel Functions
-3. **Consider a simple `vercel` plugin** if users request it, containing:
-   - `vercel_gateway_url()` helper for AI Gateway
-   - `VercelTelemetry` class with presets for Honeycomb, Datadog, etc.
-
-### Priority
-
-| Plugin | Priority | Reason |
-|--------|----------|--------|
-| `azure` | High | Official OTEL distro, pairs with microsoft-foundry |
-| `cloudflare-ai` | High | Growing edge AI market |
-| `vercel` | Low | Works without plugin, add if demanded |
-
-### If We Build It
-
-A minimal `vercel` plugin would look like:
-
-```python
-# py/plugins/vercel/src/genkit/plugins/vercel/__init__.py
-"""Vercel integration helpers for Genkit.
-
-Provides utilities for Python apps deployed on Vercel Functions.
-"""
-
-def vercel_gateway_url() -> str:
-    """Get Vercel AI Gateway URL."""
-    return "https://ai-gateway.vercel.sh/v1"
-
-class VercelTelemetry:
-    """Standard OTEL with presets for common backends."""
-    ...
-```
-
-## References
-
-- [Vercel AI Gateway - Python](https://vercel.com/docs/ai-gateway/python)
-- [Vercel AI SDK](https://sdk.vercel.ai/docs) (JS/TS only)
-- [@vercel/otel](https://www.npmjs.com/package/@vercel/otel) (Node.js only)
-- [Python AI SDK (Community)](https://github.com/python-ai-sdk/sdk) (unofficial)
diff --git a/py/engdoc/release-publishing-guide.md b/py/engdoc/release-publishing-guide.md
deleted file mode 100644
index 6a1941d1b3..0000000000
--- a/py/engdoc/release-publishing-guide.md
+++ /dev/null
@@ -1,340 +0,0 @@
-# Python SDK Release and Publishing Guide
-
-This guide documents the complete process for releasing and publishing the Genkit Python SDK.
-
-## Pre-Release Requirements
-
-### 1. Version Verification
-
-All packages must have the same version (`0.5.0` for this release):
-
-```bash
-# Check all package versions
-grep "^version = " packages/*/pyproject.toml plugins/*/pyproject.toml | sort
-```
-
-### 2. Documentation Requirements
-
-| Requirement | Location | Status |
-|-------------|----------|--------|
-| CHANGELOG.md updated | `py/CHANGELOG.md` | ✅ |
-| PR description created | `py/.github/PR_DESCRIPTION_0.5.0.md` | ✅ |
-| Blog article written | `py/engdoc/blog-genkit-python-0.5.0.md` | ✅ |
-| Release validation passed | `./bin/validate_release_docs` | ✅ |
-
-### 3. Code Quality Requirements
-
-```bash
-# All checks must pass
-cd py && ./bin/lint           # Linting and type checks
-cd py && uv run pytest .      # All tests pass
-cd py && ./bin/validate_release_docs  # Release doc validation
-```
-
-### 4. PyPI Package Status
-
-**Existing packages (update from v0.4.0 to v0.5.0):**
-- genkit
-- genkit-plugin-compat-oai
-- genkit-plugin-dev-local-vectorstore
-- genkit-plugin-firebase
-- genkit-plugin-flask
-- genkit-plugin-google-cloud
-- genkit-plugin-google-genai
-- genkit-plugin-ollama
-- genkit-plugin-vertex-ai
-
-**New packages (first publish at v0.5.0):**
-- genkit-plugin-anthropic
-- genkit-plugin-aws
-- genkit-plugin-amazon-bedrock
-- genkit-plugin-cloudflare-workers-ai
-- genkit-plugin-deepseek
-- genkit-plugin-evaluators
-- genkit-plugin-huggingface
-- genkit-plugin-mcp
-- genkit-plugin-mistral
-- genkit-plugin-microsoft-foundry
-- genkit-plugin-observability
-- genkit-plugin-xai
-
-### 5. GitHub Environment Configuration
-
-Ensure the `pypi_github_publishing` environment is configured in GitHub repository settings with:
-- PyPI trusted publishing enabled
-- Required reviewers (if applicable)
-
-## Release Process
-
-### Step 1: Merge the Release PR
-
-After PR approval:
-```bash
-# Merge the PR (use squash or merge commit as appropriate)
-gh pr merge 4417 --squash
-```
-
-### Step 2: Create a GitHub Release
-
-```bash
-# Create and push a tag
-git checkout main
-git pull origin main
-git tag -a py/v0.5.0 -m "Genkit Python SDK v0.5.0"
-git push origin py/v0.5.0
-```
-
-Then create a GitHub release:
-1. Go to https://github.com/firebase/genkit/releases/new
-2. Select tag: `py/v0.5.0`
-3. Title: `Genkit Python SDK v0.5.0`
-4. Copy release notes from `py/CHANGELOG.md`
-5. Publish release
-
-### Step 3: Publish to PyPI
-
-1. Go to **Actions** → **Publish Python Package**
-2. Click **Run workflow**
-3. Select:
-   - `publish_scope: all` (to publish all 21 packages)
-4. Click **Run workflow**
-5. Monitor the workflow - it will build and publish all packages in parallel
-
-### Step 4: Verify Publication
-
-After workflow completes:
-```bash
-# Verify packages on PyPI
-for pkg in genkit genkit-plugin-google-genai genkit-plugin-anthropic; do
-  pip index versions $pkg | head -1
-done
-```
-
-Or check on PyPI directly:
-- https://pypi.org/project/genkit/
-- https://pypi.org/project/genkit-plugin-google-genai/
-
-### Step 5: Post-Release Verification
-
-```bash
-# Test installation in a fresh environment
-python -m venv /tmp/genkit-test
-source /tmp/genkit-test/bin/activate
-pip install genkit genkit-plugin-google-genai
-python -c "from genkit.ai import Genkit; print('Success!')"
-```
-
-## Troubleshooting
-
-### Package Already Exists at This Version
-
-If a package was partially published:
-```bash
-# Check the version on PyPI
-curl -s "https://pypi.org/pypi/genkit/json" | jq -r '.info.version'
-```
-
-You cannot re-upload the same version. Either:
-1. Bump the version (e.g., 0.5.1)
-2. Delete the release on PyPI (only within 24 hours)
-
-### Trusted Publishing Fails
-
-Ensure the GitHub environment `pypi_github_publishing` is configured with:
-1. Go to repository Settings → Environments
-2. Create/edit `pypi_github_publishing`
-3. Configure trusted publisher on PyPI for each package
-
-### Individual Package Publish
-
-To publish a single package:
-
-**Via GitHub UI:**
-1. Go to **Actions** → **Publish Python Package**
-2. Select `publish_scope: single`
-3. Select `project_type: plugins` (or `packages` for genkit core)
-4. Select the specific `project_name`
-
-**Via CLI:**
-```bash
-# Publish all packages
-gh workflow run publish_python.yml -f publish_scope=all
-
-# Publish just the core genkit package
-gh workflow run publish_python.yml \
-  -f publish_scope=single \
-  -f project_type=packages \
-  -f project_name=genkit
-
-# Publish a specific plugin (3 parameters required)
-gh workflow run publish_python.yml \
-  -f publish_scope=single \
-  -f project_type=plugins \
-  -f project_name=anthropic
-
-gh workflow run publish_python.yml \
-  -f publish_scope=single \
-  -f project_type=plugins \
-  -f project_name=google-genai
-
-gh workflow run publish_python.yml \
-  -f publish_scope=single \
-  -f project_type=plugins \
-  -f project_name=vertex-ai
-```
-
-### Available Plugin Names for project_name
-
-| Plugin Name | PyPI Package Name |
-|-------------|-------------------|
-| `anthropic` | genkit-plugin-anthropic |
-| `aws` | genkit-plugin-aws |
-| `amazon-bedrock` | genkit-plugin-amazon-bedrock |
-| `cloudflare-workers-ai` | genkit-plugin-cloudflare-workers-ai |
-| `compat-oai` | genkit-plugin-compat-oai |
-| `deepseek` | genkit-plugin-deepseek |
-| `dev-local-vectorstore` | genkit-plugin-dev-local-vectorstore |
-| `evaluators` | genkit-plugin-evaluators |
-| `firebase` | genkit-plugin-firebase |
-| `flask` | genkit-plugin-flask |
-| `google-cloud` | genkit-plugin-google-cloud |
-| `google-genai` | genkit-plugin-google-genai |
-| `huggingface` | genkit-plugin-huggingface |
-| `mcp` | genkit-plugin-mcp |
-| `mistral` | genkit-plugin-mistral |
-| `microsoft-foundry` | genkit-plugin-microsoft-foundry |
-| `observability` | genkit-plugin-observability |
-| `ollama` | genkit-plugin-ollama |
-| `vertex-ai` | genkit-plugin-vertex-ai |
-| `xai` | genkit-plugin-xai |
-
-### Monitoring Workflow Progress
-
-```bash
-# List recent publish workflow runs
-gh run list --workflow=publish_python.yml --limit=5
-
-# Watch a specific run in real-time
-gh run watch <RUN_ID>
-
-# View detailed job status
-gh run view <RUN_ID> --json status,conclusion,jobs
-
-# View failed job logs
-gh run view <RUN_ID> --log-failed | head -100
-```
-
-### Retrying Failed Jobs
-
-```bash
-# Re-run all failed jobs from a specific run
-gh run rerun <RUN_ID> --failed
-
-# Or trigger a fresh workflow run
-gh workflow run publish_python.yml -f publish_scope=<plugin-name>
-```
-
-## Package Installation Reference
-
-After release, users install packages with:
-
-```bash
-# Core package (required)
-pip install genkit
-
-# Model providers
-pip install genkit-plugin-google-genai    # Google AI (Gemini)
-pip install genkit-plugin-anthropic       # Anthropic (Claude)
-pip install genkit-plugin-ollama          # Ollama (local models)
-pip install genkit-plugin-vertex-ai       # Vertex AI
-pip install genkit-plugin-amazon-bedrock  # AWS Bedrock
-pip install genkit-plugin-mistral         # Mistral AI
-pip install genkit-plugin-deepseek        # DeepSeek
-pip install genkit-plugin-xai             # xAI (Grok)
-pip install genkit-plugin-huggingface     # Hugging Face
-pip install genkit-plugin-cloudflare-workers-ai  # Cloudflare Workers AI + OTLP telemetry
-pip install genkit-plugin-microsoft-foundry       # Azure AI Foundry + Azure Application Insights telemetry
-
-# Telemetry
-pip install genkit-plugin-google-cloud    # GCP Cloud Trace
-pip install genkit-plugin-aws             # AWS X-Ray
-# Azure telemetry is included in genkit-plugin-microsoft-foundry
-pip install genkit-plugin-observability   # Sentry, Honeycomb, Datadog
-
-# Other
-pip install genkit-plugin-firebase        # Firebase/Firestore
-pip install genkit-plugin-evaluators      # Evaluation metrics
-pip install genkit-plugin-flask           # Flask integration
-pip install genkit-plugin-compat-oai      # OpenAI compatibility
-pip install genkit-plugin-mcp             # Model Context Protocol
-```
-
-## Troubleshooting
-
-### PyPI 500 Error: Trusted Publishing Exchange Failure
-
-**Error:** `Trusted publishing exchange failure: Token request failed: the index produced an unexpected 500 response.`
-
-**Cause:** This is a transient PyPI server error, not a configuration issue.
-
-**Solution:**
-1. Check PyPI status: https://status.python.org/
-2. Wait 5-10 minutes
-3. Retry the failed jobs:
-   ```bash
-   gh run rerun <RUN_ID> --failed
-   ```
-
-### PyPI 400 Error: Non-user Identities Cannot Create New Projects
-
-**Error:** `400 Non-user identities cannot create new projects. This was probably caused by successfully using a pending publisher but specifying the project name incorrectly.`
-
-**Cause:** The package doesn't exist on PyPI yet and needs Trusted Publisher setup.
-
-**Solution for new packages:**
-1. Go to https://pypi.org/manage/account/publishing/
-2. Add a **Pending Publisher** for each new package:
-   - **PyPI Project Name:** `genkit-plugin-<name>` (exact package name from pyproject.toml)
-   - **Owner:** `firebase`
-   - **Repository:** `genkit`
-   - **Workflow name:** `publish_python.yml`
-   - **Environment:** `pypi`
-3. Retry the workflow
-
-### Package Already Exists at This Version
-
-**Error:** `File already exists`
-
-**Cause:** The exact version was already uploaded to PyPI.
-
-**Solution:**
-- You cannot re-upload the same version to PyPI
-- Either bump the version (e.g., 0.5.1) or verify the existing package is correct
-- Use `--skip-existing` flag if publishing multiple packages and some already exist
-
-### Authentication Failure
-
-**Error:** `403 Forbidden` or `401 Unauthorized`
-
-**Cause:** OIDC token exchange failed between GitHub and PyPI.
-
-**Solution:**
-1. Verify the GitHub environment `pypi` exists in repository settings
-2. Verify Trusted Publisher is configured correctly on PyPI
-3. Ensure workflow file path matches PyPI configuration exactly
-
-### Manual Fallback: API Token Upload
-
-If Trusted Publishing continues to fail:
-
-```bash
-cd py
-
-# Build the package
-uv build --package genkit-plugin-<name>
-
-# Upload with API token (get token from https://pypi.org/manage/account/token/)
-TWINE_USERNAME=__token__ TWINE_PASSWORD=<your-token> twine upload dist/*
-```
-
-**Note:** This is a fallback for emergencies. Prefer Trusted Publishing for security.
diff --git a/py/engdoc/user_guide/python/publishing_pypi.md b/py/engdoc/user_guide/python/publishing_pypi.md
index d712df4679..98e3ae3fb4 100644
--- a/py/engdoc/user_guide/python/publishing_pypi.md
+++ b/py/engdoc/user_guide/python/publishing_pypi.md
@@ -1,24 +1,30 @@
 # Overview
 
-The Genkit Python AI SDK publishes some packages to PYPI in order to be able to
-use them as python packages using any python package manager.
+The Genkit Python AI SDK publishes packages to PyPI so they can be installed
+with any Python package manager (`pip`, `uv`, etc.).
 
-In order to generate a new version of any package or plugin, a CI with github
-actions has been created.
+## Publishing with ReleaseKit (current)
 
-## CI to release new PYPI versions
+The primary publishing mechanism is **ReleaseKit**, an internal release
+orchestration tool. It automates the full release lifecycle:
 
-The github action located on `.github/workflows/publish_python.yml` has two
-inputs:
+* Version bumping across all packages (core + 22 plugins + samples)
+* Changelog generation from conventional commits
+* Dependency-graph-aware publish ordering with retries
+* SBOM generation
 
-* Type of project to build. E.g. Package or Plugin
-* Name of project. E.g. genkit
+The automated workflow is at `.github/workflows/releasekit-uv.yml`. See
+[`py/tools/releasekit/README.md`](../../../tools/releasekit/README.md) for
+full documentation.
 
-The process is separated in two steps. The first one make validations over the
-project to build. Mainly, the project's new version to publish must be greater
-that the current one. This step also builds with uv the package and validates
-the wheel with twine.
+## Legacy manual workflow
 
-The last step uses an action `pypa/gh-action-pypi-publish@release/v1` to publish
-the package with trusted publishers. See
-(gh-action-pypi-publish)\[https://github.com/pypa/gh-action-pypi-publish]
+The older manual workflow at `.github/workflows/publish_python.yml` is still
+available as a fallback. It accepts two inputs:
+
+* Type of project to build (Package or Plugin)
+* Name of project (e.g. `genkit`)
+
+It validates that the new version is greater than the current PyPI version,
+builds with `uv`, validates the wheel with `twine`, and publishes using
+`pypa/gh-action-pypi-publish@release/v1` with trusted publishers.
diff --git a/py/plugins/README.md b/py/plugins/README.md
index 6ac25cc9c3..29f0199398 100644
--- a/py/plugins/README.md
+++ b/py/plugins/README.md
@@ -664,6 +664,4 @@ model provider diversity.
 
 ## Further Reading
 
-- [Plugin Planning & Roadmap](../engdoc/planning/)
-- [Feature Matrix](../engdoc/planning/FEATURE_MATRIX.md)
 - [Contributing Guide](../engdoc/contributing/)
diff --git a/py/tools/conform/ANNOUNCEMENT.md b/py/tools/conform/ANNOUNCEMENT.md
deleted file mode 100644
index f275c3ea1b..0000000000
--- a/py/tools/conform/ANNOUNCEMENT.md
+++ /dev/null
@@ -1,275 +0,0 @@
-# Announcing Conform: Cross-Runtime Model Conformance Testing for Genkit
-
-## TL;DR
-
-**Conform** is a purpose-built model conformance test runner for the
-Genkit SDK.  It validates that every model plugin — across Python, JS,
-and Go runtimes — behaves correctly and consistently.  The Python
-runtime runs **in-process** with zero subprocess overhead; JS and Go
-runtimes communicate via async HTTP to their reflection servers.  One
-command tests **13 plugins**, runs **150+ test cases**, and reports
-results in **under 4 minutes**.
-
----
-
-## The Problem
-
-The Genkit SDK supports **13+ model plugins** (Anthropic, Google GenAI,
-Amazon Bedrock, Mistral, DeepSeek, Cohere, xAI, Ollama, …) across
-**3 runtimes** (Python, JS, Go).  Each plugin must correctly:
-
-1. **Generate text** — simple prompts, system messages, multi-turn
-2. **Handle structured output** — JSON mode, schema conformance
-3. **Support tool calling** — tool requests, tool responses, multi-step
-4. **Stream responses** — text chunks, streamed JSON, streamed tool calls
-5. **Process media** — image inputs, media outputs
-6. **Expose reasoning** — thinking / reasoning content from supported models
-
-Previously, conformance was tested ad hoc:
-
-- Manual spot-checks against live APIs
-- Plugin-specific unit tests with mocked responses
-- No cross-runtime consistency verification
-- No shared test suite between Python, JS, and Go
-- Failures discovered in production, not at PR time
-
----
-
-## The Solution
-
-Conform provides a unified test framework with a single CLI:
-
-```bash
-conform list           #  Show all plugins, runtimes, and env-var readiness
-conform check-model    #  Run model conformance tests across all plugins
-conform check-plugin   #  Verify every model plugin has conformance specs
-```
-
----
-
-## Features
-
-### Live Conformance Results
-
-Conform runs all plugin tests concurrently (bounded by a configurable
-semaphore) and displays a live Rich progress table.  Log lines scroll
-above while the summary table stays pinned at the bottom:
-
-![conform check-model results](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/conform/docs/images/conform_results.png)
-
-13 plugins.  150+ tests.  Under 4 minutes wall time.
-
-### In-Process Python Runner
-
-The Python runtime uses an **InProcessRunner** that imports the
-plugin's entry point directly — no subprocess, no HTTP server, no
-genkit CLI dependency:
-
-```python
-class ActionRunner(Protocol):
-    async def run_action(
-        self, key: str, input_data: dict, *, stream: bool = False,
-    ) -> tuple[dict, list[dict]]: ...
-    async def close(self) -> None: ...
-```
-
-| Runner | When | How |
-|--------|------|-----|
-| **InProcessRunner** | Python (default) | Imports entry point, calls `action.arun_raw()` directly |
-| **ReflectionRunner** | JS / Go | Subprocess → async HTTP to reflection server |
-| **genkit CLI** | `--use-cli` flag | Delegates to `genkit dev:test-model` |
-
-### 10 Validators — 1:1 Parity with JS
-
-Every validator is ported from the canonical JS implementation:
-
-| Validator | What it checks |
-|-----------|----------------|
-| `text-includes` | Response text contains expected substring |
-| `text-starts-with` | Response text starts with expected prefix |
-| `text-not-empty` | Response text is non-empty |
-| `valid-json` | Response text is valid JSON |
-| `has-tool-request` | Response contains a tool request part |
-| `valid-media` | Response contains a media part with valid URL |
-| `reasoning` | Response contains a reasoning / thinking part |
-| `stream-text-includes` | Streamed chunks contain expected text |
-| `stream-has-tool-request` | Streamed chunks contain a tool request |
-| `stream-valid-json` | Final streamed chunk is valid JSON |
-
-New validators: decorate a function with `@register('name')`.
-
-### YAML-Driven Test Specs
-
-Each plugin defines its tests in a declarative YAML file:
-
-```yaml
-models:
-  - name: "anthropic/claude-sonnet-4"
-    supported_features: [text, json, tools, streaming, reasoning]
-    tests:
-      - name: "basic text generation"
-        prompt: "Say 'hello' and nothing else"
-        assertions:
-          - type: text-includes
-            value: hello
-
-      - name: "streaming structured output"
-        prompt: "Output a JSON object with a 'name' field"
-        stream: true
-        output:
-          format: json
-          schema: { "type": "object" }
-        assertions:
-          - type: stream-valid-json
-```
-
-### Full Feature Matrix
-
-| Feature | Description |
-|---------|-------------|
-| **In-process Python runner** | Zero-overhead native execution — no subprocess, no HTTP |
-| **Reflection runner** | Cross-runtime support via async HTTP (JS, Go) |
-| **10 validators** | Ported 1:1 from canonical JS source |
-| **YAML-driven specs** | Declarative test definitions per plugin |
-| **Live progress table** | Rich terminal UI with real-time updates |
-| **Inline progress bars** | Per-row colored bars (green/red/dim) with pre-calculated totals |
-| **Log redaction** | Data URIs auto-truncated in debug logs for readability |
-| **Concurrent execution** | Semaphore-bounded parallelism (default: 8 plugins, 3 tests/model) |
-| **Retry with backoff** | Exponential backoff + full jitter on failure; serial fallback |
-| **Human-readable details** | Details column shows `8 std + 0 custom` instead of cryptic `8s+0c` |
-| **Per-plugin overrides** | `[conform.plugin-overrides.<name>]` for rate-sensitive plugins |
-| **Pre-flight checks** | Validates specs, entry points, and env vars before running |
-| **CI integration** | `check-plugin` runs in `bin/lint` on every PR |
-| **Multi-runtime** | Python, JS, Go from a single command |
-| **Rust-style diagnostics** | Unique error codes with actionable help messages |
-| **TOML configuration** | `conform.toml` alongside specs — concurrency, env vars, runtime paths |
-| **Legacy CLI fallback** | `--use-cli` delegates to `genkit dev:test-model` |
-
----
-
-## Architecture
-
-```
-conform check-model google-genai
-        │
-        ├── Auto-detect runtimes with entry points
-        │   ├── python? ──→ InProcessRunner
-        │   │       Import conformance_entry.py
-        │   │       Call action.arun_raw() directly
-        │   │       No subprocess · No HTTP · No reflection server
-        │   │
-        │   ├── js? ──→ ReflectionRunner
-        │   │       Start conformance_entry.ts subprocess
-        │   │       Async HTTP (httpx) → reflection API
-        │   │
-        │   └── go? ──→ ReflectionRunner (same as JS)
-        │
-        All runners share:
-        ├── ActionRunner Protocol   ← common interface
-        ├── Validators              ← 10 validators, Protocol + @register
-        ├── Test cases              ← 12 built-in, 1:1 with JS
-        └── Rich console output     ← live progress + summary table
-```
-
-### Layout
-
-```
-py/
-├── tools/conform/                  ← The CLI tool
-│   ├── pyproject.toml              ← Private package metadata
-│   └── src/conform/
-│       ├── cli.py                  ← Argument parsing + subcommand dispatch
-│       ├── config.py               ← TOML config loader
-│       ├── checker.py              ← check-plugin: verify conformance files
-│       ├── display.py              ← Rich tables, inline progress bars, Rust-style errors
-│       ├── log_redact.py           ← Structlog processor to truncate data URIs
-│       ├── plugins.py              ← Plugin discovery + env-var checking
-│       ├── reflection.py           ← Async HTTP client for reflection API
-│       ├── util_test_model.py      ← Native test runner (ActionRunner)
-│       ├── util_test_cases.py      ← 12 built-in test cases
-│       ├── types.py                ← Shared types (PluginResult, Status)
-│       └── validators/             ← Protocol-based validator registry
-│           ├── __init__.py         ← Validator Protocol + @register
-│           ├── json.py             ← valid-json
-│           ├── streaming.py        ← stream-* validators
-│           ├── text.py             ← text-* validators
-│           └── tool.py             ← has-tool-request
-│
-└── tests/conform/                  ← Per-plugin conformance specs
-    ├── conform.toml                ← All repo-specific config (auto-discovered)
-    ├── anthropic/
-    │   ├── model-conformance.yaml
-    │   ├── conformance_entry.py
-    │   ├── conformance_entry.ts
-    │   └── conformance_entry.go
-    ├── google-genai/
-    ├── amazon-bedrock/
-    ├── vertex-ai/
-    └── ... (13 plugins total)
-```
-
----
-
-## Impact
-
-| Metric | Before | After |
-|--------|--------|-------|
-| **Cross-plugin testing** | Manual spot-checks | 150+ automated tests |
-| **Cross-runtime parity** | Not verified | Unified test suite |
-| **Time to run all plugins** | Hours (manual) | < 4 minutes |
-| **New plugin onboarding** | Write custom tests | Add YAML spec + entry point |
-| **CI coverage** | Unit tests only | Unit + conformance on every PR |
-| **Failure diagnosis** | Dig through logs | Rust-style errors with codes |
-| **Validator extensibility** | N/A | `@register` decorator |
-
-### CI Integration
-
-1. **PR checks** (`bin/lint` → `conform check-plugin`) — verifies every
-   model plugin has conformance specs and entry points.
-2. **Conformance runs** (`conform check-model`) — full test suite
-   against live APIs with real model calls.
-
----
-
-## Try It
-
-```bash
-# List all plugins and their readiness
-py/bin/conform list
-
-# Run conformance tests for a single plugin
-py/bin/conform check-model google-genai
-
-# Run all plugins (Python runtime)
-py/bin/conform check-model
-
-# Run with verbose output
-py/bin/conform check-model -v
-
-# Control concurrency: 4 plugins, 1 test/model (safe for free tiers)
-py/bin/conform check-model -j 4 -t 1
-
-# Disable retries (default: 2 retries with exponential backoff)
-py/bin/conform check-model --max-retries 0
-
-# Custom retry settings
-py/bin/conform check-model --max-retries 3 --retry-base-delay 2.0
-
-# Filter to a specific runtime
-py/bin/conform check-model --runtime python
-
-# Specify config explicitly (flags are per-subcommand)
-py/bin/conform list --config py/tests/conform/conform.toml
-
-# Verify all plugins have conformance specs (used by bin/lint)
-py/bin/conform check-plugin
-```
-
----
-
-## Links
-
-- **Source**: `py/tools/conform/`
-- **Specs + config**: `py/tests/conform/` (includes `conform.toml`)
-- **Documentation**: `py/tools/conform/README.md`
-- **Validators**: `py/tools/conform/src/conform/validators/`
diff --git a/py/tools/conform/README.md b/py/tools/conform/README.md
index a5078a0238..3ba1a9db83 100644
--- a/py/tools/conform/README.md
+++ b/py/tools/conform/README.md
@@ -329,8 +329,9 @@ py/
 │   ├── pyproject.toml              ← Private package + [tool.conform] config
 │   ├── README.md
 │   └── src/conform/
+│       ├── __main__.py             ← Entry point for `python -m conform`
 │       ├── cli.py                  ← Argument parsing + subcommand dispatch
-│       ├── config.py               ← TOML config loader
+│       ├── config.py               ← TOML config loader (auto-discovers conform.toml)
 │       ├── checker.py              ← check-plugin: verify conformance files exist
 │       ├── display.py              ← Rich tables, inline progress bars, Rust-style errors
 │       ├── log_redact.py           ← Structlog processor to truncate data URIs in logs
@@ -338,8 +339,8 @@ py/
 │       ├── plugins.py              ← Plugin discovery and env-var checking
 │       ├── reflection.py           ← Async HTTP client for reflection API (httpx)
 │       ├── runner.py               ← Legacy parallel runner (genkit CLI subprocess)
-│       ├── test_cases.py           ← 12 built-in test cases (1:1 parity with JS)
-│       ├── test_model.py           ← Native test runner with ActionRunner Protocol
+│       ├── util_test_cases.py      ← 12 built-in test cases (1:1 parity with JS)
+│       ├── util_test_model.py      ← Native test runner with ActionRunner Protocol
 │       ├── types.py                ← Shared types (PluginResult, Status, Runtime)
 │       └── validators/             ← Protocol-based validator registry
 │           ├── __init__.py         ← Validator Protocol + @register decorator
@@ -488,6 +489,9 @@ The wrapper script `py/bin/conform` passes `--config` automatically.
 [conform]
 concurrency = 8
 test-concurrency = 3
+action-timeout = 120        # seconds per LLM action call
+health-timeout = 5          # seconds per health check
+startup-timeout = 30        # seconds to wait for reflection server
 additional-model-plugins = ["google-genai", "vertex-ai", "ollama"]
 
 [conform.env]
@@ -499,6 +503,11 @@ google-genai = ["GEMINI_API_KEY"]
 [conform.plugin-overrides.cloudflare-workers-ai]
 test-concurrency = 1
 
+# Per-model overrides (e.g. longer timeout for slow models).
+# 3-level resolution: model → plugin → global.
+[conform.model-overrides."gemini-2.0-flash"]
+action-timeout = 180
+
 # Paths are relative to the conform.toml file.
 [conform.runtimes.python]
 cwd           = "../../.."
@@ -530,6 +539,9 @@ CLI flags override TOML values:
 | `-j N` | `concurrency` | Max concurrent plugins |
 | `-t N` | `test-concurrency` | Max concurrent tests per model spec |
 | `--verbose` | — | Print full output for failures |
+| — | `action-timeout` | Timeout in seconds for a single LLM action call (default: 120) |
+| — | `health-timeout` | Timeout in seconds for health checks (default: 5) |
+| — | `startup-timeout` | Timeout in seconds for reflection server startup (default: 30) |
 
 ## Adding a New Plugin
 
diff --git a/py/tools/releasekit/ANNOUNCEMENT.md b/py/tools/releasekit/ANNOUNCEMENT.md
deleted file mode 100644
index e6162ab2db..0000000000
--- a/py/tools/releasekit/ANNOUNCEMENT.md
+++ /dev/null
@@ -1,236 +0,0 @@
-# Announcing ReleaseKit: Automated Release Orchestration for the Genkit Python SDK
-
-## TL;DR
-
-**ReleaseKit** is a purpose-built release orchestration tool for the Genkit
-Python SDK. It automates the end-to-end process of publishing 60+ Python
-packages to PyPI in the correct dependency order — a process that was
-previously manual, error-prone, and took hours. With ReleaseKit, a full
-release takes **one command** and completes in minutes.
-
----
-
-## The Problem
-
-The Genkit Python SDK is a [uv](https://docs.astral.sh/uv/) workspace
-with **62 interdependent packages**: 1 core framework, 22 plugins, and
-39 samples. These packages form a 4-level dependency graph with 121
-dependency edges.
-
-Publishing them to PyPI requires:
-
-1. **Correct ordering** — `genkit` (core) must be published *before* any
-   plugin that depends on it, and plugins *before* samples that use them.
-2. **Ephemeral version pinning** — during build, workspace-sourced
-   dependencies (`genkit = { workspace = true }`) must be temporarily
-   rewritten to concrete versions (`genkit>=0.5.0`), then restored.
-3. **Transitive bump propagation** — if `genkit` bumps from 0.5.0 → 0.6.0,
-   every plugin and sample that depends on it must also be bumped.
-4. **Crash safety** — if the process fails mid-way through package #37,
-   we need to resume from that point, not restart from scratch.
-
-**No existing tool does this.** `uv publish` is a single-package command.
-`release-please` doesn't understand Python workspaces. PyPI-specific tools
-like `twine` and `flit` have no concept of dependency ordering.
-
-Our previous release process was:
-- Manual `uv publish` for each package, one at a time
-- Copy-paste version numbers into pyproject.toml files
-- Hope we didn't miss a dependency or publish in the wrong order
-- If something failed mid-release, start over
-
----
-
-## The Solution
-
-ReleaseKit automates the entire release lifecycle:
-
-```
-releasekit prepare   →   Opens a Release PR with computed version bumps
-                          and generated changelogs
-
-releasekit publish   →   Builds and publishes all packages to PyPI
-                          in topological dependency order
-
-releasekit release   →   Tags the merge commit and creates a GitHub Release
-```
-
----
-
-## Features
-
-### Dependency Graph Visualization
-
-ReleaseKit discovers all 62 workspace packages and builds a dependency
-graph, which can be visualized in 8 output formats (ASCII art, Mermaid,
-Graphviz DOT, CSV, JSON, Markdown table, D2, and plain text levels):
-
-![releasekit graph --format ascii](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/releasekit/docs/docs/images/releasekit_graph_ascii.png)
-
-The topological sort guarantees that every package is published only after
-all its dependencies are available on PyPI.
-
-### Workspace Health Checks
-
-19 automated health checks run on every PR via `bin/lint`, catching
-issues before they reach PyPI:
-
-![releasekit check](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/releasekit/docs/docs/images/releasekit_check.png)
-
-Checks include: circular dependency detection, missing LICENSE/README
-files, version consistency across all plugins, PEP 561 type markers,
-lockfile staleness, naming conventions, and PyPI metadata completeness.
-
-### Architecture Overview
-
-The publish pipeline processes each package through 8 stages, with a
-dependency-triggered scheduler that maximizes parallelism:
-
-![releasekit architecture](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/releasekit/docs/docs/images/releasekit_overview.png)
-
-### Full Feature Matrix
-
-| Feature | Description |
-|---------|-------------|
-| **Dependency-triggered publishing** | Packages publish as soon as their dependencies complete, maximizing parallelism |
-| **Conventional commits → semver** | Automatic version bump computation from git history |
-| **Transitive propagation** | A change in `genkit` triggers patch bumps for all 61 dependents |
-| **Crash-safe resume** | State persistence after each package; resume from failure point |
-| **19 pre-publish health checks** | Catch issues at PR time, not after a broken release |
-| **Ephemeral pinning** | Workspace deps temporarily pinned to concrete versions for build |
-| **Post-publish verification** | SHA-256 checksums verified against PyPI |
-| **Smoke testing** | `python -c 'import ...'` after publish to verify installability |
-| **Changelog generation** | Per-package changelogs from conventional commits |
-| **Git tagging** | Per-package tags (`genkit-v0.5.0`) + umbrella tag (`v0.5.0`) |
-| **8 graph formats** | ASCII, CSV, DOT, D2, JSON, Mermaid, Markdown table, levels |
-| **Rust-style diagnostics** | Every error has a unique code (e.g. `RK-GRAPH-CYCLE-DETECTED`) |
-| **SIGUSR1/SIGUSR2 controls** | Pause/resume the scheduler from another terminal |
-| **Release groups** | Publish a subset of packages (e.g. `--group core`) |
-| **Rollback** | Delete a tag and its GitHub release with one command |
-
----
-
-## Architecture
-
-ReleaseKit is built on a **protocol-based backend architecture** that
-makes it fully testable with in-memory fakes — no subprocess calls, no
-network I/O, no file system side effects in tests:
-
-```
-releasekit
-├── Backends (DI / Protocol-based)
-│   ├── VCS              git operations (tag, commit, push)
-│   ├── PackageManager   build, publish, lock (uv)
-│   ├── Workspace        package discovery (uv)
-│   ├── Registry         package registry queries (PyPI)
-│   └── Forge            release / PR management (GitHub CLI + API)
-│
-├── Core Pipeline
-│   ├── workspace.py     discover packages from pyproject.toml
-│   ├── graph.py         build & topo-sort dependency graph
-│   ├── versioning.py    conventional commits → semver bumps
-│   ├── scheduler.py     dependency-triggered queue dispatcher
-│   ├── publisher.py     async publish orchestration
-│   ├── preflight.py     pre-publish safety checks
-│   └── checks/          standalone workspace health checks (subpackage)
-│
-├── Formatters           8 output formats (ASCII, CSV, DOT, Mermaid, ...)
-├── UX                   Rust-style errors, structured logging, CLI
-└── UI                   Rich live progress (TTY) / structured logs (CI)
-```
-
-### Publish Pipeline
-
-Each package goes through an 8-stage pipeline:
-
-```
-pin → build → checksum → publish → poll → verify_checksum → smoke_test → restore
-```
-
-The **dependency-triggered scheduler** is more efficient than level-based
-lockstep — each package starts as soon as all its dependencies complete,
-not when the entire level finishes.
-
----
-
-## Impact
-
-| Metric | Before | After |
-|--------|--------|-------|
-| **Release time** | Hours (manual) | Minutes (automated) |
-| **Risk of wrong ordering** | High | Zero (topological sort) |
-| **Crash recovery** | Start over | Resume from failure point |
-| **Version consistency** | Error-prone | Enforced by 19 checks |
-| **Missing metadata** | Found after publish | Caught at PR time |
-| **Changelog** | Manual | Auto-generated from commits |
-| **PyPI verification** | Manual spot-check | Automated checksum + smoke test |
-
-### CI Integration
-
-ReleaseKit is integrated into CI at two levels:
-
-1. **PR checks** (`bin/lint` → `releasekit check`) — runs 19 health
-   checks on every PR touching `py/`. Catches issues before merge.
-2. **Publish workflow** (`.github/workflows/publish_python.yml`) —
-   orchestrates the full publish pipeline on release tags.
-
----
-
-## Dependency Graph (62 packages, 121 edges, 4 levels)
-
-```
-┌────────────────────────────────────────────────────────┐
-│ Level 0                                                │
-│   genkit (0.5.0)                                       │
-├────────────────────────────────────────────────────────┤
-│ Level 1 (19 plugins + 1 sample)                        │
-│   genkit-plugin-anthropic (0.5.0)                      │
-│   genkit-plugin-google-genai (0.5.0)                   │
-│   genkit-plugin-firebase (0.5.0)                       │
-│   genkit-plugin-vertex-ai (0.5.0)                      │
-│   genkit-plugin-ollama (0.5.0)                         │
-│   ... 15 more                                          │
-├────────────────────────────────────────────────────────┤
-│ Level 2 (35 packages)                                  │
-│   genkit-plugin-deepseek (0.5.0)                       │
-│   genkit-plugin-flask (0.5.0)                          │
-│   provider-google-genai-hello (0.1.0)                  │
-│   web-endpoints-hello (0.1.0)                          │
-│   ... 31 more                                          │
-├────────────────────────────────────────────────────────┤
-│ Level 3 (6 packages)                                   │
-│   framework-restaurant-demo (0.1.0)                    │
-│   provider-vertex-ai-model-garden (0.1.0)              │
-│   ... 4 more                                           │
-└────────────────────────────────────────────────────────┘
-```
-
----
-
-## Try It
-
-```bash
-# From the genkit repo root
-cd py/tools/releasekit
-
-# Discover all workspace packages
-uv run releasekit discover
-
-# View the dependency graph
-uv run releasekit graph --format ascii
-
-# Run workspace health checks
-uv run releasekit check
-
-# Preview what a release would look like
-uv run releasekit plan
-```
-
----
-
-## Links
-
-- **Source**: `py/tools/releasekit/` in the genkit repo
-- **CI PR**: [#4590](https://github.com/firebase/genkit/pull/4590) — enables `releasekit check` in CI
-- **Documentation PR**: [#4589](https://github.com/firebase/genkit/pull/4589) — MkDocs engineering docs
-- **Publish Workflow**: `.github/workflows/publish_python.yml`
diff --git a/py/tools/releasekit/FIXES.md b/py/tools/releasekit/FIXES.md
deleted file mode 100644
index 325eb8aacb..0000000000
--- a/py/tools/releasekit/FIXES.md
+++ /dev/null
@@ -1,143 +0,0 @@
-# Releasekit: Audit Fixes Roadmap
-
-Findings from an exhaustive audit cross-referencing known pain points from
-[release-please](https://github.com/googleapis/release-please/issues) and
-[python-semantic-release](https://github.com/python-semantic-release/python-semantic-release/issues)
-against the releasekit codebase.
-
-## Dependency Graph
-
-```text
-  F1: Label new PRs ──────────┐
-                               │
-  F2: Filter by head branch ──┼──▶ F5: Auto-prepare on push to main
-                               │        │
-  F3: checkout@v5 → v4 ───────┘        │
-                                        ▼
-  F4: --first-parent dedup ──────▶ F6: Write CHANGELOG.md to disk
-                                   (per-package files, prepend new
-                                    entries, commit in release branch)
-```
-
-## Reverse Topological Order (phases)
-
-### Phase 1 — Foundations (no dependencies, land first)
-
-These are prerequisites for the auto-prepare feature and fix real bugs.
-
-| ID | Severity | File | Fix |
-|----|----------|------|-----|
-| **F1** | Medium | `prepare.py` | Add `autorelease: pending` label to **newly created** PRs, not just updated ones. Without this, `tag_release` can't find the merged PR. |
-| **F2** | Medium | `release.py` | Filter `list_prs()` by `head='releasekit--release'` in addition to label. Prevents race where a stale PR with the same label is picked up. |
-| **F3** | Medium | `publish_python.yml` | Change `actions/checkout@v5` → `@v4`. v5 doesn't exist; workflow will fail. |
-
-### Phase 2 — Changelog Quality (independent)
-
-| ID | Severity | File | Fix |
-|----|----------|------|-----|
-| **F4** | Critical | `vcs/git.py` + `vcs/__init__.py` | Add `--first-parent` to `git log` in the `log()` method. Prevents duplicate changelog entries when merge commits repeat the same conventional commit message as the squashed commit. See [release-please#2476](https://github.com/googleapis/release-please/issues/2476). |
-
-### Phase 3 — Auto-Prepare + Changelog Files (depends on F1 + F2 + F4)
-
-| ID | Severity | File | Fix |
-|----|----------|------|-----|
-| **F5** | Feature | `release.yml` | Add `push` trigger on `main` so `prepare` runs automatically on every merge. The Release PR stays up-to-date with accumulated changelogs. Publish remains manual or merge-triggered. |
-| **F6** | Feature | `prepare.py` | Write per-package `CHANGELOG.md` files to disk during `prepare`. Prepend new entries to existing file (or create it). Commit alongside version bumps on the release branch. Depends on F4 so written changelogs are dedup-clean. |
-
-## Detailed Fix Descriptions
-
-### F1: Label new PRs with `autorelease: pending`
-
-**Problem**: In `prepare.py:334-349`, the label is only added when an
-existing PR is found and updated. When a brand-new PR is created, it
-never gets the label. The `release` step searches for merged PRs by
-this label, so it will miss PRs that were created fresh and then merged.
-
-**Fix**: After `forge.create_pr()`, extract the PR number from the
-result and call `forge.add_labels(pr_number, [_AUTORELEASE_PENDING])`.
-
-### F2: Filter merged PR lookup by head branch
-
-**Problem**: In `release.py:236-240`, `list_prs(label=..., state='merged')`
-could return a stale PR from a previous release cycle if the label wasn't
-cleaned up. Adding `head='releasekit--release'` narrows the search to
-only the correct branch.
-
-**Fix**: Add `head=_RELEASE_BRANCH` to the `list_prs` call. Import the
-branch constant or define it locally.
-
-### F3: Fix `actions/checkout` version
-
-**Problem**: `publish_python.yml:84` uses `actions/checkout@v5` which
-does not exist. The latest stable is `v4`.
-
-**Fix**: One-line change: `@v5` → `@v4`.
-
-### F4: Deduplicate changelog entries with `--first-parent`
-
-**Problem**: When a PR is merged (not squashed), both the merge commit
-and the original feature commit appear in `git log`. If both have the
-same conventional commit message (common with GitHub's default merge
-commit format), the changelog gets duplicate entries.
-
-**Fix**: Add `--first-parent` flag to the git log command in
-`GitCLIBackend.log()`. This follows only the first parent of merge
-commits, which is the mainline. Also add the parameter to the `VCS`
-protocol so all backends are aware of it.
-
-### F5: Auto-prepare on push to main
-
-**Problem**: Currently `release.yml` only triggers on `workflow_dispatch`,
-requiring manual intervention to create/update the Release PR.
-
-**Fix**: Add a `push` trigger filtered to `main` (and scoped to `py/`
-paths). The `prepare` job's `if` condition is updated to also run on
-push events. The `publish` job remains gated on manual dispatch or
-PR merge with the `autorelease: pending` label.
-
-Result:
-
-```text
-push to main ──▶ prepare runs ──▶ Release PR created/updated
-                                       │
-                           (human reviews, merges when ready)
-                                       │
-                                       ▼
-                             publish triggers automatically
-```
-
-### F6: Write per-package CHANGELOG.md files to disk
-
-**Problem**: Currently, changelogs are only rendered into the Release PR
-body. There are no `CHANGELOG.md` files in any package directory. Users
-cannot view the changelog locally, and published PyPI packages have no
-changelog file included.
-
-**Where changelogs go today**:
-- PR body only (via `_build_pr_body` in `prepare.py`)
-- `releasekit plan` shows version bumps but not changelog text
-
-**Fix**: After generating changelogs in `prepare_release` (step 6),
-write each package's rendered changelog to `{pkg.path}/CHANGELOG.md`:
-
-1. If the file exists, prepend the new version's section above the
-   existing content (below the `# Changelog` heading).
-2. If the file doesn't exist, create it with a `# Changelog` heading
-   followed by the new section.
-3. Include the changelog files in the release branch commit (step 8).
-
-This ensures:
-- Changelog is visible in the repo alongside each package
-- Changelog is included in the sdist/wheel (via `pyproject.toml`
-  `[tool.setuptools]` or default inclusion)
-- The Release PR diff shows the changelog additions for review
-- `releasekit plan --format table` can optionally preview changelog text
-
-## Low-Priority Notes (not blocking, document only)
-
-| Issue | Notes |
-|-------|-------|
-| Signal handler in `pin.py` not async-safe | Acceptable — crash recovery only, atexit + finally handle normal cases |
-| Prerelease format (`0.6.0rc1` vs `0.6.0a1`) | Valid PEP 440, but unconventional for `alpha`/`beta` labels |
-| Pre-1.0 major bump convention | Design choice — `0.x` + breaking → `1.0.0`. Some prefer `0.x+1`. Document. |
-| Advisory lock not atomic | Acceptable — CI concurrency group prevents races in practice |
diff --git a/py/tools/releasekit/README.md b/py/tools/releasekit/README.md
index 712252b1d8..33fe6a8161 100644
--- a/py/tools/releasekit/README.md
+++ b/py/tools/releasekit/README.md
@@ -64,18 +64,18 @@ implementation plan.
 | 🔄 Retry with backoff | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
 | 🔒 Release lock | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
 | ✍️ Signing / provenance | 🔜 | ❌ | ⚠️ npm | ❌ | ❌ | ❌ | ✅ GPG/Cosign |
-| 📋 SBOM | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
+| 📋 SBOM | ✅ CycloneDX+SPDX | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
 | 📢 Announcements | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
-| 📊 Plan profiling | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
-| 🔭 OpenTelemetry tracing | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| 📊 Plan profiling | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
+| 🔭 OpenTelemetry tracing | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
 | 🔄 Migrate from alternatives | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
 
 **Legend:** ✅ = supported, ⚠️ = partial, ❌ = not supported, 🔜 = planned
 
 See [docs/competitive-gap-analysis.md](docs/competitive-gap-analysis.md) for
 the full analysis with issue tracker references, and
-[docs/roadmap-execution-plan.md](docs/roadmap-execution-plan.md) for the
-dependency-graphed, topo-sorted execution plan.
+[roadmap.md](roadmap.md) for the detailed roadmap with dependency graphs
+and execution phases.
 
 ## Getting Started
 
@@ -121,6 +121,7 @@ uvx releasekit check
 | `explain` | Look up any error code (e.g. `releasekit explain RK-GRAPH-CYCLE-DETECTED`) |
 | `version` | Show the releasekit version |
 | `migrate` | Migrate from another release tool (release-please, semantic-release, changesets, etc.) |
+| `doctor` | Diagnose inconsistent state between workspace, git tags, and platform releases |
 | `completion` | Generate shell completion scripts (bash/zsh/fish) |
 
 ## Features
@@ -399,17 +400,27 @@ in the workspace root. Use `releasekit init` to scaffold one:
 
 ```toml
 # releasekit.toml
-changelog    = true
-smoke_test   = true
-tag_format   = "{name}-v{version}"
-umbrella_tag = "v{version}"
+forge            = "github"
+repo_owner       = "firebase"
+repo_name        = "genkit"
+default_branch   = "main"
+pr_title_template = "chore(release): v{version}"
+
+[workspace.py]
+ecosystem      = "python"
+tool           = "uv"              # defaults from ecosystem if omitted
+root           = "py"
+tag_format     = "{name}@{version}"
+umbrella_tag   = "py/v{version}"
+changelog      = true
+smoke_test     = true
+major_on_zero  = false
+max_commits    = 500              # limit git log depth for large repos
+extra_files    = []
 
 exclude_publish = ["group:samples"]
-major_on_zero    = false
-pr_title_template = "chore(release): v{version}"
-extra_files      = []
 
-[groups]
+[workspace.py.groups]
 core = ["genkit"]
 samples = ["*-hello", "*-demo", "web-*"]
 ```
@@ -432,6 +443,7 @@ samples = ["*-hello", "*-demo", "web-*"]
 | `major_on_zero` | `false` | Allow `0.x → 1.0.0` on breaking changes (default: downgrade to minor) |
 | `pr_title_template` | `"chore(release): v{version}"` | Template for the Release PR title. Placeholder: `{version}` |
 | `extra_files` | `[]` | Extra files with version strings to bump (path or `path:regex` pairs) |
+| `max_commits` | `0` | Limit git log depth (0 = unlimited; useful for large repos) |
 
 ### Exclusion Hierarchy
 
@@ -516,7 +528,14 @@ releasekit
 │   ├── release_notes.py release notes generation
 │   ├── commitback.py    commit-back version bumps
 │   ├── detection.py     multi-ecosystem auto-detection
-│   └── groups.py        release group filtering
+│   ├── groups.py        release group filtering
+│   ├── sbom.py          CycloneDX + SPDX SBOM generation
+│   ├── profiling.py     pipeline step timing + bottleneck analysis
+│   ├── tracing.py       optional OpenTelemetry tracing (graceful no-op)
+│   ├── doctor.py        release state consistency checker
+│   ├── distro.py        distro packaging dep sync (Debian/Fedora/Homebrew)
+│   ├── branch.py        default branch resolution
+│   └── commit_parsing/  conventional commit parser (subpackage)
 │
 ├── Formatters
 │   ├── ascii_art.py     box-drawing terminal art
@@ -532,7 +551,7 @@ releasekit
 ├── UX
 │   ├── errors.py        error catalog + Rust-style render_error/render_warning
 │   ├── logging.py       structured logging setup
-│   ├── config.py        TOML config loading + validation
+│   ├── config.py        TOML config loading + validation (workspace-aware)
 │   ├── init.py          workspace config scaffolding
 │   └── cli.py           argparse + rich-argparse + shell completion
 │
@@ -616,6 +635,8 @@ enables multi-ecosystem support:
 
 ## Testing
 
+The test suite has **1,274 tests** across 19k+ lines:
+
 ```bash
 # Run all tests
 uv run pytest tests/
diff --git a/py/tools/releasekit/docs/competitive-gap-analysis.md b/py/tools/releasekit/docs/competitive-gap-analysis.md
index d4e374d7e9..01c834974b 100644
--- a/py/tools/releasekit/docs/competitive-gap-analysis.md
+++ b/py/tools/releasekit/docs/competitive-gap-analysis.md
@@ -595,9 +595,8 @@ signing, and publishing.
 27. **Dart/Pub workspace backend** (`pubspec.yaml`, `dart pub publish`).
 28. **Rustification** — Rewrite core in Rust with PyO3/maturin (see roadmap §12).
 
-> **See [roadmap-execution-plan.md](roadmap-execution-plan.md)** for the
-> dependency-graphed, topo-sorted parallel execution plan with Gantt chart
-> and critical path analysis.
+> **See [../roadmap.md](../roadmap.md)** for the detailed roadmap with
+> dependency graphs and execution phases.
 
 ---
 
diff --git a/py/tools/releasekit/docs/roadmap-execution-plan.md b/py/tools/releasekit/docs/roadmap-execution-plan.md
deleted file mode 100644
index 3bc750fce9..0000000000
--- a/py/tools/releasekit/docs/roadmap-execution-plan.md
+++ /dev/null
@@ -1,1002 +0,0 @@
-# Releasekit Roadmap — Dependency Graph & Parallel Execution Plan
-
-**Date:** 2026-02-13
-
-This document models every roadmap item as a node in a dependency graph,
-reverse-topologically sorts it, and partitions it into **parallel execution
-phases** (levels) so that independent work streams can proceed simultaneously.
-
----
-
-## 0. Genkit Python Release — Status
-
-The full roadmap (§1–§9) covers releasekit's long-term vision across all
-ecosystems. This section tracks items **immediately relevant to shipping
-Genkit Python**, ordered by release-blocking priority.
-
-Context: [PR #4586](https://github.com/firebase/genkit/pull/4586) migrates
-`publish_python.yml` to use `releasekit publish`. The
-[FIXES.md](../FIXES.md) audit identified 6 fixes (F1–F6). The
-`releasekit.toml` config defines groups (core, google_plugins,
-community_plugins), tag format, and publish exclusions.
-
-### Tier 0 — Release Blockers — ✅ ALL DONE
-
-| ID | Item | Status | Notes |
-|----|------|--------|-------|
-| **F4** | `--first-parent` in `git log` | ✅ Done | `versioning.py:316`, `changelog.py:320` already pass `first_parent=True` |
-| **F1** | Label new PRs with `autorelease: pending` | ✅ Done | `prepare.py:376-390` labels both new and existing PRs |
-| **F2** | Filter merged PR lookup by head branch | ✅ Done | `release.py:237-241` filters by `head=_RELEASE_BRANCH` |
-| **F3** | Fix `actions/checkout@v5` → `@v4` | ✅ N/A | `actions/checkout@v5` exists (released 2024). Not a bug. |
-
-### Tier 1 — High Value — ✅ ALL DONE
-
-| ID | Item | Status | Notes |
-|----|------|--------|-------|
-| **F6** | Write per-package `CHANGELOG.md` to disk | ✅ Done | `prepare.py:313-321` + `changelog.py:write_changelog()` |
-| **F5** | Auto-prepare on push to main | ✅ Done | `releasekit-uv.yml:46-50` triggers on push to `py/packages/**`, `py/plugins/**` |
-| **R07** | Internal dep version propagation | ✅ Done | `versioning.py:386-400` BFS propagation via `graph.reverse_edges` |
-| **R32** | Parallel `vcs.log()` in `compute_bumps` | ✅ Done | Replaced sequential loop with `asyncio.gather` (2026-02-12) |
-| **R04** | Revert commit handling | ✅ Done | `parse_conventional_commit` detects `Revert "..."` and `revert:` formats; bump counter cancellation (2026-02-12) |
-| **R27** | `--ignore-unknown-tags` flag | ✅ Done | `compute_bumps(ignore_unknown_tags=True)` falls back to full history on bad tags; CLI flag on publish/plan/version (2026-02-12) |
-| — | `--no-merges` in VCS protocol | ✅ Done | `VCS.log(no_merges=True)` filters accidental merge commits from bump computation and changelogs (2026-02-12) |
-| — | Default branch auto-detection | ✅ Done | `VCS.default_branch()` + `branch.py:resolve_default_branch()` + `config.default_branch` override. Git: `symbolic-ref` → probe → fallback. Mercurial: `"default"` (2026-02-12) |
-| — | Distro packaging dep sync | ✅ Done | `distro.py`: auto-syncs Debian/Ubuntu `control`, Fedora/RHEL `.spec`, and Homebrew formula deps from `pyproject.toml`. Check via `releasekit check`, fix via `releasekit check --fix` (2026-02-12, Homebrew added 2026-02-13) |
-| — | Non-conventional commit warnings | ✅ Done | `versioning.py` and `changelog.py` now log `non_conventional_commit` warnings for improperly formatted commit messages (2026-02-12) |
-| — | Debian/Ubuntu + Fedora/RHEL + Homebrew packaging | ✅ Done | `packaging/debian/` (control, changelog, copyright, rules) + `packaging/fedora/*.spec` + `packaging/homebrew/*.rb` + `packaging/README.md` (2026-02-12, Homebrew added 2026-02-13) |
-| — | pnpm publish params (`dist_tag`, `publish_branch`, `provenance`) | ✅ Done | Threaded through `PackageManager` protocol → `PnpmBackend.publish()` → `PublishConfig` → `WorkspaceConfig` → CLI `--dist-tag` flag (2026-02-13) |
-| — | Ecosystem-aware `discover_packages` | ✅ Done | `discover_packages(ecosystem=)` dispatches to `PnpmWorkspace` for JS, `uv` for Python. Async bridge via `_discover_js_packages` (2026-02-13) |
-| — | `pyproject_path` → `manifest_path` rename | ✅ Done | Renamed across all source and test files for ecosystem-agnostic naming (2026-02-13) |
-
-### Tier 2 — Important but Not Blocking — ✅ ALL DONE
-
-| ID | Item | Effort | Status | Why Important |
-|----|------|--------|--------|---------------|
-| **R25** | `--commit-depth` / `--max-commits` | S | ✅ Done | `max_commits` param on VCS protocol, `compute_bumps`, `WorkspaceConfig`. |
-| **R05** | `releasekit doctor` | M | ✅ Done | `run_doctor` in `doctor.py` with 6 checks (config, tag alignment, orphaned tags, VCS state, forge, default branch). CLI `releasekit doctor` subcommand wired (2026-02-13). |
-| **R26** | `bootstrap-sha` config | S | ✅ Done | `bootstrap_sha` on `WorkspaceConfig`, threaded through `compute_bumps`, `prepare_release`, and all CLI call sites. Falls back to full history when no tag exists (2026-02-13). |
-| **R08** | Contributor attribution in changelogs | S | ✅ Done | `ChangelogEntry.author` field, git log format `%H\x00%an\x00%s`, rendered as `— @author` in changelog entries (2026-02-13). |
-| **R28** | Lockfile update after version bump | S | ✅ Done | `prepare.py` step 5 calls `pm.lock(upgrade_package=ver.name)` after each `bump_pyproject` (2026-02-13). |
-| **R17** | Auto-merge release PRs | S | ✅ Done | `auto_merge` config on `WorkspaceConfig`. `prepare.py` step 10 calls `forge.merge_pr()` after labeling. All 4 forge backends implement `merge_pr` (2026-02-13). |
-
-### Genkit JS Release — Parity Analysis & Migration Plan
-
-**Goal:** Migrate Genkit JS from its current shell-script-based release
-process to releasekit, achieving full parity before switching over.
-
-#### Current Genkit JS Release Process (as-is)
-
-The JS release pipeline is spread across 6 GitHub Actions workflows and
-4 shell scripts:
-
-| Workflow / Script | What It Does |
-|-------------------|-------------|
-| `bump-js-version.yml` | Manual dispatch → runs `bump_and_tag_js.sh` to bump **all** JS packages in lockstep, commit, tag, push. |
-| `bump-cli-version.yml` | Manual dispatch → runs `bump_and_tag_cli.sh` to bump CLI packages (`tools-common`, `telemetry-server`, `genkit-cli`) separately. |
-| `bump-package-version.yml` | Manual dispatch → bumps a **single** package by dir + name. |
-| `release_js_main.yml` | Manual dispatch → `pnpm install && pnpm build && pnpm test:js`, then runs `scripts/release_main.sh` which publishes ~20 packages **sequentially** to Wombat Dressing Room (Google's npm proxy). |
-| `release_js_package.yml` | Manual dispatch → publishes a **single** package to Wombat. |
-| `build-cli-binaries.yml` | Manual dispatch → cross-compiles CLI binaries via Bun for 5 platforms (linux-x64, linux-arm64, darwin-x64, darwin-arm64, win32-x64), uploads artifacts, runs smoke tests. |
-
-**Key characteristics:**
-- **Manual version bumps** — operator picks `patch`/`minor`/`major`/`prerelease` via workflow dispatch; no Conventional Commits automation.
-- **Synchronized versions** — `bump_and_tag_js.sh` bumps all JS packages to the same version (lockstep mode).
-- **Separate CLI versioning** — CLI packages (`genkit-tools/*`) are versioned independently from `js/*` packages.
-- **Tag format** — dual tags per package: `{tag_prefix}{version}` (e.g. `core-v1.2.3`) **and** `{package_name}@{version}` (e.g. `@genkit-ai/core@1.2.3`).
-- **npm dist-tag** — publishes with `--tag next` or `--tag latest` (operator choice).
-- **Wombat Dressing Room** — all publishes go through `https://wombat-dressing-room.appspot.com/` (Google's npm proxy that adds provenance).
-- **No changelogs** — no automated CHANGELOG generation.
-- **No Release PR** — version bumps are committed directly to the branch.
-- **No dependency graph awareness** — publish order is hardcoded in `release_main.sh`.
-- **Sequential publish** — one package at a time, no parallelism.
-- **Clean worktree check** — `ensure-clean-working-tree.sh` runs after build, before publish.
-
-#### Releasekit Parity Gap Analysis
-
-| JS Capability | Releasekit Status | Gap / Work Needed |
-|--------------|-------------------|-------------------|
-| pnpm workspace discovery | ✅ Done | `PnpmWorkspaceBackend` reads `pnpm-workspace.yaml`, discovers packages from `package.json`. |
-| `npm version` bump | ✅ Done | `PnpmBackend.version_bump()` uses `npm version --no-git-tag-version`. |
-| Synchronized (lockstep) versions | ✅ Done | `synchronize = true` in `WorkspaceConfig`. |
-| Independent per-package bump | ✅ Done | Default mode. |
-| Separate release groups (JS vs CLI) | ✅ Done | `groups` config in `WorkspaceConfig`. |
-| Dual tag format (`prefix-v` + `name@`) | ✅ Done | `tag_format` with `{label}` placeholder. Per-workspace config. |
-| npm dist-tag (`next` / `latest`) | ✅ Done | `--dist-tag` CLI flag → `WorkspaceConfig` → `PublishConfig` → `PnpmBackend.publish(--tag)`. |
-| Wombat Dressing Room registry | ✅ Done | `PnpmBackend.publish(index_url=...)` maps to `--registry`. |
-| `pnpm publish` | ✅ Done | `PnpmBackend.publish()` with `--access=public`, `--registry`, `--tag`, `--publish-branch`, `--provenance`. |
-| `pnpm install` / `pnpm build` / `pnpm test` | ✅ Done | `PnpmBackend.build()` (`pnpm pack`), `lock()`, `smoke_test()`. |
-| pnpm lockfile update | ✅ Done | `PnpmBackend.lock()` — `pnpm install --lockfile-only` / `--frozen-lockfile`. |
-| Cross-compiled CLI binaries | ❌ Out of scope | **R23**: Cross-compilation orchestration. Separate concern from release. |
-| Conventional Commits automation | ✅ Done | JS currently lacks this; releasekit adds it. **Upgrade.** |
-| Changelog generation | ✅ Done | JS currently lacks this; releasekit adds it. **Upgrade.** |
-| Release PR workflow | ✅ Done | JS currently lacks this; releasekit adds it. **Upgrade.** |
-| Dependency-aware publish order | ✅ Done | JS currently hardcodes order; releasekit computes it from the graph. **Upgrade.** |
-| Parallel publish | ✅ Done | JS publishes sequentially; releasekit parallelizes by dependency level. **Upgrade.** |
-| Clean worktree preflight | ✅ Done | `preflight.py` checks this. |
-| Prerelease support (`--preid rc`) | ⚠️ Partial | **R03**: Full prerelease workflow (rollup vs separate). Basic `prerelease` param exists in `compute_bumps`. |
-
-#### Migration Workflow
-
-**Phase 1 — pnpm Backend (R11)**
-
-Implement the pnpm workspace backend so releasekit can discover, build,
-test, version-bump, and publish JS packages:
-
-1. `PnpmWorkspaceBackend` — discover packages from `pnpm-workspace.yaml`
-2. `PnpmBackend.publish()` — `pnpm publish` with `--tag`, `--registry`,
-   `--publish-branch`, `--access=public`, `--provenance=false`
-3. `PnpmBackend.lock()` — `pnpm install --lockfile-only`
-4. `PnpmBackend.version_bump()` — `npm version` or `pnpm version`
-5. `PnpmBackend.build()` / `test()` — `pnpm build`, `pnpm test`
-
-**Phase 2 — npm Registry Backend (R12, R37)**
-
-1. `NpmRegistryBackend` — check if a version is already published
-   (`npm view <pkg>@<version>`)
-2. Wombat Dressing Room support — custom `--registry` URL for publish
-3. npm dist-tag support — `--tag next` / `--tag latest`
-
-**Phase 3 — JS Workspace Config**
-
-Add a `[workspace.js]` section to `releasekit.toml`:
-
-```toml
-[workspace.js]
-ecosystem = "js"
-tool = "pnpm"
-root = "."                          # JS packages span root + js/
-tag_format = "{name}@{version}"
-umbrella_tag = "js/v{version}"
-synchronize = true                  # lockstep versions for js/*
-bootstrap_sha = "abc123..."         # starting point for adoption
-
-[workspace.js-cli]
-ecosystem = "js"
-tool = "pnpm"
-root = "genkit-tools"
-tag_format = "{name}@{version}"
-synchronize = true
-```
-
-**Phase 4 — Parallel Cutover**
-
-1. Run releasekit in `--dry-run` mode alongside existing scripts for
-   1–2 release cycles to validate parity.
-2. Verify tag format, version bumps, and publish output match.
-3. Switch `release_js_main.yml` to call `releasekit publish`.
-4. Archive `scripts/release_main.sh` and `js/scripts/bump_*.sh`.
-
-#### What Releasekit Gains Over Current JS Process
-
-- **Automated version bumps** from Conventional Commits (no manual
-  `patch`/`minor`/`major` selection).
-- **Changelogs** generated automatically per package.
-- **Release PR workflow** with review gate before publish.
-- **Dependency-aware parallel publish** instead of hardcoded sequential.
-- **Unified tooling** across Python and JS ecosystems.
-- **Rollback support** (`releasekit rollback <tag>`).
-- **Preflight checks** (cycles, lockfile, shallow clone, forge).
-- **Doctor diagnostics** for state consistency.
-
-### Tier 3 — Extended Features (partially done)
-
-| ID | Item | Status | Notes |
-|----|------|--------|-------|
-| ★ **R11** | pnpm workspace publish pipeline | ✅ Done | `PnpmBackend` + `PnpmWorkspace` fully implemented (2026-02-13). |
-| ★ **R12** | npm registry backend | ✅ Done | `NpmRegistry` with `npm view` version check (2026-02-13). |
-| ★ **R37** | Custom registry URL / Wombat Dressing Room | ✅ Done | `index_url` wired through `PnpmBackend.publish(--registry)` (2026-02-13). |
-| **R13** | Scoped tag format | ✅ Done | `parse_tag()` reverse-parses scoped npm tags (`@scope/name@version`). `secondary_tag_format` config for dual-tagging in `create_tags`. (2026-02-13). |
-| **R30** | Plan profiling | ✅ Done | `profiling.py`: `StepTimer` context manager, `PipelineProfile` with summary stats, JSON export, and ASCII table rendering (2026-02-13). |
-| **R31** | OpenTelemetry tracing | ✅ Done | `tracing.py`: optional OTel spans with zero-overhead no-op fallback when `opentelemetry-api` is not installed. `@span` decorator for sync/async. `pip install releasekit[tracing]` (2026-02-13). |
-| **R02** | Standalone repo packaging | Pending | PyPI-publishable wheel + entry point. |
-| **R03** | Full prerelease workflow | Pending | Rollup vs separate prerelease modes. |
-| **R06** | Hotfix / maintenance branches | Pending | `--base-branch` for non-default branch releases. |
-| **R10** | Snapshot releases | Pending | `--snapshot` for CI testing with ephemeral versions. |
-| **R14** | npm provenance (Sigstore) | Pending | `--provenance` attestation for npm publishes. |
-| **R15** | GPG / Sigstore signing | Pending | Sign tags and release artifacts. |
-| **R16** | SBOM generation | ✅ Done | `sbom.py`: CycloneDX 1.5 + SPDX 2.3 JSON generation from release manifest. Package URLs (purl), license IDs, supplier metadata. `generate_sbom()` + `write_sbom()` (2026-02-13). |
-| **R23** | Cross-compilation orchestration | Pending | CLI binary builds for multiple platforms. |
-| **R24** | PEP 440 scheme | Pending | Full PEP 440 version scheme support. |
-| **R29** | `releasekit migrate` | Pending | Protocol-based migration from alternatives. |
-| **R38** | Cherry-pick for release branches | Pending | `releasekit cherry-pick` subcommand. |
-| **R18–R22** | Changelog templates, announcements, changesets, plugins, programmatic API | Pending | |
-| **R33–R36** | Bazel, Rust, Java, Dart ecosystem backends | Pending | |
-
-### Implementation Summary (2026-02-13)
-
-All Tier 0, Tier 1, and Tier 2 items are complete. The release pipeline
-is production-ready for Genkit Python. JS parity backends (pnpm, npm)
-are implemented and ecosystem-aware.
-
-**2026-02-13 additions:** `releasekit doctor` CLI wired, `bootstrap_sha`
-confirmed wired, contributor attribution in changelogs (`@author`),
-lockfile update after bump confirmed wired, auto-merge release PRs
-(`auto_merge` config + `forge.merge_pr()`).
-
-**Codebase stats:** 73 source modules (~23,400 LOC), 51 test files
-(~20,700 LOC), 1293 tests passing (86% coverage), 14 CLI subcommands.
-
-**Protocols:** 5 backend protocols (VCS, PackageManager, Workspace,
-Registry, Forge) + 1 check protocol (CheckBackend).
-
-**Backends implemented:**
-
-| Protocol | Backends |
-|----------|----------|
-| VCS | Git (full), Mercurial (full) |
-| PackageManager | uv, pnpm |
-| Workspace | uv, pnpm |
-| Registry | PyPI, npm |
-| Forge | GitHub (CLI + API), GitLab (CLI), Bitbucket (API) |
-| CheckBackend | PythonCheckBackend (34 checks + 14 auto-fixers) |
-
-**Key changes (2026-02-12):**
-
-- **R32** — `versioning.py`: `compute_bumps` Phase 1 now uses
-  `asyncio.gather` to run per-package `vcs.log()` + `tag_exists()`
-  concurrently (~10× speedup for 60+ packages).
-- **R04** — `versioning.py`: `parse_conventional_commit` handles
-  `Revert "feat: ..."` (GitHub format) and `revert: feat: ...`
-  (conventional format). Bump computation uses per-level counters
-  where reverts decrement, so a reverted `feat:` cancels the MINOR bump.
-- **R27** — `versioning.py` + `cli.py`: New `ignore_unknown_tags`
-  parameter on `compute_bumps`. When `True`, a failed `git log {tag}..HEAD`
-  falls back to `since_tag=None` (full history) with a warning.
-  CLI flag `--ignore-unknown-tags` added to `publish`, `plan`, `version`.
-- **`--no-merges`** — VCS protocol + Git/Mercurial backends filter
-  accidental merge commits from bump computation and changelogs.
-- **Default branch detection** — `VCS.default_branch()` auto-detects
-  via `git symbolic-ref` (Git) or returns `"default"` (Mercurial).
-  Config override via `default_branch` in `releasekit.toml`.
-  `prepare.py` uses `resolve_default_branch()` for PR base.
-- **Distro dep sync** — New `distro.py` module parses `pyproject.toml`
-  deps and generates/validates Debian/Ubuntu `control` and Fedora/RHEL
-  `.spec` dependency lists. Integrated as check (`distro_deps`) and
-  auto-fixer (`releasekit check --fix`).
-- **Non-conventional commit warnings** — `versioning.py` and
-  `changelog.py` now log structured warnings for commit messages that
-  don't follow Conventional Commits format.
-- **Distro packaging** — Added `packaging/debian/` and
-  `packaging/fedora/` with full Debian and RPM packaging files.
-
-**Key changes (2026-02-13):**
-
-- **Config at repo root** — `releasekit.toml` moved to repo root.
-  `_find_workspace_root()` updated. New `_effective_workspace_root()`
-  resolves per-workspace root from `config_root / ws_config.root`.
-- **Ecosystem-aware backends** — `_create_backends()` selects
-  `PnpmBackend`/`NpmRegistry` for JS workspaces, `UvBackend`/`PyPIBackend`
-  for Python, based on `ws_config.tool`.
-- **R25** — `max_commits` param added to VCS protocol, `compute_bumps`,
-  and `WorkspaceConfig`. Bounds changelog generation for large repos.
-- **Tag `{label}` placeholder** — `format_tag()`, `create_tags()`,
-  `delete_tags()` accept `label` param. Tag format `{name}@{version}`,
-  umbrella tag `py/v{version}`.
-- **VCS `list_tags` + `current_branch`** — Added to VCS protocol,
-  `GitCLIBackend`, `MercurialCLIBackend`, and all 8 FakeVCS test classes.
-  Enables `releasekit doctor` orphan tag and branch checks.
-- **R05 partial** — `run_doctor()` in `doctor.py` with 7 diagnostic
-  checks (config, VCS, forge, registry, orphaned tags, branch, packages).
-  CLI wiring still pending.
-- **CI matrix expansion** — `tool-tests` and `conform-tests` now run
-  on Python 3.10–3.14 (5 versions). Path-filtered via
-  `dorny/paths-filter` so tests only run when relevant files change.
-- **Repo portability** — Audited: zero genkit imports, zero hardcoded
-  paths, self-contained deps. Ready for standalone repo extraction.
-- **pnpm publish params** — `dist_tag`, `publish_branch`, `provenance`
-  added to `PackageManager` protocol, `PnpmBackend.publish()` (maps to
-  `--tag`, `--publish-branch`, `--provenance`), `UvBackend.publish()`
-  (accepts and ignores for protocol compat), `PublishConfig`,
-  `WorkspaceConfig`, `_WORKSPACE_TYPE_MAP`. CLI `--dist-tag` flag on
-  `publish` subcommand. All `FakePackageManager.publish()` signatures
-  updated across 3 test files.
-- **Ecosystem-aware `discover_packages`** — New `ecosystem` parameter
-  dispatches to `PnpmWorkspace.discover()` for JS workspaces via
-  `_discover_js_packages` async bridge. All `discover_packages` call
-  sites in `cli.py` updated to pass `ws_config.ecosystem`.
-- **`pyproject_path` → `manifest_path`** — Renamed across all source
-  and test files for ecosystem-agnostic naming (`Package` dataclass,
-  `discover_packages`, `ephemeral_pin`, CLI, tests).
-- **Homebrew packaging** — `packaging/homebrew/releasekit.rb` formula
-  with `virtualenv_install_with_resources` and 10 dependency resource
-  blocks. `distro.py`: `_brew_resource_name()`, `expected_brew_resources()`,
-  `_parse_brew_resources()`, `check_brew_deps()`, `fix_brew_formula()`.
-  Wired into `check_distro_deps()` and `fix_distro_deps()`. 19 new tests.
-  `packaging/README.md` updated with Homebrew section.
-- **R38** — Cherry-pick for release branches added to roadmap (depends
-  on R06). Added to items table, gap traceability, Mermaid graph, topo
-  sort, parallel execution phases, and Gantt chart.
-- **FAQ edge cases** — Added 19 edge case entries to `docs/guides/faq.md`
-  covering dependency graph topologies (diamond, disconnected, chain,
-  cycle, self-dep) and version bump edge cases (revert cancellation,
-  mixed levels, `major_on_zero`, `synchronize`, `propagate_bumps`,
-  `force_unchanged`, `exclude_bump` vs `exclude_publish`, `max_commits`,
-  unreachable tags).
-
-All 1293 tests pass (86% coverage).
-
----
-
-## 1. Roadmap Items (Nodes)
-
-Each item has an ID, description, estimated effort, and list of dependencies.
-
-| ID | Item | Effort | Depends On |
-|----|------|--------|------------|
-| `R01` | Core protocol audit — ensure all 6 protocols are fully agnostic | S | — |
-| `R02` | Standalone repo scaffolding (CI, pyproject.toml, LICENSE, docs) | S | `R01` |
-| `R03` | Pre-release workflow (`--prerelease` flag, PEP 440 / SemVer) | M | `R01` |
-| `R04` | Revert commit handling (cancel bumps for reverted commits) | S | — |
-| `R05` | `releasekit doctor` (state consistency checker) | M | — |
-| `R06` | Hotfix / maintenance branch support (`--base-branch`) | M | `R03` |
-| `R07` | Internal dep version propagation (`fix_internal_dep_versions`) | M | — |
-| `R08` | Contributor attribution in changelogs | S | — |
-| `R09` | Incremental changelog generation (perf for large repos) | M | — |
-| `R10` | Snapshot releases (`--snapshot` for CI testing) | S | `R03` |
-| `R11` | pnpm workspace publish pipeline (end-to-end JS support) | L | `R01` |
-| `R12` | npm registry backend (wire up `NpmRegistry` for publish) | M | `R11` |
-| `R13` | Wombat proxy auth support (Google internal npm proxy) | S | `R12` |
-| `R14` | `@scope/name@version` tag format support | S | `R11` |
-| `R15` | Sigstore / GPG signing + provenance | M | `R02` |
-| `R16` | SBOM generation (CycloneDX / SPDX) | M | `R15` |
-| `R17` | Auto-merge release PRs | S | — |
-| `R18` | Custom changelog templates (Jinja2) | S | — |
-| `R19` | Announcement integrations (Slack, Discord) | S | — |
-| `R20` | Optional changeset file support (hybrid with conv. commits) | M | — |
-| `R21` | Plugin system for custom steps (entry-point discovery) | L | `R01` |
-| `R22` | Programmatic Python API | L | `R01`, `R21` |
-| `R23` | Cross-compilation orchestration (CLI binaries) | M | `R02` |
-| `R24` | PEP 440 version scheme (`version_scheme = "pep440"`) | S | `R03` |
-| `R25` | `--commit-depth` / `--max-commits` for large repos | S | — |
-| `R26` | `bootstrap-sha` config for mid-stream adoption | S | `R05` |
-| `R27` | `--ignore-unknown-tags` flag | S | — |
-| `R28` | Lockfile update after version bump | S | `R07` |
-| `R29` | `releasekit migrate` — protocol-based migration from alternatives | M | `R01`, `R02` |
-| `R30` | `releasekit plan --analyze` — critical path & bottleneck profiling | S | — |
-| `R31` | OpenTelemetry tracing backend (spans for publish stages, HTTP, git) | M | `R01` |
-| `R32` | Parallel `vcs.log()` in `compute_bumps` via `asyncio.gather` | S | — |
-| `R33` | Bazel workspace backend (BUILD files, `bazel run //pkg:publish`) | L | `R01` |
-| `R34` | Rust/Cargo workspace backend (`Cargo.toml`, `cargo publish`) | M | `R01` |
-| `R35` | Java backend (Maven `pom.xml` / Gradle `build.gradle`, `mvn deploy`) | L | `R01` |
-| `R36` | Dart/Pub workspace backend (`pubspec.yaml`, `dart pub publish`) | M | `R01` |
-| `R37` | pyx package registry backend | M | `R01` |
-| `R38` | Cherry-pick for release branches (`releasekit cherry-pick`) | M | `R06` |
-
-**Effort key:** S = Small (1–3 days), M = Medium (3–7 days), L = Large (1–2 weeks)
-
-### Gap → Roadmap Traceability
-
-Every gap identified in the [competitive analysis](competitive-gap-analysis.md)
-maps to one or more roadmap nodes:
-
-| Severity | Gap | Roadmap Node(s) | Alternative Tool Issues |
-|----------|-----|-----------------|-------------------|
-| 🔴 Critical | Pre-release workflow | `R03`, `R24` | release-please [#510](https://github.com/googleapis/release-please/issues/510), semantic-release [#563](https://github.com/semantic-release/semantic-release/issues/563) |
-| 🔴 Critical | Revert commit handling | `R04` | release-please [#296](https://github.com/googleapis/release-please/issues/296) |
-| 🔴 Critical | Hotfix / maintenance branches | `R06` | release-please [#2475](https://github.com/googleapis/release-please/issues/2475), semantic-release [#1038](https://github.com/semantic-release/semantic-release/issues/1038) |
-| 🟠 High | Dep version propagation | `R07`, `R28` | release-please [#1032](https://github.com/googleapis/release-please/issues/1032) |
-| 🟠 High | Contributor attribution | `R08` | release-please [#292](https://github.com/googleapis/release-please/issues/292) |
-| 🟠 High | PEP 440 version scheme | `R24` | python-semantic-release [#455](https://github.com/python-semantic-release/python-semantic-release/issues/455) |
-| 🟠 High | Performance on large repos | `R09`, `R25`, `R26` | python-semantic-release [#722](https://github.com/python-semantic-release/python-semantic-release/issues/722) |
-| 🟠 High | `releasekit doctor` | `R05`, `R26` | release-please [#1946](https://github.com/googleapis/release-please/issues/1946) |
-| 🟡 Nice | GPG / Sigstore signing | `R15`, `R16` | release-please [#1314](https://github.com/googleapis/release-please/issues/1314) |
-| 🟡 Nice | Auto-merge release PRs | `R17` | release-please [#2299](https://github.com/googleapis/release-please/issues/2299) |
-| 🟡 Nice | Custom changelog templates | `R18` | release-please [#2007](https://github.com/googleapis/release-please/issues/2007) |
-| 🟡 Nice | Plugin / extension system | `R21`, `R22` | python-semantic-release [#321](https://github.com/python-semantic-release/python-semantic-release/issues/321) |
-| 🟡 Nice | Snapshot releases | `R10` | changesets (built-in feature) |
-| 🟡 Nice | Changeset file support | `R20` | changesets [#862](https://github.com/changesets/changesets/issues/862) |
-| 🟡 Nice | Announcement integrations | `R19` | goreleaser (built-in feature) |
-| 🟢 Growth | `releasekit migrate` command | `R29` | Users of all alternatives |
-| 🟠 High | Plan profiling / bottleneck analysis | `R30` | python-semantic-release [#722](https://github.com/python-semantic-release/python-semantic-release/issues/722) |
-| 🟠 High | OpenTelemetry tracing | `R31` | No alternative has this |
-| 🟠 High | Parallel commit log fetching | `R32` | python-semantic-release [#722](https://github.com/python-semantic-release/python-semantic-release/issues/722) |
-| 🟢 Growth | Bazel workspace support | `R33` | No alternative supports Bazel monorepos |
-| 🟢 Growth | Rust/Cargo workspace support | `R34` | cargo-release is single-crate only |
-| 🟢 Growth | Java (Maven/Gradle) support | `R35` | jreleaser covers Java but no monorepo graph |
-| 🟢 Growth | Dart/Pub workspace support | `R36` | No alternative supports Dart workspaces |
-| 🟢 Growth | pyx package registry support | `R37` | No alternative supports pyx |
-| 🟠 High | Cherry-pick for release branches | `R38` | release-please [#2475](https://github.com/googleapis/release-please/issues/2475), semantic-release [#1038](https://github.com/semantic-release/semantic-release/issues/1038) |
-
----
-
-## 2. Dependency Graph (Mermaid)
-
-```mermaid
-graph TD
-    R01[R01: Protocol audit]
-    R02[R02: Standalone repo]
-    R03[R03: Pre-release workflow]
-    R04[R04: Revert handling]
-    R05[R05: Doctor command]
-    R06[R06: Hotfix branches]
-    R07[R07: Dep version propagation]
-    R08[R08: Contributor changelogs]
-    R09[R09: Incremental changelog]
-    R10[R10: Snapshot releases]
-    R11[R11: pnpm publish pipeline]
-    R12[R12: npm registry backend]
-    R13[R13: Wombat proxy auth]
-    R14[R14: Scoped tag format]
-    R15[R15: Signing + provenance]
-    R16[R16: SBOM generation]
-    R17[R17: Auto-merge PRs]
-    R18[R18: Changelog templates]
-    R19[R19: Announcements]
-    R20[R20: Changeset file support]
-    R21[R21: Plugin system]
-    R22[R22: Programmatic API]
-    R23[R23: Cross-compilation]
-    R24[R24: PEP 440 scheme]
-    R25[R25: Commit depth limit]
-    R26[R26: Bootstrap SHA]
-    R27[R27: Ignore unknown tags]
-    R28[R28: Lockfile update]
-    R29[R29: Migrate command]
-    R30[R30: Plan profiling]
-    R31[R31: OTel tracing]
-    R32[R32: Parallel vcs.log]
-    R33[R33: Bazel backend]
-    R34[R34: Rust/Cargo backend]
-    R35[R35: Java Maven/Gradle]
-    R36[R36: Dart/Pub backend]
-    R37[R37: pyx registry]
-    R38[R38: Cherry-pick for release branches]
-
-    R01 --> R37
-    R01 --> R33
-    R01 --> R34
-    R01 --> R35
-    R01 --> R36
-    R01 --> R02
-    R01 --> R29
-    R02 --> R29
-    R01 --> R31
-    R01 --> R03
-    R01 --> R11
-    R01 --> R21
-    R01 --> R22
-    R03 --> R06
-    R03 --> R10
-    R03 --> R24
-    R05 --> R26
-    R07 --> R28
-    R11 --> R12
-    R11 --> R14
-    R12 --> R13
-    R02 --> R15
-    R15 --> R16
-    R02 --> R23
-    R21 --> R22
-    R06 --> R38
-
-    classDef small fill:#d4edda,stroke:#28a745
-    classDef medium fill:#fff3cd,stroke:#ffc107
-    classDef large fill:#f8d7da,stroke:#dc3545
-
-    class R01,R04,R08,R10,R13,R14,R17,R18,R19,R24,R25,R26,R27,R28,R02 small
-    class R03,R05,R06,R07,R09,R12,R15,R16,R20,R23,R29,R31 medium
-    class R11,R21,R22,R33,R35 large
-    class R30,R32 small
-    class R34,R36,R37,R38 medium
-```
-
----
-
-## 3. Reverse Topological Sort
-
-Reverse topological order (leaves first, roots last):
-
-```
-Level 0 (no deps):     R01, R04, R05, R07, R08, R09, R17, R18, R19, R20, R25, R27, R30, R32
-Level 1 (deps on L0):  R02, R03, R11, R21, R26, R28
-Level 2 (deps on L1):  R06, R10, R12, R14, R15, R22, R23, R24, R29, R31, R33, R34, R35, R36, R37
-Level 3 (deps on L2):  R13, R16, R38
-```
-
----
-
-## 4. Parallel Execution Phases
-
-Items within each phase can execute **simultaneously**. A phase starts only
-after all items in the previous phase are complete.
-
-### Phase 0 — Foundation (all independent, max parallelism)
-
-```
-┌─────────────────────────────────────────────────────────────────────┐
-│  R01  Protocol audit                                          [S]  │
-│  R04  Revert commit handling                                  [S]  │
-│  R05  Doctor command                                          [M]  │
-│  R07  Internal dep version propagation                        [M]  │
-│  R08  Contributor attribution in changelogs                   [S]  │
-│  R09  Incremental changelog generation                        [M]  │
-│  R17  Auto-merge release PRs                                  [S]  │
-│  R18  Custom changelog templates                              [S]  │
-│  R19  Announcement integrations                               [S]  │
-│  R20  Optional changeset file support                         [M]  │
-│  R25  Commit depth limit                                      [S]  │
-│  R27  Ignore unknown tags                                     [S]  │
-│  R30  Plan profiling / bottleneck analysis                     [S]  │
-│  R32  Parallel vcs.log in compute_bumps                        [S]  │
-├─────────────────────────────────────────────────────────────────────┤
-│  14 items │ ~7 days wall-clock (limited by M items)                │
-│  Critical path: R01 (gates Phase 1)                                │
-└─────────────────────────────────────────────────────────────────────┘
-```
-
-### Phase 1 — Core Features (depends on Phase 0)
-
-```
-┌─────────────────────────────────────────────────────────────────────┐
-│  R02  Standalone repo scaffolding                  [S] ← R01       │
-│  R03  Pre-release workflow                         [M] ← R01       │
-│  R11  pnpm workspace publish pipeline              [L] ← R01       │
-│  R21  Plugin system                                [L] ← R01       │
-│  R26  Bootstrap SHA config                         [S] ← R05       │
-│  R28  Lockfile update after bump                   [S] ← R07       │
-├─────────────────────────────────────────────────────────────────────┤
-│  6 items │ ~10 days wall-clock (limited by L items: R11, R21)      │
-│  Critical path: R11 (gates JS publish in Phase 2)                  │
-└─────────────────────────────────────────────────────────────────────┘
-```
-
-### Phase 2 — Ecosystem & Extensions (depends on Phase 1)
-
-```
-┌─────────────────────────────────────────────────────────────────────┐
-│  R06  Hotfix branch support                        [M] ← R03       │
-│  R10  Snapshot releases                            [S] ← R03       │
-│  R12  npm registry backend                         [M] ← R11       │
-│  R14  Scoped tag format                            [S] ← R11       │
-│  R15  Sigstore / GPG signing                       [M] ← R02       │
-│  R22  Programmatic Python API                      [L] ← R01, R21  │
-│  R23  Cross-compilation orchestration              [M] ← R02       │
-│  R24  PEP 440 version scheme                       [S] ← R03       │
-│  R29  Migrate command                              [M] ← R01, R02  │
-│  R31  OpenTelemetry tracing                        [M] ← R01       │
-│  R33  Bazel workspace backend                      [L] ← R01       │
-│  R34  Rust/Cargo workspace backend                 [M] ← R01       │
-│  R35  Java (Maven/Gradle) backend                  [L] ← R01       │
-│  R36  Dart/Pub workspace backend                   [M] ← R01       │
-│  R37  pyx package registry backend               [M] ← R01       │
-├─────────────────────────────────────────────────────────────────────┤
-│  15 items │ ~10 days wall-clock (limited by L items: R22, R33, R35)│
-│  Critical path: R12 (gates Wombat proxy in Phase 3)                │
-└─────────────────────────────────────────────────────────────────────┘
-```
-
-### Phase 3 — Polish (depends on Phase 2)
-
-```
-┌─────────────────────────────────────────────────────────────────────┐
-│  R13  Wombat proxy auth                            [S] ← R12       │
-│  R16  SBOM generation                              [M] ← R15       │
-│  R38  Cherry-pick for release branches             [M] ← R06       │
-├─────────────────────────────────────────────────────────────────────┤
-│  3 items │ ~7 days wall-clock                                      │
-└─────────────────────────────────────────────────────────────────────┘
-```
-
----
-
-## 5. Critical Path Analysis
-
-The **longest path** through the dependency graph determines the minimum
-total wall-clock time:
-
-```
-R01 (S:3d) → R11 (L:10d) → R12 (M:5d) → R13 (S:2d)
-Total critical path: ~20 working days
-```
-
-Alternative critical path (for plugin system):
-```
-R01 (S:3d) → R21 (L:10d) → R22 (L:10d)
-Total: ~23 working days
-```
-
-**Optimization:** R22 (Programmatic API) can start as soon as R21 reaches
-a stable internal API, even before R21 is fully complete. With this overlap,
-effective critical path is ~20 days.
-
----
-
-## 6. Gantt Chart (Mermaid)
-
-```mermaid
-gantt
-    title Releasekit Roadmap Execution
-    dateFormat  YYYY-MM-DD
-    axisFormat  %b %d
-
-    section Phase 0 — Foundation
-    R01 Protocol audit           :r01, 2026-02-17, 3d
-    R04 Revert handling          :r04, 2026-02-17, 2d
-    R05 Doctor command           :r05, 2026-02-17, 5d
-    R07 Dep propagation          :r07, 2026-02-17, 5d
-    R08 Contributor changelogs   :r08, 2026-02-17, 2d
-    R09 Incremental changelog    :r09, 2026-02-17, 5d
-    R17 Auto-merge PRs           :r17, 2026-02-17, 2d
-    R18 Changelog templates      :r18, 2026-02-17, 2d
-    R19 Announcements            :r19, 2026-02-17, 2d
-    R20 Changeset support        :r20, 2026-02-17, 5d
-    R25 Commit depth limit       :r25, 2026-02-17, 1d
-    R27 Ignore unknown tags      :r27, 2026-02-17, 1d
-
-    section Phase 1 — Core Features
-    R02 Standalone repo          :r02, after r01, 3d
-    R03 Pre-release workflow     :r03, after r01, 5d
-    R11 pnpm publish pipeline    :crit, r11, after r01, 10d
-    R21 Plugin system            :r21, after r01, 10d
-    R26 Bootstrap SHA            :r26, after r05, 2d
-    R28 Lockfile update          :r28, after r07, 2d
-
-    section Phase 2 — Ecosystem
-    R06 Hotfix branches          :r06, after r03, 5d
-    R10 Snapshot releases        :r10, after r03, 2d
-    R12 npm registry backend     :crit, r12, after r11, 5d
-    R14 Scoped tag format        :r14, after r11, 2d
-    R15 Signing + provenance     :r15, after r02, 5d
-    R22 Programmatic API         :r22, after r21, 10d
-    R23 Cross-compilation        :r23, after r02, 5d
-    R24 PEP 440 scheme           :r24, after r03, 2d
-    R29 Migrate command          :r29, after r02, 5d
-    R31 OTel tracing             :r31, after r01, 5d
-    R33 Bazel backend            :r33, after r01, 10d
-    R34 Rust/Cargo backend       :r34, after r01, 5d
-    R35 Java Maven/Gradle        :r35, after r01, 10d
-    R36 Dart/Pub backend         :r36, after r01, 5d
-    R37 pyx registry             :r37, after r01, 5d
-    R30 Plan profiling           :r30, 2026-02-17, 2d
-    R32 Parallel vcs.log         :r32, 2026-02-17, 2d
-
-    section Phase 3 — Polish
-    R13 Wombat proxy auth        :r13, after r12, 2d
-    R16 SBOM generation          :r16, after r15, 5d
-    R38 Cherry-pick release br   :r38, after r06, 5d
-```
-
----
-
-## 7. Standalone Repo Readiness Checklist
-
-Releasekit is already architecturally independent. These items ensure it
-can live in its own repository:
-
-- [x] **No hardcoded paths** — All paths are relative to workspace root
-      (discovered at runtime via `releasekit.toml` location).
-- [x] **Protocol-based backends** — 6 protocols (VCS, PackageManager,
-      Workspace, Registry, Forge, Telemetry) with no concrete coupling in core.
-- [x] **Ecosystem-agnostic core** — `graph.py`, `scheduler.py`,
-      `versioning.py`, `changelog.py` operate on abstract `Package` objects.
-- [x] **Config-driven** — All repo-specific settings in `releasekit.toml`.
-- [x] **No imports from parent packages** — `releasekit` has zero imports
-      from the genkit monorepo.
-- [x] **Own pyproject.toml** — Complete with build system, dependencies,
-      entry point.
-- [x] **Own test suite** — `tests/` directory with full coverage.
-- [ ] **LICENSE file** — Currently references `../../LICENSE`; needs own copy.
-- [ ] **CI workflows** — Needs own `.github/workflows/` for testing and
-      publishing.
-- [ ] **PyPI publishing** — Needs Trusted Publisher setup.
-- [ ] **Documentation site** — `docs/mkdocs.yml` exists; needs deployment.
-
-### Abstraction Layers (6 Protocols)
-
-```
-┌──────────────────────────────────────────────────────────────────┐
-│                        releasekit core                           │
-│                                                                  │
-│  graph.py  scheduler.py  versioning.py  changelog.py  plan.py   │
-│  preflight.py  state.py  lock.py  tags.py  groups.py            │
-│                                                                  │
-│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐│
-│  │   VCS    │ │ Package  │ │Workspace │ │ Registry │ │ Forge  ││
-│  │ Protocol │ │ Manager  │ │ Protocol │ │ Protocol │ │Protocol││
-│  │          │ │ Protocol │ │          │ │          │ │        ││
-│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘│
-└───────┼────────────┼────────────┼────────────┼────────────┼─────┘
-        │            │            │            │            │
-   ┌────┴────┐  ┌────┴────┐  ┌───┴────┐  ┌───┴────┐  ┌───┴─────┐
-   │  git    │  │  uv     │  │  uv    │  │  PyPI  │  │ GitHub  │
-   │  hg     │  │  pnpm   │  │  pnpm  │  │  npm   │  │ GitLab  │
-   │         │  │  cargo  │  │  cargo │  │crates.io│  │Bitbucket│
-   │         │  │  maven  │  │  bazel │  │  Maven │  │  Gitea  │
-   │         │  │  gradle │  │  dart  │  │  Pub   │  │         │
-   │         │  │  dart   │  │  maven │  │        │  │         │
-   └─────────┘  └─────────┘  └────────┘  └────────┘  └─────────┘
-```
-
-Each protocol is a `typing.Protocol` (structural subtyping) — no base
-class inheritance required. New backends are added by implementing the
-protocol and registering in `detection.py`.
-
----
-
-## 8. Algorithm & Data Structure Audit
-
-An audit of the current codebase confirms optimal choices across all
-performance-critical paths:
-
-### Algorithms
-
-| Module | Algorithm | Complexity | Status |
-|--------|-----------|------------|--------|
-| `graph.py` `topo_sort` | Kahn's algorithm (BFS-based) | O(V+E) | ✅ Optimal |
-| `graph.py` `detect_cycles` | DFS with 3-color marking | O(V+E) | ✅ Optimal |
-| `graph.py` `forward_deps` / `reverse_deps` | BFS with `deque` | O(V+E) | ✅ Optimal |
-| `versioning.py` transitive propagation | BFS via `deque` over reverse edges | O(V+E) | ✅ Optimal |
-| `scheduler.py` dispatch | Dependency-triggered queue (not level-lockstep) | O(1) per completion | ✅ Optimal |
-| `scheduler.py` retry | Exponential backoff + full jitter (capped 60s) | — | ✅ Best practice |
-| `net.py` HTTP retry | Exponential backoff on 429/5xx + connection errors | — | ✅ Best practice |
-
-### Data Structures
-
-| Structure | Where Used | Why |
-|-----------|-----------|-----|
-| `dict[str, Package]` | `DependencyGraph.packages` | O(1) lookup by name |
-| `dict[str, list[str]]` | `edges`, `reverse_edges` | O(1) adjacency lookup |
-| `dict[str, int]` | `in_degree` in Kahn's | O(1) decrement |
-| `set[str]` | `_done`, `_enqueued`, `_cancelled` in Scheduler | O(1) membership test |
-| `deque[str]` | BFS queues in topo sort, forward/reverse deps | O(1) append + popleft |
-| `asyncio.Queue` | Scheduler work queue | Thread-safe async FIFO |
-| `asyncio.Semaphore` | Concurrency limiter | Cooperative async gating |
-| `frozenset[int]` | `RETRYABLE_STATUS_CODES` | O(1) membership, immutable |
-| `frozen dataclass` | `Package`, `SchedulerResult` | Hashable, safe to share |
-
-### Async Runtime
-
-| Component | Implementation | Notes |
-|-----------|---------------|-------|
-| Event loop | `asyncio.run()` (stdlib) | Single-loop, no thread contention |
-| Concurrency | `asyncio.Semaphore(N)` | Cooperative, no OS thread overhead |
-| Worker pool | `asyncio.create_task()` × N | Lightweight coroutines, not threads |
-| HTTP | `httpx.AsyncClient` with connection pooling | Reuses TCP connections |
-| Subprocess | `asyncio.create_subprocess_exec` (via `_run.py`) | Non-blocking process I/O |
-| File I/O | `aiofiles` | Non-blocking disk I/O |
-| Pause/resume | `asyncio.Event` gate | Zero-cost when not paused |
-| Signals | `loop.add_signal_handler` (SIGUSR1/2) | OS-level, no polling |
-
-### Identified Optimization: R32 — Parallel `vcs.log()`
-
-**Current:** `compute_bumps` calls `vcs.log()` sequentially for each
-package (N serial git subprocess calls for N packages).
-
-**Fix:** Use `asyncio.gather()` to fetch all commit logs in parallel,
-bounded by a semaphore to avoid fork-bombing:
-
-```python
-# Before (sequential):
-for pkg in packages:
-    log_lines = await vcs.log(since_tag=tag, paths=[str(pkg.path)])
-
-# After (parallel):
-sem = asyncio.Semaphore(10)
-async def _fetch(pkg):
-    async with sem:
-        return await vcs.log(since_tag=tag, paths=[str(pkg.path)])
-results = await asyncio.gather(*[_fetch(p) for p in packages])
-```
-
-For a 60-package workspace, this reduces commit log fetching from
-~60 × 0.1s = 6s to ~0.6s (10× speedup).
-
----
-
-## 9. OpenTelemetry Tracing Design (R31)
-
-### Why
-
-No alternative has built-in observability. For large workspaces (60+
-packages), understanding where time is spent is critical:
-
-- Which packages are on the critical path?
-- Is the bottleneck git, the registry, or the build?
-- How long does each publish stage take?
-
-### Architecture
-
-```
-┌──────────────────────────────────────────────────────────┐
-│                    releasekit core                        │
-│                                                          │
-│  scheduler.py ──┐                                        │
-│  publisher.py ──┤── @traced decorator ──► TracerProvider  │
-│  versioning.py ─┤                            │           │
-│  net.py ────────┘                            ▼           │
-│                                     ┌────────────────┐   │
-│                                     │ SpanExporter   │   │
-│                                     │  (pluggable)   │   │
-│                                     └───┬────┬───┬───┘   │
-└─────────────────────────────────────────┼────┼───┼───────┘
-                                          │    │   │
-                              ┌───────────┘    │   └──────────┐
-                              ▼                ▼              ▼
-                        OTLP/gRPC        Console         JSON file
-                        (Jaeger,         (--verbose)     (CI artifact)
-                         Grafana)
-```
-
-### Span Hierarchy
-
-```
-releasekit.publish
-├── releasekit.discover          (workspace discovery)
-├── releasekit.graph.build       (graph construction)
-├── releasekit.graph.topo_sort   (topological sort)
-├── releasekit.compute_bumps     (version computation)
-│   ├── releasekit.vcs.log [pkg=genkit]
-│   ├── releasekit.vcs.log [pkg=genkit-plugin-foo]
-│   └── ...
-├── releasekit.preflight         (preflight checks)
-└── releasekit.scheduler.run     (publish orchestration)
-    ├── releasekit.publish_one [pkg=genkit]
-    │   ├── releasekit.pin
-    │   ├── releasekit.build
-    │   ├── releasekit.checksum
-    │   ├── releasekit.upload     (registry publish)
-    │   ├── releasekit.poll       (availability check)
-    │   ├── releasekit.verify     (checksum verify)
-    │   └── releasekit.smoke_test
-    ├── releasekit.publish_one [pkg=genkit-plugin-foo]
-    └── ...
-```
-
-### Implementation Plan
-
-1. **Optional dependency** — `opentelemetry-api` + `opentelemetry-sdk` as
-   extras: `pip install releasekit[telemetry]`.
-2. **`Telemetry` protocol** — New 6th protocol in `backends/`:
-   ```python
-   class Telemetry(Protocol):
-       def start_span(self, name: str, **attrs) -> Span: ...
-       def record_metric(self, name: str, value: float, **attrs) -> None: ...
-   ```
-3. **`NullTelemetry`** — Default no-op backend (zero overhead when tracing
-   is not configured).
-4. **`OTelTelemetry`** — OpenTelemetry backend that creates real spans.
-5. **`@traced` decorator** — Wraps async functions to auto-create spans:
-   ```python
-   @traced('releasekit.vcs.log')
-   async def log(self, *, since_tag=None, paths=None, ...): ...
-   ```
-6. **`--trace` CLI flag** — Enables tracing with console exporter.
-   `--trace-endpoint` sends to OTLP collector.
-7. **`plan --analyze`** (R30) — Uses trace data to compute:
-   - Critical path through the dependency graph
-   - Estimated wall-clock time per phase
-   - Bottleneck packages (longest build/publish time)
-   - Parallelism efficiency (actual vs. theoretical speedup)
-
-### Metrics to Track
-
-| Metric | Type | Description |
-|--------|------|-------------|
-| `releasekit.publish.duration` | Histogram | Total publish time |
-| `releasekit.package.duration` | Histogram | Per-package publish time |
-| `releasekit.stage.duration` | Histogram | Per-stage time (pin, build, upload, ...) |
-| `releasekit.vcs.log.duration` | Histogram | Git log fetch time |
-| `releasekit.http.duration` | Histogram | HTTP request time |
-| `releasekit.scheduler.queue_wait` | Histogram | Time waiting in queue |
-| `releasekit.scheduler.concurrency` | Gauge | Active workers |
-| `releasekit.retry.count` | Counter | Total retries |
-
-### Plan Profiling Output (R30)
-
-```bash
-$ releasekit plan --analyze
-
-Critical Path: genkit → genkit-plugin-firebase → genkit-plugin-google-cloud
-  Estimated: 45s (build: 20s, publish: 15s, poll: 10s)
-
-Bottleneck Packages:
-  1. genkit-plugin-firebase  — 18s build (heaviest)
-  2. genkit                  — 15s build (most dependents: 42)
-  3. genkit-plugin-ollama    — 12s build
-
-Parallelism:
-  Theoretical speedup: 8.2× (60 packages, 5 workers)
-  Estimated speedup:   5.1× (critical path limits parallelism)
-  Utilization:         62%
-
-Phase Breakdown:
-  Phase 0 (12 pkgs): ~8s  ████████░░░░░░░░
-  Phase 1 (18 pkgs): ~12s ████████████░░░░
-  Phase 2 (20 pkgs): ~15s ███████████████░
-  Phase 3 (10 pkgs): ~10s ██████████░░░░░░
-```
-
----
-
-## 10. Branding
-
-**Logo:** 🚀 Rocketship.
-
-The releasekit logo is a rocketship — representing launches, velocity,
-and shipping releases. Use it in CLI banners, docs, and README headers.
-
-Deliverables:
-
-- SVG logo (rocketship silhouette, monochrome + color variants)
-- ASCII art banner for `releasekit --version` output
-- Favicon for docs site
-
----
-
-## 11. Repo Portability
-
-Releasekit is designed to be extractable to a standalone repository.
-
-Current state (audited):
-
-- **Zero imports** from any genkit package
-- **Zero hardcoded paths** — all paths are config-driven via `releasekit.toml`
-- **Self-contained deps** — `pyproject.toml` has no workspace-internal dependencies
-- **Own build system** — hatchling with `[project.scripts]` entry point
-- **Own test suite** — 1293+ tests with FakeVCS/FakeForge mocks, no genkit fixtures
-- All `genkit`/`firebase` references in source are docstring examples or test fixtures
-
-To extract to a standalone repo:
-
-1. Copy `py/tools/releasekit/` to a new repo root
-2. Move `releasekit.toml` into the consuming repo (it stays there)
-3. Publish to PyPI: `pip install releasekit`
-4. No code changes required in releasekit itself
-
-Post-extraction, update docs (README, getting-started guide) to reflect
-standalone installation and remove genkit-specific examples from docstrings.
-
----
-
-## 12. Rustification
-
-Long-term, rewrite the performance-critical core of releasekit in Rust
-and expose it to Python via PyO3/maturin. Python becomes a thin CLI
-driver and configuration layer; Rust handles the heavy lifting.
-
-### Motivation
-
-- **Speed** — Commit log parsing, dependency graph resolution, and
-  topological sorting are CPU-bound. Rust eliminates the GIL bottleneck
-  and enables true parallelism.
-- **Single binary** — A Rust core can be compiled to a standalone CLI
-  (`releasekit`) with zero runtime dependencies, usable from any
-  language ecosystem (JS, Go, etc.) without requiring Python.
-- **Memory safety** — Rust's ownership model prevents the class of bugs
-  that arise in concurrent subprocess orchestration.
-
-### Architecture
-
-```
-┌─────────────────────────────────────────┐
-│  Python CLI driver (click/argparse)     │
-│  - Config loading (releasekit.toml)     │
-│  - User interaction (prompts, UI)       │
-│  - Plugin system (custom backends)      │
-└────────────────┬────────────────────────┘
-                 │ PyO3 FFI
-┌────────────────▼────────────────────────┐
-│  Rust core (releasekit-core)            │
-│  - Commit parsing (conventional)        │
-│  - Version computation (semver)         │
-│  - Dependency graph + topo sort         │
-│  - Tag formatting + validation          │
-│  - Changelog generation                 │
-│  - Parallel subprocess orchestration    │
-│  - Registry polling (async reqwest)     │
-└─────────────────────────────────────────┘
-```
-
-### Migration phases
-
-1. **Phase 1 — Rust library crate** (`releasekit-core`): Implement
-   commit parsing, semver computation, and graph resolution in Rust.
-   Expose via PyO3 as a native Python extension module.
-2. **Phase 2 — Hybrid mode**: Python calls into Rust for hot paths
-   (versioning, graph, changelog). Backends (VCS, PM, Registry, Forge)
-   remain in Python for flexibility.
-3. **Phase 3 — Standalone binary**: Compile the Rust core into a
-   standalone `releasekit` CLI binary. Python driver becomes optional.
-4. **Phase 4 — Full Rust**: Migrate remaining backends to Rust.
-   Python package becomes a thin wrapper (`releasekit-py`) for users
-   who prefer `pip install`.
diff --git a/py/tools/releasekit/roadmap.md b/py/tools/releasekit/roadmap.md
index 3e93eb1305..d299cfc1c8 100644
--- a/py/tools/releasekit/roadmap.md
+++ b/py/tools/releasekit/roadmap.md
@@ -144,20 +144,34 @@ Flat top-level keys, no `[tool.*]` nesting:
 ```toml
 # releasekit.toml (at the monorepo root)
 
-synchronize = true          # all packages share one version number
-tag_format = "v{version}"
-publish_from = "ci"
-
-# Ecosystems: declare which workspace roots to scan.
-# Each ecosystem maps to a (Workspace, PackageManager, Registry) triple.
-[ecosystems.python]
-workspace_root = "py/"             # contains pyproject.toml with [tool.uv.workspace]
-
-[ecosystems.js]
-workspace_root = "js/"             # contains pnpm-workspace.yaml
-
-[ecosystems.go]
-workspace_root = "go/"             # contains go.work
+forge            = "github"
+repo_owner       = "firebase"
+repo_name        = "genkit"
+default_branch   = "main"
+pr_title_template = "chore(release): v{version}"
+
+[workspace.py]
+ecosystem      = "python"
+tool           = "uv"
+root           = "py"
+tag_format     = "{name}@{version}"
+umbrella_tag   = "py/v{version}"
+changelog      = true
+smoke_test     = true
+max_commits    = 500
+
+[workspace.js]
+ecosystem      = "js"
+tool           = "pnpm"
+root           = "."
+tag_format     = "{name}@{version}"
+umbrella_tag   = "js/v{version}"
+synchronize    = true
+
+# Go workspace (future)
+# [workspace.go]
+# ecosystem    = "go"
+# root         = "go"
 ```
 
 ##### Per-package config (`releasekit.toml`)
@@ -422,9 +436,9 @@ own the transport and format details.
 |---|----------|---------------|---------|--------|
 | 1 | **`VCS`** | Commit, tag, push, log, branch operations | `GitCLIBackend`, `MercurialBackend` | — |
 | 2 | **`Forge`** | PRs, releases, labels, availability check | `GitHubCLIBackend`, `GitLabBackend`, `BitbucketAPIBackend` | — |
-| 3 | **`Workspace`** | Discover members, classify deps, rewrite versions | `UvWorkspace`, `PnpmWorkspace` | `GoWorkspace`, `CargoWorkspace`, `PubWorkspace`, `MavenWorkspace`, `GradleWorkspace` |
-| 4 | **`PackageManager`** | Lock, build, publish | `UvBackend`, `PnpmBackend` | `GoBackend`, `CargoBackend`, `PubBackend`, `MavenBackend`, `GradleBackend` |
-| 5 | **`Registry`** | Check published versions, checksums | `PyPIBackend`, `NpmRegistry` | `GolangProxy`, `CratesIoRegistry`, `PubDevRegistry`, `MavenCentralRegistry` |
+| 3 | **`Workspace`** | Discover members, classify deps, rewrite versions | ✅ `UvWorkspace`, ✅ `PnpmWorkspace` | `GoWorkspace`, `CargoWorkspace`, `PubWorkspace`, `MavenWorkspace`, `GradleWorkspace` |
+| 4 | **`PackageManager`** | Lock, build, publish | ✅ `UvBackend`, ✅ `PnpmBackend` | `GoBackend`, `CargoBackend`, `PubBackend`, `MavenBackend`, `GradleBackend` |
+| 5 | **`Registry`** | Check published versions, checksums | ✅ `PyPIBackend`, ✅ `NpmRegistry` | `GolangProxy`, `CratesIoRegistry`, `PubDevRegistry`, `MavenCentralRegistry` |
 
 > **Design note:** `ManifestParser` and `VersionRewriter` were folded
 > into the `Workspace` protocol as `rewrite_version()` and
@@ -436,7 +450,7 @@ own the transport and format details.
 | Ecosystem | Workspace Config | Source Mechanism | Manifest File | Registry | Status |
 |-----------|-----------------|-----------------|---------------|----------|--------|
 | **Python (uv)** | `[tool.uv.workspace]` | `[tool.uv.sources]` `workspace = true` | `pyproject.toml` | PyPI | ✅ Implemented |
-| **TypeScript (pnpm)** | `pnpm-workspace.yaml` | `"workspace:*"` protocol in `package.json` | `package.json` | npm | 🔧 Backend done |
+| **TypeScript (pnpm)** | `pnpm-workspace.yaml` | `"workspace:*"` protocol in `package.json` | `package.json` | npm | ✅ Implemented |
 | **Go** | `go.work` | `use ./pkg` directives | `go.mod` | proxy.golang.org | ⬜ Designed (see §7) |
 | **Java (Maven)** | reactor POM `<modules>` | `<version>${project.version}</version>` | `pom.xml` | Maven Central | ⬜ Future |
 | **Java (Gradle)** | `settings.gradle` `include` | `project(':sub')` deps | `build.gradle(.kts)` | Maven Central | ⬜ Future |
@@ -480,7 +494,7 @@ Remaining migration steps:
 | 4c: UI States | ✅ Complete | observer.py, sliding window, keyboard shortcuts, signal handlers |
 | 5: Release-Please | ✅ Complete | Orchestrators, CI workflow, workspace-sourced deps |
 | 6: UX Polish | ✅ Complete | init, formatters (9), rollback, completion, diagnostics, granular flags, TOML config migration |
-| 7: Quality + Ship | 🔶 In progress | 706 tests pass, 16.8K src lines, 12.1K test lines |
+| 7: Quality + Ship | 🔶 In progress | 1,293 tests pass, 76 source modules, 51 test files (~19.3K test LOC) |
 
 ### Phase 5 completion status
 
@@ -865,10 +879,17 @@ Phase 6: UX Polish         ▼    ✅ COMPLETE
                            │
 Phase 7: Quality + Ship    ▼    🔶 IN PROGRESS
 ┌─────────────────────────────────────────────────────────┐
-│  tests (706 tests, 12.1K lines)                         │
+│  tests (1,293 tests, 51 files, ~19.3K lines)            │
 │  type checking (ty, pyright, pyrefly -- zero errors)    │
 │  README.md (21 sections, mermaid diagrams)              │
 │  workspace config (releasekit init on genkit repo)     │
+│  sbom.py (CycloneDX + SPDX SBOM generation)             │
+│  profiling.py (pipeline step timing + bottleneck)       │
+│  tracing.py (optional OpenTelemetry, graceful no-op)    │
+│  doctor.py (release state consistency checker)          │
+│  distro.py (Debian/Fedora/Homebrew dep sync)            │
+│  branch.py (default branch resolution)                  │
+│  commit_parsing/ (conventional commit parser)           │
 │                                                         │
 │  ✓ Ship v0.1.0 to PyPI                                  │
 └─────────────────────────────────────────────────────────┘
@@ -1374,12 +1395,86 @@ deletion. Shell completion works in bash/zsh/fish.
 | Type checking | Zero errors from `ty`, `pyright`, and `pyrefly` in strict mode. | config |
 | `README.md` | 21 sections with Mermaid workflow diagrams, CLI reference, config reference, testing workflow, vulnerability scanning, migration guide. | ~800 |
 | Workspace config | Run `releasekit init` on the genkit repo. Review auto-detected groups. Commit generated config. | config |
+| `migrate.py` | `releasekit migrate` subcommand for mid-stream adoption. See details below. | ~200 |
 
 **Done when**: `pytest --cov-fail-under=90` passes, all three type checkers
 report zero errors, README is complete.
 
 **Milestone**: Ship `releasekit` v0.1.0 to PyPI.
 
+#### `releasekit migrate` — Automatic Tag Detection and Bootstrap
+
+When adopting releasekit on a repo that already has releases, the user
+currently needs to manually find the last release tag and set
+`bootstrap_sha` in `releasekit.toml`. The `migrate` subcommand automates
+this entirely.
+
+**What it does:**
+
+1. **Scan all git tags** in the repo (`git tag -l`).
+2. **Classify each tag** by matching against known tag patterns:
+   - Umbrella tags: `py/v0.5.0`, `js/v1.2.3`, `go/v0.1.0`
+   - Per-package tags: `py/genkit-v0.5.0`, `@genkit-ai/core@1.2.3`
+   - Legacy tags: `genkit-python@0.4.0`, `genkit@1.0.0-rc.5`
+   - Unrecognized tags are reported but not associated.
+3. **Associate tags with workspaces** by matching the tag prefix/format
+   against each `[workspace.*]` section's `tag_format`, `umbrella_tag`,
+   and `root` fields in `releasekit.toml`.
+4. **Associate tags with packages** by matching the `{name}` component
+   of the tag against discovered workspace members (from
+   `Workspace.discover()`).
+5. **Determine the latest release per workspace** by sorting associated
+   tags by semver and picking the highest.
+6. **Auto-set `bootstrap_sha`** to the commit the latest tag points to
+   (via `git rev-list -1 <tag>`).
+7. **Generate a migration report** showing:
+   - Tags found per workspace (with version, commit SHA, date).
+   - Tags that could not be associated (orphaned/legacy).
+   - The `bootstrap_sha` that will be written.
+   - Per-package tag status (present / missing / legacy format).
+8. **Write `bootstrap_sha`** into `releasekit.toml` (using tomlkit for
+   comment-preserving edits), or print the diff in `--dry-run` mode.
+
+**CLI interface:**
+
+```
+releasekit migrate [--dry-run] [--workspace LABEL]
+
+Options:
+  --dry-run       Show what would be written without modifying files.
+  --workspace     Migrate a specific workspace (default: all).
+```
+
+**Example output:**
+
+```
+Scanning tags...
+  Found 4 tags:
+    py/v0.5.0              → workspace: py  (commit b71a3d20c, 2026-02-05)
+    genkit-python@0.4.0    → workspace: py  (legacy format, commit a1b2c3d)
+    genkit-python@0.3.2    → workspace: py  (legacy format, commit e4f5g6h)
+    genkit-python@0.3.1    → workspace: py  (legacy format, commit i7j8k9l)
+
+  Latest release for workspace 'py': py/v0.5.0 (0.5.0)
+
+  Per-package tag status (workspace: py):
+    genkit                          — no per-package tag (will use bootstrap_sha)
+    genkit-plugin-google-genai      — no per-package tag (will use bootstrap_sha)
+    genkit-plugin-vertex-ai         — no per-package tag (will use bootstrap_sha)
+    ... (22 packages total)
+
+Writing bootstrap_sha = "b71a3d20c74b71583edbc652e5b26117caad43f4" to releasekit.toml
+  ✅ Migration complete. Run 'releasekit plan' to preview the next release.
+```
+
+**Why this matters:**
+
+- Eliminates manual SHA lookup when adopting releasekit.
+- Handles repos with mixed tag formats (legacy + new) gracefully.
+- Works across multiple workspaces (e.g. `py` + `js` in the same repo).
+- The classification logic reuses `tag_format` parsing from
+  `versioning.py`, ensuring consistency with how releasekit creates tags.
+
 ---
 
 ## Critical Path
@@ -1552,6 +1647,7 @@ py/tools/releasekit/
         mermaid.py                    ← Mermaid syntax
         d2.py                         ← D2 syntax
       init.py                         ← workspace config scaffolding
+      migrate.py                      ← mid-stream adoption: tag detection + bootstrap_sha
       versioning.py                   ← Conventional Commits -> semver
       pin.py                          ← ephemeral version pinning
       bump.py                         ← version string rewriting