From 06fc7a667fa5635bd3bcb7d90d24044607a089af Mon Sep 17 00:00:00 2001 From: Yesudeep Mangalapilly Date: Fri, 13 Feb 2026 13:36:15 -0800 Subject: [PATCH] docs(py): audit and fix stale Python documentation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Cross-checked all markdown files in py/ against the codebase and open PRs. Fixed outdated content across 9 files. engdoc/index.md: - Fix Python version: 3.12+ → 3.10+ - Update feature parity table (6 of 7 features now ✅, Agents still ❌) - Replace 8-plugin table with full 23-plugin parity table - Rewrite all 6 Python code examples (generation, structured output, tool calling, chat, agents, data retrieval) with correct imports, Genkit() class API, and @ai.tool() decorator pattern engdoc/extending/api.md: - Replace stale Sync/Async design section (GenkitExperimental/SyncGenkit/ AsyncGenkit never implemented) with actual async-first architecture documenting GenkitRegistry → GenkitBase → Genkit hierarchy engdoc/extending/index.md: - Update d2 diagram plugin list from 7 to 22 plugins engdoc/extending/servers.md: - Fill Python TODO links with actual file paths (flows.py, reflection.py) engdoc/user_guide/python/publishing_pypi.md: - Add ReleaseKit as primary publishing mechanism - Demote manual workflow to "Legacy" section GEMINI.md: - Remove 7 dangling references to deleted files (engdoc/planning/, blog-genkit-python-*.md, release-publishing-guide.md) - Update blog article guidelines from mandatory to optional - Remove stale validation script checking deleted paths .github/PR_RELEASE.md: - Remove dangling reference to deleted blog-genkit-python-0.5.0.md PARITY_AUDIT.md: - G7: ✅ Done → ⬜ Reverted (#4459 reverted by #4469, needs re-land) - §8c.3/§8c.4: Update stale text — X-Genkit-Span-Id IS now sent (#4511) - §1d: genkitx-cohere ❌ → ✅ (in-tree cohere plugin exists) - §6c: Community coverage 3/6 → 4/6 - G17: 🔄 draft → ⬜ (#4521 closed, needs new PR) - G3/G12-G16/G4: Note #4510 is closed, needs new PR after G38 - G2→G1: Mark as superseded (#4516 titled [SUPERSEDED]) --- py/.github/PR_RELEASE.md | 6 - py/GEMINI.md | 31 +- py/PARITY_AUDIT.md | 144 +-- py/engdoc/ROADMAP.org | 240 ---- py/engdoc/blog-genkit-python-0.5.0.md | 304 ----- py/engdoc/extending/api.md | 310 ++--- py/engdoc/extending/index.md | 21 +- py/engdoc/extending/servers.md | 5 +- py/engdoc/index.md | 257 ++--- py/engdoc/model-conformance-roadmap.md | 491 -------- .../feature_parity_analysis.md | 653 ----------- .../parity-analysis/model_spec_compliance.md | 267 ----- .../parity-analysis/plugin_api_consistency.md | 295 ----- py/engdoc/parity-analysis/roadmap.md | 256 ----- .../parity-analysis/sample_parity_roadmap.md | 471 -------- py/engdoc/planning/FEATURE_MATRIX.md | 448 -------- py/engdoc/planning/README.md | 121 -- py/engdoc/planning/azure-telemetry-plugin.md | 452 -------- py/engdoc/planning/cloudflare-ai-plugin.md | 376 ------- .../planning/cloudflare-telemetry-plugin.md | 339 ------ py/engdoc/planning/observability-plugin.md | 453 -------- py/engdoc/planning/vercel-plugins.md | 384 ------- py/engdoc/release-publishing-guide.md | 340 ------ .../user_guide/python/publishing_pypi.md | 38 +- py/plugins/README.md | 2 - py/tools/conform/ANNOUNCEMENT.md | 275 ----- py/tools/conform/README.md | 18 +- py/tools/releasekit/ANNOUNCEMENT.md | 236 ---- py/tools/releasekit/FIXES.md | 143 --- py/tools/releasekit/README.md | 51 +- .../docs/competitive-gap-analysis.md | 5 +- .../releasekit/docs/roadmap-execution-plan.md | 1002 ----------------- py/tools/releasekit/roadmap.md | 136 ++- 33 files changed, 490 insertions(+), 8080 deletions(-) delete mode 100644 py/engdoc/ROADMAP.org delete mode 100644 py/engdoc/blog-genkit-python-0.5.0.md delete mode 100644 py/engdoc/model-conformance-roadmap.md delete mode 100644 py/engdoc/parity-analysis/feature_parity_analysis.md delete mode 100644 py/engdoc/parity-analysis/model_spec_compliance.md delete mode 100644 py/engdoc/parity-analysis/plugin_api_consistency.md delete mode 100644 py/engdoc/parity-analysis/roadmap.md delete mode 100644 py/engdoc/parity-analysis/sample_parity_roadmap.md delete mode 100644 py/engdoc/planning/FEATURE_MATRIX.md delete mode 100644 py/engdoc/planning/README.md delete mode 100644 py/engdoc/planning/azure-telemetry-plugin.md delete mode 100644 py/engdoc/planning/cloudflare-ai-plugin.md delete mode 100644 py/engdoc/planning/cloudflare-telemetry-plugin.md delete mode 100644 py/engdoc/planning/observability-plugin.md delete mode 100644 py/engdoc/planning/vercel-plugins.md delete mode 100644 py/engdoc/release-publishing-guide.md delete mode 100644 py/tools/conform/ANNOUNCEMENT.md delete mode 100644 py/tools/releasekit/ANNOUNCEMENT.md delete mode 100644 py/tools/releasekit/FIXES.md delete mode 100644 py/tools/releasekit/docs/roadmap-execution-plan.md diff --git a/py/.github/PR_RELEASE.md b/py/.github/PR_RELEASE.md index b9b96495a2..7c5b400f35 100644 --- a/py/.github/PR_RELEASE.md +++ b/py/.github/PR_RELEASE.md @@ -18,12 +18,6 @@ Version bump and release documentation for Genkit Python SDK v0.5.0. - Contributor acknowledgments with PR links - `PR_DESCRIPTION_0.5.0.md` - Release notes for GitHub -### Blog Article -- `py/engdoc/blog-genkit-python-0.5.0.md` - Release announcement with: - - Feature highlights - - Code examples - - Getting started guide - ### Contributor Acknowledgments 13 contributors recognized with 188 total pull requests: - @pavelgj (34 PRs) - Technical lead diff --git a/py/GEMINI.md b/py/GEMINI.md index 49444e3d2f..e9a87cf36e 100644 --- a/py/GEMINI.md +++ b/py/GEMINI.md @@ -1812,10 +1812,6 @@ plugin categorization guides. ## Changes -### New Planning Documents (engdoc/planning/) -- **FILE_NAME.md** - Description of integration plan -- **ROADMAP.md** - Status and effort metrics - ### Updated Documentation - **py/plugins/README.md** - Updated categorization guide @@ -2909,7 +2905,7 @@ Use this checklist when drafting a release PR: | 11 | **Categorize contributions** | Use bold categories: **Core**, **Plugins**, **Fixes**, etc. | | 12 | **Include PR numbers** | Add (#1234) for each major contribution | | 13 | **Add dotprompt table** | Same format as main table with PRs, Commits, Key Contributions | -| 14 | **Create blog article** | `py/engdoc/blog-genkit-python-X.Y.Z.md` | +| 14 | **Create blog article** | Optional: draft in PR description or external blog | | 15 | **Verify code examples** | Test all code snippets match actual API patterns | | 16 | **Run release validation** | `./bin/validate_release_docs` (see below) | | 17 | **Commit with --no-verify** | `git commit --no-verify -m "docs(py): ..."` | @@ -2989,17 +2985,7 @@ else echo "OK" fi -# 7. Check blog article exists for version in CHANGELOG -echo -n "Checking blog article exists... " -VERSION=$(grep -m1 '## \[' CHANGELOG.md | grep -oE '[0-9]+\.[0-9]+\.[0-9]+') -if [ -f "engdoc/blog-genkit-python-$VERSION.md" ]; then - echo "OK (found blog-genkit-python-$VERSION.md)" -else - echo "FAIL: Missing engdoc/blog-genkit-python-$VERSION.md" - ERRORS=$((ERRORS + 1)) -fi - -# 8. Verify imports work +# 7. Verify imports work echo -n "Checking Python imports... " if python -c "from genkit.ai import Genkit, Output; print('OK')" 2>/dev/null; then : @@ -3044,12 +3030,12 @@ done 6. **Match table formats**: External repo tables should have same columns as main table 7. **Cross-check repositories**: Check both firebase/genkit and google/dotprompt for Python work 8. **Use --no-verify**: For documentation-only changes, skip hooks for faster iteration -9. **Always include blog article**: Every release needs a blog article in `py/engdoc/` +9. **Consider a blog article**: Major releases may warrant a blog article 10. **Branding**: Use "Genkit" not "Firebase Genkit" (rebranded as of 2025) #### Blog Article Guidelines -Every release MUST include a blog article at `py/engdoc/blog-genkit-python-X.Y.Z.md`. +Major releases may include a blog article (e.g. in the PR description or an external blog). **Branding Note**: The project is called **"Genkit"** (not "Firebase Genkit"). While the repository is hosted at `github.com/firebase/genkit` and some blog posts may be published @@ -3098,20 +3084,12 @@ CRITICAL: Before publishing any blog article, extract and validate ALL code snip against the actual codebase to ensure they would compile/run correctly. ```bash -# Extract Python code blocks from a blog article and check for common errors -grep -A 50 '```python' py/engdoc/blog-genkit-python-*.md | grep -E \ - 'response\.text\(\)|output_schema=|asyncio\.run\(|from genkit import Genkit' - # Verify import statements match actual module structure python -c "from genkit.ai import Genkit, Output; print('Imports OK')" # Check that decorator patterns exist in codebase grep -r "@ai.flow()" py/samples/*/src/main.py | head -3 grep -r "@ai.tool()" py/samples/*/src/main.py | head -3 - -# Validate a blog article's code examples by syntax checking -python -m py_compile <(grep -A 20 '```python' py/engdoc/blog-genkit-python-*.md | \ - grep -v '```' | head -50) 2>&1 || echo "Syntax errors found!" ``` **Blog Article Code Review Checklist:** @@ -3558,7 +3536,6 @@ For the v0.5.0 release specifically: #### Full Release Guide For detailed release instructions, see: -- `py/engdoc/release-publishing-guide.md` - Complete step-by-step guide - `py/.github/PR_DESCRIPTION_0.5.0.md` - v0.5.0 PR description template - `py/CHANGELOG.md` - Full changelog format diff --git a/py/PARITY_AUDIT.md b/py/PARITY_AUDIT.md index 42881516b1..fcbb7e0541 100644 --- a/py/PARITY_AUDIT.md +++ b/py/PARITY_AUDIT.md @@ -1,7 +1,7 @@ # Genkit Feature Parity Audit — JS / Go / Python -> Generated: 2025-02-08. Updated: 2026-02-09. Baseline: `firebase/genkit` JS implementation, with explicit JS vs Go vs Python parity tracking. -> Last verified: 2026-02-09 against genkit-ai org (14 repos) and BloomLabsInc/genkit-plugins. +> Generated: 2025-02-08. Updated: 2026-02-13. Baseline: `firebase/genkit` JS implementation, with explicit JS vs Go vs Python parity tracking. +> Last verified: 2026-02-13 against genkit-ai org (14 repos) and BloomLabsInc/genkit-plugins. ## 1. Plugin Parity Matrix @@ -23,6 +23,7 @@ | Microsoft Foundry | `microsoft-foundry` | — | — | ✅ | Python-only | | Mistral | `mistral` | — | — | ✅ | Python-only | | xAI (Grok) | `xai` | — | — | ✅ | Python-only | +| Cohere | `cohere` | — | — | ✅ | Python-only | | **Vector Stores** | | | | | | | Dev Local Vectorstore | `dev-local-vectorstore` / `localvec` | ✅ | ✅ | ✅ | | | Pinecone | `pinecone` | ✅ | ✅ | ❌ | Missing in Python | @@ -44,6 +45,7 @@ | Express | `express` | ✅ | — | — | JS-only | | Next.js | `next` | ✅ | — | — | JS-only | | Flask | `flask` | — | — | ✅ | Python-only | +| FastAPI | `fastapi` | — | — | ✅ | Python-only | | Server plugin | `server` | — | ✅ | — | Go-only | | **Other** | | | | | | | LangChain | `langchain` | ✅ | — | — | JS-only | @@ -53,11 +55,11 @@ | Metric | JS | Go | Python | |--------|:--:|:--:|:------:| -| Total in-tree plugins | 18 | 16 | 20 | +| Total in-tree plugins | 18 | 16 | 22 | | Shared (JS+Go+Python) | 11 | 11 | 11 | -| Model provider plugins | 6 | 4 | 12 | +| Model provider plugins | 6 | 4 | 13 | | Vector store plugins | 4 | 4 | 1 | -| Unique to this SDK | 7 | 5 | 9 | +| Unique to this SDK | 7 | 5 | 11 | ### 1c. Plugin Gap Table (Parity Focus) @@ -81,7 +83,7 @@ | `genkitx-anthropic` | `BloomLabsInc/genkit-plugins` | JS | `anthropic` (in-tree) | ✅ | | `genkitx-mistral` | `BloomLabsInc/genkit-plugins` | JS | `mistral` (in-tree) | ✅ | | `genkitx-groq` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ | -| `genkitx-cohere` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ | +| `genkitx-cohere` | `BloomLabsInc/genkit-plugins` | JS | ✅ `cohere` (in-tree) | ✅ | | `genkitx-azure-openai` | `BloomLabsInc/genkit-plugins` | JS | `microsoft-foundry` (partial) | ⚠️ | | `genkitx-convex` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ | | `genkitx-hnsw` | `BloomLabsInc/genkit-plugins` | JS | ❌ Not available | ❌ | @@ -96,9 +98,9 @@ | Sample Set | JS | Go | Python | Notes | |------------|:--:|:--:|:------:|-------| -| Canonical internal sample/testapp set | 32 (`js/testapps`) | 37 (`go/samples`) | 37 runnable (`py/samples`, excluding `shared`, `sample-test`) | Primary parity baseline | +| Canonical internal sample/testapp set | 32 (`js/testapps`) | 37 (`go/samples`) | 39 runnable (`py/samples`, excluding `shared`, `sample-test`) | Primary parity baseline | | Public showcase samples | 9 (`samples/js-*`) | — | — | Public docs/demo set | -| Total directories under samples root | — | 37 | 39 | Python includes utility dirs (`shared`, `sample-test`) | +| Total directories under samples root | — | 37 | 41 | Python includes utility dirs (`shared`, `sample-test`) | ### 2b. Sample Area Parity (JS vs Go vs Python) @@ -132,36 +134,39 @@ Per Google OSS guidelines: | Plugin | LICENSE | README | pyproject | CHANGELOG | py.typed | tests/ | Status | |--------|:------:|:------:|:---------:|:---------:|:--------:|:------:|:------:| -| amazon-bedrock | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | -| anthropic | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ | -| checks | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (1) | ⚠️ | -| cloudflare-workers-ai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (1) | ⚠️ | -| compat-oai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (7) | ⚠️ | -| deepseek | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ | -| dev-local-vectorstore | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (4) | ⚠️ | -| evaluators | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | -| firebase | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | -| flask | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (1) | ⚠️ | -| google-cloud | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | -| google-genai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ | -| huggingface | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | -| mcp | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (5) | ⚠️ | -| microsoft-foundry | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ | -| mistral | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (3) | ⚠️ | -| observability | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | -| ollama | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (4) | ⚠️ | -| vertex-ai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (4) | ⚠️ | -| xai | ✅ | ✅ | ✅ | ❌ | ✅ | ✅ (2) | ⚠️ | - -**Legend**: ✅ = present, ❌ = missing, ⚠️ = mostly OK (only CHANGELOG missing) +| amazon-bedrock | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| anthropic | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| checks | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (1) | ✅ | +| cloudflare-workers-ai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| compat-oai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (7) | ✅ | +| deepseek | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| dev-local-vectorstore | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| evaluators | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| firebase | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| flask | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| google-cloud | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| google-genai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (9) | ✅ | +| huggingface | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| mcp | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (5) | ✅ | +| microsoft-foundry | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| mistral | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| observability | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (3) | ✅ | +| ollama | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (5) | ✅ | +| vertex-ai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| xai | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (4) | ✅ | +| cohere | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ (5) | ✅ | +| fastapi | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ (0) | ⚠️ | + +**Legend**: ✅ = present, ❌ = missing, ⚠️ = mostly OK ### 3c. Missing Files Summary | Issue | Count | Affected | |-------|:-----:|----------| | Missing `py.typed` | ~~9~~ **0** | All fixed ✅ | -| Missing `CHANGELOG.md` | 21 | ALL plugins + core package | +| Missing `CHANGELOG.md` | ~~21~~ **0** | All fixed ✅ (G11) | | Missing sample `LICENSE` | ~~1~~ **0** | `provider-checks-hello` fixed ✅ | +| Missing tests | 1 | `fastapi` plugin (0 test files) | ### 3d. Core Package (`packages/genkit`) @@ -170,14 +175,13 @@ Per Google OSS guidelines: | LICENSE | ✅ | | README.md | ✅ | | pyproject.toml | ✅ | -| CHANGELOG.md | ❌ | +| CHANGELOG.md | ✅ | | py.typed | ✅ | | tests/ | ✅ (44 test files) | ### 3e. Sample Compliance -All 37 samples have: `README.md` ✅, `run.sh` ✅, `pyproject.toml` ✅ -All samples except `provider-checks-hello` had `LICENSE` ✅ (now fixed). +All 39 samples have: `README.md` ✅, `run.sh` ✅, `pyproject.toml` ✅, `LICENSE` ✅. --- @@ -186,27 +190,30 @@ All samples except `provider-checks-hello` had `LICENSE` ✅ (now fixed). | Component | Test Files | Notes | |-----------|:----------:|-------| | **Core** (`packages/genkit`) | 44 | Comprehensive | -| **compat-oai** | 7 | Best-covered plugin | -| **google-genai** | 7 | Best-covered plugin | +| **google-genai** | 9 | Best-covered plugin | +| **compat-oai** | 7 | Well-covered | +| **cohere** | 5 | Well-covered | | **mcp** | 5 | Well-covered | +| **ollama** | 5 | Well-covered | +| **amazon-bedrock** | 4 | Good | +| **cloudflare-workers-ai** | 4 | Good | | **dev-local-vectorstore** | 4 | Good | -| **ollama** | 4 | Good | +| **microsoft-foundry** | 4 | Good | +| **mistral** | 4 | Good | | **vertex-ai** | 4 | Good | -| **amazon-bedrock** | 3 | Good | +| **xai** | 4 | Good | | **anthropic** | 3 | Good | -| **cloudflare-workers-ai** | 3 | Good | | **deepseek** | 3 | Good | | **evaluators** | 3 | Good | | **firebase** | 3 | Good | | **flask** | 3 | Good | | **google-cloud** | 3 | Good | | **huggingface** | 3 | Good | -| **microsoft-foundry** | 3 | Good | -| **mistral** | 3 | Good | | **observability** | 3 | Good | -| **xai** | 3 | Good | -| **Total (plugins)** | 70 | All plugins ≥ 3 | -| **Total (workspace)** | 136 | Including core + samples | +| **checks** | 1 | Minimal | +| **fastapi** | 0 | ❌ No tests | +| **Total (plugins)** | 84 | 20 of 22 plugins have tests | +| **Total (workspace)** | 128+ | Including core + samples | --- @@ -359,7 +366,7 @@ Python users typically use `httpx` or `requests` directly. | LangChain integration plugin | ✅ | — | ❌ | Go + Python | P3 | | **Community Ecosystem** (BloomLabs etc.) | | | | | | | Groq provider (`genkitx-groq`) | ✅ (community) | — | ❌ | Python | P3 | -| Cohere provider (`genkitx-cohere`) | ✅ (community) | — | ❌ | Python | P3 | +| Cohere provider (`genkitx-cohere`) | ✅ (community) | — | ✅ `cohere` (in-tree) | Python | ✅ | | Azure OpenAI (`genkitx-azure-openai`) | ✅ (community) | — | ✅ `microsoft-foundry` (superset) | Python | ✅ | | Convex vector store (`genkitx-convex`) | ✅ (community) | — | ❌ | Python | P3 | | HNSW vector store (`genkitx-hnsw`) | ✅ (community) | — | ❌ | Python | P3 | @@ -370,8 +377,8 @@ Python users typically use `httpx` or `requests` directly. | Feature | Notes | |---------|-------| -| 8 unique model providers | Bedrock, Cloudflare Workers AI, DeepSeek, HuggingFace, MS Foundry, Mistral, xAI, Observability | -| Flask plugin | Python web framework integration | +| 9 unique model providers | Bedrock, Cloudflare Workers AI, Cohere, DeepSeek, HuggingFace, MS Foundry, Mistral, xAI, Observability | +| Flask + FastAPI plugins | Python web framework integrations | | ASGI/gRPC production sample | `web-endpoints-hello` — production-ready template with security, resilience, multi-server | | `check_consistency` tooling | Automated 25-check workspace hygiene script | | `release_check` tooling | Automated 15-check pre-release validation | @@ -434,7 +441,7 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel | `genkitx-anthropic` | Provider (Anthropic) | Covered via `anthropic` | ✅ | | `genkitx-mistral` | Provider (Mistral) | Covered via `mistral` | ✅ | | `genkitx-groq` | Provider (Groq) | ❌ Not available | ❌ | -| `genkitx-cohere` | Provider (Cohere) | ❌ Not available | ❌ | +| `genkitx-cohere` | Provider (Cohere) | ✅ `cohere` (in-tree) | ✅ | | `genkitx-azure-openai` | Provider (Azure OpenAI) | `microsoft-foundry` (partial) | ⚠️ | **Vector Store Plugins:** @@ -455,13 +462,13 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel | External Category | Current Python Coverage | Gap Level | |-------------------|-------------------------|:---------:| -| Community model providers (6) | 3 of 6 covered | ⚠️ | +| Community model providers (6) | 4 of 6 covered | ⚠️ | | Community vector stores (3) | 0 of 3 covered | ❌ | | Community other plugins (1) | 0 of 1 covered | ❌ | | genkit-ai org plugins (5) | All covered via in-tree equivalents | ✅ | | Priority relative to JS-canonical parity | Secondary | ⚠️ | -**Note on community provider gaps**: The missing community providers (`genkitx-groq`, `genkitx-cohere`) could potentially be addressed via `compat-oai` since both Groq and Cohere offer OpenAI-compatible API endpoints. However, dedicated plugins would provide optimal model capability declarations and embedder support. +**Note on community provider gaps**: The missing community provider `genkitx-groq` could potentially be addressed via `compat-oai` since Groq offers an OpenAI-compatible API endpoint. However, a dedicated plugin would provide optimal model capability declarations and embedder support. Cohere is now covered by the in-tree `cohere` plugin ([#4518](https://github.com/firebase/genkit/pull/4518)). --- @@ -486,26 +493,26 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel ### 7a. Python Roadmap (JS-Canonical Parity) -> Updated: 2026-02-09. Status legend: ⬜ = not started, 🔄 = PR open, ✅ = merged, ⏳ = deferred, ⏸️ = paused (blocked on upstream), ~~struck~~ = superseded. +> Updated: 2026-02-13. Status legend: ⬜ = not started, 🔄 = PR open, ✅ = merged, ⏳ = deferred, ⏸️ = paused (blocked on upstream), ~~struck~~ = superseded. | Gap ID | SDK | Work Item | Reference | Status | PR | |--------|-----|-----------|-----------|:------:|:---| | **G38** | Python | **Generate-level middleware V2** — 3-tier hooks (`generate`/`model`/`tool`), `define_middleware`, registry | §8l | ⬜ Blocked | Upstream: JS [#4515](https://github.com/firebase/genkit/pull/4515), Go [#4422](https://github.com/firebase/genkit/pull/4422) | -| G2 → G1 | Python | Add `middleware` storage to `Action`, then add `use=` to `define_model` | §8b.1 | ⏸️ Paused | [#4516](https://github.com/firebase/genkit/pull/4516) — paused pending G38 | -| G7 | Python | Wire DAP action discovery into `GET /api/actions` | §8a, §8c.5 | ✅ Done | [#4459](https://github.com/firebase/genkit/pull/4459) | +| G2 → G1 | Python | Add `middleware` storage to `Action`, then add `use=` to `define_model` | §8b.1 | ⏸️ Superseded | [#4516](https://github.com/firebase/genkit/pull/4516) — open but superseded, pending G38 | +| G7 | Python | Wire DAP action discovery into `GET /api/actions` | §8a, §8c.5 | ⬜ Reverted | [#4459](https://github.com/firebase/genkit/pull/4459) merged then reverted by [#4469](https://github.com/firebase/genkit/pull/4469) — needs re-land | | G6 → G5 | Python | Pass `span_id` in `on_trace_start`, send `X-Genkit-Span-Id` | §8c.3, §8c.4 | ✅ Done | [#4511](https://github.com/firebase/genkit/pull/4511) | -| G3 | Python | Implement `simulate_constrained_generation` middleware | §8b.3, §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 | -| G12 | Python | Implement `retry` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 | -| G13 | Python | Implement `fallback` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 | -| G14 | Python | Implement `validate_support` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 | -| G15 | Python | Implement `download_request_media` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 | -| G16 | Python | Implement `simulate_system_prompt` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) — paused pending G38 | +| G3 | Python | Implement `simulate_constrained_generation` middleware | §8b.3, §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 | +| G12 | Python | Implement `retry` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 | +| G13 | Python | Implement `fallback` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 | +| G14 | Python | Implement `validate_support` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 | +| G15 | Python | Implement `download_request_media` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 | +| G16 | Python | Implement `simulate_system_prompt` middleware | §8f | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — needs new PR after G38 | | G18 | Python | Add multipart tool support (`defineTool({multipart: true})`) | §8h | 🔄 | [#4513](https://github.com/firebase/genkit/pull/4513) | | ~~G19~~ | ~~Python~~ | ~~Add Model API V2 (`defineModel({apiVersion: 'v2'})`)~~ | ~~§8i~~ | ~~Superseded~~ | Replaced by G38 (middleware V2) + G41 (bidi models) | | G20 | Python | Add `context` parameter to `Genkit()` constructor | §8j | 🔄 | [#4512](https://github.com/firebase/genkit/pull/4512) | | G21 | Python | Add `clientHeader` parameter to `Genkit()` constructor | §8j | 🔄 | [#4512](https://github.com/firebase/genkit/pull/4512) | | G22 | Python | Add `name` parameter to `Genkit()` constructor | §8j | 🔄 | [#4512](https://github.com/firebase/genkit/pull/4512) | -| G4 | Python | Move `augment_with_context` to define-model time | §8b.2 | 🔄 | [#4510](https://github.com/firebase/genkit/pull/4510) — logic valid, needs G38 interface | +| G4 | Python | Move `augment_with_context` to define-model time | §8b.2 | ⏸️ Paused | [#4510](https://github.com/firebase/genkit/pull/4510) (closed) — logic valid, needs new PR after G38 | | **G39** | Python | **Bidirectional Action** primitive (`define_bidi_action`) | §8m | ⬜ Blocked | Upstream: JS [#4288](https://github.com/firebase/genkit/pull/4288) | | **G40** | Python | **Bidirectional Flow** primitive (`define_bidi_flow`) | §8m | ⬜ Blocked | Upstream: JS [#4288](https://github.com/firebase/genkit/pull/4288) | | **G41** | Python | **Bidirectional Model** (`define_bidi_model`, `generate_bidi`) for real-time LLM APIs | §8m | ⬜ Blocked | Upstream: JS [#4210](https://github.com/firebase/genkit/pull/4210) | @@ -517,7 +524,7 @@ Full plugin list from the repository README (10 plugins, 33 contributors, 54 rel | G30 | Python | Add Cloud SQL PG vector store parity | §5g | ⏳ Deferred | — | | G31 | Python | Add dedicated Python MCP parity sample | §2b/§9 | 🔄 | [#4248](https://github.com/firebase/genkit/pull/4248) | | G8 | Python | Implement `genkit.client` (`run_flow` / `stream_flow`) | §5c/§9 | ⏳ Deferred | — | -| G17 | Python | Add built-in `api_key()` context provider | §8g | 🔄 | [#4521](https://github.com/firebase/genkit/pull/4521) (draft) | +| G17 | Python | Add built-in `api_key()` context provider | §8g | ⬜ | [#4521](https://github.com/firebase/genkit/pull/4521) (closed) — needs new PR | | G11 | Python | Add `CHANGELOG.md` to plugins + core | §3c | ✅ Done | [#4507](https://github.com/firebase/genkit/pull/4507), [#4508](https://github.com/firebase/genkit/pull/4508) | | G33 | Python | Consider LangChain integration parity | §1c/§9 | ⏳ Deferred | — | | G34 | Python | Track BloomLabs vector stores (Convex, HNSW, Milvus) | §6b/§9 | ⏳ Deferred | — | @@ -721,7 +728,7 @@ Both JS and Python use the **same core protocol**: Newline-delimited JSON (NDJSO | Header | JS | Python | Gap | |--------|-----|--------|-----| | `X-Genkit-Trace-Id` | ✅ Set in `onTraceStart` callback. Both streaming and non-streaming. | ✅ Set when trace ID is available. Both streaming and non-streaming. | ✅ Identical | -| **`X-Genkit-Span-Id`** | ✅ Set in `onTraceStart` callback (`reflection.ts:247`). | ❌ **Not sent**. Only listed in CORS `expose_headers`. | **Gap**: Python never sends this header. | +| **`X-Genkit-Span-Id`** | ✅ Set in `onTraceStart` callback (`reflection.ts:247`). | ✅ Set in `wrapped_on_trace_start` callback. Both streaming and non-streaming. | ✅ Fixed by [#4511](https://github.com/firebase/genkit/pull/4511) | | `X-Genkit-Version` / `x-genkit-version` | ✅ Set as `X-Genkit-Version` in `onTraceStart` callback AND as `x-genkit-version` in non-streaming list endpoints. | ✅ Set as `x-genkit-version` in all responses. | ✅ Functionally equivalent (case-insensitive HTTP headers). | | CORS `expose_headers` | Not explicitly shown (uses express CORS). | `['X-Genkit-Trace-Id', 'X-Genkit-Span-Id', 'x-genkit-version']` | ✅ Python is more explicit. | @@ -729,7 +736,7 @@ Both JS and Python use the **same core protocol**: Newline-delimited JSON (NDJSO | Aspect | JS | Python | Gap | |--------|-----|--------|-----| -| Callback arguments | `({traceId, spanId})` — receives **both** trace ID and span ID as a destructured object. | `(tid: str)` — receives **only** trace ID as a string. | **Gap**: Python cannot send `X-Genkit-Span-Id` because it doesn't receive the span ID. | +| Callback arguments | `({traceId, spanId})` — receives **both** trace ID and span ID as a destructured object. | `(tid: str, sid: str)` — receives **both** trace ID and span ID. | ✅ Fixed by [#4511](https://github.com/firebase/genkit/pull/4511) | **JS** (`js/core/src/reflection.ts:234-258`): ```js @@ -743,16 +750,17 @@ const onTraceStartCallback = ({ traceId: tid, spanId }) => { }; ``` -**Python** (`py/.../core/reflection.py:395-399`): +**Python** (`py/.../core/reflection.py`): ```python -def wrapped_on_trace_start(tid: str) -> None: - nonlocal run_trace_id +def wrapped_on_trace_start(tid: str, sid: str) -> None: + nonlocal run_trace_id, run_span_id run_trace_id = tid - on_trace_start(tid) + run_span_id = sid + on_trace_start(tid, sid) trace_id_event.set() ``` -**Fix required**: Update `on_trace_start` callback signature throughout the Python action system to pass both `trace_id` and `span_id`, then include `X-Genkit-Span-Id` in reflection response headers. +**Fixed**: `on_trace_start` now receives both `trace_id` and `span_id`, and `X-Genkit-Span-Id` is included in reflection response headers ([#4511](https://github.com/firebase/genkit/pull/4511)). #### 8c.5 Action Discovery Endpoint (`GET /api/actions`) diff --git a/py/engdoc/ROADMAP.org b/py/engdoc/ROADMAP.org deleted file mode 100644 index 3052eecc07..0000000000 --- a/py/engdoc/ROADMAP.org +++ /dev/null @@ -1,240 +0,0 @@ -#+title: SDK Roadmap -#+description: An org document that enlists the milestones and objectives of our SDK roadmap. - -* SDK Roadmap [0/0] -** Objectives [0/4] -- [ ] The Python SDK needs to be at feature parity with the JavaScript SDK. -- [ ] The Go SDK needs to be at feature parity with the JavaScript SDK. -- [ ] The Python Dotprompt library needs to be at feature parity with the JavaScript Dotprompt implementation. -- [ ] The Go Dotprompt library needs to be at feature parity with the JavaScript Dotprompt implementation. -** Specifications and Schemas [0/4] -- [ ] dotprompt - - [ ] helpers (based on yaml spec) - - [ ] json - - [ ] Go - - [ ] Python - - [ ] media - - [ ] Go - - [ ] Python - - [ ] role - - [ ] Go - - [ ] Python - - [ ] history - - [ ] Create the spec yaml - - [ ] Go - - [ ] Python - - [ ] section - - [ ] Create the spec yaml - - [ ] Go - - [ ] Python - - [ ] metadata.yaml - - [ ] Go - - [ ] Python - - [ ] partials.yaml - - [ ] Go - - [ ] Python - - [ ] picoschema.yaml - - [ ] Go - - [ ] Python - - [ ] variables.yaml - - [ ] Go - - [ ] Python -- [ ] genkit-schema converter - - [ ] schema.py and tests - - [ ] Candidate - - [ ] CandidateError - - [ ] DataPart - - [ ] DocumentData - - [ ] FinishReason - - [ ] GenerateActionOptions - - [ ] GenerateCommonConfig - - [ ] GenerateRequest - - [ ] GenerateResponse - - [ ] GenerateResponseChunk - - [ ] GenerationUsage - - [ ] InstrmentationLibrary - - [ ] Link - - [ ] MediaPart - - [ ] Message - - [ ] ModelInfo - - [ ] ModelRequest - - [ ] ModelResponse - - [ ] ModelResponseChunk - - [ ] Part - - [ ] Role - - [ ] SpanContext - - [ ] SpanData - - [ ] SpanMetadata - - [ ] SpanStatus - - [ ] TextPart - - [ ] TimeEvent - - [ ] ToolDefinition - - [ ] ToolRequest - - [ ] ToolRequestPart - - [ ] ToolResponse - - [ ] ToolResponsePart - - [ ] TraceData -- [ ] reflection API [0/7] - - See: `reflectionApi.yaml` - - - [ ] GET /api/actions: Retrieves all runnable actions. - - [ ] POST /api/runAction: Runs an action and returns the result. - - [ ] GET /api/envs/{env}/traces: Retrieves all traces for a given environment (e.g. dev or prod) - - [ ] GET /api/envs/{env}/traces/{traceId}: Retrieves traces for the given environment - - [ ] GET /api/envs/{env}/flowStates: Retrieves all flow states for a given environment (e.g. dev or prod) - - [ ] GET /api/envs/{env}/flowStates/{flowId}: Retrieves a flow state for the given ID - - [ ] GET /api/__health: health check -- [ ] generate API - - [ ] -** Plugins [0/9] -- [ ] Design [0/2] - - [ ] Proposal with example API - - [ ] Design review -- [ ] Chroma [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -- [ ] Dotprompt [0/0] -- [ ] Firebase [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -- [ ] GoogleAI [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -- [ ] Ollama [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -- [ ] OpenAI [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -- [ ] Pinecone [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -- [ ] VertexAI [0/4] - - [ ] Plugin - - [ ] Documentation - - [ ] Tests - - [ ] Sample -** Samples -- [ ] Hello world -- [ ] Basic Gemini -- [ ] Context caching -- [ ] Context caching2 -- [ ] Custom evaluators -- [ ] Docs Menu Basic -- [ ] Docs Menu RAG -- [ ] Flow sample 1 -- [ ] Flow sample 2 -- [ ] Prompt file -- [ ] RAG -- [ ] Vertex AI model garden -- [ ] Vertex AI reranker -- [ ] Vertex AI Vector Search -** Server implementations [/] -- [ ] multiprocessing server cluster [0/2] - - [ ] reflection server in dev mode - - [ ] production flows server -** CI/CD/Dev workflow [2/6] -- [-] Unit testing library - - [ ] Go testify - - [X] Python pytest -- [X] Unit testing watcher - - [X] pytest-watcher -- [-] Coverage analysis - - [X] pytest-cov - - [ ] Go test coverage tool -- [ ] Vulnerability analysis - - [ ] Python - - [ ] Go -- [ ] License compatibility checks - - [ ] Python - - [ ] Go -- [X] Automated license header check - - [X] Python - - [X] Go -** Git Hooks [0/1] -- [-] Pre-commit and pre-push hooks - - [-] Build Code - - [-] go build - - [X] Genkit - - [ ] Dotprompt - - [X] build python distribution - - [X] Genkit - - [X] Dotprompt - - [X] Distribution - - [X] Genkit - - [X] Dotprompt - - [-] Documentation - - [-] godoc - - [X] Genkit - - [ ] Dotprompt - - [X] engdoc using mkdocs - - [X] Genkit - - [X] Dotprompt - - [ ] Python API doc using mkdocstrings - - [ ] Genkit - - [ ] Dotprompt - - [-] Test - - [-] go test - - [X] Genkit - - [ ] Dotprompt - - [X] pytest with coverage threshold - - [X] Genkit - - [X] Dotprompt - - [X] Format - - [-] Lint - - [-] Python - - [-] mypy static type checks - - [X] dotprompt - - [ ] genkit -** Dependencies -- [X] Handlebars - - [X] handlebars-py (MIT License; feasibility test done) - - [X] pybars3 (LGPL 3.0 License; cannot use) -- [ ] JSON Schema - - [ ] Go: - - [ ] https://github.com/swaggest/jsonschema-go - - [ ] https://github.com/xeipuuv/gojsonschema - - [ ] https://github.com/santhosh-tekuri/jsonschema - - [ ] https://github.com/qri-io/jsonschema -- [ ] Picoschema - - [ ] Go - - [ ] https://github.com/jumonapp/picoschema -** Release management -- [X] Semantic Versioning and tagging -- [X] PyPi project for dotprompt https://pypi.org/project/dotprompt/ -- [X] PyPi project for genkit https://pypi.org/project/genkit/ -- [X] Version consistency check script (bin/check_versions) -- [X] Dynamic plugin matrix for publish workflow -- [X] Post-publish verification job -- [X] Shell script linting (shellcheck) - -** API Documentation [0/4] -- [X] MkDocs configuration with Material theme -- [X] mkdocstrings for all 22 plugins (mkdocs.yml) -- [ ] Docs publish workflow (deploy to GitHub Pages on release) - - [ ] Create publish_docs.yml workflow - - [ ] Configure GitHub Pages source - - [ ] Add version selector for multiple SDK versions - - [ ] Add search functionality -- [ ] API reference pages for each plugin - - [ ] Auto-generate from docstrings - - [ ] Add usage examples - - [ ] Add configuration reference - -** Integration Tests [/] - - [ ] Go - - [ ] Python - - [X] JS diff --git a/py/engdoc/blog-genkit-python-0.5.0.md b/py/engdoc/blog-genkit-python-0.5.0.md deleted file mode 100644 index 0edf69ce73..0000000000 --- a/py/engdoc/blog-genkit-python-0.5.0.md +++ /dev/null @@ -1,304 +0,0 @@ -# Genkit Python SDK 0.5.0: A Major Leap Forward - -Building intelligent AI-powered applications in Python just got significantly better. Today, we're thrilled to announce the release of **Genkit Python SDK 0.5.0**—our most significant update yet, with **178 commits**, **680+ files changed**, and contributions from **13 developers** across **188 PRs** over the past 8 months. - -This release transforms Genkit for Python from an experimental SDK into a production-ready framework with comprehensive plugin coverage, enterprise-grade security, and first-class developer experience. - -## What's New in 0.5.0 - -### Massive Plugin Ecosystem Expansion - -We've added **7 new model provider plugins** and **3 telemetry plugins**, giving you access to virtually every major AI provider: - -**New Model Providers:** -- **AWS Bedrock**: Access Claude, Titan, Llama, and more through AWS -- **Azure OpenAI (Microsoft Foundry)**: Enterprise Azure OpenAI integration -- **Cloudflare Workers AI**: Edge AI with Cloudflare's global network -- **Mistral AI**: Mistral Large, Small, Codestral, and Pixtral models -- **Hugging Face**: 17+ inference providers through one plugin -- **Anthropic**: Full Claude model support -- **DeepSeek**: DeepSeek models with structured output - -**New Telemetry Plugins:** -- **AWS X-Ray**: Production observability with SigV4 signing -- **Observability**: Third-party backends (Sentry, Honeycomb, Datadog) -- **Google Cloud Telemetry**: Full parity with JS/Go SDKs - -### Async-First Architecture - -The Python SDK now embraces async-first design throughout. Here's how clean your code can be: - -```python -from genkit.ai import Genkit -from genkit.plugins.google_genai import GoogleAI - -ai = Genkit( - plugins=[GoogleAI()], - model='googleai/gemini-2.5-flash', -) - -@ai.flow() -async def analyze_sentiment(text: str) -> str: - """Analyzes sentiment of the given text.""" - response = await ai.generate( - prompt=f'Analyze the sentiment of this text: {text}' - ) - return response.text # Property, not method -``` - -### Agentive Tool Calling - -Genkit makes it easy to give your AI agents the ability to call functions. Define tools with the `@ai.tool()` decorator: - -```python -from pydantic import BaseModel, Field - -class WeatherInput(BaseModel): - location: str = Field(description='City and state, e.g. San Francisco, CA') - -@ai.tool() -def get_weather(input: WeatherInput) -> dict: - """Get the current weather for a location.""" - return { - 'location': input.location, - 'temperature': 21.5, - 'conditions': 'sunny', - } - -@ai.flow() -async def weather_agent(location: str) -> str: - """AI agent that can check the weather.""" - response = await ai.generate( - prompt=f"What's the weather in {location}?", - tools=[get_weather], - ) - return response.text -``` - -### Enhanced Dotprompt Integration - -We've deeply integrated with [Dotprompt](https://github.com/google/dotprompt), our prompt templating engine, bringing: - -- **Directory/file prompt loading**: Automatic prompt discovery matching the JS SDK -- **Handlebars partials**: Template reuse with `define_partial` -- **Python 3.14 support**: Full compatibility via our Rust-based Handlebars engine -- **Cycle detection**: Prevents infinite recursion in partial resolution -- **Path traversal hardening**: Security fix for CWE-22 vulnerability - -```python -from genkit.ai import Genkit -from genkit.plugins.google_genai import GoogleAI - -ai = Genkit(plugins=[GoogleAI()], model='googleai/gemini-2.5-flash') - -# Define partials for reusable template components -ai.define_partial('greeting', 'Hello, {{name}}!') -ai.define_partial('signature', '\n\nBest regards,\n{{sender}}') - -# Use partials in prompts ({{> greeting}} and {{> signature}}) -@ai.flow() -async def send_email(name: str, sender: str) -> str: - response = await ai.generate( - prompt=f'Write an email to {name} from {sender}' - ) - return response.text -``` - -### Comprehensive Type Safety - -We now run **three type checkers** on every commit: - -| Type Checker | Provider | Purpose | -|-------------|----------|---------| -| **ty** | Astral (Ruff) | Fast, strict checking | -| **pyrefly** | Meta | Additional coverage | -| **pyright** | Microsoft | Full type analysis | - -This means you get better IDE support, fewer runtime errors, and more confident refactoring. - -### Pydantic Output Instances - -Generate structured data directly into Pydantic models: - -```python -from pydantic import BaseModel, Field -from genkit.ai import Genkit, Output -from genkit.plugins.google_genai import GoogleAI - -ai = Genkit(plugins=[GoogleAI()], model='googleai/gemini-2.5-flash') - -class RpgCharacter(BaseModel): - name: str = Field(description='name of the character') - backstory: str = Field(description='character backstory') - abilities: list[str] = Field(description='list of abilities (3-4)') - -@ai.flow() -async def generate_character(name: str) -> RpgCharacter: - result = await ai.generate( - prompt=f'Generate an RPG character named {name}', - output=Output(schema=RpgCharacter), # Use Output wrapper - ) - # Returns an RpgCharacter instance, not dict! - return result.output -``` - -## Critical Fixes & Security - -This release addresses several important issues: - -- **Race Condition Fix**: Dev server startup race condition resolved (#4225) -- **Thread Safety**: Per-event-loop HTTP client caching prevents event loop binding errors -- **Security Audit**: Full Ruff security rules (S) audit completed -- **SigV4 Signing**: AWS X-Ray OTLP exporter now uses proper AWS signatures - -## Developer Experience Improvements - -### Hot Reloading - -All samples now support hot reloading via [Watchdog](https://github.com/gorakhargosh/watchdog): - -```bash -# Start with hot reloading -genkit start -- python main.py -``` - -### CI Consolidation - -Every commit is now release-worthy. Our consolidated CI runs: -- All three type checkers -- Full test suite across Python 3.10-3.14 -- Security scanning -- License compliance -- Package builds - -### Rich Tracebacks - -Better error output with [Rich](https://github.com/Textualize/rich) tracebacks in all samples. - -## Available Plugins - -### Model Providers - -| Plugin | Models | Status | -|--------|--------|--------| -| `genkit-plugin-google-genai` | Gemini 2.5, Imagen, embeddings | ✅ Stable | -| `genkit-plugin-ollama` | Gemma, Llama, Mistral (local) | ✅ Stable | -| `genkit-plugin-anthropic` | Claude 3.5 Sonnet, Opus | ✅ New | -| `genkit-plugin-amazon-bedrock` | Claude, Titan, Llama via AWS | ✅ New | -| `genkit-plugin-microsoft-foundry` | Azure OpenAI | ✅ New | -| `genkit-plugin-cloudflare-workers-ai` | Cloudflare Workers AI | ✅ New | -| `genkit-plugin-mistral` | Mistral Large, Codestral | ✅ New | -| `genkit-plugin-huggingface` | 17+ providers | ✅ New | -| `genkit-plugin-deepseek` | DeepSeek models | ✅ New | -| `genkit-plugin-xai` | Grok models | ✅ New | - -### Telemetry & Observability - -| Plugin | Destination | Status | -|--------|-------------|--------| -| `genkit-plugin-google-cloud` | Cloud Trace/Logging | ✅ Stable | -| `genkit-plugin-aws` | AWS X-Ray | ✅ New | -| `genkit-plugin-observability` | Sentry, Honeycomb, Datadog | ✅ New | -| `genkit-plugin-firebase` | Firebase/Firestore | ✅ Stable | - -### Data & Retrieval - -| Plugin | Purpose | Status | -|--------|---------|--------| -| `genkit-plugin-firebase` | Vector search with Firestore | ✅ Stable | -| `genkit-plugin-dev-local-vectorstore` | Local vector store for dev | ✅ Stable | - -## Get Started - -Install Genkit Python SDK 0.5.0: - -```bash -pip install genkit==0.5.0 -``` - -Or with specific plugins: - -```bash -pip install genkit[google-genai,anthropic,amazon-bedrock]==0.5.0 -``` - -### Quick Start Example - -```python -import asyncio -from genkit.ai import Genkit -from genkit.plugins.google_genai import GoogleAI - -# Initialize at module level (best practice) -ai = Genkit( - plugins=[GoogleAI()], - model='googleai/gemini-2.5-flash', -) - -@ai.flow() -async def greeting_flow(name: str) -> str: - """Generates a personalized greeting.""" - response = await ai.generate( - prompt=f'Write a creative greeting for {name}' - ) - return response.text # Property, not method - -async def main(): - result = await greeting_flow('World') - print(result) - -if __name__ == '__main__': - ai.run_main(main()) # Use ai.run_main() for proper lifecycle -``` - -Run with the Developer UI: - -```bash -genkit start -- python main.py -``` - -## Contributors - -This release was made possible by an incredible team effort. Thank you to all **13 contributors** who made this release possible: - -| Contributor | Contributions | -|-------------|---------------| -| [@yesudeep](https://github.com/yesudeep) | Core architecture, 7 plugins, type safety, security | -| [@MengqinShen](https://github.com/MengqinShen) | Resources, samples, model configs | -| [@AbeJLazaro](https://github.com/AbeJLazaro) | Model Garden, Ollama, Gemini | -| [@pavelgj](https://github.com/pavelgj) | Reflection API, embedders | -| [@zarinn3pal](https://github.com/zarinn3pal) | Anthropic, DeepSeek, xAI, GCP telemetry | -| [@huangjeff5](https://github.com/huangjeff5) | PluginV2, type safety, telemetry | -| [@hendrixmar](https://github.com/hendrixmar) | Evaluators, OpenAI compat, Dotprompt | -| [@ssbushi](https://github.com/ssbushi) | Evaluator plugins | -| [@shrutip90](https://github.com/shrutip90) | ResourcePartSchema | -| [@schlich](https://github.com/schlich) | Type annotations | -| [@ktsmadhav](https://github.com/ktsmadhav) | Windows support | -| [@junhyukhan](https://github.com/junhyukhan) | Documentation | -| [@CorieW](https://github.com/CorieW) | Community contribution | - -Special thanks to the [google/dotprompt](https://github.com/google/dotprompt) team for the deep integration work. - -## What's Next? - -We're committed to continuously evolving Genkit Python. Coming soon: - -- **Session/Chat API**: Multi-turn conversation management -- **Reflection API v2**: WebSocket and JSON-RPC 2.0 support -- **More plugins**: Checks, Chroma, Pinecone, Cloud SQL PostgreSQL -- **Feature parity**: Continued alignment with JS/Go SDKs - -## Get Involved - -Got questions or feedback? Join us on: -- [Discord](https://discord.gg/qXt5zzQKpc) -- [Stack Overflow](https://stackoverflow.com/questions/tagged/genkit) -- [GitHub Issues](https://github.com/firebase/genkit/issues) - -Explore the [full documentation](https://python.api.genkit.dev) and start building! - -Happy coding, and we look forward to seeing what you create with Genkit Python 0.5.0! - ---- - -*Tags: Launch | Genkit | AI | Python* diff --git a/py/engdoc/extending/api.md b/py/engdoc/extending/api.md index 7b8eafb937..5fad3dd83c 100644 --- a/py/engdoc/extending/api.md +++ b/py/engdoc/extending/api.md @@ -139,210 +139,110 @@ differences here for now. * **Content negotiation**: Different response formats based on accept headers or query parameters. -# Sync vs Async Design +# Async-First Design + Genkit is a library that allows application developers to create AI flows for their applications using an API that abstracts over various components such as -indexers, retiervers, models, embedders, etc. - -Ideally, as a user, one would like the API to be async-first because this -single-threaded model of dealing with concurrency is the direction that Python -frameworks are taking and Genkit naturally lives in an async world. Genkit is -majorly I/O-bound not as much computationally-bound since it's primary purpose -is composing various AI foundational components and setting up typed -communication patterns between them. - -### Shape of the API - -Before we begin, let's study `structlog`, a structured logging library that has -had to deal with this problem as well and exposes a well-defined set of APIs -that is familiar to the Python world: - -```python -import asyncio -import structlog - -logger = structlog.get_logger(__name__) - -async def foo() -> str: - """Foo. - - Returns: - The name of this function. - """ - await logger.ainfo('Returning foo from function', fn=foo.__name__) - - return foo.__name__ - - -if __name__ == '__main__': - asyncio.run(foo()) - -``` - -Running the program displays the following on the console: - -```shell -zsh❯ uv run foo.py -2025-03-30 14:23:13 [info ] Returning foo from function fn=foo - -``` - -`structlog` exposes the async equivalent (`await logger.ainfo()`) functionality -of their `logger.info()` calls using the minimally-invasive `a*` prefix, without -resorting to any sort of magic. - -We propose to do the same: - -```python -ai = Genkit() - -@ai.flow() -async def async_flow(...): - response = await ai.generate(f"Answer this: {query}") - return {"answer": response.text} - -@ai.flow() -def sync_flow(...): - response = ai.generate(f"Answer this: {query}") - return {"answer": response.text} - -async def main() -> None: - """Main entry-point.""" - ... - -if __name__ == '__main__': - asyncio.run(main()) - -``` - -!!! note - - In an initial iteration of this design, we were considering using decorators - to detect whether the callable is a coroutine and change the meaning of the - `ai` treating it as a special variable inside it, but this increases the - complexity of the implementation and adds very little value. +indexers, retrievers, models, embedders, etc. - We have, therefore, decided to favor simplicity and add the `a*` prefix to - every asynchronous method made available by the API. +The API is **async-first** because this single-threaded model of dealing with +concurrency is the direction that Python frameworks are taking and Genkit +naturally lives in an async world. Genkit is majorly I/O-bound, not as much +computationally-bound, since its primary purpose is composing various AI +foundational components and setting up typed communication patterns between them. -To make this work, we could have a user-facing veneer -`genkit.ai.GenkitExperimental` class that composes 2 implementations of Genkit: +### Class Hierarchy -- `genkit.ai.AsyncGenkit` -- `genkit.ai.SyncGenkit` - -#### ASCII Diagram +The implementation uses a three-level class hierarchy: ```ascii -+---------------------+ +-------------------+ -| RegistrarMixin | | Registry | -|---------------------| |-------------------| -| - _registry |<>----|(placeholder type) | (Composition: RegistrarMixin has a Registry) -|---------------------| +-------------------+ -| + __init__(registry)| -| + flow() | -| + tool() | ++---------------------+ +| GenkitRegistry | (in _registry.py) +|---------------------| +| + flow() | Decorator to register flows +| + tool() | Decorator to register tools +| + define_model() | Register model actions +| + define_embedder() | Register embedder actions | + registry (prop) | +--------^------------+ - | (Inheritance: GenkitExperimental is-a RegistrarMixin) -+--------|-----------------+ +----------------------+ +----------------------+ -| GenkitExperimental |----->| AsyncGenkit | | SyncGenkit | -| (in _veneer.py) |<>-- | (in _async.py) | | (in _sync.py) | -|--------------------------| | |----------------------| |----------------------| -| - _registry (inherited) | | | + generate() | | + generate() | -| - _async_ai : AsyncGenkit| | | + generate_stream() | | + generate_stream() | -| - _sync_ai : SyncGenkit | *-->+----------------------+ *-->+----------------------+ -|--------------------------| (Async Implementation) (Independent Sync Impl.) -| + __init__(registry) | -| + flow() (inherited) | -| + tool() (inherited) | -| | -| + generate() ----------> calls _sync_ai.generate() -| + generate_stream() ---> calls _sync_ai.generate_stream() -| | -| + agenerate() ---------> calls _async_ai.generate() -| + agenerate_stream() --> calls _async_ai.generate_stream() -| | -| + aio (prop) ---------> returns _async_ai instance -| + io (prop) ----------> returns _sync_ai instance -+--------------------------+ + | ++--------|-----------+ +| GenkitBase | (in _base_async.py) +|--------------------| +| + __init__( | +| plugins, | +| model, | +| reflection_ | +| server_spec) | ++--------^-----------+ + | ++--------|-----------+ +| Genkit | (in _aio.py) +|--------------------| +| + generate() | async — text generation +| + generate_stream()| streaming generation +| + embed() | async — create embeddings +| + retrieve() | async — fetch documents +| + rerank() | async — reorder documents +| + evaluate() | async — evaluate outputs +| + chat() | session-based chat ++--------------------+ ``` -#### Mermaid Diagram - ```mermaid classDiagram - class RegistrarMixin { - -Registry _registry - +__init__(registry: Registry | None) - +flow(name: str | None, description: str | None) Callable - +tool(name: str | None, description: str | None) Callable + class GenkitRegistry { + <<_registry.py>> + +flow(name, description) Callable + +tool(name, description) Callable + +define_model(config, fn) Action + +define_embedder(config, fn) Action +registry() Registry } - class Registry { - %% Placeholder for Registry type %% + class GenkitBase { + <<_base_async.py>> + +__init__(plugins, model, reflection_server_spec) } - class AsyncGenkit { - <<_async.py>> - +generate(prompt: str) str - +generate_stream(prompt: str) AsyncGenerator + class Genkit { + <<_aio.py>> + +generate(model, prompt, system, ...) GenerateResponseWrapper + +generate_stream(model, prompt, ...) tuple + +embed(embedder, content) EmbedResponse + +retrieve(retriever, query) list + +rerank(reranker, query, documents) list + +evaluate(evaluator, dataset) EvalResponse } - class SyncGenkit { - <<_sync.py>> - +generate(prompt: str) str - +generate_stream(prompt: str) Generator - } - - class GenkitExperimental { - <<_veneer.py>> - -AsyncGenkit _async_ai - -SyncGenkit _sync_ai - +__init__(registry: Registry | None) - +generate(prompt: str) str - +generate_stream(prompt: str) Generator - +agenerate(prompt: str) str - +agenerate_stream(prompt: str) AsyncGenerator - +aio() AsyncGenkit - +io() SyncGenkit - } - - RegistrarMixin *-- Registry : has a > - GenkitExperimental --|> RegistrarMixin : inherits - GenkitExperimental *-- AsyncGenkit : has _async_ai > - GenkitExperimental *-- SyncGenkit : has _sync_ai > - - GenkitExperimental --> AsyncGenkit : calls agenerate() - GenkitExperimental --> AsyncGenkit : calls agenerate_stream() - GenkitExperimental --> SyncGenkit : calls generate() - GenkitExperimental --> SyncGenkit : calls generate_stream() + GenkitBase --|> GenkitRegistry : inherits + Genkit --|> GenkitBase : inherits ``` -An instance of each of these would be exposed as a property on the veneer class. -The veneer class should use a mixin called `RegistrarMixin` to manage the -registration of AI blocks such as tools, flows, actions, etc - -### Maintaining parity +All methods on the `Genkit` class are `async`. Synchronously-defined flows and +tools are executed using a thread-pool executor internally. -This would imply we'd have 2 implementations of Genkit. There's 2 ways that -occur to me in which we could maintain parity: +### Usage -1. Maintain two separate implementations one for async and another for sync. +```python +from genkit.ai import Genkit +from genkit.plugins.google_genai import GoogleAI -2. Implement one in terms of the other. +ai = Genkit( + plugins=[GoogleAI()], + model='googleai/gemini-2.0-flash', +) -We recommend option 1 for simplicity and easier maintenance. +@ai.flow() +async def my_flow(query: str) -> str: + response = await ai.generate(prompt=f"Answer this: {query}") + return response.text +``` ## Implementation -Currently, the Veneer API contains an implementation that uses threads to start -a reflection server when Genkit is in use in an environment where the -`GENKIT_ENV` environment variable has been set to `'dev'`. - -There are a few ways to set that environment variable, and running the development -server using `genkit start` also sets it. +The `Genkit` class starts a reflection server when the `GENKIT_ENV` environment +variable has been set to `'dev'`. Running the following command: @@ -350,60 +250,28 @@ Running the following command: genkit start -- uv run sample.py ``` -would set `GENKIT_ENV='dev'` within a running instance of `sample.py`. +sets `GENKIT_ENV='dev'` within a running instance of `sample.py`. `genkit start` exposes a developer UI (usually called dev UI for short) that is used for debugging and that talks to a reflection API server implemented by the -veneer `Genkit` class instance. The reflection API server provides a way for -the dev UI to allow users to debug their custom flows, test features such as -models and plugins, and also observe traces emitted by these components. +`Genkit` class instance. The reflection API server provides a way for the dev UI +to allow users to debug their custom flows, test features such as models and +plugins, and also observe traces emitted by these components. ### Concurrency handling -We would like to avoid using threads since asyncio is primarily a -single-threaded design and threading complicates the internals of the API. -Synchronously-defined flows, tools, and other actions would execute using a -thread-pool executor used by the `SyncGenkit` implementation. +The implementation avoids using threads for server infrastructure since asyncio +is primarily a single-threaded design. The reflection server runs as a coroutine +on the same event loop. #### Scenarios -- For simple short lived applications, when we don't have the dev server we'd - want the program to exit since that shouldn't start the reflection server. - -- For simple short lived applications, when we have the dev server (meaning the - `GENKIT_ENV=dev` environment variable has been set), we should start the - reflection server and prevent the application's main thread from exiting and - shutting down the process to enable debugging. - -- For servers, we'd want the user to be able to add the reflection server to a - manager object such as that used in @multi_server.py passed into the - arguments of the Genkit veneer class instance so that it attaches to the - server manager alongside any application servers written by the end user. - -The end user should not need to expliclity add code to their main thread to wait -for the reflection server when dev mode is enabled. Since we're building an -asyncio-first solution it should naturally do that since we'd be running the -reflection server on the same event loop. - -```pseudocode -if short lived app: - if dev mode enabled: - add reflection server coroutine to the event loop so main thread waits for dev UI debugging - else: - complete all flows and exit normally -elif long-lived server: - if dev mode enabled: - add reflection server coroutine to the server manager to enable debuggging using dev UI - else: - run user-defined servers using server manager - -``` +- For simple short-lived applications without dev mode, the program exits + normally after completing all flows. -Each of these can be demonstrated using individual entry-points sharing a common -set of flows and tools. For example, the sample would define all the flows in -`flows.py` and use them in both `server_example.py` and `short_lived_example.py` -as a demonstration: +- For simple short-lived applications with dev mode (`GENKIT_ENV=dev`), the + reflection server starts and prevents the main thread from exiting to enable + debugging. -- `flows.py` -- `server_example.py` -- `short_lived_example.py` +- For long-lived servers, the reflection server attaches to the server manager + alongside any application servers written by the end user. diff --git a/py/engdoc/extending/index.md b/py/engdoc/extending/index.md index 99adf0f124..c51fe449d4 100644 --- a/py/engdoc/extending/index.md +++ b/py/engdoc/extending/index.md @@ -43,13 +43,28 @@ genkit: { } plugins: { style: {fill: "#FCE4EC"} - chroma - pinecone google_genai google_cloud - openai + vertex_ai firebase ollama + anthropic + amazon_bedrock + cloudflare_workers_ai + cohere + compat_oai + deepseek + huggingface + microsoft_foundry + mistral + xai + observability + checks + evaluators + mcp + fastapi + flask + dev_local_vectorstore } } diff --git a/py/engdoc/extending/servers.md b/py/engdoc/extending/servers.md index a32ecbc2cf..2153ad4489 100644 --- a/py/engdoc/extending/servers.md +++ b/py/engdoc/extending/servers.md @@ -38,10 +38,10 @@ the runtime. The initialization process deals with: | Server | Sources | |------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| Flows | [JS](https://github.com/firebase/genkit/blob/main/js/plugins/express/src/index.ts), [Go](TODO), [Python](TODO) | +| Flows | [JS](https://github.com/firebase/genkit/blob/main/js/plugins/express/src/index.ts), [Go](TODO), [Python](https://github.com/firebase/genkit/blob/main/py/packages/genkit/src/genkit/core/flows.py) | | Telemetry | [JS](https://github.com/firebase/genkit/blob/main/genkit-tools/telemetry-server/src/index.ts) | | Dev UI/Tools API | [JS](https://github.com/firebase/genkit/blob/main/genkit-tools/common/src/server/server.ts) | -| Reflection | [JS](https://github.com/firebase/genkit/blob/main/js/core/src/reflection.ts), [Go](https://github.com/firebase/genkit/blob/main/go/genkit/reflection.go), [Python](TODO) | +| Reflection | [JS](https://github.com/firebase/genkit/blob/main/js/core/src/reflection.ts), [Go](https://github.com/firebase/genkit/blob/main/go/genkit/reflection.go), [Python](https://github.com/firebase/genkit/blob/main/py/packages/genkit/src/genkit/core/reflection.py) | ## Environment Variables @@ -104,4 +104,3 @@ Many of these servers handle signals to handle graceful termination and clean up | `SIGTSTP` | 20 or 18 | Terminal stop signal. Sent when the user presses `Ctrl+Z`. | Stop | Used for job control. | | `SIGTTIN` | 21 | Terminal input. Sent to a background process that attempts to read from the terminal. | Stop | Used for job control. | | `SIGTTOU` | 22 | Terminal output. Sent to a background process that attempts to write to the terminal. | Stop | Used for job control. | - diff --git a/py/engdoc/index.md b/py/engdoc/index.md index 57909edcdb..d62a62376a 100644 --- a/py/engdoc/index.md +++ b/py/engdoc/index.md @@ -19,7 +19,7 @@ tools for testing and debugging. The following language runtimes are supported: |------------------|---------|--------------| | Node.js | 22.0+ | 1 | | Go | 1.22+ | 1 | -| Python | 3.12+ | 1 | +| Python | 3.10+ | 1 | It is designed to work with any generative AI model API or vector database. While we offer integrations for Firebase and Google Cloud, you can use Genkit @@ -52,25 +52,40 @@ capabilities in code: | Feature | Python | JavaScript | Go | |-------------------|--------|------------|----| | Agents | ❌ | ✅ | ✅ | -| Chat | ❌ | ✅ | ✅ | -| Data retrieval | ❌ | ✅ | ✅ | -| Generation | ❌ | ✅ | ✅ | -| Prompt templating | ❌ | ✅ | ✅ | -| Structured output | ❌ | ✅ | ✅ | -| Tool calling | ❌ | ✅ | ✅ | +| Chat | ✅ | ✅ | ✅ | +| Data retrieval | ✅ | ✅ | ✅ | +| Generation | ✅ | ✅ | ✅ | +| Prompt templating | ✅ | ✅ | ✅ | +| Structured output | ✅ | ✅ | ✅ | +| Tool calling | ✅ | ✅ | ✅ | ### Plugin Parity -| Plugins | Python | JavaScript | Go | -|--------------|--------|------------|----| -| Chroma DB | ❌ | ✅ | ✅ | -| Dotprompt | ❌ | ✅ | ✅ | -| Firebase | ❌ | ✅ | ✅ | -| Google AI | ❌ | ✅ | ✅ | -| Google Cloud | ❌ | ✅ | ✅ | -| Ollama | ❌ | ✅ | ✅ | -| Pinecone | ❌ | ✅ | ✅ | -| Vertex AI | ❌ | ✅ | ✅ | +| Plugins | Python | JavaScript | Go | +|------------------------|--------|------------|----| +| Amazon Bedrock | ✅ | — | — | +| Anthropic | ✅ | — | — | +| Checks | ✅ | ✅ | — | +| Cloudflare Workers AI | ✅ | — | — | +| Cohere | ✅ | — | — | +| Compat-OAI | ✅ | — | — | +| DeepSeek | ✅ | — | — | +| Dev Local Vectorstore | ✅ | ✅ | — | +| Dotprompt | ✅ | ✅ | ✅ | +| Evaluators | ✅ | ✅ | — | +| FastAPI | ✅ | — | — | +| Firebase | ✅ | ✅ | ✅ | +| Flask | ✅ | — | — | +| Google Cloud | ✅ | ✅ | ✅ | +| Google GenAI | ✅ | ✅ | ✅ | +| Hugging Face | ✅ | — | — | +| MCP | ✅ | — | — | +| Microsoft Foundry | ✅ | — | — | +| Mistral | ✅ | — | — | +| Observability | ✅ | — | — | +| Ollama | ✅ | ✅ | — | +| Vertex AI | ✅ | ✅ | ✅ | +| xAI | ✅ | — | — | ## Examples @@ -78,38 +93,31 @@ capabilities in code: === "Python" - ```python hl_lines="12 13 14 15 17 20 21 22" linenums="1" + ```python linenums="1" import asyncio - import structlog - from genkit.ai import genkit - from genkit.plugins.google_ai import googleAI - from genkit.plugins.google_ai.models import gemini15Flash + from genkit.ai import Genkit + from genkit.plugins.google_genai import GoogleAI - logger = structlog.get_logger() + ai = Genkit( + plugins=[GoogleAI()], + model='googleai/gemini-2.0-flash', + ) async def main() -> None: - ai = genkit({ # (1)! - plugins: [googleAI()], - model: gemini15Flash, - }) + response = await ai.generate(prompt='Why is AI awesome?') + print(response.text) - response = await ai.generate('Why is AI awesome?') - await logger.adebug(response.text) - - stream, _ = ai.generate_stream("Tell me a story") + stream, _ = ai.generate_stream(prompt='Tell me a story') async for chunk in stream: - await logger.adebug("Received chunk", text=chunk.text) - await logger.adebug("Finished generating text stream") + print(chunk.text, end='') if __name__ == '__main__': - asyncio.run(content_generation()) + asyncio.run(main()) ``` -1. :man_raising_hand: Basic example of annotation. - === "JavaScript" ```javascript @@ -148,42 +156,37 @@ capabilities in code: ```python import asyncio - import structlog + from enum import Enum - from genkit.ai import genkit - from genkit.plugins.google_ai import googleAI - from genkit.plugins.google_ai.models import gemini15Flash + from pydantic import BaseModel - logger = structlog.get_logger() + from genkit.ai import Genkit, Output + from genkit.plugins.google_genai import GoogleAI + + ai = Genkit( + plugins=[GoogleAI()], + model='googleai/gemini-2.0-flash', + ) - from pydantic import BaseModel, Field, validator - from enum import Enum class Role(str, Enum): KNIGHT = "knight" MAGE = "mage" ARCHER = "archer" + class CharacterProfile(BaseModel): name: str role: Role backstory: str - async def main() -> None: - ai = genkit({ - plugins: [googleAI()], - model: gemini15Flash, - }) - await logger.adebug("Generating structured output", prompt="Create a brief profile for a character in a fantasy video game.") + async def main() -> None: response = await ai.generate( prompt="Create a brief profile for a character in a fantasy video game.", - output={ - "format": "json", - "schema": CharacterProfile, - }, + output=Output(schema=CharacterProfile), ) - await logger.ainfo("Generated output", output=response.output) + print(response.output) if __name__ == "__main__": @@ -224,51 +227,30 @@ capabilities in code: ```python import asyncio - import structlog - from genkit.ai import genkit - from genkit.plugins.google_ai import googleAI - from genkit.plugins.google_ai.models import gemini15Flash from pydantic import BaseModel, Field - logger = structlog.get_logger() - - - class GetWeatherInput(BaseModel): - location: str = Field(description="The location to get the current weather for") - + from genkit.ai import Genkit + from genkit.plugins.google_genai import GoogleAI - class GetWeatherOutput(BaseModel): - weather: str + ai = Genkit( + plugins=[GoogleAI()], + model='googleai/gemini-2.0-flash', + ) - async def get_weather(input: GetWeatherInput) -> GetWeatherOutput: - await logger.adebug("Calling get_weather tool", location=input.location) - # Replace this with an actual API call to a weather service - weather_info = f"The current weather in {input.location} is 63°F and sunny." - return GetWeatherOutput(weather=weather_info) + @ai.tool() + async def get_weather(location: str = Field(description="The location to get the current weather for")) -> str: + """Gets the current weather in a given location.""" + return f"The current weather in {location} is 63°F and sunny." async def main() -> None: - ai = genkit({ - plugins: [googleAI()], - model: gemini15Flash, - }) - - get_weather_tool = ai.define_tool( - name="getWeather", - description="Gets the current weather in a given location", - input_schema=GetWeatherInput, - output_schema=GetWeatherOutput, - func=get_weather, - ) - - await logger.adebug("Generating text with tool", prompt="What is the weather like in New York?") response = await ai.generate( prompt="What is the weather like in New York?", - tools=[get_weather_tool], + tools=['get_weather'], ) - await logger.ainfo("Generated text", text=response.text) + print(response.text) if __name__ == "__main__": @@ -317,43 +299,30 @@ capabilities in code: ```python import asyncio - import structlog - - from genkit.ai import genkit - from genkit.plugins.google_ai import googleAI - from genkit.plugins.google_ai.models import gemini15Flash - from pydantic import BaseModel, Field - logger = structlog.get_logger() + from genkit.ai import Genkit + from genkit.plugins.google_genai import GoogleAI - - class ChatResponse(BaseModel): - text: str - - - async def chat(input: str) -> ChatResponse: - await logger.adebug("Calling chat tool", input=input) - # Replace this with an actual API call to a language model, - # providing the user query and the conversation history. - response_text = "Ahoy there! Your name is Pavel, you scurvy dog!" - return ChatResponse(text=response_text) + ai = Genkit( + plugins=[GoogleAI()], + model='googleai/gemini-2.0-flash', + ) async def main() -> None: - ai = genkit({ - plugins: [googleAI()], - model: gemini15Flash, - }) - - chat_tool = ai.chat({system: 'Talk like a pirate'}) - - await logger.adebug("Calling chat tool", input="Hi, my name is Pavel") - response = await chat_tool.send("Hi, my name is Pavel") - - await logger.adebug("Calling chat tool", input="What is my name?") - response = await chat_tool.send("What is my name?") + response = await ai.generate( + prompt='Hi, my name is Pavel', + system='Talk like a pirate', + ) + print(response.text) - await logger.ainfo("Chat response", text=response.text) + response = await ai.generate( + prompt='What is my name?', + system='Talk like a pirate', + messages=response.messages, + ) + print(response.text) + # Ahoy there! Your name is Pavel, you scurvy dog! if __name__ == "__main__": @@ -385,7 +354,8 @@ capabilities in code: === "Python" ```python - + # Not yet implemented in Python. + # See: https://github.com/firebase/genkit/pull/4212 ``` === "JavaScript" @@ -438,46 +408,39 @@ capabilities in code: ```python import asyncio - import structlog - from genkit.ai import genkit - from genkit.plugins.google_ai import googleAI - from genkit.plugins.google_ai.models import gemini15Flash, textEmbedding004 - from genkit.plugins.dev_local_vectorstore import devLocalVectorstore, devLocalRetrieverRef + from genkit.ai import Genkit + from genkit.plugins.google_genai import GoogleAI + from genkit.plugins.dev_local_vectorstore import DevLocalVectorstore - logger = structlog.get_logger() + ai = Genkit( + plugins=[ + GoogleAI(), + DevLocalVectorstore( + indexes=[{ + 'index_name': 'BobFacts', + 'embedder': 'googleai/text-embedding-004', + }], + ), + ], + model='googleai/gemini-2.0-flash', + ) async def main() -> None: - ai = genkit( - plugins=[ - googleAI(), - devLocalVectorstore( - [ - { - "index_name": "BobFacts", - "embedder": textEmbedding004, - } - ] - ), - ], - model=gemini15Flash, - ) - - retriever = devLocalRetrieverRef("BobFacts") - query = "How old is Bob?" - await logger.adebug("Retrieving documents", query=query) - docs = await ai.retrieve(retriever=retriever, query=query) + docs = await ai.retrieve( + retriever='devLocalVectorstore/BobFacts', + query=query, + ) - await logger.adebug("Generating answer", query=query) response = await ai.generate( - prompt=f"Use the provided context from the BobFacts database to answer this query: {query}", + prompt=f"Use the provided context to answer: {query}", docs=docs, ) + print(response.text) - await logger.ainfo("Generated answer", answer=response.text) if __name__ == "__main__": asyncio.run(main()) diff --git a/py/engdoc/model-conformance-roadmap.md b/py/engdoc/model-conformance-roadmap.md deleted file mode 100644 index 9f182de450..0000000000 --- a/py/engdoc/model-conformance-roadmap.md +++ /dev/null @@ -1,491 +0,0 @@ -# Model Conformance Testing Plan for Python Plugins - -> **Status:** Infrastructure + Native Runner Complete (P0–P3 done, P4 pending manual validation) -> **Date:** 2026-02-11 (updated) -> **Owner:** Python Genkit Team -> **Scope:** Phase 1 covers google-genai, anthropic, and compat-oai (OpenAI). -> All 13 plugins have entry points and specs. Native test runner replaces -> genkit CLI dependency. Unified multi-runtime table. - ---- - -## Problem Statement - -The Genkit CLI provides a `genkit dev:test-model` command -([genkit-tools/cli/src/commands/dev-test-model.ts][dev-test-model]) that runs -standardized conformance tests against model providers. This command already -works cross-runtime (JS and Python) via the reflection API, but we have no -Python-side conformance test specs, entry points, or automation to exercise it. - -We need to: - -1. Verify that Python model provider plugins produce correct responses for the - same test cases used by JS plugins. -2. Establish a repeatable, per-plugin conformance testing workflow. -3. Identify and close feature parity gaps between Python and JS plugins. - -[dev-test-model]: https://github.com/firebase/genkit/blob/main/genkit-tools/cli/src/commands/dev-test-model.ts - ---- - -## Architecture - -The `conform` tool supports two execution modes: - -``` - py/bin/conform check-model [PLUGIN...] - | - +--------+--------+ - | | - default (native) --use-cli (legacy) - | | - +---------+---------+ | - | | | v - python js go genkit dev:test-model - | | | | - InProcess Reflection Reflection - Runner Runner Runner - | | | | - import subprocess subprocess subprocess - entry.py entry.ts entry.go genkit CLI - | | | | - action. async HTTP async HTTP | - arun_raw reflection reflection | - | | | | - +----+----+----+----+ | - | | | - 10 Validators | | - (1:1 with JS) | | - | | | - Unified Results Table | - (Runtime column when v - multiple runtimes) Legacy per-runtime - tables -``` - -**Native runner (default):** - -1. For Python: imports `conformance_entry.py` in-process, calls - `action.arun_raw()` directly (no subprocess, no HTTP, no genkit CLI). -2. For JS/Go: starts the entry point subprocess, discovers the reflection - server via `.genkit/runtimes/*.json`, communicates via async HTTP. -3. 10 validators ported 1:1 from the canonical JS source. -4. Results displayed in a unified table with Runtime column. - -**Legacy CLI runner (`--use-cli`):** - -1. Delegates to `genkit dev:test-model` via subprocess. -2. Discovers the running Python runtime via `.genkit/runtimes/*.json`. -3. Sends standardized test requests via `POST /api/runAction`. -4. Validates responses using built-in validators. - ---- - -## Cross-Runtime Feature Parity Analysis - -### Plugins with JS Counterparts - -| Plugin | JS Location | JS Models | Python Models | Parity | Gaps in Python | Python Extras | -|--------|-------------|-----------|---------------|--------|----------------|---------------| -| **google-genai** | In-repo `js/plugins/google-genai/` | 24 (Gemini, TTS, Gemini-Image, Gemma, Imagen, Veo) | 23+ (same families) | **Partial** | Imagen under `googleai/` prefix (only registered under `vertexai/`) | More legacy Gemini preview versions | -| **anthropic** | In-repo `js/plugins/anthropic/` | 8 (Claude 3-haiku through opus-4-5) | 8 (identical list and capabilities) | **Full** | None | None | -| **compat-oai** | In-repo `js/plugins/compat-oai/` | 49 (30 chat, 2 image gen, 3 TTS, 3 STT, 3 embed, 2 DeepSeek, 6 xAI) | 30+ (22+ chat, 2 image gen, 3 TTS, 3 STT, 3 embed) | **Full** | Vision (gpt-4-vision*), gpt-4-32k (older models) | DeepSeek/xAI split into dedicated plugins | -| **ollama** | In-repo `js/plugins/ollama/` | Dynamic discovery | Dynamic discovery | **Full** | Cosmetic: JS declares `media=true`, `toolChoice=true`; Python omits | Python declares `output=['text','json']` | -| **amazon-bedrock** | External [aws-bedrock-js-plugin][bedrock-js] | ~35 (Amazon, Claude 2-3.7, Cohere, Mistral, AI21, Llama) | 50+ (all JS models included) | **Python superset** | None | DeepSeek, Gemma, NVIDIA, Qwen, Writer, Moonshot, newer Claude 4.x | -| **microsoft-foundry** | External [azure-foundry-js-plugin][foundry-js] | ~32 chat + DALL-E + TTS + Whisper + embed | 30+ chat + embed + dynamic catalog | **Partial** | DALL-E image gen, TTS, Whisper STT | Claude, DeepSeek, Grok, Llama, Mistral; dynamic Azure catalog (11k+ models) | -| **deepseek** | JS: in `compat-oai` as `deepseek/` prefix | 2 (deepseek-chat, deepseek-reasoner) | 4 (+ deepseek-v3, deepseek-r1) | **Python superset** | None | 2 additional models | -| **xai** | JS: in `compat-oai` as `xai/` prefix | 6 (grok-3 family, grok-2-vision, grok-2-image) | 6 (grok-3 family, grok-4, grok-2-vision) | **Partial** | Image gen (grok-2-image-1212) | grok-4 (newer model) | - -[bedrock-js]: https://github.com/genkit-ai/aws-bedrock-js-plugin -[foundry-js]: https://github.com/genkit-ai/azure-foundry-js-plugin - -### Python-Only Plugins (no JS counterpart) - -| Plugin | Models | Notes | -|--------|--------|-------| -| **mistral** | 30+ (Large 3, Medium 3.1, Small 3.2, Ministral 3, Magistral, Codestral, Devstral, Voxtral, Pixtral, Embed) | No JS plugin exists. PR #4485: embeddings + streaming fix. PR #4486: full capability update. | -| **huggingface** | 10+ popular models + any HF model ID | No JS plugin exists | -| **cloudflare-workers-ai** | 15+ (Llama, Mistral, Qwen, Gemma, Phi, DeepSeek) | No JS plugin exists | - -### Gaps Summary (Ordered by Priority) - -| Priority | Plugin | Gap | Impact | Fix Effort | -|----------|--------|-----|--------|------------| -| **HIGH** | google-genai | Imagen under `googleai/` prefix | Blocks spec symlink for conformance tests | Low (~20 lines in `google.py`) | -| ~~MEDIUM~~ | compat-oai | ~~Image gen (dall-e-3, gpt-image-1)~~ | ✅ Done (PR #4477) | -- | -| ~~MEDIUM~~ | compat-oai | ~~TTS (tts-1, tts-1-hd, gpt-4o-mini-tts)~~ | ✅ Done (PR #4477) | -- | -| ~~MEDIUM~~ | compat-oai | ~~STT (whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe)~~ | ✅ Done (PR #4477) | -- | -| **MEDIUM** | microsoft-foundry | DALL-E, TTS, Whisper | Mirrors compat-oai gaps | Medium | -| **LOW** | xai | Image gen (grok-2-image-1212) | Single model missing | Medium (new handler) | -| **LOW** | compat-oai | Vision models (gpt-4-vision*), gpt-4-32k | Older models, multimodal works via gpt-4o | Low (add model defs) | -| **LOW** | ollama | `media`, `toolChoice` metadata | Cosmetic only, no functional impact | Trivial | - ---- - -## Dependency Graph - -All tasks for Phase 1 and their dependency relationships: - -``` -DEPENDENCY GRAPH -================ - - +-----------------+ +-----------------+ - | fix-imagen-gap | | setup-dir | - | (P0) | | (P0) | - +----+-------+----+ +--+---------+--+-+ - | | | | | - | +----+---------------+ | | - | | | | | - +----v--v-+ +----v-----------+ +-----v--+ +-----v--------+ - | symlink | | entry- | | spec- | | spec- | - | gemini | | google-genai | | anthr. | | compat-oai | - | (P1) | | (P1) | | (P1) | | (P1) | - +----+----+ +-------+--------+ +---+----+ +-----+--------+ - | | | | - +-------+-------+-------+-------+--------------+ - | - +----v-----------+ - | runner-script | - | (P2) | - +----+-----------+ - | - +----v-----------+ - | validate- | - | google-genai | - | (P3) | - +----------------+ -``` - -**Edge list (A -> B means "A must complete before B can start"):** - -- `fix-imagen-gap` -> `symlink-gemini-spec` -- `fix-imagen-gap` -> `entry-google-genai` -- `setup-dir` -> `symlink-gemini-spec` -- `setup-dir` -> `entry-google-genai` -- `setup-dir` -> `spec-anthropic` -- `setup-dir` -> `spec-compat-oai` -- `symlink-gemini-spec` -> `runner-script` -- `entry-google-genai` -> `runner-script` -- `spec-anthropic` -> `runner-script` -- `spec-compat-oai` -> `runner-script` -- `runner-script` -> `validate-google-genai` - ---- - -## Phased Execution Plan (Reverse Topological Order) - -Execute each phase to completion before starting the next. **All tasks within a -phase are independent and should run in parallel** for fastest completion. - -**Critical path:** `fix-imagen-gap` -> `symlink-gemini-spec` -> `runner-script` --> `validate-google-genai` - -### Phase 0: Leaves ✅ COMPLETE - -| Task | Description | File(s) | Effort | Status | -|------|-------------|---------|--------|--------| -| `fix-imagen-gap` | GoogleAI already registers Imagen under `googleai/` (verified in code) | `google.py` lines 378-380, 523-527, 596-601 | N/A | ✅ Already done | -| `setup-dir` | Created `py/tests/conformance/` with dirs for all 10 plugins | `py/tests/conformance/{google-genai,anthropic,compat-oai,...}/` | Trivial | ✅ Done | - -**Parallelizable:** Yes, both tasks are independent. - -### Phase 1: Specs + Entry Points ✅ COMPLETE - -| Task | Description | Depends On | File(s) | Status | -|------|-------------|------------|---------|--------| -| `symlink-gemini-spec` | Symlinked JS spec into conformance dir | P0 | `google-genai/model-conformance.yaml` → JS spec | ✅ Done | -| `entry-google-genai` | Minimal google-genai entry point | P0 | `google-genai/conformance_entry.py` | ✅ Done | -| `spec-anthropic` | Anthropic entry point + YAML spec | P0 | `anthropic/{conformance_entry.py,model-conformance.yaml}` | ✅ Done | -| `spec-compat-oai` | compat-oai entry point + YAML spec (gpt-4o, gpt-4o-mini, dall-e-3, tts-1) | P0 | `compat-oai/{conformance_entry.py,model-conformance.yaml}` | ✅ Done (updated with multimodal, PR #4477) | - -**Note:** All 10 plugins (including Phase 2 plugins) have entry points and specs. - -### Phase 2: Orchestration ✅ COMPLETE - -| Task | Description | Depends On | File(s) | Status | -|------|-------------|------------|---------|--------| -| `runner-script` | Shell script to orchestrate per-plugin conformance test runs | All Phase 1 tasks | `py/bin/test-model-conformance` | ✅ Done | - -### Phase 2.5: Spec Audit + Model Updates ✅ COMPLETE - -| Task | Description | File(s) | Status | -|------|-------------|---------|--------| -| `audit-specs` | Verified all 11 plugin specs against official provider documentation (Feb 11, 2026). Fixed model names, corrected Supports flags, added missing models. Total: 24 models across 11 plugins. | All `model-conformance.yaml` files | ✅ Done | - -**Changes made during audit:** - -| Plugin | Before | After | Changes | -|--------|--------|-------|---------| -| **anthropic** | 2 models | 4 models | Added claude-sonnet-4-5, claude-opus-4-6 | -| **deepseek** | 1 model (no structured-output) | 2 models | Added structured-output to chat, added deepseek-reasoner (no tools) | -| **xai** | 1 model (grok-3, legacy) | 2 models | Replaced grok-3 → grok-4-fast-non-reasoning, added grok-2-vision-1212 | -| **mistral** | 1 model (no vision) | 2 models | Added vision tests, added mistral-large-latest | -| **amazon-bedrock** | Missing structured-output | Fixed | Added structured-output, streaming-structured-output | -| **cloudflare** | Missing tool-request | Fixed | Added tool-request, streaming-multiturn | -| **ollama** | Missing tool-request, vision | Fixed | Added tool-request, input-image-base64 | - -### Phase 3: Validation ⏳ PENDING - -| Task | Description | Depends On | File(s) | Status | -|------|-------------|------------|---------|--------| -| `validate-google-genai` | Manual end-to-end validation with live API via `genkit dev:test-model` | `runner-script` | -- (manual run) | ⏳ Not yet run | - -### Execution Timeline - -``` -TIME --> -========================================================================== - -P0: [fix-imagen-gap ~~~~~~~~~~~~] [setup-dir ~~~] - (parallel) (parallel) - | - --- all P0 complete ----------------+-------- - | -P1: [symlink-gemini-spec ~] [entry-google-genai ~] - [spec-anthropic ~~~~~~] [spec-compat-oai ~~~~] - (all 4 in parallel) - | - --- all P1 complete --- - | -P2: [runner-script ~~~~~~~~~~~~] - | -P2.5:[audit-specs ~~~~~~~~~] - | -P3: [conform tool ~~~~~~~~~~~~~~~] ← native runner, unified table - | -P4: [validate-google-genai ~~~~] - | - === PHASE 1 SCOPE COMPLETE === -``` - -### Phase 3: Conform CLI Tool + Native Runner ✅ COMPLETE - -| Task | Description | File(s) | Status | -|------|-------------|---------|--------| -| `conform-cli` | Multi-runtime CLI tool (`py/tools/conform/`) | `cli.py`, `config.py`, `runner.py`, etc. | ✅ Done (PR #4593) | -| `native-runner` | In-process runner for Python, reflection runner for JS/Go | `test_model.py`, `reflection.py` | ✅ Done | -| `validators` | 10 validators ported 1:1 from JS canonical source | `validators/*.py` | ✅ Done | -| `unified-table` | Single table with Runtime column across runtimes | `display.py`, `types.py` | ✅ Done | -| `global-flags` | `--runtime` accepts matrix (e.g., `python go`), shown in subcommand help | `cli.py` | ✅ Done | -| `remove-test-model` | Merged into `check-model` (native runner is default, `--use-cli` for legacy) | `cli.py` | ✅ Done | - -### Phase 4: Validation ⏳ PENDING - ---- - -## What To Build - -### Prerequisite: Fix Imagen Gap in Python google-genai Plugin - -The JS plugin supports Imagen under the `googleai/` prefix but the Python plugin -only registers it under `vertexai/`. The `ImagenModel` class is already -client-agnostic (uses `client.aio.models.generate_images()` which works for -both); only the registration code needs updating. - -**File:** `py/plugins/google-genai/src/genkit/plugins/google_genai/google.py` - -**Changes (~20 lines):** - -1. **`GoogleAI.init()`** -- Add Imagen model loop after Gemini registration: - ```python - for name in genai_models.imagen: - actions.append(self._resolve_model(googleai_name(name))) - ``` -2. **`GoogleAI._resolve_model()`** -- Add Imagen detection branch (mirror - VertexAI logic): - ```python - if clean_name.lower().startswith('imagen'): - model_ref = vertexai_image_model_info(clean_name) - model = ImagenModel(clean_name, self._client) - IMAGE_SUPPORTED_MODELS[clean_name] = model_ref - config_schema = ImagenConfigSchema - # ... create and return Action - ``` -3. **`GoogleAI.list_actions()`** -- Include Imagen in discovered actions list: - ```python - for name in genai_models.imagen: - actions_list.append( - model_action_metadata( - name=googleai_name(name), - info=vertexai_image_model_info(name).model_dump(by_alias=True), - config_schema=ImagenConfigSchema, - ) - ) - ``` - -### Directory Layout - -All conformance testing files live under `py/tests/conform/`: - -``` -py/tests/conform/ - google-genai/ - conformance_entry.py # minimal Genkit entry point - model-conformance.yaml -> symlink # -> js/plugins/google-genai/tests/model-tests-tts.yaml - anthropic/ - conformance_entry.py - model-conformance.yaml # anthropic-specific spec - compat-oai/ - conformance_entry.py - model-conformance.yaml # openai-specific spec - ...13 plugins total... -py/tools/conform/ # conform CLI tool - src/conform/ - cli.py # arg parsing + dispatch - config.py # TOML config loader - runner.py # legacy genkit CLI runner - test_model.py # native runner + ActionRunner Protocol - reflection.py # async HTTP client for reflection API - validators/ # 10 validators (1:1 with JS) -py/bin/conform # wrapper script -``` - -### Entry Point Template - -Each plugin gets a minimal Python script that initializes Genkit with just that -plugin. The reflection server starts automatically in dev mode (`GENKIT_ENV=dev`, -set by `genkit start`). - -```python -"""Minimal entry point for model conformance testing via genkit dev:test-model.""" -import asyncio -from genkit.ai import Genkit -from genkit.plugins.google_genai import GoogleAI # varies per plugin - -ai = Genkit(plugins=[GoogleAI()]) - -async def main(): - while True: - await asyncio.sleep(3600) - -if __name__ == '__main__': - ai.run_main(main()) -``` - -### Spec Files - -**google-genai:** Symlink to the JS spec file so both runtimes test the same -models with the same expectations: - -```bash -# From py/tests/conformance/google-genai/ -ln -s "$(git rev-parse --show-toplevel)/js/plugins/google-genai/tests/model-tests-tts.yaml" model-conformance.yaml -``` - -The JS spec tests: -- `googleai/imagen-4.0-generate-001` (output-image) -- `googleai/gemini-2.5-flash-preview-tts` (custom TTS test) -- `googleai/gemini-2.5-pro` (tool-request, structured-output, multiturn, system-role, image-base64, image-url, video-youtube) -- `googleai/gemini-3-pro-preview` (same + reasoning, streaming, tool-response custom tests) -- `googleai/gemini-2.5-flash` (same as gemini-2.5-pro) - -Env: `GEMINI_API_KEY` - -**anthropic:** New spec. Models: `anthropic/claude-sonnet-4` and -`anthropic/claude-haiku-4-5`. Tests: tool-request, multiturn, system-role, -input-image-base64, input-image-url, streaming-multiturn, streaming-tool-request. -Haiku-4-5 adds structured-output and streaming-structured-output. - -Env: `ANTHROPIC_API_KEY` - -**compat-oai (OpenAI):** New spec. Models: `openai/gpt-4o` and -`openai/gpt-4o-mini`. Tests: tool-request, structured-output, multiturn, -system-role, input-image-base64, input-image-url, streaming-multiturn, -streaming-tool-request, streaming-structured-output. - -Env: `OPENAI_API_KEY` - -### Conform CLI Tool - -**Location:** `py/bin/conform` (wrapper) → `py/tools/conform/` - -```bash -# Usage: -conform check-model # test all plugins, all runtimes -conform check-model anthropic xai # test specific plugins -conform --runtime python go check-model # matrix: python + go only -conform check-model --use-cli # legacy genkit CLI fallback -conform list # show readiness table -conform check-plugin # lint-time file check -``` - -The tool: -- Uses the native runner by default (in-process for Python, async HTTP for JS/Go) -- Falls back to `genkit dev:test-model` subprocess with `--use-cli` -- Runs across all configured runtimes by default (`--runtime` for matrix) -- Shows a unified table with Runtime column across runtimes -- Reports aggregate pass/fail and exits non-zero on failure - -> **Note:** [`uv`](https://docs.astral.sh/uv/) is the project's standard Python -> package manager and task runner, already used throughout the repository (see -> `py/pyproject.toml` workspace configuration and `py/bin/` scripts). It is -> installed as part of the developer setup via `bin/setup`. - -### Built-in Test Capabilities - -The following test types are available from `dev:test-model` (from -[dev-test-model.ts lines 254-476][dev-test-model]): - -| Test | Description | -|------|-------------| -| `tool-request` | Tool/function calling conformance | -| `structured-output` | JSON schema output | -| `multiturn` | Multi-turn conversation | -| `streaming-multiturn` | Streaming + multiturn | -| `streaming-tool-request` | Streaming tool calls | -| `streaming-structured-output` | Streaming structured output | -| `system-role` | System message handling | -| `input-image-base64` | Base64 image input | -| `input-image-url` | URL image input | -| `input-video-youtube` | YouTube video input | -| `output-audio` | TTS/audio output | -| `output-image` | Image generation | - -### Built-in Validators - -`has-tool-request[:toolName]`, `valid-json`, `text-includes:expected`, -`text-starts-with:prefix`, `text-not-empty`, `valid-media:type`, `reasoning`, -plus streaming variants (`stream-text-includes`, `stream-has-tool-request`, -`stream-valid-json`). - ---- - -## Phase 2 (Future -- after Phase 1 validated) - -Add conformance specs for remaining plugins. The parity analysis above informs -which capabilities to test per plugin: - -| Plugin | Test Capabilities | Notes | -|--------|-------------------|-------| -| **mistral** | tool-request, structured-output, multiturn, system-role, streaming-multiturn, input-image-base64, input-image-url | All Large 3/Medium 3.1/Small 3.2/Ministral 3/Magistral support vision. Voxtral adds audio input. | -| **deepseek** | tool-request, structured-output, multiturn, system-role, streaming-multiturn | | -| **xai** | tool-request, structured-output, multiturn, system-role, streaming-multiturn | grok-2-vision adds input-image | -| **ollama** | tool-request, structured-output, multiturn, system-role | Depends on locally installed model | -| **amazon-bedrock** | tool-request, structured-output, multiturn, system-role, streaming-multiturn, input-image-base64 | Model-dependent | -| **huggingface** | tool-request, structured-output, multiturn, system-role | Model-dependent | -| **microsoft-foundry** | tool-request, structured-output, multiturn, system-role, streaming-multiturn, input-image-base64 | Model-dependent | -| **cloudflare-workers-ai** | tool-request, structured-output, multiturn, system-role | Model-dependent | - ---- - -## CI Integration Notes - -- These are **live API tests** -- they call real model endpoints. Do NOT run in - standard CI. -- Gate behind manual trigger or CI label (e.g., `run-conformance-tests`). -- Each plugin requires its own API key/credentials. -- Consider a `--dry-run` mode in the runner script that validates spec files - parse correctly without making API calls. - ---- - -## Effort Estimates - -| Phase | Tasks | Effort | Parallelizable | -|-------|-------|--------|----------------| -| **P0** | 2 tasks (fix-imagen-gap, setup-dir) | ~1 hour | Yes | -| **P1** | 4 tasks (symlink, entry, 2 specs) | ~2 hours | Yes | -| **P2** | 1 task (runner script) | ~1 hour | No | -| **P3** | 1 task (E2E validation) | ~1 hour | No | -| **Total** | 8 tasks | ~3-5 hours (with parallelism) | | diff --git a/py/engdoc/parity-analysis/feature_parity_analysis.md b/py/engdoc/parity-analysis/feature_parity_analysis.md deleted file mode 100644 index ba2961a780..0000000000 --- a/py/engdoc/parity-analysis/feature_parity_analysis.md +++ /dev/null @@ -1,653 +0,0 @@ -# Genkit Feature Parity Analysis: JS vs Python - -This document analyzes feature gaps and behavioral differences between the JavaScript (canonical) implementation and the Python implementation of Genkit. - ---- - -## Executive Summary - -| Category | JS | Python | Gap | -|----------|-----|--------|-----| -| Plugins | 18 | 13 | 5 missing | -| Core API Methods | ~45 | ~35 | 10+ missing | -| Session/Chat | ✅ | ❌ | **Critical Gap** | -| Background Actions | ✅ | ❌ | **Critical Gap** | -| Dynamic Action Provider | ✅ | ❌ | Significant Gap | - ---- - -## 1. Missing Core Features - -### 1.1 Session & Chat (Critical Gap) - -> [!CAUTION] -> Python lacks stateful conversation management entirely. - -**JS has:** -- [Session](/js/ai/src/session.ts) class with: - - `updateState(data)` - Update session state - - `updateMessages(thread, messages)` - Manage thread history - - `chat()` - Create chat sessions with thread support - - `run(fn)` - Execute within session context - - `toJSON()` - Serialize session -- [Chat](/js/ai/src/chat.ts) class with: - - `send(options)` - Send message with history - - `sendStream(options)` - Streaming with history - - `messages()` - Get conversation history - - Thread management (multiple conversations per session) -- `SessionStore` interface for persistence -- `ai.createSession()` and `ai.chat()` veneer methods - -**Python has:** Nothing equivalent. No way to maintain conversation history across multiple `generate()` calls without manual message management. - ---- - -### 1.2 Background Actions & Background Models (Critical Gap) - -> [!CAUTION] -> Python lacks long-running operation support. - -**JS has:** -- [BackgroundAction](/js/core/src/background-action.ts): - - `start(input, options)` - Start background operation - - `check(operation)` - Check operation status - - `cancel(operation)` - Cancel running operation -- `Operation` type with `id`, `done`, `output`, `error`, `metadata` -- `defineBackgroundAction()` - Register background actions -- `defineBackgroundModel()` - Register models that return operations (e.g., video generation) -- [ai.checkOperation()](/js/genkit/src/genkit.ts#L866-L886) - Veneer method - -**Python has:** Nothing. Cannot use models like Veo that return operations for later retrieval. - ---- - -### 1.3 Dynamic Action Provider (Significant Gap) - -**JS has:** -- [DynamicActionProvider](/js/core/src/dynamic-action-provider.ts): - - Caching with configurable TTL - - `invalidateCache()` - Force refresh - - `getAction(type, name)` - Resolve action dynamically - - `listActionMetadata(type, name)` - List available actions - - Used by MCP plugin for dynamic tool discovery - -**Python has:** Nothing. The MCP plugin must pre-register all actions. - ---- - -### 1.4 Missing Veneer Methods - -| JS Method | Description | Python Status | -|-----------|-------------|---------------| -| `ai.createSession()` | Create stateful session | ❌ Missing | -| `ai.chat()` | Quick chat session | ❌ Missing | -| `ai.currentSession()` | Get active session | ❌ Missing | -| `ai.checkOperation()` | Check background op | ❌ Missing | -| `ai.defineSimpleRetriever()` | Simplified retriever | ❌ Missing | -| `ai.defineBackgroundModel()` | Background model | ❌ Missing | -| `ai.defineDynamicActionProvider()` | DAP registration | ❌ Missing | -| `ai.defineJsonSchema()` | JSON Schema registration | ❌ Missing | -| `ai.dynamicTool()` | Unregistered tools | ❌ Missing | -| `ai.run()` | Named trace step | ❌ Missing | -| `ai.embedMany()` | Bulk embedding | ❌ Missing | -| `ai.index()` | Indexing veneer | ❌ Missing | - ---- - -## 2. Plugin Gaps - -### 2.1 Missing Plugins (5) - -| JS Plugin | Description | Priority | -|-----------|-------------|----------| -| `@genkit-ai/checks` | Google Checks for safety | Medium | -| `@genkit-ai/chroma` | Chroma vector store | Low | -| `@genkit-ai/cloud-sql-pg` | Cloud SQL PostgreSQL | Medium | -| `@genkit-ai/pinecone` | Pinecone vector store | Medium | -| `@genkit-ai/langchain` | LangChain integration | Low | -| `@genkit-ai/next` | Next.js integration | N/A (Python irrelevant) | -| `@genkit-ai/express` | Express integration | N/A (Flask exists) | -| `@genkit-ai/googleai` | Legacy Google AI | Being deprecated | - -### 2.2 Plugin Feature Gaps - -#### Vertex AI Plugin - -| Feature | JS | Python | -|---------|-----|--------| -| Gemini Models | ✅ | ✅ | -| Imagen Models | ✅ | ✅ (via google-genai) | -| Embedders | ✅ | Limited | -| Rerankers | ✅ | ❌ Missing | -| Context Caching | ✅ | ❌ Missing | -| Vector Search | ✅ | ✅ | -| Evaluation | ✅ | ❌ Missing | -| Model Garden | ✅ | ✅ | - -#### Google GenAI Plugin - -| Feature | JS | Python | -|---------|-----|--------| -| Gemini Models | ✅ | ✅ | -| Imagen Models | ✅ | ✅ | -| Embedders | ✅ | ✅ | -| Context Caching | ✅ | ❌ Missing | -| Live/Realtime | ✅ | ❌ Missing | - ---- - -## 2.3 Prompt API (`ai.prompt()` / `ai.definePrompt()`) - -### JS API - -```typescript -// Lookup prompt by name -ai.prompt(name: string, options?: { variant?: string }) - : ExecutablePrompt - -// Define prompt with config + template/function -ai.definePrompt({ - name: string, - model?: string, - input?: { schema: z.ZodSchema }, - output?: { schema: z.ZodSchema }, - config?: GenerationConfig, - messages?: string | ((input) => Message[]), // Template string or function - tools?: ToolRef[], -}, templateOrFn?) -``` - -**Key JS Features:** -- Generic type parameters `` for type-safe input/output -- `messages` can be a Dotprompt template string -- Returns `ExecutablePrompt` with `()` call and `.stream()` method -- Automatic `.prompt` file loading from `promptDir` - -### Python API - -```python -# Lookup prompt by name -await ai.prompt(name: str, variant: str | None = None) - -> ExecutablePrompt - -# Define prompt with explicit kwargs -ai.define_prompt( - name: str | None = None, - variant: str | None = None, - model: str | None = None, - config: GenerationCommonConfig | dict | None = None, - description: str | None = None, - input_schema: type | dict | str | None = None, - system: str | Part | list[Part] | Callable | None = None, - prompt: str | Part | list[Part] | Callable | None = None, - messages: str | list[Message] | Callable | None = None, - output_format: str | None = None, - output_content_type: str | None = None, - output_instructions: bool | str | None = None, - output_schema: type | dict | str | None = None, - output_constrained: bool | None = None, - max_turns: int | None = None, - return_tool_requests: bool | None = None, - tools: list[str] | None = None, - tool_choice: ToolChoice | None = None, - use: list[ModelMiddleware] | None = None, - docs: list[DocumentData] | Callable | None = None, - metadata: dict | None = None, -) -``` - -**Key Differences:** - -| Feature | JS | Python | Notes | -|---------|-----|--------|-------| -| Type generics | ✅ `` | ❌ | No typed input/output | -| Sync lookup | ✅ Sync | ❌ `async` | Python requires `await` | -| Separate `system` param | ❌ | ✅ | Python has dedicated system param | -| Separate `prompt` param | ❌ | ✅ | Python can pass prompt separately | -| `output_*` params | Combined in `output` | ✅ Explicit | More granular in Python | -| `docs` param | ❌ | ✅ | Python has docs for RAG | -| `max_turns` | In config | ✅ Direct | Easier access in Python | -| Template strings | ✅ Dotprompt | ✅ Handlebars | Both support templates | -| `.prompt` file loading | ✅ Auto | ✅ Auto | Both support file loading | - -### ExecutablePrompt Comparison - -**JS:** -```typescript -const result = await myPrompt({ name: 'value' }); -const { stream, response } = await myPrompt.stream({ name: 'value' }); -``` - -**Python:** -```python -result = await my_prompt(name='value') -stream, response = await my_prompt.stream(name='value') -``` - -> [!NOTE] -> Python's prompt API is **more complete** in some ways (explicit `system`, `docs`, `max_turns`), but lacks the type safety of JS generics. - ---- - -## 2.4 Complete API Surface Comparison - -### Method Parity Matrix - -| Method | JS | Python | Notes | -|--------|-----|--------|-------| -| **Core Generation** | -| `generate()` | ✅ | ✅ | Both support | -| `generateStream()` | ✅ | ✅ `generate_stream()` | Name differs | -| `checkOperation()` | ✅ | ❌ | Python missing (needed for Veo) | -| **Prompts** | -| `prompt()` | ✅ Sync | ✅ `async` | Python requires await | -| `definePrompt()` | ✅ | ✅ `define_prompt()` | Both support | -| **Flows** | -| `defineFlow()` | ✅ | ✅ `@ai.flow()` decorator | Python uses decorator | -| `run()` | ✅ | ❌ | Step tracing missing in Python | -| `currentContext()` | ✅ | ❌ | Python missing | -| **Tools** | -| `defineTool()` | ✅ | ✅ `@ai.tool()` decorator | Python uses decorator | -| `dynamicTool()` | ✅ | ❌ | Python missing | -| **Models** | -| `defineModel()` | ✅ | ✅ `define_model()` | Both support | -| `defineBackgroundModel()` | ✅ | ❌ | Python missing | -| **RAG** | -| `retrieve()` | ✅ | ✅ | Both support | -| `index()` | ✅ | ✅ | Both support | -| `defineRetriever()` | ✅ | ✅ `define_retriever()` | Both support | -| `defineSimpleRetriever()` | ✅ | ❌ | Python missing | -| `defineIndexer()` | ✅ | ✅ `define_indexer()` | Both support | -| **Embeddings** | -| `embed()` | ✅ | ✅ | Both support | -| `embedMany()` | ✅ | ❌ | Python missing | -| `defineEmbedder()` | ✅ | ✅ `define_embedder()` | Both support | -| **Reranking** | -| `rerank()` | ✅ | ✅ | Both support | -| `defineReranker()` | ✅ | ✅ `define_reranker()` | Both support | -| **Evaluation** | -| `evaluate()` | ✅ | ✅ | Both support | -| `defineEvaluator()` | ✅ | ✅ `define_evaluator()` | Both support | -| — | — | ✅ `define_batch_evaluator()` | Python extra | -| **Schemas** | -| `defineSchema()` | ✅ | ✅ `define_schema()` | Both support | -| `defineJsonSchema()` | ✅ | ❌ | Python missing | -| **Templates** | -| `defineHelper()` | ✅ | ✅ `define_helper()` | Both support | -| `definePartial()` | ✅ | ✅ `define_partial()` | Both support | -| **Dynamic Actions** | -| `defineDynamicActionProvider()` | ✅ | ❌ | Python missing (MCP) | -| **Formats** | -| — | — | ✅ `define_format()` | Python extra | -| **Resources** | -| — | — | ✅ `define_resource()` | Python extra | -| **Session/Chat** | -| `createSession()` | ✅ | ❌ | Python missing | -| `loadSession()` | ✅ | ❌ | Python missing | -| `chat()` | ✅ | ❌ | Python missing | -| **Lifecycle** | -| `configure()` | ✅ | ❌ | Python uses constructor | -| `stopServers()` | ✅ | ❌ | Python missing | -| `run_main()` | ❌ | ✅ | Python extra | - -### Critical Missing APIs (Python) - -| API | Use Case | Priority | -|-----|----------|----------| -| `checkOperation()` | Poll long-running ops (Veo, Imagen) | P0 | -| `createSession()`/`loadSession()` | Stateful multi-turn | P0 | -| `chat()` | Simple chat interface | P0 | -| `run()` | Step tracing in flows | P1 | -| `dynamicTool()` | Runtime tool creation | P1 | -| `defineBackgroundModel()` | Long-running models | P1 | -| `defineDynamicActionProvider()` | MCP host support | P2 | -| `currentContext()` | Auth/context access | P2 | -| `embedMany()` | Batch embedding | P2 | -| `defineSimpleRetriever()` | Quick retriever setup | P3 | -| `defineJsonSchema()` | Register JSON schemas | P3 | - -### Python Extras (Not in JS) - -| API | Use Case | -|-----|----------| -| `define_batch_evaluator()` | Evaluate entire dataset at once | -| `define_format()` | Register custom output formats | -| `define_resource()` | Register MCP resources | -| `run_main()` | Dev server entry point | - ---- - -## 2.5 Telemetry and Tracing Comparison - -### Tracing Infrastructure - -| Feature | JS | Python | Notes | -|---------|-----|--------|-------| -| **OpenTelemetry SDK** | ✅ `@opentelemetry/sdk-node` | ✅ `opentelemetry-sdk` | Both use OTel | -| **TracerProvider** | ✅ | ✅ | Both configure | -| **SimpleSpanProcessor** | ✅ (dev) | ✅ (dev) | Same pattern | -| **BatchSpanProcessor** | ✅ (prod) | ✅ (prod) | Same pattern | -| **RealtimeSpanProcessor** | ✅ | ✅ | Parity achieved | -| **Configurable via env** | ✅ `GENKIT_ENABLE_REALTIME_TELEMETRY` | ✅ | Parity achieved | - -### Realtime Tracing - -> [!NOTE] -> Both JS and Python now have `RealtimeSpanProcessor` that exports spans on **both start AND end**, enabling live trace visualization during development. - -**JS RealtimeSpanProcessor:** -```typescript -class RealtimeSpanProcessor implements SpanProcessor { - onStart(span: Span): void { - // Export immediately for real-time updates - this.exporter.export([span], () => {}); - } - onEnd(span: ReadableSpan): void { - // Export completed span - this.exporter.export([span], () => {}); - } -} -``` - -**Python:** Equivalent implementation in `genkit.core.trace.realtime_processor`: -```python -class RealtimeSpanProcessor(SpanProcessor): - def on_start(self, span: Span, parent_context: Context | None = None) -> None: - # Export immediately for real-time updates - self._exporter.export([span]) - - def on_end(self, span: ReadableSpan) -> None: - # Export completed span - self._exporter.export([span]) -``` - -### Span Exporters - -| Exporter | JS | Python | Notes | -|----------|-----|--------|-------| -| **TelemetryServerExporter** | ✅ `TraceServerExporter` | ✅ `TelemetryServerSpanExporter` | Both have | -| **GCP Cloud Trace** | ✅ | ✅ | Both via plugins | -| **AdjustingTraceExporter** | ✅ (redacts content) | ✅ | Parity achieved | -| **Custom exporter API** | ✅ | ✅ `add_custom_exporter()` | Both support | - -### Telemetry Configuration API - -| API | JS | Python | Notes | -|-----|-----|--------|-------| -| `enableTelemetry(config)` | ✅ | ❌ | Python auto-configures | -| `flushTracing()` | ✅ | ✅ `ai.flush_tracing()` | Parity achieved | -| `cleanUpTracing()` | ✅ | ❌ | Python no cleanup | -| `TelemetryConfig` type | ✅ `Partial` | ❌ | Python untyped | - -### Metrics - -| Metric | JS (GCP Plugin) | Python (GCP Plugin) | -|--------|----------------|---------------------| -| `genkit/ai/generate/requests` | ✅ | ✅ | -| `genkit/ai/generate/failures` | ✅ | ✅ | -| `genkit/ai/generate/latency` | ✅ | ✅ | -| `genkit/ai/generate/input/tokens` | ✅ | ✅ | -| `genkit/ai/generate/output/tokens` | ✅ | ✅ | -| `genkit/ai/generate/input/characters` | ✅ | ✅ | -| `genkit/ai/generate/output/characters` | ✅ | ✅ | -| `genkit/ai/generate/input/images` | ✅ | ✅ | -| `genkit/ai/generate/output/images` | ✅ | ✅ | -| `genkit/ai/generate/input/videos` | ✅ | ✅ | -| `genkit/ai/generate/output/videos` | ✅ | ✅ | -| `genkit/ai/generate/input/audio` | ✅ | ✅ | -| `genkit/ai/generate/output/audio` | ✅ | ✅ | - -> [!NOTE] -> **Metrics parity is good!** Both JS and Python google-cloud plugins record the same AI monitoring metrics. - -### GCP Plugin Comparison - -| Feature | JS `@genkit-ai/google-cloud` | Python `google-cloud` | -|---------|------------------------------|----------------------| -| **Cloud Trace export** | ✅ | ✅ | -| **Cloud Metrics export** | ✅ | ✅ | -| **Automatic instrumentation** | ✅ (Pino, Winston) | ❌ | -| **Span adjustment/redaction** | ✅ `AdjustingTraceExporter` | ✅ | Parity achieved | -| **Feature markers** | ✅ (marks genkit spans) | ❌ | - -### Telemetry Gaps Summary - -| Gap | Priority | Status | -|-----|----------|--------| -| ~~**RealtimeSpanProcessor**~~ | ~~P1~~ | ✅ Implemented - live tracing now works | -| ~~**Span redaction**~~ | ~~P2~~ | ✅ Implemented - `AdjustingTraceExporter` | -| ~~**flushTracing() API**~~ | ~~P2~~ | ✅ Implemented - `ai.flush_tracing()` | -| **Logging instrumentation** | P3 | Logs not auto-correlated | -| **enableTelemetry() config** | P3 | Less flexibility | - ---- - -## 3. Model Configuration Not Showing in DevUI - -> [!CAUTION] -> **Critical Bug**: Model configuration options don't appear in DevUI for Python Google GenAI models. - -### Root Cause - -Python's `google.py` has `config_schema` **commented out** when calling `model_action_metadata()`: - -```python -# google.py line 484-490 -actions_list.append( - model_action_metadata( - name=vertexai_name(name), - info=google_model_info(name).model_dump(), - # config_schema=GeminiConfigSchema, # <-- COMMENTED OUT! - ), -) -``` - -Compare to JS which **always passes `configSchema`**: - -```typescript -// gemini.ts line 547-551 -return modelActionMetadata({ - name: ref.name, - info: ref.info, - configSchema: ref.configSchema, // <-- Always included! -}); -``` - -### Impact - -- DevUI cannot display model configuration options (temperature, topP, safety settings, etc.) -- Users cannot adjust model parameters through the UI -- Affects both GoogleAI and VertexAI plugins - -### Fix Required - -1. Uncomment `config_schema=GeminiConfigSchema` in `_resolve_model()` and `list_actions()` -2. Ensure `GeminiConfigSchema` is properly exported and includes all options -3. Verify JSON schema serialization works for Pydantic models - -### Files to Fix -- [google.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/google.py#L243) - `list_actions()` -- [google.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/google.py#L455) - VertexAI `list_actions()` -- [gemini.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/models/gemini.py#L179-L187) - `GeminiConfigSchema` definition - ---- - -## 4. Behavioral Differences - -> [!IMPORTANT] -> These are differences in how features behave, not missing features. - -### 4.1 Streaming Response Handling - -**JS:** Returns `GenerateStreamResponse` with both `stream` (async iterable) and `response` (promise) accessible simultaneously. Stream chunks are also available via `onChunk` callback. - -```typescript -const { response, stream } = ai.generateStream({...}); -for await (const chunk of stream) { - console.log(chunk.text); -} -const final = await response; -``` - -**Python:** Returns GenerateResponseWrapper that requires different access patterns. The `stream` attribute returns an async generator. - -```python -response = await ai.generate_stream(...) -async for chunk in response.stream: - print(chunk.text) -final = await response.response -``` - -**Action Needed:** Verify streaming API ergonomics match JS patterns. - ---- - -### 4.2 Tool Call Response Handling - -**JS:** Tools can return `Part[]` when configured with `multipart: true`, allowing rich responses with multiple content types. - -```typescript -ai.defineTool({ multipart: true }, async (input) => { - return [ - { text: "Analysis:" }, - { media: { url: "data:..." } } - ]; -}); -``` - -**Python:** Tools return a single value that gets wrapped. No explicit multipart support. - -**Action Needed:** Add multipart tool support to Python. - ---- - -### 4.3 Output Schema Validation Behavior - -**JS:** Uses Zod schemas. When `output.schema` is specified, attempts to parse and validate response. Returns typed `response.output`. - -**Python:** Uses Pydantic models. Schema is converted to JSON Schema for the model request, but response parsing may differ. - -**Action Needed:** Verify output parsing behavior matches, especially for: -- Partial JSON handling -- Array extraction -- Nested object validation - ---- - -### 4.4 Prompt Resolution Order - -**JS:** -1. Looks in prompt cache (already loaded) -2. Loads from `promptDir` (default: `./prompts`) -3. Checks registered prompts via `definePrompt()` - -**Python:** -1. Checks registry for registered prompts -2. Loads from prompt directory if configured - -**Action Needed:** Verify prompt loading order and caching behavior. - ---- - -### 4.5 Model Middleware Execution - -**JS:** Middleware wraps the generate call with before/after hooks, can modify request and response. - -**Python:** Similar structure but async/await patterns differ. - -**Action Needed:** Verify middleware execution order and error handling. - ---- - -## 5. Type System Differences - -### 5.1 Part Type Construction - -**JS:** -```typescript -{ text: "hello" } // Direct construction -{ media: { url: "..." } } -``` - -**Python:** (After recent fixes) -```python -Part(root=TextPart(text="hello")) # Must use root= -Part(root=MediaPart(media=Media(url="..."))) -``` - -**Status:** Being addressed - needs consistent patterns documented. - ---- - -### 5.2 Schema Registration - -**JS:** -```typescript -ai.defineSchema('Recipe', RecipeSchema); // Zod schema -ai.defineJsonSchema('Recipe', {...}); // JSON Schema -``` - -**Python:** -```python -ai.define_schema('Recipe', Recipe) # Pydantic model only -``` - -**Action Needed:** Add `define_json_schema()` to Python. - ---- - -## 6. Testing & Tooling Gaps - -| Feature | JS | Python | -|---------|-----|--------| -| Echo Model | ✅ | ✅ | -| Programmable Model | ✅ | ❌ | -| Test Action | ✅ | Limited | -| Trace Viewer | ✅ | ✅ | - ---- - -## 7. Priority Recommendations - -### P0 - Critical (Block key use cases) - -1. **Model Config Schema Bug** - DevUI cannot show model parameters (commented out in Python) -2. **Session/Chat** - Required for conversational AI applications -3. **Background Actions** - Required for video/image generation models - -### P1 - High (Significant functionality gaps) - -4. **Context Caching** - Cost optimization for long contexts -5. **Dynamic Action Provider** - MCP and plugin extensibility -6. **Vertex Rerankers** - RAG quality improvement -7. **Vertex Evaluation** - Built-in quality assessment - -### P2 - Medium (Completeness) - -8. **`defineSimpleRetriever()`** - Developer convenience -9. **`ai.run()`** - Trace step naming -10. **`embedMany()`** - Batch embedding efficiency -11. **Multipart tools** - Rich tool responses - -### P3 - Low (Nice to have) - -12. **Chroma plugin** -13. **Pinecone plugin** -14. **Cloud SQL plugin** - ---- - -## 8. Files Reference - -### JS Core -- [genkit.ts](/js/genkit/src/genkit.ts) - Main Genkit class -- [session.ts](/js/ai/src/session.ts) - Session management -- [chat.ts](/js/ai/src/chat.ts) - Chat implementation -- [background-action.ts](/js/core/src/background-action.ts) - Background ops -- [dynamic-action-provider.ts](/js/core/src/dynamic-action-provider.ts) - DAP - -### Python Core -- [_registry.py](/py/packages/genkit/src/genkit/ai/_registry.py) - GenkitRegistry -- [_base_async.py](/py/packages/genkit/src/genkit/ai/_base_async.py) - Genkit base -- [generate.py](/py/packages/genkit/src/genkit/blocks/generate.py) - Generation -- [prompt.py](/py/packages/genkit/src/genkit/blocks/prompt.py) - Prompts diff --git a/py/engdoc/parity-analysis/model_spec_compliance.md b/py/engdoc/parity-analysis/model_spec_compliance.md deleted file mode 100644 index c2efc1aa73..0000000000 --- a/py/engdoc/parity-analysis/model_spec_compliance.md +++ /dev/null @@ -1,267 +0,0 @@ -# Model Spec Compliance Analysis - -This document cross-checks Python model plugin implementations against the Genkit Model Action Specification ([model-spec.md](/docs/model-spec.md)). - ---- - -## Executive Summary - -| Area | Types Defined | Plugin Implementation | Gap Level | -|------|--------------|----------------------|-----------| -| Core Types | ✅ Complete | ⚠️ Partial use | Medium | -| Metadata | ✅ Complete | ⚠️ Missing fields | Medium | -| Docs Context | ✅ Type exists | ❌ Not implemented | High | -| Latency Tracking | ✅ Type exists | ❌ Not tracked | Medium | -| Partial Tool Streaming | ✅ Type exists | ❌ Not implemented | Low | - ---- - -## 1. Type Compliance (Python Core Types) - -### 1.1 GenerateRequest ✅ - -| Field | Spec | Python Type | Status | -|-------|------|-------------|--------| -| `messages` | Required | ✅ `list[Message]` | Complete | -| `config` | Any | ✅ `Any \| None` | Complete | -| `tools` | ToolDefinition[] | ✅ `list[ToolDefinition] \| None` | Complete | -| `toolChoice` | enum | ✅ `ToolChoice \| None` | Complete | -| `output` | OutputConfig | ✅ `OutputConfig \| None` | Complete | -| `docs` | DocumentData[] | ✅ `list[DocumentData] \| None` | Complete | - -### 1.2 GenerateResponse ✅ - -| Field | Spec | Python Type | Status | -|-------|------|-------------|--------| -| `message` | Message | ✅ `Message \| None` | Complete | -| `finishReason` | enum | ✅ `FinishReason \| None` | Complete | -| `finishMessage` | string | ✅ `str \| None` | Complete | -| `usage` | GenerationUsage | ✅ `GenerationUsage \| None` | Complete | -| `latencyMs` | number | ✅ `float \| None` | Complete | -| `custom` | any | ✅ `Any \| None` | Complete | -| `request` | GenerateRequest | ✅ `GenerateRequest \| None` | Complete | - -### 1.3 GenerateResponseChunk ✅ - -| Field | Spec | Python Type | Status | -|-------|------|-------------|--------| -| `role` | Role | ✅ `Role \| None` | Complete | -| `index` | number | ✅ `float \| None` | Complete | -| `content` | Part[] | ✅ `list[Part]` | Complete | -| `aggregated` | boolean | ✅ `bool \| None` | Complete | -| `custom` | any | ✅ `Any \| None` | Complete | - -### 1.4 Part Types ✅ - -| Part Type | Spec | Python Type | Status | -|-----------|------|-------------|--------| -| TextPart | ✅ | ✅ `TextPart` | Complete | -| MediaPart | ✅ | ✅ `MediaPart` | Complete | -| ToolRequestPart | ✅ | ✅ `ToolRequestPart` | Complete | -| ToolResponsePart | ✅ | ✅ `ToolResponsePart` | Complete | -| CustomPart | ✅ | ✅ `CustomPart` | Complete | -| ReasoningPart | ✅ | ✅ `ReasoningPart` | Complete | -| DataPart | Reserved | ✅ `DataPart` | Complete | - -### 1.5 ToolRequest (Partial Streaming) ✅ - -| Field | Spec | Python Type | Status | -|-------|------|-------------|--------| -| `partial` | boolean | ✅ `bool \| None` | Complete | - ---- - -## 2. Model Action Metadata Gaps - -> [!WARNING] -> Plugin implementations don't fully populate action metadata per spec. - -### 2.1 Spec Requirements - -```json -{ - "model": { - "label": "Human-readable name", - "versions": ["version1", "version2"], - "supports": { - "multiturn": true, - "media": true, - "tools": true, - "systemRole": true, - "output": ["json", "text"], - "contentType": ["application/json"], - "context": false, - "constrained": "no-tools", - "toolChoice": true, - "longRunning": false - }, - "stage": "stable", - "customOptions": { /* JSON Schema */ } - } -} -``` - -### 2.2 Gap Analysis - -| Field | Google GenAI | Anthropic | Ollama | -|-------|-------------|-----------|--------| -| `label` | ✅ | ⚠️ Missing | ⚠️ Missing | -| `versions` | ⚠️ Some models | ✅ | ❌ | -| `multiturn` | ✅ | ✅ | ✅ | -| `media` | ✅ | ✅ | ✅ | -| `tools` | ✅ | ✅ | ✅ | -| `systemRole` | ✅ | ✅ | ✅ | -| `output` | ⚠️ Some | ❌ Missing | ❌ Missing | -| `contentType` | ❌ Missing | ❌ Missing | ❌ Missing | -| `context` | ❌ Missing | ❌ Missing | ❌ Missing | -| `constrained` | ✅ | ❌ Missing | ⚠️ Hardcoded | -| `toolChoice` | ✅ | ❌ Missing | ⚠️ Missing | -| `longRunning` | ❌ Missing | ❌ Missing | ❌ Missing | -| `stage` | ✅ | ❌ Missing | ❌ Missing | -| `customOptions` | ❌ Not exposed | ❌ | ❌ | - ---- - -## 3. Plugin Implementation Gaps - -### 3.1 Docs Context Handling ❌ - -> [!CAUTION] -> **Critical**: No Python plugin implements `docs` context augmentation. - -**Spec Requirement:** -> If `docs` are provided, the model action should incorporate them into the context, typically by augmenting the message history. - -**Current State:** -- Types: `GenerateRequest.docs` exists -- Google GenAI: Does not process `docs` field -- Anthropic: Does not process `docs` field -- Ollama: Does not process `docs` field - -### 3.2 Latency Tracking ❌ - -**Spec Requirement:** -> `latencyMs`: Time taken for generation in milliseconds. - -**Current State:** -- Types: `GenerateResponse.latency_ms` exists -- Google GenAI: Not populating latency_ms -- Anthropic: Not populating latency_ms -- Ollama: Not populating latency_ms - -### 3.3 Request Echo ⚠️ - -**Spec Requirement:** -> `request`: The request that triggered this response. - -**Current State:** -- Types: `GenerateResponse.request` exists -- Plugins: Not consistently populating this field - -### 3.4 Partial Tool Streaming ❌ - -**Spec Requirement:** -> Some models support streaming tool calls with `partial: true`. The final chunk should have `partial: false`. - -**Current State:** -- Types: `ToolRequest.partial` exists -- Google GenAI: Not implemented -- Anthropic: Not implemented -- Ollama: Not implemented - -### 3.5 Server-Side Tools Configuration ⚠️ - -**Spec Requirement:** -> Features like Web Search, Code Execution, or URL Context are configured in `config`, not `tools`. - -**Current State:** -- Google GenAI: ✅ Supports `url_context`, `file_search` in config -- Anthropic: ❌ Not implemented -- Ollama: ❌ Not applicable - -### 3.6 Config Passthrough ⚠️ - -**Spec Requirement:** -> Pass all remaining unknown keys directly to the underlying model API. - -**Current State:** -- Google GenAI: ✅ Inherits from SDK type, passes through -- Anthropic: ⚠️ Only extracts known keys, doesn't pass through -- Ollama: ✅ Passes through via `ollama_api.Options(**config)` - ---- - -## 4. Behavior Compliance - -### 4.1 System Message Handling ✅ - -| Plugin | Extracts System | Separate Field | Status | -|--------|----------------|----------------|--------| -| Google GenAI | ✅ | ✅ `systemInstruction` | Compliant | -| Anthropic | ✅ | ✅ `system` | Compliant | -| Ollama | ✅ | ⚠️ Varies by model | Mostly | - -### 4.2 Tool Definition Conversion ✅ - -| Plugin | Name Sanitization | Schema Convert | Description | -|--------|-------------------|----------------|-------------| -| Google GenAI | ✅ | ✅ | ✅ | -| Anthropic | ✅ | ✅ | ✅ | -| Ollama | ✅ | ✅ | ✅ | - -### 4.3 Finish Reason Mapping ✅ - -| Plugin | Maps Provider Reasons | Standard Enum | -|--------|----------------------|---------------| -| Google GenAI | ✅ | ✅ | -| Anthropic | ✅ | ✅ | -| Ollama | ⚠️ Limited | ✅ | - -### 4.4 Structured Output ⚠️ - -| Plugin | Schema Passed | Constrained Support | Status | -|--------|--------------|---------------------|--------| -| Google GenAI | ✅ | ✅ `no-tools` | Good | -| Anthropic | ⚠️ | ❌ | Limited | -| Ollama | ⚠️ | ⚠️ | Limited | - ---- - -## 5. Priority Recommendations - -### P0 - Critical - -1. **Implement `docs` context handling** - RAG use case is broken without it -2. **Add latency tracking** - Important for monitoring - -### P1 - High - -3. **Complete metadata fields** - `contentType`, `context`, `longRunning` -4. **Config passthrough** for Anthropic/Ollama - Future-proofing -5. **customOptions JSON Schema** - DevUI config display - -### P2 - Medium - -6. **Partial tool streaming** - Advanced feature -7. **Request echo** in response - Debugging support -8. **Stage field** for all plugins - Model lifecycle - -### P3 - Low - -9. **Versions array** for dynamic models -10. **Output contentType** support - ---- - -## 6. Files Reference - -### Spec -- [model-spec.md](/docs/model-spec.md) - -### Python Types -- [typing.py](/py/packages/genkit/src/genkit/core/typing.py) - Core types - -### Plugin Implementations -- [gemini.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/models/gemini.py) -- [anthropic/models.py](/py/plugins/anthropic/src/genkit/plugins/anthropic/models.py) -- [ollama/models.py](/py/plugins/ollama/src/genkit/plugins/ollama/models.py) diff --git a/py/engdoc/parity-analysis/plugin_api_consistency.md b/py/engdoc/parity-analysis/plugin_api_consistency.md deleted file mode 100644 index 24f33c4ba0..0000000000 --- a/py/engdoc/parity-analysis/plugin_api_consistency.md +++ /dev/null @@ -1,295 +0,0 @@ -# Plugin API Consistency Report - -This document analyzes model provider API consistency across JS and Python Genkit plugins, comparing initialization parameters, config schemas, and feature support. - ---- - -## Executive Summary - -| Plugin | JS Config Schema | Python Config Schema | Gap Level | -|--------|-----------------|---------------------|-----------| -| Google GenAI | Full Zod Schema (25+ fields) | Pydantic (inherits from SDK) | **Medium** | -| Anthropic | AnthropicConfigSchema (10+ fields) | GenerationCommonConfig only | **Critical** | -| Ollama | OllamaConfigSchema (6 fields) | GenerationCommonConfig | **Medium** | - ---- - -## 1. Google GenAI / Vertex AI Plugin - -### 1.1 Plugin Initialization Options - -| Parameter | JS | Python | Notes | -|-----------|-----|--------|-------| -| `apiKey` | ✅ | ✅ | Both support | -| `apiVersion` | ✅ | ❌ | Python missing | -| `baseUrl` | ✅ | ❌ | Python missing | -| `customHeaders` | ✅ | ❌ (internal only) | Python injects headers internally | -| `legacyResponseSchema` | ✅ | ❌ | Python missing | -| `experimental_debugTraces` | ✅ | ❌ | Python missing | -| `credentials` | ✅ | ✅ | Both support | -| `project` | ✅ | ✅ | VertexAI only | -| `location` | ✅ | ✅ | VertexAI only | -| `debug_config` | ❌ | ✅ | Python has SDK debug | -| `http_options` | ❌ | ✅ | Python SDK-specific | - -### 1.2 GeminiConfigSchema Comparison - -**JS (Zod Schema):** -```typescript -GeminiConfigSchema = GenerationCommonConfigSchema.extend({ - apiKey: z.string().optional(), // Override plugin apiKey - baseUrl: z.string().optional(), // Override baseUrl - apiVersion: z.string().optional(), // Override apiVersion - safetySettings: z.array(...), // Safety filters - codeExecution: z.boolean().optional(), // Enable code execution - contextCache: z.boolean().optional(), // Enable context caching - functionCallingConfig: z.object({...}), // Tool control - responseModalities: z.array(...), // TEXT/IMAGE/AUDIO - googleSearchRetrieval: z.boolean(), // Grounding with Google Search - fileSearch: z.object({...}), // File search stores - urlContext: z.boolean(), // URL context grounding - temperature: z.number().min(0).max(2), // With descriptions - topP: z.number().min(0).max(1), - thinkingConfig: z.object({ - includeThoughts: z.boolean(), - thinkingBudget: z.number().min(0).max(24576), - thinkingLevel: z.enum(['MINIMAL', 'LOW', 'MEDIUM', 'HIGH']), - }), -}); -``` - -**Python (Pydantic Model):** -```python -class GeminiConfigSchema(genai_types.GenerateContentConfig): - code_execution: bool | None = None - response_modalities: list[str] | None = None - thinking_config: dict[str, Any] | None = None - file_search: dict[str, Any] | None = None - url_context: dict[str, Any] | None = None - api_version: str | None = None -``` - -### 1.3 Config Schema Gaps - -| Field | JS | Python | Priority | -|-------|-----|--------|----------| -| `safetySettings` | ✅ Typed array | Inherits from SDK | P1 | -| `contextCache` | ✅ boolean | ❌ Missing | P1 | -| `functionCallingConfig` | ✅ Typed object | Inherits from SDK | P2 | -| `googleSearchRetrieval` | ✅ boolean/object | Inherits from SDK | P2 | -| Per-field descriptions | ✅ All fields | ❌ None | P2 | -| Type validation bounds | ✅ min/max | ❌ None | P2 | - -### 1.4 API Surface Gap - -> [!WARNING] -> Python plugin does not expose `googleAI.model()` or `vertexAI.model()` convenience methods for creating model references with typed configs. - -**JS Pattern:** -```typescript -const model = googleAI.model('gemini-2.5-flash', { - temperature: 0.8, - thinkingConfig: { includeThoughts: true } -}); -``` - -**Python Pattern:** -```python -# No equivalent - must use string reference -response = await ai.generate( - model='googleai/gemini-2.5-flash', - config={'temperature': 0.8} # Untyped dict -) -``` - ---- - -## 2. Anthropic Plugin - -### 2.1 Plugin Initialization Options - -| Parameter | JS | Python | Notes | -|-----------|-----|--------|-------| -| `apiKey` | ✅ | ✅ | Both support | -| `apiVersion` | ✅ ('stable'/'beta') | ❌ | Python missing | -| `models` | ❌ | ✅ | Python-specific | -| `**anthropic_params` | ❌ | ✅ | Python passes to SDK | - -### 2.2 Config Schema Comparison - -> [!CAUTION] -> **Critical Gap**: Python Anthropic uses only `GenerationCommonConfig`, missing all Claude-specific features. - -**JS AnthropicConfigSchema:** -```typescript -AnthropicConfigSchema = GenerationCommonConfigSchema.extend({ - tool_choice: z.union([ - z.object({ type: z.literal('auto') }), - z.object({ type: z.literal('any') }), - z.object({ type: z.literal('tool'), name: z.string() }), - ]), - metadata: z.object({ user_id: z.string() }).optional(), - apiVersion: z.enum(['stable', 'beta']).optional(), - thinking: z.object({ - enabled: z.boolean().optional(), - budgetTokens: z.number().min(1024).optional(), - }).optional(), -}); -``` - -**Python (Uses GenerationCommonConfig only):** -```python -# No Claude-specific config! -# Just: temperature, max_output_tokens, top_p, stop_sequences, top_k -``` - -### 2.3 Missing Python Features - -| Feature | Description | Impact | -|---------|-------------|--------| -| **Thinking Config** | Extended thinking with budget tokens | Users cannot enable Claude thinking | -| **API Version** | Switch between stable/beta APIs | No access to beta features | -| **Tool Choice** | Force specific tool use | Less control over tool calling | -| **Metadata** | User ID tracking | No usage tracking | -| **Citations** | Document citation support | `anthropicDocument()` missing | -| **Cache Control** | `cacheControl()` helper | No prompt caching | - -### 2.4 Model Mapping Gap - -**JS:** -```typescript -KNOWN_CLAUDE_MODELS = { - 'claude-3-haiku': AnthropicBaseConfigSchema, - 'claude-3-5-haiku': AnthropicBaseConfigSchema, - 'claude-sonnet-4': AnthropicThinkingConfigSchema, // Separate schema! - 'claude-opus-4': AnthropicThinkingConfigSchema, - 'claude-sonnet-4-5': AnthropicThinkingConfigSchema, - // ... -}; -``` - -**Python:** -```python -# All models use same GenerationCommonConfig -# No model-specific config schemas -``` - ---- - -## 3. Ollama Plugin - -### 3.1 Plugin Initialization Options - -| Parameter | JS | Python | Notes | -|-----------|-----|--------|-------| -| `serverAddress` | ✅ | ✅ | Both support | -| `requestHeaders` | ✅ | ✅ | Both support | -| `models` | ✅ | ✅ | Pre-register models | -| `embedders` | ✅ | ✅ | Pre-register embedders | - -### 3.2 Config Schema Comparison - -**JS OllamaConfigSchema:** -```typescript -OllamaConfigSchema = GenerationCommonConfigSchema.extend({ - temperature: z.number().min(0.0).max(1.0) - .describe('...defaults value is 0.8'), - topP: z.number().min(0).max(1.0) - .describe('...defaults value is 0.9'), -}); -``` - -**Python (Uses GenerationCommonConfig):** -```python -# Untyped config - relies on Ollama SDK defaults -``` - -### 3.3 Gaps - -| Gap | Impact | -|-----|--------| -| No per-field descriptions | Less IDE help | -| No type validation | Invalid values sent to Ollama | - ---- - -## 4. Common API Pattern Gaps - -### 4.1 Model Reference Factory - -**JS Pattern (All Plugins):** -```typescript -// Type-safe model reference with IDE autocomplete -const model = googleAI.model('gemini-2.5-flash'); -const model = anthropic.model('claude-sonnet-4'); -const model = ollama.model('llama3'); -``` - -**Python Pattern:** -```python -# Only string-based references - no type safety -model='googleai/gemini-2.5-flash' -model='anthropic/claude-sonnet-4' -model='ollama/llama3' -``` - -### 4.2 Embedder Reference Factory - -**JS Pattern:** -```typescript -const embedder = googleAI.embedder('gemini-embedding-001'); -``` - -**Python:** ❌ No equivalent - -### 4.3 Config Schema in DevUI - -| Plugin | JS DevUI | Python DevUI | -|--------|----------|--------------| -| Google GenAI | ✅ Full config | ❌ Empty (commented out) | -| Anthropic | ✅ Full config | ⚠️ Basic only | -| Ollama | ✅ Full config | ⚠️ Basic only | - ---- - -## 5. Priority Recommendations - -### P0 - Critical - -1. **Add Python `config_schema` to model metadata** - Fix the commented out code -2. **Anthropic ThinkingConfig** - Required for Claude 4.x models - -### P1 - High - -3. **Anthropic-specific config schema** - tool_choice, metadata, apiVersion -4. **Google GenAI plugin options** - apiVersion, baseUrl, customHeaders -5. **Model reference factories** - `plugin.model()` pattern for Python - -### P2 - Medium - -6. **Config field descriptions** - Match JS documentation -7. **Type validation** - min/max bounds on numeric fields -8. **Ollama config schema** - Match JS validation - -### P3 - Low - -9. **Embedder reference factories** - `plugin.embedder()` pattern -10. **Debug trace options** - Match JS tracing options - ---- - -## 6. Files Reference - -### JS Plugins -- [googleai/types.ts](/js/plugins/google-genai/src/googleai/types.ts) - Plugin options -- [googleai/gemini.ts](/js/plugins/google-genai/src/googleai/gemini.ts) - Config schema -- [anthropic/types.ts](/js/plugins/anthropic/src/types.ts) - Anthropic config -- [anthropic/models.ts](/js/plugins/anthropic/src/models.ts) - Model definitions -- [ollama/index.ts](/js/plugins/ollama/src/index.ts) - Ollama config - -### Python Plugins -- [google_genai/google.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/google.py) -- [google_genai/models/gemini.py](/py/plugins/google-genai/src/genkit/plugins/google_genai/models/gemini.py) -- [anthropic/plugin.py](/py/plugins/anthropic/src/genkit/plugins/anthropic/plugin.py) -- [anthropic/models.py](/py/plugins/anthropic/src/genkit/plugins/anthropic/models.py) -- [ollama/plugin_api.py](/py/plugins/ollama/src/genkit/plugins/ollama/plugin_api.py) diff --git a/py/engdoc/parity-analysis/roadmap.md b/py/engdoc/parity-analysis/roadmap.md deleted file mode 100644 index 23e0c0586a..0000000000 --- a/py/engdoc/parity-analysis/roadmap.md +++ /dev/null @@ -1,256 +0,0 @@ -# Parity Analysis & Roadmap - -> [!NOTE] -> This document tracks the feature parity of Genkit Python plugins against the -> Genkit Node.js reference implementation. Use this to identify gaps and plan work. - ---- - -## Current Status (Updated 2026-02-06) - -> [!IMPORTANT] -> **Overall Parity: ~99% Complete** - Nearly all milestones done! -> -> Legacy formatting and type checking issues fixed throughout the repo. -> Remaining work is focused on resolving specific model quirks (e.g. DeepSeek R1 reasoning). - -| Plugin | API Conformance | Missing Features | Security Issues | Test Coverage | Priority | -|--------|-----------------|------------------|-----------------|---------------|----------| -| google-genai | ✅ Verified | Minor | None | Good | - | -| anthropic | ✅ Mostly Conformant (PR #4482) | Citations | None | ✅ Good | Low | -| amazon-bedrock | ✅ Verified | Guardrails | None | Good | Low | -| ollama | ✅ Verified | Vision via chat API | None | Fair | Low | -| mistral | ✅ Mostly Conformant (PR #4481) | Agents API, Codestral FIM | None | ✅ Good | Low | -| xai | ⚠️ Gaps | Agent Tools API (server/client-side) | None | Fair | Medium | -| deepseek | ✅ Mostly Conformant (PR #4480) | Multi-round reasoning | None | ✅ Good | Low | -| cloudflare-workers-ai | ✅ Verified | Async Batch API | None | Good | Low | -| huggingface | ⚠️ Gaps | Inference Endpoints, TGI | None | Fair | Medium | -| azure | ⚠️ Gaps | Azure AI Studio | None | Fair | Medium | - -### Priority Actions - -| Priority | Task | Plugin | Effort | Description | -|----------|------|--------|--------|-------------| -| ~~P0~~ | ~~Fix `reasoning_content` extraction~~ | ~~deepseek~~ | ~~M~~ | ✅ Done (PR #4480) - Extracted via `MessageAdapter` in compat-oai, emits `ReasoningPart` | -| ~~P0~~ | ~~Add parameter validation warnings~~ | ~~deepseek~~ | ~~S~~ | ✅ Done (PR #4480) - `_warn_reasoning_params()` logs warnings for ignored params | -| ~~P1~~ | ~~Add cache control support~~ | ~~anthropic~~ | ~~M~~ | ✅ Done (PR #4482) - `cache_control` with TTL for cost savings | -| ~~P1~~ | ~~Add PDF/Document support~~ | ~~anthropic~~ | ~~M~~ | ✅ Done (PR #4482) - `DocumentBlockParam` for common use case | -| ~~P1~~ | ~~Add embeddings support~~ | ~~mistral~~ | ~~S~~ | ✅ Done (PR #4481) - `mistral-embed` model | -| **P2** | Add Agent Tools API | xai | M | Server/client-side tool calling (Jan 2026) | -| **P2** | Add Agents API | mistral | L | Mistral Agents endpoint | -| **P2** | Add Inference Endpoints | huggingface | M | Dedicated endpoints for production | -| **P3** | Add Guardrails | amazon-bedrock | M | Bedrock Guardrails integration | -| **P3** | Add Azure AI Studio | azure | L | New unified API | - -### Detailed Gap Analysis - -#### 1. google-genai (Gemini/Vertex AI) - -**Status**: ✅ Mostly Conformant - -**Verified Features**: -- Text generation (streaming/non-streaming) ✓ -- Embeddings ✓ -- Image generation (Imagen) ✓ -- Video generation (Veo) ✓ -- Function/tool calling ✓ -- Context caching ✓ -- Safety settings ✓ -- Evaluators (Vertex AI) ✓ -- Rerankers (Vertex AI Discovery Engine) ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Grounding with Google Search | Not implemented for Gemini API | Medium - useful for RAG | Medium | -| Code execution tool | Built-in code execution not exposed | Low | Low | -| Audio generation (Lyria) | Partial - helpers only, no full model | Low | Low | - ---- - -#### 2. anthropic (Claude) - -**Status**: ✅ Mostly Conformant (PR #4482) - -**Verified Features**: -- Messages API ✓ -- Tool/function calling ✓ -- Streaming ✓ -- Vision (images) ✓ -- Thinking mode (extended thinking) ✓ -- ✅ Cache control (ephemeral) ✓ (PR #4482) -- ✅ PDF/Document support (`DocumentBlockParam`) ✓ (PR #4482) -- ✅ URL image source ✓ (PR #4482) - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| ~~Cache control (ephemeral)~~ | ~~`cache_control` with TTL not supported~~ | ~~High~~ | ✅ Done (PR #4482) | -| ~~PDF/Document support~~ | ~~`DocumentBlockParam` not implemented~~ | ~~High~~ | ✅ Done (PR #4482) | -| Citations | Citation extraction not supported | Medium | P2 | -| Web search tool | Server-side `web_search` tool not supported | Medium | P2 | -| Batch API | Message batches not supported | Low - async processing | P3 | - ---- - -#### 3. amazon-bedrock - -**Status**: ✅ Mostly Conformant - -**Verified Features**: -- Converse API ✓ -- ConverseStream API ✓ -- Tool calling ✓ -- Multi-provider support (Claude, Nova, Llama, etc.) ✓ -- Inference profiles for cross-region ✓ -- Embeddings ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Guardrails | Bedrock Guardrails not integrated | Medium - content filtering | P3 | -| Knowledge bases | RAG via Bedrock KB not supported | Medium | P3 | -| Model invocation logging | CloudWatch logging config | Low | P4 | - ---- - -#### 4. ollama - -**Status**: ✅ Conformant - -**Verified Features**: -- Chat API (/api/chat) ✓ -- Generate API (/api/generate) ✓ -- Embeddings API (/api/embeddings) ✓ -- Tool calling ✓ -- Streaming ✓ -- Model discovery ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Vision in chat | Images via chat API need testing | Low - works via generate | P4 | -| Pull models | Model download/management | Low - user manages | P4 | - ---- - -#### 5. mistral - -**Status**: ✅ Mostly Conformant (PR #4481) - -**Verified Features**: -- Chat completions ✓ -- Streaming ✓ -- Tool/function calling ✓ -- JSON mode ✓ -- Vision models (Pixtral) ✓ -- ✅ Embeddings (`mistral-embed`) ✓ (PR #4481) - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| ~~Embeddings~~ | ~~`mistral-embed` model not supported~~ | ~~Medium~~ | ✅ Done (PR #4481) | -| Agents API | Mistral Agents endpoint not supported | High - agentic workflows | P2 | -| FIM (Fill-in-Middle) | Codestral FIM for code completion | Medium - code use cases | P2 | -| Built-in tools | websearch, code_interpreter, image_generation | Medium | P3 | - ---- - -#### 6. xai (Grok) - -**Status**: ⚠️ Has Gaps - -**Verified Features**: -- Chat completions ✓ -- Streaming ✓ -- Tool/function calling ✓ -- Vision (grok-2-vision) ✓ -- Reasoning effort parameter ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Agent Tools API | Server-side and client-side tool calling (Jan 2026) | High - new feature | P2 | -| Web search options | Built-in web search configuration | Medium | P3 | -| New models | grok-4-1-fast-reasoning, grok-4-1-fast-non-reasoning | Medium | P2 | - ---- - -#### 7. deepseek - -**Status**: ✅ Mostly Conformant (PR #4480) - -**Verified Features**: -- Chat completions (OpenAI-compatible) ✓ -- Streaming ✓ -- Uses compat-oai for implementation ✓ -- `reasoning_content` extraction for R1/reasoner models ✓ (PR #4480) -- Parameter validation warnings for R1 (temp, top_p, tools) ✓ (PR #4480) -- Chat vs. reasoning model capability split ✓ (PR #4480) -- `is_reasoning_model()` helper ✓ (PR #4480) - -**Implementation Details (PR #4480)**: -- **compat-oai layer**: `MessageAdapter` wraps raw Pydantic `ChatCompletionMessage` for safe `reasoning_content` access (Pydantic raises `AttributeError` for unknown fields). `MessageConverter.to_genkit()` emits `ReasoningPart` before `TextPart` (matching JS order). -- **Streaming**: `MessageAdapter(delta).reasoning_content` in `_generate_stream()` replaces unsafe `getattr()` pattern. -- **deepseek plugin**: `_warn_reasoning_params()` logs warnings when `temperature`, `top_p`, or `tools` are passed to R1 models. Model capabilities split into chat (`tools=True`) vs. reasoning (`tools=False`). - -**Remaining Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| ~~`reasoning_content`~~ | ~~CoT output not extracted/exposed~~ | ~~**Critical**~~ | ✅ Done (PR #4480) | -| ~~Parameter validation~~ | ~~R1 ignores temp/top_p but no warning~~ | ~~High~~ | ✅ Done (PR #4480) | -| ~~Multi-round reasoning~~ | ~~Must strip reasoning_content from context~~ | ~~High~~ | ✅ Done — `ReasoningPart` skipped in `MessageConverter.to_openai()` | -| Tool calling in R1 | Not supported in reasoner mode | Medium - documented limitation | P2 | - ---- - -#### 8. cloudflare-workers-ai (Cloudflare Workers AI) - -**Status**: ✅ Mostly Conformant - -**Verified Features**: -- Text generation ✓ -- Streaming (SSE) ✓ -- Tool calling (via CF specific implementation) ✓ -- Embeddings ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Async Batch API | Not implemented | Low | Low | -| Function calling standardization | Uses custom impl instead of OpenAI compat | Medium | Low | - ---- - -#### 9. huggingface - -**Status**: ⚠️ Has Gaps - -**Verified Features**: -- Text generation (inference API) ✓ -- Streaming ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Inference Endpoints | Dedicated endpoints not supported | Medium - production use | P2 | -| TGI Integration | Text Generation Inference specific features | Medium | P3 | -| Chat templating | Better reliance on tokenizer chat templates | Low | P3 | - ---- - -#### 10. azure (Azure OpenAI) - -**Status**: ⚠️ Has Gaps - -**Verified Features**: -- Chat completions ✓ -- Streaming ✓ -- Tool calling ✓ - -**Gaps**: -| Gap | Description | Impact | Priority | -|-----|-------------|--------|----------| -| Azure AI Studio | New unified API not supported | Medium | P3 | -| Entra ID Auth | Managed identity support | Medium - enterprise | P2 | -| On Your Data | Azure Search integration | Medium | P3 | diff --git a/py/engdoc/parity-analysis/sample_parity_roadmap.md b/py/engdoc/parity-analysis/sample_parity_roadmap.md deleted file mode 100644 index 04b99b2c06..0000000000 --- a/py/engdoc/parity-analysis/sample_parity_roadmap.md +++ /dev/null @@ -1,471 +0,0 @@ -# Sample Parity Analysis: JS vs Python - -> **Updated:** 2026-02-07 -> **Scope:** Every JS code sample on genkit.dev docs -> Python `py/samples/` counterpart. -> -> **Exclusions (per team decision):** -> - **Chat/Session API** -- Deprecated, skip -> - **Agents / Multi-Agent** -- Not yet in Python SDK, skip -> - **MCP** -- Will come later, skip -> - **Durable Streaming** -- Not yet in Python SDK, skip -> - **Client SDK** -- JS client-side only, not applicable to Python backend SDK - ---- - -## Summary - -**JS Sample Locations:** -- `/samples/` - 9 polished demo samples (js-angular, js-chatbot, js-menu, etc.) -- `/js/testapps/` - 32 internal test/demo apps (advanced scenarios) - -**Python Sample Location:** -- `/py/samples/` - 36 samples (including shared, sample-test) - -| Metric | JS (`samples/` + `testapps/`) | Python (`py/samples/`) | Gap | -|--------|-------------------------------|------------------------|-----| -| Plugin hello demos | 8 | 14 | **Python superset** | -| Advanced feature demos | 15 | 10 | **-5** | -| RAG samples | 5 | 4 | -1 | -| Evaluation | 2 | 2 | Parity | -| Media generation | 1 | 3 | **Python superset** | -| Observability | 0 | 2 | **Python superset** | - ---- - -## 1. genkit.dev Docs -> Python Sample Coverage - -This is the authoritative mapping from every JS code feature demonstrated in -the genkit.dev documentation to its Python sample coverage. Only in-scope -features are listed (exclusions above apply). - -### `/docs/models` -- Generating Content with AI Models - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Basic generation | `ai.generate('prompt')` | All hello samples (`generate_greeting`) | Covered | -| Model reference | `googleAI.model('gemini-2.5-flash')` | All hello samples | Covered | -| Model string ID | `model: 'googleai/gemini-2.5-flash'` | All hello samples | Covered | -| System prompts | `system: "..."` | `provider-google-genai-hello` + most hello samples | Covered | -| Multi-turn (messages) | `messages: [{role, content}]` | `provider-google-genai-hello` + most hello samples | Covered | -| Model parameters | `config: {maxOutputTokens, temperature, ...}` | Most hello samples (`generate_with_config`) | Covered | -| Structured output | `output: { schema: ZodSchema }` | Most samples (`generate_character`) | Covered | -| Streaming text | `ai.generateStream()` | Most samples (`generate_streaming_story`) | Covered | -| Streaming + structured | `generateStream() + output schema` | `provider-google-genai-hello` | Covered | -| Multimodal input (image URL) | `prompt: [{media: {url}}, {text}]` | `provider-google-genai-hello`, `provider-anthropic-hello`, `provider-xai-hello`, etc. | Covered | -| Multimodal input (base64) | `data:image/jpeg;base64,...` | `provider-google-genai-hello` describe_image | Covered | -| Generating media (images) | `output: {format: 'media'}` (Imagen) | `provider-google-genai-media-models-demo` | Covered | -| Generating media (TTS) | Text-to-speech | `provider-google-genai-media-models-demo`, `provider-compat-oai-hello` | Covered | -| Middleware (retry) | `use: [retry({...})]` | `framework-middleware-demo` | Covered | -| Middleware (fallback) | `use: [fallback({...})]` | `framework-middleware-demo` | Covered | - -> **SDK Status:** Python has `use=` middleware infrastructure in `generate()`. -> `framework-middleware-demo` demonstrates custom retry and logging middleware. - -### `/docs/tool-calling` -- Tool Calling - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Define tools | `ai.defineTool()` | All samples with tools | Covered | -| Use tools in generate | `tools: [getWeather]` | Most samples (`generate_weather`) | Covered | -| `maxTurns` | `maxTurns: 8` | 3 samples use `max_turns=2` | Covered | -| `returnToolRequests` | `returnToolRequests: true` | `provider-google-genai-context-caching` | Covered | -| Interrupts (tool-based) | `ctx.interrupt()` | `framework-tool-interrupts`, `provider-google-genai-hello` | Covered | -| Dynamic tools at runtime | Tool defined inline at generate() | `framework-dynamic-tools-demo` uses `ai.dynamic_tool()` | Covered | -| Streaming + tool calling | Stream with tools | All provider hello samples (`generate_streaming_with_tools`) | Covered | - -> **SDK Status:** `ai.dynamic_tool()` exists. Streaming + tools is demonstrated -> in all 12 provider hello samples via `generate_streaming_with_tools` flow. - -### `/docs/interrupts` -- Interrupts - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Tool-based interrupt | `@ai.tool(interrupt=True)` + `ctx.interrupt()` | `framework-tool-interrupts`, `provider-google-genai-hello` | Covered | -| Check response.interrupts | Loop checking for interrupts | `framework-tool-interrupts` | Covered | -| Resume with respond | `resume: { respond: [...] }` | `framework-tool-interrupts`, `provider-google-genai-hello` | Covered | -| `defineInterrupt()` | Standalone interrupt API | Not in Python SDK | N/A (SDK gap) | -| Restartable interrupts | `restart` option | Not in Python SDK | N/A (SDK gap) | - -> **SDK Status:** Python only supports tool-based interrupts via -> `@ai.tool(interrupt=True)`. No standalone `define_interrupt()` API exists. -> This is a SDK feature gap, not a sample gap. - -### `/docs/context` -- Context - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Context in generate() | `context: { auth: {...} }` | `framework-context-demo` (`context_in_generate`) | Covered | -| Context in flow | `{context}` destructured | `framework-context-demo` (`context_in_flow`) | Covered | -| Context in tool | `{context}` in tool handler | `framework-context-demo` | Covered | -| Context propagation | Auto-propagation to sub-actions | `framework-context-demo` (`context_propagation_chain`) | Covered | -| `ai.current_context()` | Access current context | `framework-context-demo` (`context_current_context`) | Covered | - -> **SDK Status:** Full context support exists: `context=` on `generate()` and -> flows, `ActionRunContext`, `ai.current_context()`, and auto-propagation. -> `framework-context-demo` provides comprehensive coverage with 4 dedicated flows. - -### `/docs/dotprompt` -- Managing Prompts with Dotprompt - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| .prompt files | YAML frontmatter + template | `framework-prompt-demo` | Covered (bug: see below) | -| Running prompts from code | `ai.prompt('name')` | `framework-prompt-demo` | Covered (bug: see below) | -| Streaming prompts | `prompt.stream()` | `framework-prompt-demo` | Covered (bug: see below) | -| Input/Output schemas (Picoschema) | `schema:` in frontmatter | `framework-prompt-demo` | Covered (bug: see below) | -| Schema references | `ai.defineSchema()` + name ref | `framework-prompt-demo` | Covered (bug: see below) | -| Model configuration | `config:` in frontmatter | `framework-prompt-demo` | Covered (bug: see below) | -| Handlebars templates | `{{variable}}`, `{{#if}}` | `framework-prompt-demo` | Covered (bug: see below) | -| Multi-message prompts | `{{role "system"}}` | `framework-prompt-demo` (in partial) | Covered (bug: see below) | -| Partials | `{{>partialName}}` | `framework-prompt-demo` (`_style.prompt`) | Covered (bug: see below) | -| Custom helpers | `ai.defineHelper()` | `framework-prompt-demo` (`list` helper) | Covered (bug: see below) | -| Prompt variants | `.variant.prompt` files | Blocked by SDK bug | **BUG** | -| **Tool calling in prompts** | `tools: [...]` in frontmatter | Not in framework-prompt-demo | **GAP** | -| **Multimodal prompts** | `{{media url=photoUrl}}` | Not in framework-prompt-demo | **GAP** | -| **Defining prompts in code** | `ai.definePrompt()` | Not in framework-prompt-demo | **GAP** | -| **Default input values** | `default:` in frontmatter | Not in framework-prompt-demo | **GAP** | - -> **SDK Bug (B1b):** `framework-prompt-demo` had a P0 bug: `Failed to load lazy -> action recipe.robot: maximum recursion depth exceeded`. Root cause is a -> **self-referential lazy loading loop** in the SDK's `create_prompt_from_file()` -> at `py/packages/genkit/src/genkit/blocks/prompt.py` -- when loading a variant -> prompt, `resolve_action_by_key()` is called with the action's own key before -> `_cached_prompt` is set, which triggers `_trigger_lazy_loading()` to re-invoke -> `create_prompt_from_file()` for the same action, causing infinite recursion. -> This is NOT a dotprompt library bug. Only Python is affected (JS uses a -> `lazy()` wrapper guaranteeing single evaluation). -> -> **Workaround (B1a):** `recipe.robot.prompt` was removed to unblock the sample. -> **Fix:** Tracked at [firebase/genkit#4491](https://github.com/firebase/genkit/issues/4491). -> Once fixed, variant demo should be re-added. - -### `/docs/flows` -- Flows - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Define flows | `@ai.flow()` decorator | All samples | Covered | -| Input/output schemas | Pydantic models | All samples | Covered | -| Streaming flows | `ctx.send_chunk()` | Several samples | Covered | -| Deploy with Flask | Flask integration | `web-flask-hello` | Covered | -| Flow steps (`ai.run()`) | Named trace spans | `provider-google-genai-hello` (line 434), `framework-realtime-tracing-demo` | Covered | - -> All flow features documented on genkit.dev are covered. - -### `/docs/rag` -- Retrieval-Augmented Generation - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Basic RAG flow | Retriever + generate | `framework-restaurant-demo` (case_04/05), `provider-firestore-retriever` | Covered | -| Embedders | `ai.embed()` | `provider-google-genai-hello`, `provider-ollama-hello` | Covered | -| Custom retriever | `ai.defineRetriever()` | `provider-firestore-retriever` | Covered | -| Simple retriever | `ai.defineSimpleRetriever()` | No equivalent | **GAP** (minor) | -| Vector search (Firestore) | Firestore vector store | `provider-vertex-ai-vector-search-firestore` | Covered | -| Vector search (BigQuery) | BigQuery vector store | `provider-vertex-ai-vector-search-bigquery` | Covered | -| Reranker | `ai.rerank()` | `provider-vertex-ai-rerank-eval` | Covered | -| Custom reranker | `ai.defineReranker()` | No sample | **GAP** (minor) | -| **Indexer** | `ai.index()` + flow | **No indexer sample** | **GAP** | - -> **SDK Status:** Python SDK does not have a built-in local dev vector store -> plugin (like JS `@genkit-ai/dev-local-vectorstore`). Indexing is done via -> external SDKs (Firestore, etc.). The RAG Python tab on genkit.dev shows -> Firestore-based retrieval only. - -### `/docs/evaluation` -- Evaluation - -| Feature | JS Doc Example | Python Sample | Status | -|---------|---------------|---------------|--------| -| Custom evaluator | `ai.defineEvaluator()` | `framework-evaluator-demo` | Covered | -| Built-in metrics | `GenkitMetric.MALICIOUSNESS` | `provider-vertex-ai-rerank-eval` (BLEU, ROUGE, etc.) | Covered | -| **Full eval pipeline** | Dataset -> inference -> metrics -> results | **No end-to-end pipeline sample** | **GAP** | -| **Data synthesis** | Generate test questions from docs | **No sample** | **GAP** | - -> The JS `evals` testapp demonstrates dataset creation, flow evaluation, and -> result analysis as a complete pipeline. Python needs an equivalent. - ---- - -## 2. Plugin Hello World Demos - -| Plugin | JS | Python | Notes | -|--------|-----|--------|-------| -| Google GenAI | Yes | `provider-google-genai-hello` | Parity | -| Vertex AI | Yes (in basic-gemini) | `provider-google-genai-vertexai-hello` | Parity | -| Anthropic | Yes | `provider-anthropic-hello` | Parity | -| Ollama | Yes | `provider-ollama-hello` | Parity | -| OpenAI Compat | Yes | `provider-compat-oai-hello` | Parity | -| xAI (Grok) | No | `provider-xai-hello` | Python extra | -| DeepSeek | No | `provider-deepseek-hello` | Python extra | -| Model Garden | Yes | `provider-vertex-ai-model-garden` | Parity | -| Mistral | No | `provider-mistral-hello` | Python extra | -| HuggingFace | No | `provider-huggingface-hello` | Python extra | -| Amazon Bedrock | No | `provider-amazon-bedrock-hello` | Python extra | -| Cloudflare Workers AI | No | `provider-cloudflare-workers-ai-hello` | Python extra | -| Microsoft Foundry | No | `provider-microsoft-foundry-hello` | Python extra | - ---- - -## 3. Incomplete Hello Samples - -Several hello samples are missing `generate_with_system_prompt` and/or -`generate_multi_turn_chat` flows that other hello samples already have. - -### `generate_with_system_prompt` flow - -- [x] `provider-microsoft-foundry-hello` -- DONE -- [x] `provider-mistral-hello` -- DONE -- [x] `provider-huggingface-hello` -- DONE -- [x] `provider-google-genai-vertexai-hello` -- DONE -- [ ] `web-short-n-long` (still uses old name `system_prompt`) -- [ ] `provider-vertex-ai-model-garden` (still uses old name `system_prompt`) - -### `generate_multi_turn_chat` flow - -- [x] `provider-microsoft-foundry-hello` -- DONE -- [x] `provider-google-genai-vertexai-hello` -- DONE -- [ ] `web-short-n-long` (still uses old name `multi_turn_chat`) -- [ ] `provider-vertex-ai-model-garden` (still uses old name `multi_turn_chat`) - ---- - -## 4. Items Already Covered (verified) - -These were previously flagged as gaps but are now confirmed covered: - -| Feature | Sample | Notes | -|---------|--------|-------| -| Streaming + structured output | `provider-google-genai-hello` | Has streaming structured output flow | -| Media generation (images) | `provider-google-genai-media-models-demo` | Imagen, Gemini Image, image editing | -| Media generation (TTS) | `provider-google-genai-media-models-demo`, `provider-compat-oai-hello` | Google TTS, OpenAI TTS | -| Reranker | `provider-vertex-ai-rerank-eval` | Vertex AI semantic reranker + eval metrics | -| Dynamic tools | `framework-dynamic-tools-demo` | Standalone sample with `ai.dynamic_tool()` | -| Flow steps (`ai.run()`) | `provider-google-genai-hello`, `framework-realtime-tracing-demo` | Named trace spans | -| Multimodal input | Multiple hello samples | Image, video, audio input | -| Tool interrupts | `framework-tool-interrupts`, `provider-google-genai-hello` | Full interrupt + resume flow | -| Context propagation | `framework-context-demo` | 4 flows covering generate, flow, tool, and current_context | -| Custom middleware | `framework-middleware-demo` | Retry, logging, and chained middleware | -| Streaming + tool calling | All provider hello samples | `generate_streaming_with_tools` flow in all 12 | - ---- - -## 5. Items Out of Scope (not in Python SDK) - -| Feature | Doc Page | Reason | -|---------|----------|--------| -| Chat/Session API | `chat.mdx` | Deprecated | -| Agents / Multi-Agent | `agentic-patterns.mdx`, `multi-agent.mdx` | Not yet in Python SDK | -| MCP | `mcp-server.mdx`, `model-context-protocol.mdx` | Will come later | -| Durable Streaming | `durable-streaming.mdx` | Not in Python SDK | -| `defineInterrupt()` | `interrupts.mdx` | Only tool-based interrupts in Python | -| Client SDK | `client.mdx` | JS client-side only | - ---- - -## 6. Execution Roadmap - -### Dependency Graph - -```mermaid -flowchart TD - subgraph phase0 [Phase 0 - Leaves] - B1a["B1a: Remove recipe.robot.prompt DONE"] - B1b["B1b: Fix SDK lazy loading bug firebase/genkit#4491"] - G1["G1: Context demo DONE"] - G6["G6: Streaming + tools DONE"] - G7["G7: Custom middleware DONE"] - N6["N6: Fix typo firestore-retriever DONE"] - end - - subgraph phase1 [Phase 1 - Dotprompt + Eval + Hello] - G2["G2: Dotprompt tool calling"] - G3["G3: Dotprompt define in code"] - G4["G4: Dotprompt multimodal"] - G5["G5: Dotprompt defaults"] - G8["G8: Eval pipeline"] - H1["H1: generate_with_system_prompt 2 remaining"] - H2["H2: generate_multi_turn_chat 2 remaining"] - end - - subgraph phase2 [Phase 2 - RAG + Eval Extras] - N1["N1: Simple retriever"] - N2["N2: Custom reranker"] - N3["N3: Data synthesis"] - N4["N4: Indexer sample"] - N7["N7: Firebase Functions"] - end - - subgraph phase3 [Phase 3 - Polish] - N5["N5: DevUI gallery"] - end - - B1a --> B1b - B1b --> G2 - B1b --> G3 - B1b --> G4 - B1b --> G5 - G8 --> N3 - G1 --> N5 - G2 --> N5 - G3 --> N5 - G6 --> N5 - G7 --> N5 - G8 --> N5 - H1 --> N5 -``` - -### Edge List - -`A -> B` means "A must complete before B can start": - -- `B1a -> B1b` (removing the bad variant file makes the sample usable; SDK fix restores variant support) -- `B1b -> G2` (SDK fix unblocks dotprompt tool calling) -- `B1b -> G3` (SDK fix unblocks dotprompt define-in-code) -- `B1b -> G4` (SDK fix unblocks dotprompt multimodal) -- `B1b -> G5` (SDK fix unblocks dotprompt defaults) -- `G8 -> N3` (eval pipeline design informs data synthesis) -- `{G1, G2, G3, G6, G7, G8, H1} -> N5` (DevUI gallery showcases all features) - -**Critical path:** `B1a -> B1b -> G2/G3/G4/G5 -> N5` - ---- - -### Phase 0: Leaves (no dependencies, all parallel) -- MOSTLY DONE - -All tasks in this phase are independent (except B1a -> B1b which are sequential). - -| Task | Description | Status | Notes | -|------|-------------|--------|-------| -| **B1a** | Remove `recipe.robot.prompt` from framework-prompt-demo to unblock the sample. | **DONE** | Variant file and code removed | -| **B1b** | Fix the SDK lazy loading bug in `create_prompt_from_file()` that causes infinite recursion when loading `.variant.prompt` files. Root cause: self-referential loop where `resolve_action_by_key()` is called with own key before `_cached_prompt` is set. Once fixed, re-add variant demo. | **BLOCKED** | Tracked at [firebase/genkit#4491](https://github.com/firebase/genkit/issues/4491). Only Python affected. | -| **G1** | Context demo -- `framework-context-demo` with flows for `context=` in generate, context in flows, context in tools, auto-propagation, `ai.current_context()`. | **DONE** | 4 flows: `context_in_generate`, `context_in_flow`, `context_current_context`, `context_propagation_chain` | -| **G6** | Streaming + tool calling -- `generate_streaming_with_tools` flow added to all 12 provider hello samples. | **DONE** | Uses shared `generate_streaming_with_tools_logic` | -| **G7** | Custom middleware demo -- `framework-middleware-demo` with retry, logging, and chained middleware. | **DONE** | 3 flows: `logging_demo`, `request_modifier_demo`, `chained_middleware_demo` | -| **N6** | Rename `firestore-retreiver` to `firestore-retriever` (typo fix). Now `provider-firestore-retriever`. | **DONE** | Directory renamed | - ---- - -### Phase 1: Dotprompt Completion + Eval + Hello Consistency - -G2-G5 are all unblocked by B1b. G8, H1, H2 are independent leaves placed here -for workload balancing. - -| Task | Description | Depends On | Status | -|------|-------------|------------|--------| -| **G2** | Dotprompt: tool calling in prompts -- add a `.prompt` file with `tools: [search, calculate]` in frontmatter, plus a flow that loads and runs it. | B1b | Pending | -| **G3** | Dotprompt: define prompts in code -- add `ai.define_prompt()` usage (no `.prompt` file, purely programmatic). | B1b | Pending | -| **G4** | Dotprompt: multimodal prompts -- add a `.prompt` file using `{{media url=photoUrl}}` helper with image input schema. | B1b | Pending | -| **G5** | Dotprompt: default input values -- add `default:` section to an existing or new `.prompt` file. | B1b | Pending | -| **G8** | Eval pipeline sample -- end-to-end evaluation: define a custom evaluator, prepare a dataset, run inference-based eval, report results. | -- | Pending | -| **H1** | Add `generate_with_system_prompt` flow to 2 remaining samples: `web-short-n-long`, `provider-vertex-ai-model-garden`. | -- | 4/6 Done | -| **H2** | Add `generate_multi_turn_chat` flow to 2 remaining samples: `web-short-n-long`, `provider-vertex-ai-model-garden`. | -- | 4/6 Done | - -**Parallelizable:** G2-G5 are independent of each other (all just need B1b). -G8, H1, H2 are independent of everything. - ---- - -### Phase 2: RAG and Eval Extras - -Lower-priority items that round out coverage for `rag.mdx` and `evaluation.mdx`. -N3 depends on G8. All others are independent leaves. - -| Task | Description | Depends On | Status | -|------|-------------|------------|--------| -| **N1** | Simple retriever -- `ai.define_simple_retriever()` equivalent if SDK supports it, or a minimal custom retriever pattern. | -- | Pending | -| **N2** | Custom reranker -- `ai.define_reranker()` with custom scoring logic. | -- | Pending | -| **N3** | Data synthesis -- generate test questions from documents using an LLM. | G8 | Pending | -| **N4** | Indexer sample -- document ingestion pipeline: chunk PDFs, generate embeddings, store in vector DB. | -- | Pending | -| **N7** | Firebase Functions sample -- Python Cloud Functions deployment with Genkit. | -- | Pending | - ---- - -### Phase 3: Polish - -DevUI gallery depends on most features being in place so it can showcase them all. - -| Task | Description | Depends On | Status | -|------|-------------|------------|--------| -| **N5** | DevUI gallery -- a single sample that showcases all DevUI features: prompts, flows, tools, evaluators, structured output, streaming, context, middleware. | G1, G2, G3, G6, G7, G8, H1 | Pending | - ---- - -### Execution Timeline - -``` -TIME --> -========================================================================== - -P0: [B1a: remove recipe.robot.prompt] DONE - [B1b: fix SDK lazy loading bug] BLOCKED (firebase/genkit#4491) - [G1: context demo] DONE - [G6: streaming+tools] DONE - [G7: custom middleware] DONE - [N6: typo fix] DONE - (5 of 6 P0 tasks complete; B1b awaits SDK fix) - | - --- P0 partially complete (B1b on critical path) --- - | -SDK: [S1: fix plugin structlog blowaway ~~~~] (HIGH - 5 plugins) - [S2: fix awarn protocol gap ~~~~~~~~~~~] (LOW) - [S3: fix ToolRunContext sole param ~~~~] (MEDIUM - #4492) - [S4: fix lazy loading recursion ~~~~~~~] (MEDIUM - #4491, same as B1b) - (all independent, each a separate PR) - | -P1: [G2: dotprompt tools ~~] [G8: eval pipeline ~~~~~~] - [G3: dotprompt code ~~~] [H1: system_prompt x2 ~~~] - [G4: dotprompt media ~~] [H2: multi_turn x2 ~~~~~~] - [G5: dotprompt defaults] - (G2-G5 blocked by B1b/S4; G8/H1/H2 ready now) - | - --- all P1 complete --- - | -P2: [N1: simple retriever ~~~] [N4: indexer ~~~~~~~~] - [N2: custom reranker ~~~~] [N7: firebase funcs ~] - [N3: data synthesis ~~~~~~~~~~] - (N1/N2/N4/N7 parallel, N3 after G8) - | - --- all P2 complete --- - | -P3: [N5: DevUI gallery ~~~~~~~~~~~~~~] - | - === SAMPLE PARITY COMPLETE === -``` - ---- - -### Progress Summary - -| Phase | Tasks | Done | Remaining | Blockers | -|-------|-------|------|-----------|----------| -| **P0** | B1a, B1b, G1, G6, G7, N6 | 5/6 | B1b | firebase/genkit#4491 | -| **P1** | G2, G3, G4, G5, G8, H1, H2 | 0/7 (H1 4/6, H2 4/6) | All | G2-G5 blocked by B1b | -| **P2** | N1, N2, N3, N4, N7 | 0/5 | All | N3 blocked by G8 | -| **P3** | N5 | 0/1 | All | Broad P0-P2 deps | -| **SDK** | S1, S2, S3, S4 | 0/4 | All | Separate PRs needed | -| **Total** | 23 tasks | ~5.5 | ~17.5 | 1 SDK bug + 4 SDK fixes | - ---- - -## 7. Pending SDK / Infrastructure Fixes (Separate PRs) - -Issues discovered during the sample consolidation and logging refactoring. -These should NOT be fixed in the samples PR -- each needs its own PR touching -core SDK or plugin code. - -### SDK Bugs - -| ID | Severity | Description | Affected Code | Notes | -|----|----------|-------------|---------------|-------| -| ~~**S1**~~ | ~~**HIGH**~~ | ~~**Observability plugins blow away structlog config.** Five plugins call `structlog.configure(processors=new_processors)` with *only* the `processors` kwarg. Since `structlog.configure()` is a full-replace (not partial-update), this resets `wrapper_class`, `logger_factory`, `cache_logger_on_first_use` back to defaults -- silently destroying any custom structlog setup (e.g. the `setup_sample()` stdlib integration). **Fix:** Use `structlog.configure(**{**structlog.get_config(), 'processors': new_processors})` to preserve the full config.~~ | ~~All 5 plugins~~ | ✅ Done — all 5 plugins fixed | -| **S2** | LOW | **`awarn` gap in `Logger` protocol.** `genkit.core.logging.Logger` declares `awarn()` but `structlog.stdlib.BoundLogger` only has `awarning` (no `awarn` alias). Calling `logger.awarn(...)` would raise `AttributeError`. Previously masked because `make_filtering_bound_logger` dynamically creates all method names. **Fix:** Either remove `awarn`/`warn` from the protocol, or add runtime aliases. | `py/packages/genkit/src/genkit/core/logging.py` | Only matters if `awarn` is actually called somewhere | -| **S3** | MEDIUM | **`ToolRunContext` as sole parameter crashes with `PydanticSchemaGenerationError`.** When a `@ai.tool()` has `ToolRunContext` as its only parameter, the SDK tries to create a `TypeAdapter` for it (which fails) and would also dispatch the tool input instead of the context at runtime. **Workaround:** Use `Genkit.current_context()` with zero-arg tools. | `py/packages/genkit/src/genkit/core/action/_action.py` (lines 493-494, 592-598), `py/packages/genkit/src/genkit/ai/_registry.py` (lines 555-565) | Tracked at [firebase/genkit#4492](https://github.com/firebase/genkit/issues/4492) | -| **S4** | MEDIUM | **SDK lazy loading infinite recursion for `.variant.prompt` files.** `create_prompt_from_file()` self-references via `resolve_action_by_key()` before caching, causing `RecursionError`. | `py/packages/genkit/src/genkit/blocks/prompt.py` | Tracked at [firebase/genkit#4491](https://github.com/firebase/genkit/issues/4491) | - -### Sample Naming Convention - -All samples follow a consistent prefix scheme: - -| Prefix | Category | Examples | -|--------|----------|----------| -| `provider-` | Model provider-specific | `provider-google-genai-hello`, `provider-anthropic-hello`, `provider-vertex-ai-model-garden` | -| `framework-` | Genkit framework features | `framework-context-demo`, `framework-middleware-demo`, `framework-prompt-demo` | -| `web-` | Web framework integration | `web-flask-hello`, `web-multi-server`, `web-short-n-long` | -| (none) | Other | `dev-local-vectorstore-hello` | diff --git a/py/engdoc/planning/FEATURE_MATRIX.md b/py/engdoc/planning/FEATURE_MATRIX.md deleted file mode 100644 index 0a3ee15239..0000000000 --- a/py/engdoc/planning/FEATURE_MATRIX.md +++ /dev/null @@ -1,448 +0,0 @@ -# Plugin Feasibility & Feature Matrix - -This document provides a comprehensive comparison of all proposed plugins to help -prioritize implementation efforts. - -## Executive Summary - -``` -┌─────────────────────────────────────────────────────────────────────────────────┐ -│ PLUGIN PRIORITY RECOMMENDATION │ -├─────────────────────────────────────────────────────────────────────────────────┤ -│ │ -│ PHASE 1 (Build Now) PHASE 2 (Consider) PHASE 3 (If Demanded) │ -│ ────────────────── ───────────────── ──────────────────── │ -│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ azure │ │ cloudflare │ │ vercel │ │ -│ │ (telemetry) │ │ (telemetry) │ │ (helpers) │ │ -│ │ Score: 92/100 │ │ Score: 75/100 │ │ Score: 55/100 │ │ -│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ -│ │ -│ ┌─────────────────┐ │ -│ │ cloudflare-ai │ │ -│ │ (models) │ │ -│ │ Score: 88/100 │ │ -│ └─────────────────┘ │ -│ │ -│ ┌─────────────────┐ │ -│ │ observability │ ← NEW │ -│ │ (3rd party) │ │ -│ │ Score: 89/100 │ │ -│ └─────────────────┘ │ -│ │ -└─────────────────────────────────────────────────────────────────────────────────┘ -``` - ---- - -## Part 1: Model/AI Plugins - -### Feature Comparison - -| Feature | amazon-bedrock ❶ | google-genai ❶ | microsoft-foundry ❶ | cloudflare-ai ❷ | -|---------|---------------|----------------|-------------|-----------------| -| **Text Generation** | ✅ | ✅ | ✅ | ✅ | -| **Streaming (SSE)** | ✅ | ✅ | ✅ | ✅ | -| **Tool/Function Calling** | ✅ | ✅ | ✅ | ✅ (Llama 3+) | -| **Embeddings** | ✅ | ✅ | ✅ | ✅ (BGE) | -| **Image Generation** | ✅ (Nova) | ✅ (Imagen) | ✅ (DALL-E) | ✅ (Flux, SD) | -| **Image Understanding** | ✅ | ✅ | ✅ | ✅ (Llama 4) | -| **Speech-to-Text** | ✅ | ✅ | ✅ (Whisper) | ✅ (Whisper) | -| **Text-to-Speech** | ✅ | ✅ | ✅ | ❌ | -| **Video Generation** | ❌ | ✅ (Veo) | ❌ | ❌ | -| **Audio Generation** | ❌ | ✅ (Lyria) | ❌ | ❌ | - -❶ = Already implemented -❷ = Proposed - -### Model Availability - -| Provider | Models | Notable Models | -|----------|--------|----------------| -| **AWS Bedrock** | 20+ | Claude 3.5, Llama 3, Nova, Titan | -| **Google GenAI** | 10+ | Gemini 2, Imagen, Veo, Lyria | -| **MS Foundry** | 11,000+ | GPT-4o, Claude, Llama, Mistral | -| **Cloudflare AI** | 50+ | Llama 4, Mistral, Flux, Whisper | -| **Vercel AI Gateway** | Pass-through | Any via OpenAI/Anthropic API | - -### Implementation Complexity - -| Plugin | API Type | Auth | SDK | Complexity | -|--------|----------|------|-----|------------| -| **amazon-bedrock** ❶ | Converse API | IAM/Keys | boto3 | Medium | -| **google-genai** ❶ | REST/gRPC | API Key/ADC | google-genai | Medium | -| **microsoft-foundry** ❶ | OpenAI-compat | API Key | openai | Low | -| **cloudflare-ai** ❷ | REST | API Token | httpx | Low-Medium | - -### Cloudflare AI Feasibility Score - -| Factor | Score | Notes | -|--------|-------|-------| -| **API Documentation** | 9/10 | Excellent, clear examples | -| **Python Support** | 7/10 | REST API, no official SDK | -| **Model Variety** | 9/10 | 50+ models across categories | -| **Streaming Support** | 9/10 | Native SSE for all LLMs | -| **Tool Calling** | 8/10 | Supported on Llama 3+ | -| **Community Demand** | 7/10 | Growing edge AI market | -| **Maintenance Burden** | 8/10 | Simple REST, few breaking changes | -| **Strategic Value** | 9/10 | Edge computing differentiator | -| **TOTAL** | **88/100** | ✅ **BUILD** | - ---- - -## Part 2: Telemetry Plugins - -### Architecture Overview - -``` -┌─────────────────────────────────────────────────────────────────────────────────┐ -│ TELEMETRY PLUGIN ARCHITECTURE │ -├─────────────────────────────────────────────────────────────────────────────────┤ -│ │ -│ NATIVE PLATFORM BACKENDS THIRD-PARTY BACKENDS │ -│ ──────────────────────── ──────────────────── │ -│ │ -│ ┌─────────┐ ┌─────────┐ ┌─────────────────────────┐ │ -│ │ aws │ │ google- │ │ observability │ │ -│ │ │ │ cloud │ │ │ │ -│ │ • SigV4 │ │ • ADC │ │ • Sentry │ │ -│ │ • X-Ray │ │ • Trace │ │ • Honeycomb │ │ -│ │ • CW │ │ • Logs │ │ • Datadog │ │ -│ └────┬────┘ └────┬────┘ │ • Grafana │ │ -│ │ │ │ • Axiom │ │ -│ ▼ ▼ │ • Custom OTLP │ │ -│ ┌─────────┐ ┌─────────┐ └───────────┬─────────────┘ │ -│ │ X-Ray │ │ Cloud │ │ │ -│ │ Console │ │ Trace │ ▼ │ -│ └─────────┘ └─────────┘ ┌─────────────────────────┐ │ -│ │ Any OTLP Backend │ │ -│ ┌─────────┐ │ (Sentry, Honeycomb, │ │ -│ │ azure │ │ Datadog, etc.) │ │ -│ │ │ └─────────────────────────┘ │ -│ │ • Distro│ │ -│ │ • Live │ CAN'T BE REPLICATED CAN BE REPLICATED │ -│ │ • Map │ WITH GENERIC OTLP WITH GENERIC OTLP │ -│ └────┬────┘ │ -│ │ │ -│ ▼ │ -│ ┌─────────┐ │ -│ │ App │ │ -│ │Insights │ │ -│ └─────────┘ │ -│ │ -└─────────────────────────────────────────────────────────────────────────────────┘ -``` - -### When to Use What - -``` -┌─────────────────────────────────────────────────────────────────────────────────┐ -│ TELEMETRY PLUGIN DECISION GUIDE │ -├─────────────────────────────────────────────────────────────────────────────────┤ -│ │ -│ "I'm on AWS and want X-Ray" → aws plugin (SigV4, X-Ray format) │ -│ "I'm on GCP and want Cloud Trace" → google-cloud plugin (ADC) │ -│ "I'm on Azure and want App Insights" → azure plugin (Live Metrics, Map) │ -│ │ -│ "I'm on AWS but want Honeycomb" → observability plugin (just OTLP) │ -│ "I'm on GCP but want Sentry" → observability plugin (just OTLP) │ -│ "I'm multi-cloud, want Datadog" → observability plugin (just OTLP) │ -│ "I don't care, just give me traces" → observability plugin (just OTLP) │ -│ │ -└─────────────────────────────────────────────────────────────────────────────────┘ -``` - -### Feature Comparison - -| Feature | aws ❶ | google-cloud ❶ | azure ❷ | observability ❷ | cloudflare ❷ | vercel ❷ | -|---------|-------|----------------|---------|-----------------|--------------|----------| -| **Distributed Tracing** | ✅ X-Ray | ✅ Cloud Trace | ✅ App Insights | ✅ Any OTLP | ⚠️ 3rd party | ⚠️ 3rd party | -| **Structured Logging** | ✅ CloudWatch | ✅ Cloud Logging | ✅ App Insights | ⚠️ Via backend | ✅ Logpush | ⚠️ 3rd party | -| **Metrics** | ✅ CloudWatch | ✅ Cloud Monitoring | ✅ App Insights | ⚠️ Via backend | ⚠️ Workers Analytics | ⚠️ 3rd party | -| **Live Metrics** | ❌ | ❌ | ✅ Built-in | ❌ | ❌ | ❌ | -| **Application Map** | ❌ | ❌ | ✅ Built-in | ❌ | ❌ | ❌ | -| **Log-Trace Correlation** | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | -| **Auto-Instrumentation** | ⚠️ Manual | ⚠️ Manual | ✅ Distro | ⚠️ Manual | ⚠️ AI Gateway | ⚠️ Manual | -| **Sentry Support** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ Native | ✅ OTLP | -| **Honeycomb Support** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ Native | ✅ OTLP | -| **Datadog Support** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ Native | ✅ OTLP | - -❶ = Already implemented -❷ = Proposed - -### Third-Party Backend Support - -| Backend | aws | google-cloud | azure | cloudflare | vercel | -|---------|-----|--------------|-------|------------|--------| -| **Sentry** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP | -| **Honeycomb** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP | -| **Datadog** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP | -| **Grafana Cloud** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP | -| **Axiom** | ✅ OTLP | ✅ OTLP | ✅ OTLP | ✅ Native | ✅ OTLP | - -### Implementation Approach - -| Plugin | Approach | SDK | Setup Complexity | -|--------|----------|-----|------------------| -| **aws** ❶ | Custom OTLP + SigV4 | opentelemetry-* | Medium | -| **google-cloud** ❶ | Custom OTLP | opentelemetry-* | Medium | -| **azure** ❷ | Official Distro | azure-monitor-opentelemetry | **Very Low** | -| **cloudflare** ❷ | Presets for 3rd party | opentelemetry-* | Low | -| **vercel** ❷ | Standard OTLP | opentelemetry-* | Low | - -### Azure Telemetry Feasibility Score - -| Factor | Score | Notes | -|--------|-------|-------| -| **API Documentation** | 10/10 | Microsoft official docs | -| **Python Support** | 10/10 | Official SDK with distro | -| **Setup Simplicity** | 10/10 | One-liner `configure_azure_monitor()` | -| **Feature Richness** | 9/10 | Live Metrics, App Map included | -| **Community Demand** | 9/10 | Enterprise Azure users | -| **Maintenance Burden** | 9/10 | Microsoft maintains SDK | -| **Strategic Value** | 9/10 | Pairs with microsoft-foundry plugin | -| **TOTAL** | **92/100** | ✅ **BUILD NOW** | - -### Cloudflare Telemetry Feasibility Score - -| Factor | Score | Notes | -|--------|-------|-------| -| **API Documentation** | 8/10 | Good Workers OTEL docs | -| **Python Support** | 6/10 | REST API, standard OTEL | -| **Setup Simplicity** | 7/10 | Dashboard config + code | -| **Feature Richness** | 8/10 | AI Gateway auto-traces | -| **Community Demand** | 7/10 | Growing edge users | -| **Maintenance Burden** | 8/10 | Standard OTEL patterns | -| **Strategic Value** | 8/10 | Pairs with cloudflare-ai | -| **TOTAL** | **75/100** | ⚠️ **CONSIDER** | - -### Observability Plugin Feasibility Score - -| Factor | Score | Notes | -|--------|-------|-------| -| **API Documentation** | 9/10 | Standard OTLP, well-documented | -| **Python Support** | 10/10 | Official opentelemetry-python | -| **Setup Simplicity** | 9/10 | One function call with preset | -| **Feature Coverage** | 8/10 | Traces + basic metrics | -| **Community Demand** | 9/10 | Common request for 3rd party | -| **Maintenance Burden** | 9/10 | Stable OTLP protocol | -| **Strategic Value** | 8/10 | Platform-agnostic option | -| **TOTAL** | **89/100** | ✅ **BUILD** | - -### Vercel Telemetry Feasibility Score - -| Factor | Score | Notes | -|--------|-------|-------| -| **API Documentation** | 6/10 | Node.js focused | -| **Python Support** | 5/10 | No Vercel-specific SDK | -| **Setup Simplicity** | 7/10 | Standard OTEL works | -| **Feature Richness** | 5/10 | No unique features | -| **Community Demand** | 6/10 | Python on Vercel growing | -| **Maintenance Burden** | 8/10 | Standard OTEL patterns | -| **Strategic Value** | 5/10 | No Vercel AI plugin needed | -| **TOTAL** | **55/100** | ⚠️ **IF DEMANDED** | - ---- - -## Part 3: Effort vs Impact Matrix - -``` - IMPACT - Low High - ┌───────────┬───────────┐ - Low │ │ azure │ ← Quick wins - │ vercel │cloudflare │ - EFFORT │ │ (both) │ - ├───────────┼───────────┤ - High │ │ │ - │ │ │ - │ │ │ - └───────────┴───────────┘ -``` - -### Effort Estimates - -| Plugin | Estimated Days | Dependencies | -|--------|---------------|--------------| -| **azure** | 5-7 days | azure-monitor-opentelemetry (official) | -| **observability** | 5-7 days | opentelemetry-* | -| **cloudflare-ai** | 10-15 days | httpx, pydantic | -| **cloudflare** (telemetry) | 5-7 days | opentelemetry-* | -| **vercel** | 3-5 days | opentelemetry-* | - -### Impact Factors - -| Plugin | New Users | Ecosystem Fit | Differentiation | -|--------|-----------|---------------|-----------------| -| **azure** | High (enterprise) | Pairs with microsoft-foundry | Good | -| **cloudflare-ai** | Medium (edge) | New market | High | -| **cloudflare** | Medium | Pairs with cloudflare-ai | Medium | -| **vercel** | Low | Standalone | Low | - ---- - -## Part 4: Dependencies & Risk Analysis - -### External Dependencies - -| Plugin | Key Dependencies | Risk Level | -|--------|-----------------|------------| -| **azure** | azure-monitor-opentelemetry | ✅ Low (Microsoft maintained) | -| **cloudflare-ai** | httpx | ✅ Low (stable library) | -| **cloudflare** | opentelemetry-* | ✅ Low (CNCF standard) | -| **vercel** | opentelemetry-* | ✅ Low (CNCF standard) | - -### API Stability Risk - -| Plugin | API Stability | Breaking Change Risk | -|--------|--------------|---------------------| -| **azure** | ✅ Stable | Low - versioned SDK | -| **cloudflare-ai** | ⚠️ Evolving | Medium - new models added | -| **cloudflare** | ✅ Stable | Low - standard OTEL | -| **vercel** | ✅ Stable | Low - standard OTEL | - -### Maintenance Burden - -| Plugin | Ongoing Maintenance | Reason | -|--------|-------------------|--------| -| **azure** | Low | Microsoft maintains SDK | -| **cloudflare-ai** | Medium | New models, config updates | -| **cloudflare** | Low | Standard OTEL patterns | -| **vercel** | Very Low | Just URL helpers | - ---- - -## Part 5: Final Recommendations - -### Priority Order - -| Priority | Plugin | Score | Action | Timeline | -|----------|--------|-------|--------|----------| -| **1** | azure (telemetry) | 92/100 | ✅ Build Now | 1 week | -| **2** | observability (3rd party) | 89/100 | ✅ Build Now | 1 week | -| **3** | cloudflare-ai (models) | 88/100 | ✅ Build Now | 2-3 weeks | -| **4** | cloudflare (telemetry) | 75/100 | ⚠️ Consider | 1 week | -| **5** | vercel (helpers) | 55/100 | ⚠️ If Demanded | 3-5 days | - -### Rationale - -**1. Azure Telemetry (Priority 1)** -- Official Microsoft SDK with one-liner setup -- Pairs naturally with existing `microsoft-foundry` plugin -- High enterprise demand -- Very low implementation effort - -**2. Observability Plugin (Priority 2)** -- Platform-agnostic third-party backend support -- One plugin for Sentry, Honeycomb, Datadog, Grafana, Axiom -- Common user request -- Uses stable OTLP protocol - -**3. Cloudflare AI (Priority 3)** -- Growing edge AI market -- 50+ models including latest Llama 4 -- Clear REST API -- Differentiates Genkit in edge computing space - -**4. Cloudflare Telemetry (Priority 4)** -- Pairs with cloudflare-ai plugin -- Good third-party backend support (via observability plugin) -- AI Gateway auto-traces are valuable -- Lower priority since observability plugin covers 3rd party - -**5. Vercel (Priority 5)** -- Python works on Vercel, but no unique features -- AI Gateway = just URL change -- Standard OTEL works fine -- Build only if users explicitly request - -### What NOT to Build - -| Plugin | Reason | -|--------|--------| -| Vercel AI SDK wrapper | JS/TS only, use existing plugins | -| Vercel OTEL package | Node.js only, standard OTEL works | -| Generic OTEL presets | Too generic, not Genkit-specific value | - ---- - -## Appendix: Complete Feature Matrix - -### AI/Model Capabilities - -``` -┌─────────────────────────────────────────────────────────────────────────────────┐ -│ AI/MODEL FEATURE MATRIX │ -├──────────────────────┬─────────┬──────────┬─────────┬─────────────┬────────────┤ -│ Feature │ AWS │ Google │ Azure │ Cloudflare │ Vercel │ -│ │ Bedrock │ GenAI │ Foundry │ Workers AI │ AI Gateway │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼────────────┤ -│ Text Generation │ ✅ │ ✅ │ ✅ │ ✅ │ ✅ proxy │ -│ Streaming │ ✅ │ ✅ │ ✅ │ ✅ │ ✅ proxy │ -│ Tool Calling │ ✅ │ ✅ │ ✅ │ ✅ │ ✅ proxy │ -│ Structured Output │ ✅ │ ✅ │ ✅ │ ⚠️ partial │ ✅ proxy │ -│ Embeddings │ ✅ │ ✅ │ ✅ │ ✅ │ ❌ │ -│ Image Generation │ ✅ │ ✅ │ ✅ │ ✅ │ ❌ │ -│ Image Understanding │ ✅ │ ✅ │ ✅ │ ✅ │ ✅ proxy │ -│ Speech-to-Text │ ✅ │ ✅ │ ✅ │ ✅ │ ❌ │ -│ Text-to-Speech │ ✅ │ ✅ │ ✅ │ ❌ │ ❌ │ -│ Video Generation │ ❌ │ ✅ │ ❌ │ ❌ │ ❌ │ -│ Audio Generation │ ❌ │ ✅ │ ❌ │ ❌ │ ❌ │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼────────────┤ -│ Python SDK │ ✅ boto3│ ✅ google│ ✅ openai│ ❌ REST │ ✅ openai │ -│ Auth Method │ IAM/Key │ Key/ADC │ API Key │ API Token │ API Key │ -│ Regional │ ✅ │ ✅ │ ✅ │ ❌ Global │ ❌ Global │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼────────────┤ -│ STATUS │ ✅ DONE │ ✅ DONE │ ✅ DONE │ 📋 PLANNED │ ❌ SKIP │ -└──────────────────────┴─────────┴──────────┴─────────┴─────────────┴────────────┘ -``` - -### Telemetry Capabilities - -``` -┌──────────────────────────────────────────────────────────────────────────────────────────────┐ -│ TELEMETRY FEATURE MATRIX │ -├──────────────────────┬─────────┬──────────┬─────────┬─────────────┬─────────────┬───────────┤ -│ Feature │ AWS │ GCP │ Azure │ observ. │ Cloudflare │ Vercel │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤ -│ Native Trace Backend │ ✅ X-Ray│ ✅ Trace │ ✅ Insght│ ❌ (3rd pty)│ ❌ (3rd pty)│ ❌ (3rd) │ -│ Distributed Tracing │ ✅ │ ✅ │ ✅ │ ✅ any OTLP │ ✅ export │ ✅ export │ -│ Structured Logging │ ✅ │ ✅ │ ✅ │ ⚠️ backend │ ✅ Logpush │ ⚠️ backend│ -│ Metrics │ ✅ │ ✅ │ ✅ │ ⚠️ backend │ ⚠️ basic │ ❌ │ -│ Live Metrics │ ❌ │ ❌ │ ✅ │ ❌ │ ❌ │ ❌ │ -│ Application Map │ ❌ │ ❌ │ ✅ │ ❌ │ ❌ │ ❌ │ -│ Log-Trace Correlation│ ✅ │ ✅ │ ✅ │ ✅ │ ✅ │ ✅ │ -│ Auto-Instrumentation │ ⚠️ │ ⚠️ │ ✅ │ ⚠️ manual │ ⚠️ AI only │ ❌ │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤ -│ Sentry Export │ ✅ OTLP │ ✅ OTLP │ ✅ OTLP │ ✅ PRESET │ ✅ Native │ ✅ OTLP │ -│ Honeycomb Export │ ✅ OTLP │ ✅ OTLP │ ✅ OTLP │ ✅ PRESET │ ✅ Native │ ✅ OTLP │ -│ Datadog Export │ ✅ OTLP │ ✅ OTLP │ ✅ OTLP │ ✅ PRESET │ ✅ Native │ ✅ OTLP │ -│ Grafana Export │ ✅ OTLP │ ✅ OTLP │ ✅ OTLP │ ✅ PRESET │ ✅ Native │ ✅ OTLP │ -│ Axiom Export │ ✅ OTLP │ ✅ OTLP │ ✅ OTLP │ ✅ PRESET │ ✅ Native │ ✅ OTLP │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤ -│ Official Python SDK │ ❌ manual│ ❌ manual│ ✅ distro│ ✅ OTEL │ ❌ REST │ ❌ manual │ -│ Setup Complexity │ Medium │ Medium │ Very Low│ Very Low │ Low │ Low │ -├──────────────────────┼─────────┼──────────┼─────────┼─────────────┼─────────────┼───────────┤ -│ STATUS │ ✅ DONE │ ✅ DONE │ 📋 PLAN │ 📋 PLAN │ 📋 CONSIDER │ ⚠️ DEFER │ -└──────────────────────┴─────────┴──────────┴─────────┴─────────────┴─────────────┴───────────┘ -``` - ---- - -## Decision: What to Attack - -Based on this analysis: - -### ✅ Build Now (Q1 2026) - -1. **azure** - 92/100 score, 1 week effort, high enterprise value -2. **observability** - 89/100 score, 1 week effort, platform-agnostic 3rd party -3. **cloudflare-ai** - 88/100 score, 2-3 weeks effort, edge differentiator - -### ⚠️ Consider (Q2 2026) - -4. **cloudflare** (telemetry) - 75/100 score, pairs with cloudflare-ai - -### ⏸️ Defer - -5. **vercel** - 55/100 score, build only if explicitly requested diff --git a/py/engdoc/planning/README.md b/py/engdoc/planning/README.md deleted file mode 100644 index 34ae8141ef..0000000000 --- a/py/engdoc/planning/README.md +++ /dev/null @@ -1,121 +0,0 @@ -# Plugin Implementation Plans - -This directory contains detailed implementation plans for proposed Genkit plugins. - -## Summary Table - -| Plugin | Type | Feasibility | Effort | Priority | Status | -|--------|------|-------------|--------|----------|--------| -| **azure** | Telemetry | ✅ HIGH | 1 week | High | Ready | -| **observability** | Telemetry | ✅ HIGH | 1 week | High | Ready | -| **cloudflare-ai** | Model | ✅ HIGH | 2-3 weeks | High | Ready | -| **cloudflare** | Telemetry | ⚠️ MEDIUM-HIGH | 1-2 weeks | Medium | Consider | -| **vercel** | Combined | ⚠️ MEDIUM | 1 week | Low | If demanded | - -> **Note:** The `observability` plugin provides presets for Sentry, Honeycomb, Datadog, -> Grafana, and Axiom. It complements platform plugins (aws, google-cloud, azure) for -> users who prefer third-party backends. - -## Detailed Plans - -### Ready for Implementation - -1. **[azure-telemetry-plugin.md](./azure-telemetry-plugin.md)** - Azure Application Insights - - Official Microsoft OTEL distro - - One-liner setup with `configure_azure_monitor()` - - Live metrics, application map, log correlation - -2. **[observability-plugin.md](./observability-plugin.md)** - Third-Party Backends - - Presets for Sentry, Honeycomb, Datadog, Grafana, Axiom - - Platform-agnostic OTLP export - - One function call setup - -3. **[cloudflare-ai-plugin.md](./cloudflare-ai-plugin.md)** - Cloudflare Workers AI - - 50+ models at the edge (Llama, Mistral, Flux, etc.) - - Streaming, tool calling, embeddings - - REST API with simple auth - -### Consider Building - -4. **[cloudflare-telemetry-plugin.md](./cloudflare-telemetry-plugin.md)** - Cloudflare Telemetry - - No native backend, but exports to Sentry, Honeycomb, Datadog, etc. - - AI Gateway auto-exports AI traces to third-party backends - - Recommend: Plugin with presets for common backends - -5. **[vercel-plugins.md](./vercel-plugins.md)** - Vercel AI & Telemetry - - Python DOES work on Vercel (FastAPI, Flask) - - AI SDK and @vercel/otel are JS-only, but AI Gateway + standard OTEL work - - Recommend: Build simple helper plugin if user demand exists - -## Feasibility Criteria - -### ✅ HIGH Feasibility -- Official SDK/library available -- Clear API documentation -- Python support confirmed -- Similar patterns to existing plugins - -### ⚠️ MEDIUM Feasibility -- REST API available but no SDK -- Limited Python-specific documentation -- Workarounds required -- May have feature gaps - -### ❌ LOW Feasibility -- No Python support -- Platform-specific (JS/Node only) -- Would duplicate existing functionality -- Not worth maintenance overhead - -## Implementation Priority - -### Phase 1 (Immediate) -1. **azure** - Strong enterprise demand, official OTEL support -2. **observability** - Platform-agnostic, Sentry/Honeycomb/Datadog presets -3. **cloudflare-ai** - Growing edge AI market, good REST API - -### Phase 2 (Consider) -4. **cloudflare** (telemetry) - AI Gateway integration, pairs with cloudflare-ai - -### Phase 3 (If Demanded) -5. **vercel** - Simple helper plugin if users request it - -## Architecture Patterns - -All plugins should follow these patterns from existing implementations: - -### Model Plugins (like `amazon-bedrock`, `microsoft-foundry`) -``` -plugins/{name}/ -├── src/genkit/plugins/{name}/ -│ ├── __init__.py # ELI5 docs, exports -│ ├── typing.py # Config schemas per model -│ ├── models/ -│ │ ├── model.py # Base model implementation -│ │ └── {family}.py # Model-specific configs -│ └── embedders/ # If applicable -└── tests/ -``` - -### Telemetry Plugins (like `aws`, `google-cloud`) -``` -plugins/{name}/ -├── src/genkit/plugins/{name}/ -│ ├── __init__.py # ELI5 docs, exports -│ ├── telemetry/ -│ │ ├── __init__.py -│ │ └── tracing.py # Manager class -│ └── typing.py # Config schemas -└── tests/ -``` - -## Documentation Requirements - -All plugins must include: - -1. **ELI5 Concepts Table** - In module docstring -2. **Data Flow Diagram** - ASCII art showing architecture -3. **README.md** - Setup instructions, examples -4. **Sample Application** - In `samples/{name}-hello/` - -See [GEMINI.md](../../GEMINI.md) for full documentation requirements. diff --git a/py/engdoc/planning/azure-telemetry-plugin.md b/py/engdoc/planning/azure-telemetry-plugin.md deleted file mode 100644 index 748a5229c2..0000000000 --- a/py/engdoc/planning/azure-telemetry-plugin.md +++ /dev/null @@ -1,452 +0,0 @@ -# Azure Telemetry Plugin Implementation Plan - -**Status:** Ready for Implementation -**Feasibility:** ✅ HIGH -**Estimated Effort:** Medium (1-2 weeks) -**Dependencies:** `azure-monitor-opentelemetry`, `opentelemetry-sdk` - -## Overview - -The `azure` plugin exports Genkit telemetry to Azure Monitor (Application Insights), -providing distributed tracing, logging, and metrics for Azure-hosted applications. - -``` -┌─────────────────────────────────────────────────────────────────────────┐ -│ AZURE TELEMETRY PLUGIN ARCHITECTURE │ -│ │ -│ Key Concepts (ELI5): │ -│ ┌─────────────────────┬────────────────────────────────────────────┐ │ -│ │ Azure Monitor │ Microsoft's observability platform. See │ │ -│ │ │ traces, logs, metrics in Azure Portal. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Application Insights│ The part of Azure Monitor for apps. │ │ -│ │ │ Tracks requests, dependencies, exceptions. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Connection String │ Your key to send data. Found in Azure │ │ -│ │ │ Portal > App Insights > Connection String. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Azure Monitor Distro│ Microsoft's "batteries included" OTEL │ │ -│ │ │ package. One line to enable everything. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Live Metrics │ Real-time view of your app's health. │ │ -│ │ │ See requests as they happen! │ │ -│ └─────────────────────┴────────────────────────────────────────────┘ │ -│ │ -│ Data Flow: │ -│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ Genkit App │────▶│ Azure Monitor │────▶│ Application │ │ -│ │ (Your Code) │ │ OTEL Distro │ │ Insights │ │ -│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ -│ │ │ │ │ -│ │ │ │ │ -│ │ ▼ ▼ │ -│ │ ┌─────────────────────────────────────────┐ │ -│ │ │ Azure Portal │ │ -│ │ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ │ │ -│ │ │ │ Traces │ │ Logs │ │ Metrics │ │ │ -│ │ │ │ (E2E) │ │ (Query) │ │ (Charts)│ │ │ -│ │ │ └─────────┘ └─────────┘ └─────────┘ │ │ -│ │ └─────────────────────────────────────────┘ │ -│ │ │ -│ │ ┌─────────────────┐ │ -│ └───▶│ structlog │──── Logs with trace correlation │ -│ │ integration │ │ -│ └─────────────────┘ │ -└─────────────────────────────────────────────────────────────────────────┘ -``` - -## Azure Monitor OpenTelemetry Distro - -Microsoft provides an official "batteries included" package that handles everything: - -```bash -pip install azure-monitor-opentelemetry -``` - -**Version:** 1.8.5 (January 2026) -**Python Support:** 3.9 - 3.14 - -### What It Includes - -- **Azure Monitor exporters** - Send data to Application Insights -- **Auto-instrumentation** - HTTP, database, and framework libraries -- **Trace correlation** - Links traces across services -- **Live Metrics** - Real-time monitoring stream - -## Implementation - -### Core Plugin Class - -```python -"""Azure telemetry plugin for Genkit. - -Exports traces, logs, and metrics to Azure Monitor (Application Insights). - -Key Concepts (ELI5):: - - ┌─────────────────────┬────────────────────────────────────────────────┐ - │ Concept │ ELI5 Explanation │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Azure Monitor │ Microsoft's observability platform. Like a │ - │ │ dashboard showing your app's vital signs. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Application Insights│ The part that tracks your app specifically. │ - │ │ Requests, errors, dependencies, performance. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Connection String │ Your unique key to send telemetry. Like an │ - │ │ address where your data should be delivered. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Live Metrics │ Real-time view of requests as they happen. │ - │ │ Like watching a live scoreboard. │ - └─────────────────────┴────────────────────────────────────────────────┘ - -Data Flow:: - - ┌─────────────────────────────────────────────────────────────────────┐ - │ HOW AZURE TELEMETRY WORKS │ - │ │ - │ Your Genkit App │ - │ │ │ - │ │ (1) Initialize AzureTelemetry │ - │ ▼ │ - │ ┌─────────────────┐ │ - │ │ AzureTelemetry │ Configures OTEL with Azure exporters │ - │ │ (Manager) │ │ - │ └────────┬────────┘ │ - │ │ │ - │ │ (2) Auto-instruments your code │ - │ ▼ │ - │ ┌─────────────────┐ ┌─────────────────┐ │ - │ │ TracerProvider │────▶│ AzureMonitor │ │ - │ │ (OTEL) │ │ Exporter │ │ - │ └─────────────────┘ └────────┬────────┘ │ - │ │ │ - │ ┌───────────────────────┼───────────────────────┐ │ - │ │ │ │ │ - │ ▼ ▼ ▼ │ - │ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐ │ - │ │ Traces │ │ Logs │ │ Metrics │ │ - │ │ (Distributed) │ │ (Structured) │ │ (Counters) │ │ - │ └─────────────────┘ └─────────────────┘ └─────────────┘ │ - │ │ │ │ │ - │ └───────────────────────┼───────────────────────┘ │ - │ │ │ - │ ▼ │ - │ ┌─────────────────────┐ │ - │ │ Application │ │ - │ │ Insights Portal │ │ - │ └─────────────────────┘ │ - └─────────────────────────────────────────────────────────────────────┘ - -Example:: - - from genkit.ai import Genkit - from genkit.plugins.microsoft_foundry import AzureTelemetry - from genkit.plugins.microsoft_foundry import MicrosoftFoundry - - # Initialize Azure telemetry - AzureTelemetry().initialize() - - ai = Genkit( - plugins=[MicrosoftFoundry()], - model='microsoft-foundry/gpt-4o', - ) -""" - -import os -import logging -from typing import Any, MutableMapping, Mapping - -import structlog -from azure.monitor.opentelemetry import configure_azure_monitor -from opentelemetry import trace -from opentelemetry.sdk.trace import TracerProvider - -from genkit.core.logging import get_logger - - -logger = get_logger(__name__) - - -class AzureTelemetry: - """Azure Monitor telemetry manager for Genkit applications. - - This class provides a centralized way to configure Azure Application Insights - telemetry, including distributed tracing, structured logging, and metrics. - - Args: - connection_string: Application Insights connection string. - Falls back to APPLICATIONINSIGHTS_CONNECTION_STRING env var. - service_name: Name of your service (appears in traces). - service_version: Version of your service. - enable_live_metrics: Enable real-time metrics stream. - log_level: Minimum log level to export. - - Example: - >>> telemetry = AzureTelemetry(service_name="my-genkit-app") - >>> telemetry.initialize() - """ - - def __init__( - self, - connection_string: str | None = None, - service_name: str = "genkit-app", - service_version: str = "1.0.0", - enable_live_metrics: bool = True, - log_level: int = logging.INFO, - ): - self.connection_string = ( - connection_string - or os.environ.get('APPLICATIONINSIGHTS_CONNECTION_STRING') - ) - self.service_name = service_name - self.service_version = service_version - self.enable_live_metrics = enable_live_metrics - self.log_level = log_level - - if not self.connection_string: - raise ValueError( - "Connection string required. Set APPLICATIONINSIGHTS_CONNECTION_STRING " - "or pass connection_string parameter." - ) - - def initialize(self) -> None: - """Initialize Azure Monitor telemetry. - - This method: - 1. Configures the Azure Monitor OpenTelemetry distro - 2. Sets up structured logging with trace correlation - 3. Enables live metrics if configured - """ - # Configure Azure Monitor (one-liner!) - configure_azure_monitor( - connection_string=self.connection_string, - service_name=self.service_name, - service_version=self.service_version, - enable_live_metrics=self.enable_live_metrics, - logger_name="", # Capture all loggers - ) - - # Configure structlog for trace correlation - self._configure_logging() - - logger.info( - "Azure telemetry initialized", - service_name=self.service_name, - live_metrics=self.enable_live_metrics, - ) - - def _configure_logging(self) -> None: - """Configure structlog to include Azure trace context.""" - processors = list(structlog.get_config().get("processors", [])) - - # Check if already configured - if any( - getattr(p, '__name__', '') == 'inject_azure_trace_context' - for p in processors - ): - return - - def inject_azure_trace_context( - _logger: Any, - method_name: str, - event_dict: MutableMapping[str, Any], - ) -> Mapping[str, Any]: - """Inject Azure trace context into log events.""" - span = trace.get_current_span() - if span and span.is_recording(): - ctx = span.get_span_context() - # Azure uses operation_Id and operation_ParentId - event_dict['operation_Id'] = format(ctx.trace_id, '032x') - event_dict['operation_ParentId'] = format(ctx.span_id, '016x') - return event_dict - - new_processors = list(processors) - new_processors.insert(max(0, len(new_processors) - 1), inject_azure_trace_context) - structlog.configure(processors=new_processors) -``` - -### Directory Structure - -``` -py/plugins/microsoft-foundry/ -├── pyproject.toml -├── README.md -├── LICENSE -├── src/genkit/plugins/microsoft-foundry/ -│ ├── __init__.py # Plugin entry, ELI5 docs, exports -│ ├── telemetry/ -│ │ ├── __init__.py -│ │ └── tracing.py # AzureTelemetry class -│ ├── typing.py # Configuration schemas -│ └── py.typed -└── tests/ - ├── conftest.py - └── azure_telemetry_test.py -``` - -### pyproject.toml - -```toml -[project] -name = "genkit-azure-plugin" -version = "0.1.0" -description = "Azure Monitor telemetry plugin for Genkit" -requires-python = ">=3.10" -dependencies = [ - "genkit", - "azure-monitor-opentelemetry>=1.8.0", - "structlog>=24.0.0", -] - -[project.optional-dependencies] -dev = [ - "pytest>=8.0.0", - "pytest-asyncio>=0.24.0", -] -``` - -## Configuration Options - -### Connection String - -Get from Azure Portal: -1. Go to your Application Insights resource -2. Click "Overview" or "Properties" -3. Copy the "Connection String" - -Format: -``` -InstrumentationKey=xxx;IngestionEndpoint=https://xxx.in.applicationinsights.azure.com/;LiveEndpoint=https://xxx.livediagnostics.monitor.azure.com/;ApplicationId=xxx -``` - -### Environment Variables - -| Variable | Required | Description | -|----------|----------|-------------| -| `APPLICATIONINSIGHTS_CONNECTION_STRING` | Yes | App Insights connection string | -| `AZURE_SDK_TRACING_IMPLEMENTATION` | No | Set to "opentelemetry" for SDK tracing | - -## Features - -### 1. Distributed Tracing - -Automatic tracing for: -- HTTP requests (incoming and outgoing) -- Database calls (via auto-instrumentation) -- Genkit flows, models, and tools -- Cross-service correlation - -### 2. Structured Logging - -Logs automatically include: -- `operation_Id` - Links logs to traces -- `operation_ParentId` - Parent span context -- Custom properties from structlog - -### 3. Live Metrics - -Real-time stream showing: -- Request rate -- Failure rate -- Response time -- Server health - -### 4. Application Map - -Visual diagram of: -- Service dependencies -- Call flows -- Performance bottlenecks - -## Sample Application - -```python -# py/samples/provider-microsoft-foundry-hello/src/main.py -"""Azure telemetry hello sample - Monitor Genkit with Application Insights. - -Key Concepts (ELI5):: - - ┌─────────────────────┬────────────────────────────────────────────────┐ - │ Concept │ ELI5 Explanation │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Application Insights│ Microsoft's app monitoring. See traces, logs, │ - │ │ and metrics in Azure Portal. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Connection String │ Your key to send data. Find it in Azure │ - │ │ Portal > App Insights > Properties. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Live Metrics │ Real-time view of requests as they happen. │ - │ │ Great for debugging production issues! │ - └─────────────────────┴────────────────────────────────────────────────┘ -""" - -from genkit.ai import Genkit -from genkit.plugins.microsoft_foundry import AzureTelemetry -from genkit.plugins.microsoft_foundry import MicrosoftFoundry - -# Initialize Azure telemetry FIRST -AzureTelemetry( - service_name="microsoft-foundry-hello-sample", - enable_live_metrics=True, -).initialize() - -ai = Genkit( - plugins=[MicrosoftFoundry()], - model='microsoft-foundry/gpt-4o', -) - -@ai.flow() -async def say_hi(name: str) -> str: - """Say hello - traced in Application Insights.""" - response = await ai.generate(prompt=f"Say hi to {name}!") - return response.text -``` - -## Comparison with AWS/GCP Telemetry - -| Feature | AWS (`aws`) | GCP (`google-cloud`) | Azure (`azure`) | -|---------|-------------|---------------------|-----------------| -| Native Backend | X-Ray | Cloud Trace | Application Insights | -| OTEL Distro | Manual setup | Manual setup | ✅ Official distro | -| One-liner Setup | ❌ | ❌ | ✅ `configure_azure_monitor()` | -| Live Metrics | ❌ | ❌ | ✅ Built-in | -| Application Map | ❌ | ❌ | ✅ Built-in | -| Log Correlation | ✅ | ✅ | ✅ | -| Auto-instrumentation | Manual | Manual | ✅ Automatic | - -## Implementation Phases - -### Phase 1: Core Telemetry (3-4 days) - -1. Plugin skeleton with `AzureTelemetry` class -2. Integration with `azure-monitor-opentelemetry` -3. Structlog trace correlation -4. Basic tests - -### Phase 2: Sample & Docs (2-3 days) - -1. `microsoft-foundry-hello` sample application -2. README with setup instructions -3. Integration with `microsoft-foundry` plugin - -### Phase 3: Advanced Features (Optional) - -1. Custom metrics support -2. Exception tracking -3. Availability tests integration - -## Risks and Mitigations - -| Risk | Impact | Mitigation | -|------|--------|------------| -| Connection string exposure | High | Document secure storage practices | -| High telemetry volume | Medium | Configure sampling | -| SDK version conflicts | Low | Pin compatible versions | - -## References - -- [Azure Monitor OpenTelemetry Distro](https://learn.microsoft.com/en-us/python/api/overview/azure/monitor-opentelemetry-readme) -- [Configure Azure Monitor OpenTelemetry](https://learn.microsoft.com/en-us/azure/azure-monitor/app/opentelemetry-configuration) -- [Application Insights Overview](https://learn.microsoft.com/en-us/azure/azure-monitor/app/app-insights-overview) -- [PyPI Package](https://pypi.org/project/azure-monitor-opentelemetry/) diff --git a/py/engdoc/planning/cloudflare-ai-plugin.md b/py/engdoc/planning/cloudflare-ai-plugin.md deleted file mode 100644 index cb9e4d64ea..0000000000 --- a/py/engdoc/planning/cloudflare-ai-plugin.md +++ /dev/null @@ -1,376 +0,0 @@ -# Cloudflare Workers AI Plugin Implementation Plan - -**Status:** Ready for Implementation -**Feasibility:** ✅ HIGH -**Estimated Effort:** Medium (2-3 weeks) -**Dependencies:** `httpx`, `pydantic` - -## Overview - -The `cloudflare-ai` plugin provides access to Cloudflare Workers AI, enabling Genkit -applications to use 50+ open-source AI models running at the edge across 200+ data centers. - -``` -┌─────────────────────────────────────────────────────────────────────────┐ -│ CLOUDFLARE WORKERS AI PLUGIN ARCHITECTURE │ -│ │ -│ Key Concepts (ELI5): │ -│ ┌─────────────────────┬────────────────────────────────────────────┐ │ -│ │ Workers AI │ Cloudflare's AI at the edge. Models run │ │ -│ │ │ close to users (200+ data centers). │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Account ID │ Your Cloudflare account identifier. │ │ -│ │ │ Found in dashboard URL or API settings. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ API Token │ Auth token with Workers AI permissions. │ │ -│ │ │ Create at dash.cloudflare.com/profile/api │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ @cf/ Models │ Model names start with @cf/ prefix. │ │ -│ │ │ @cf/meta/llama-3.1-8b-instruct │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Edge Computing │ Processing close to users. Lower latency │ │ -│ │ │ than centralized cloud data centers. │ │ -│ └─────────────────────┴────────────────────────────────────────────┘ │ -│ │ -│ Data Flow: │ -│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ Genkit App │────▶│ CF Workers AI │────▶│ Edge Location │ │ -│ │ (Your Code) │ │ REST API │ │ (Nearest DC) │ │ -│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ -│ │ │ │ -│ │ ┌─────────────────────────────────────┘ │ -│ │ │ │ -│ │ ▼ │ -│ │ ┌─────────────────┐ │ -│ │ │ AI Models │ │ -│ │ │ • Llama 3/4 │ │ -│ │ │ • Mistral │ │ -│ │ │ • Flux (Image) │ │ -│ │ │ • Whisper │ │ -│ │ └─────────────────┘ │ -└─────────────────────────────────────────────────────────────────────────┘ -``` - -## API Details - -### Authentication - -```python -# Environment variables -CLOUDFLARE_ACCOUNT_ID=your_account_id -CLOUDFLARE_API_TOKEN=your_api_token -``` - -### Base URL Pattern - -``` -https://api.cloudflare.com/client/v4/accounts/{ACCOUNT_ID}/ai/run/{MODEL} -``` - -### Request Format - -```python -# Text Generation -{ - "messages": [ - {"role": "system", "content": "You are a helpful assistant"}, - {"role": "user", "content": "Hello!"} - ], - "stream": true, # Optional: Enable SSE streaming - "max_tokens": 256, - "temperature": 0.6 -} - -# Image Generation -{ - "prompt": "A sunset over mountains", - "num_steps": 20, - "guidance": 7.5 -} - -# Embeddings -{ - "text": ["Hello world", "Goodbye world"] -} -``` - -### Response Format - -```python -# Non-streaming -{ - "result": { - "response": "Hello! How can I help you today?" - }, - "success": true, - "errors": [], - "messages": [] -} - -# Streaming (SSE) -data: {"response": "Hello"} -data: {"response": "!"} -data: {"response": " How"} -data: [DONE] -``` - -## Model Support Matrix - -| Model | Type | Streaming | Tools | Status | -|-------|------|-----------|-------|--------| -| `@cf/meta/llama-3.3-70b-instruct-fp8-fast` | Text | ✅ | ✅ | Priority | -| `@cf/meta/llama-3.1-8b-instruct-fast` | Text | ✅ | ✅ | Priority | -| `@cf/meta/llama-4-scout-17b-16e-instruct` | Multimodal | ✅ | ✅ | Priority | -| `@cf/mistral/mistral-7b-instruct-v0.2` | Text | ✅ | ❌ | Phase 1 | -| `@cf/qwen/qwen1.5-14b-chat-awq` | Text | ✅ | ❌ | Phase 2 | -| `@cf/black-forest-labs/flux-2-klein-9b` | Image | ❌ | ❌ | Phase 2 | -| `@cf/stabilityai/stable-diffusion-xl-base-1.0` | Image | ❌ | ❌ | Phase 2 | -| `@cf/openai/whisper` | Speech→Text | ❌ | ❌ | Phase 3 | -| `@cf/baai/bge-base-en-v1.5` | Embedding | ❌ | ❌ | Phase 1 | -| `@cf/baai/bge-large-en-v1.5` | Embedding | ❌ | ❌ | Phase 1 | - -## Model Configuration Parameters - -### Text Generation (Llama/Mistral) - -```python -class CloudflareLlamaConfig(BaseModel): - """Configuration for Llama models on Workers AI.""" - - # Core parameters - temperature: float | None = Field(default=0.6, ge=0.0, le=5.0) - max_tokens: int | None = Field(default=256, ge=1, le=4096) - top_p: float | None = Field(default=0.9, ge=0.0, le=1.0) - top_k: int | None = Field(default=40, ge=1, le=100) - - # Repetition control - repetition_penalty: float | None = Field(default=1.0, ge=0.0, le=2.0) - presence_penalty: float | None = None # Llama 3.1+ - frequency_penalty: float | None = None # Llama 3.1+ - - # Output control - seed: int | None = None # For reproducibility - raw: bool | None = None # Return raw tokens -``` - -### Image Generation (Flux/Stable Diffusion) - -```python -class CloudflareImageConfig(BaseModel): - """Configuration for image generation models.""" - - num_steps: int | None = Field(default=20, ge=1, le=50) - guidance: float | None = Field(default=7.5, ge=0.0, le=20.0) - strength: float | None = Field(default=1.0, ge=0.0, le=1.0) # For img2img - width: int | None = Field(default=1024) - height: int | None = Field(default=1024) - seed: int | None = None -``` - -## Directory Structure - -``` -py/plugins/cloudflare-ai/ -├── pyproject.toml -├── README.md -├── LICENSE -├── src/genkit/plugins/cloudflare_ai/ -│ ├── __init__.py # Plugin entry, ELI5 docs, exports -│ ├── typing.py # All Pydantic config schemas -│ ├── constants.py # Model names, URLs, defaults -│ ├── models/ -│ │ ├── __init__.py -│ │ ├── model.py # CloudflareModel base implementation -│ │ ├── text.py # Text generation (Llama, Mistral, Qwen) -│ │ ├── image.py # Image generation (Flux, SD) -│ │ ├── speech.py # Speech-to-text (Whisper) -│ │ └── utils.py # Response parsing, error handling -│ ├── embedders/ -│ │ ├── __init__.py -│ │ └── embedder.py # BGE embeddings implementation -│ └── py.typed -└── tests/ - ├── conftest.py - ├── cloudflare_model_test.py - ├── cloudflare_embedder_test.py - └── integration_test.py -``` - -## Implementation Phases - -### Phase 1: Core Plugin (Week 1) - -1. **Plugin skeleton** - - `CloudflareAI` plugin class - - Authentication handling (API token, Account ID) - - HTTP client setup with `httpx` - -2. **Text generation models** - - Llama 3.1/3.3 support - - Streaming via SSE - - Tool/function calling - -3. **Embeddings** - - BGE embedder implementation - - Batch embedding support - -### Phase 2: Extended Models (Week 2) - -1. **Image generation** - - Flux models - - Stable Diffusion XL - - Base64 image handling - -2. **Additional text models** - - Mistral family - - Qwen models - -3. **Model configuration** - - Full parameter support per model family - - DevUI integration - -### Phase 3: Advanced Features (Week 3) - -1. **Speech models** - - Whisper integration - - Audio input handling - -2. **Multimodal** - - Llama 4 Scout vision support - - Image + text inputs - -3. **Sample application** - - `cloudflare-workers-ai-hello` sample - - README with setup instructions - -## Key Implementation Details - -### Plugin Class - -```python -class CloudflareAI(Plugin): - """Cloudflare Workers AI plugin for Genkit. - - Provides access to 50+ AI models running at the edge. - - Example: - >>> from genkit.ai import Genkit - >>> from genkit.plugins.cloudflare_ai import CloudflareAI, cloudflare_model - >>> - >>> ai = Genkit( - ... plugins=[CloudflareAI()], - ... model=cloudflare_model("@cf/meta/llama-3.1-8b-instruct-fast"), - ... ) - """ - - def __init__( - self, - account_id: str | None = None, - api_token: str | None = None, - models: list[str] | None = None, # Subset of models to register - ): - self.account_id = account_id or os.environ.get('CLOUDFLARE_ACCOUNT_ID') - self.api_token = api_token or os.environ.get('CLOUDFLARE_API_TOKEN') - - if not self.account_id: - raise ValueError("CLOUDFLARE_ACCOUNT_ID required") - if not self.api_token: - raise ValueError("CLOUDFLARE_API_TOKEN required") -``` - -### Streaming Implementation - -```python -async def _generate_stream( - self, - model: str, - messages: list[dict], - config: CloudflareLlamaConfig, -) -> AsyncIterator[GenerateResponseChunk]: - """Generate streaming response using SSE.""" - - url = f"{BASE_URL.format(account_id=self.account_id)}/{model}" - - async with httpx.AsyncClient() as client: - async with client.stream( - "POST", - url, - headers={ - "Authorization": f"Bearer {self.api_token}", - "Content-Type": "application/json", - }, - json={ - "messages": messages, - "stream": True, - **config.model_dump(exclude_none=True), - }, - ) as response: - async for line in response.aiter_lines(): - if line.startswith("data: "): - data = line[6:] - if data == "[DONE]": - break - chunk = json.loads(data) - yield GenerateResponseChunk( - content=[TextPart(text=chunk.get("response", ""))], - ) -``` - -## Testing Strategy - -1. **Unit tests** - Mock HTTP responses, test config validation -2. **Integration tests** - Live API calls (requires credentials) -3. **Model-specific tests** - Verify each model family works correctly - -## Environment Variables - -| Variable | Required | Description | -|----------|----------|-------------| -| `CLOUDFLARE_ACCOUNT_ID` | Yes | Your Cloudflare account ID | -| `CLOUDFLARE_API_TOKEN` | Yes | API token with Workers AI permissions | - -## Sample Application - -```python -# py/samples/provider-cloudflare-workers-ai-hello/src/main.py -"""Cloudflare Workers AI hello sample - Edge AI with Genkit.""" - -from genkit.ai import Genkit -from genkit.plugins.cloudflare_ai import CloudflareAI, cloudflare_model - -ai = Genkit( - plugins=[CloudflareAI()], - model=cloudflare_model("@cf/meta/llama-3.1-8b-instruct-fast"), -) - -@ai.flow() -async def say_hi(name: str) -> str: - """Say hello using Llama at the edge.""" - response = await ai.generate(prompt=f"Say hi to {name} in a friendly way!") - return response.text - -@ai.flow() -async def generate_image(prompt: str) -> str: - """Generate an image using Flux.""" - response = await ai.generate( - model=cloudflare_model("@cf/black-forest-labs/flux-2-klein-9b"), - prompt=prompt, - ) - return response.media[0].url # Base64 data URL -``` - -## Risks and Mitigations - -| Risk | Impact | Mitigation | -|------|--------|------------| -| Rate limiting | Medium | Implement exponential backoff | -| Model availability | Low | Graceful fallback to alternative models | -| SSE parsing edge cases | Medium | Comprehensive error handling | -| Tool calling variations | Medium | Test with multiple model families | - -## References - -- [Workers AI Documentation](https://developers.cloudflare.com/workers-ai/) -- [Workers AI Models Catalog](https://developers.cloudflare.com/workers-ai/models/) -- [REST API Guide](https://developers.cloudflare.com/workers-ai/get-started/rest-api) -- [Cloudflare API Reference](https://developers.cloudflare.com/api/resources/ai/) diff --git a/py/engdoc/planning/cloudflare-telemetry-plugin.md b/py/engdoc/planning/cloudflare-telemetry-plugin.md deleted file mode 100644 index 3d533ad919..0000000000 --- a/py/engdoc/planning/cloudflare-telemetry-plugin.md +++ /dev/null @@ -1,339 +0,0 @@ -# Cloudflare Telemetry Plugin Implementation Plan - -**Status:** Research Complete - Limited Native Support -**Feasibility:** ⚠️ MEDIUM (requires workarounds) -**Estimated Effort:** Low-Medium (1-2 weeks) -**Dependencies:** `httpx`, `opentelemetry-sdk` - -## Overview - -Cloudflare does not have a native tracing backend like AWS X-Ray or GCP Cloud Trace. -However, they support **exporting OpenTelemetry data** to third-party observability platforms -and have recently adopted OpenTelemetry internally for their logging pipeline. - -``` -┌─────────────────────────────────────────────────────────────────────────┐ -│ CLOUDFLARE TELEMETRY OPTIONS ARCHITECTURE │ -│ │ -│ Key Concepts (ELI5): │ -│ ┌─────────────────────┬────────────────────────────────────────────┐ │ -│ │ Logpush │ Exports logs to external services. Like a │ │ -│ │ │ pipe sending data to S3, Datadog, etc. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Workers Analytics │ Built-in metrics for Workers. Basic │ │ -│ │ │ request counts, CPU time, errors. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ AI Gateway OTEL │ Auto-exports AI request traces! Includes │ │ -│ │ │ model, tokens, cost, latency. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Workers OTEL Export │ Export traces from Workers to Honeycomb, │ │ -│ │ │ Grafana, Axiom, Datadog, etc. │ │ -│ └─────────────────────┴────────────────────────────────────────────┘ │ -│ │ -│ OPTION A: AI Gateway Integration (Recommended for AI apps) │ -│ ────────────────────────────────────────────────────────── │ -│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ Genkit App │────▶│ CF AI Gateway │────▶│ Workers AI │ │ -│ │ (Your Code) │ │ (Proxy) │ │ or OpenAI etc │ │ -│ └─────────────────┘ └────────┬────────┘ └─────────────────┘ │ -│ │ │ -│ Auto-export OTEL traces │ -│ │ │ -│ ▼ │ -│ ┌─────────────────┐ │ -│ │ Your OTEL │ │ -│ │ Backend │ │ -│ │ (Honeycomb, │ │ -│ │ Grafana, etc) │ │ -│ └─────────────────┘ │ -│ │ -│ OPTION B: Direct OTLP Export (For non-Workers apps) │ -│ ─────────────────────────────────────────────────── │ -│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ Genkit App │────▶│ OTLP Exporter │────▶│ Any OTEL │ │ -│ │ (Your Code) │ │ (Standard) │ │ Backend │ │ -│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ -└─────────────────────────────────────────────────────────────────────────┘ -``` - -## Cloudflare's OTEL Support - -### Supported Third-Party Backends - -Cloudflare Workers and AI Gateway support exporting OTEL data to: - -| Provider | Traces | Logs | Notes | -|----------|--------|------|-------| -| **Sentry** | ✅ | ✅ | Error tracking + traces | -| **Honeycomb** | ✅ | ✅ | Full OTEL support | -| **Grafana Cloud** | ✅ | ✅ | Tempo + Loki | -| **Axiom** | ✅ | ✅ | Log + trace ingestion | -| **Datadog** | ✅ | ✅ | Full APM integration | - -### 1. AI Gateway OpenTelemetry (Best for AI Apps) - -Cloudflare AI Gateway automatically exports traces with: - -- **Model information** - Which model was called -- **Token usage** - Input/output tokens -- **Cost estimates** - Approximate cost per request -- **Prompts & completions** - Full request/response content -- **Latency metrics** - Time to first token, total time - -Configuration via dashboard or API: -```json -{ - "otel": { - "endpoint": "https://your-sentry-endpoint/v1/traces", - "headers": { - "Authorization": "Bearer your-sentry-dsn" - } - } -} -``` - -Or for Honeycomb: -```json -{ - "otel": { - "endpoint": "https://api.honeycomb.io/v1/traces", - "headers": { - "x-honeycomb-team": "your-api-key" - } - } -} -``` - -### 2. Workers OTEL Export - -For Workers-deployed apps, traces can be exported to any of the supported backends. -Configure in `wrangler.toml` or via the Cloudflare dashboard. - -### 3. Logpush - -Exports logs (not traces) to: -- AWS S3 -- Google Cloud Storage -- Azure Blob Storage -- Elastic -- Datadog -- Splunk -- Sentry -- And more... - -## Implementation Options - -### Option A: AI Gateway Proxy Plugin (Recommended) - -Route AI requests through Cloudflare AI Gateway to get automatic telemetry. - -```python -class CloudflareAIGateway(Plugin): - """Route AI requests through Cloudflare AI Gateway for telemetry. - - Works with ANY model provider (OpenAI, Anthropic, etc.) while adding: - - Automatic tracing to your OTEL backend - - Request caching - - Rate limiting - - Cost tracking - """ - - def __init__( - self, - gateway_id: str, # Your AI Gateway ID - account_id: str | None = None, - # The underlying provider (OpenAI, Anthropic, etc.) - provider: Literal["openai", "anthropic", "workers-ai"] = "openai", - ): - self.base_url = f"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}" -``` - -**Pros:** -- Automatic OTEL traces for all AI requests -- Works with any model provider -- Includes cost tracking, caching, rate limiting -- No code changes needed for telemetry - -**Cons:** -- Adds network hop (slight latency) -- Only covers AI requests, not all app traces -- Requires AI Gateway setup in CF dashboard - -### Option B: Generic OTLP Export (Standard Approach) - -Use standard OpenTelemetry exporters to any backend. - -```python -class CloudflareTelemetry(Plugin): - """Export Genkit telemetry to any OTEL-compatible backend. - - This is a thin wrapper around standard OTLP export, but with - preset configurations for popular Cloudflare-compatible backends. - """ - - def __init__( - self, - backend: Literal["honeycomb", "grafana", "axiom", "datadog", "custom"], - endpoint: str | None = None, - api_key: str | None = None, - headers: dict[str, str] | None = None, - ): - ... -``` - -**Pros:** -- Works for any Python app (not just CF Workers) -- Full control over what's traced -- Standard OTEL approach - -**Cons:** -- No Cloudflare-specific features -- Basically same as any OTLP export -- Less "Cloudflare native" - -### Option C: Hybrid (Both) - -Combine AI Gateway for AI telemetry + standard OTEL for app telemetry. - -## Recommended Implementation - -Given Cloudflare's limited native tracing, I recommend **Option A (AI Gateway)** as the -primary implementation with a simple helper for configuration. - -### Directory Structure - -``` -py/plugins/cloudflare/ -├── pyproject.toml -├── README.md -├── LICENSE -├── src/genkit/plugins/cloudflare/ -│ ├── __init__.py # Plugin entry, ELI5 docs -│ ├── ai_gateway.py # AI Gateway proxy configuration -│ ├── typing.py # Configuration schemas -│ └── py.typed -└── tests/ - ├── conftest.py - └── ai_gateway_test.py -``` - -### Implementation - -```python -# __init__.py -"""Cloudflare plugin for Genkit - AI Gateway integration. - -This plugin configures Genkit to route AI requests through Cloudflare's -AI Gateway, which provides automatic OpenTelemetry trace export. - -Key Concepts (ELI5):: - - ┌─────────────────────┬────────────────────────────────────────────────┐ - │ Concept │ ELI5 Explanation │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ AI Gateway │ A proxy that sits between you and AI models. │ - │ │ Adds caching, rate limits, and tracing. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Gateway ID │ Your gateway's unique name. Create in the │ - │ │ Cloudflare dashboard under AI > AI Gateway. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Automatic OTEL │ AI Gateway exports traces automatically. │ - │ │ Configure destination in CF dashboard. │ - └─────────────────────┴────────────────────────────────────────────────┘ - -Example:: - - from genkit.ai import Genkit - from genkit.plugins.cloudflare import configure_ai_gateway - from genkit.plugins.compat_oai import OpenAI - - # Route OpenAI requests through AI Gateway - ai = Genkit( - plugins=[ - OpenAI( - base_url=configure_ai_gateway( - account_id="your-account-id", - gateway_id="your-gateway-id", - provider="openai", - ), - ), - ], - ) -""" - -def configure_ai_gateway( - account_id: str | None = None, - gateway_id: str | None = None, - provider: str = "openai", -) -> str: - """Get the AI Gateway base URL for a provider. - - Args: - account_id: Cloudflare account ID (or CLOUDFLARE_ACCOUNT_ID env var) - gateway_id: AI Gateway ID (or CLOUDFLARE_GATEWAY_ID env var) - provider: Provider name ("openai", "anthropic", "workers-ai", etc.) - - Returns: - Base URL to use with the provider's SDK/plugin - """ - account_id = account_id or os.environ.get('CLOUDFLARE_ACCOUNT_ID') - gateway_id = gateway_id or os.environ.get('CLOUDFLARE_GATEWAY_ID') - - if not account_id or not gateway_id: - raise ValueError( - "CLOUDFLARE_ACCOUNT_ID and CLOUDFLARE_GATEWAY_ID required" - ) - - return f"https://gateway.ai.cloudflare.com/v1/{account_id}/{gateway_id}/{provider}" -``` - -## Alternative: Don't Create a Separate Plugin - -Given the limited scope, this functionality could also be: - -1. **Documented in cloudflare-ai plugin** - Show how to use AI Gateway with Workers AI -2. **Added as a utility function** - Simple helper in `genkit.plugins.cloudflare_ai` -3. **Left to user configuration** - Just document how to set `base_url` - -## Environment Variables - -| Variable | Required | Description | -|----------|----------|-------------| -| `CLOUDFLARE_ACCOUNT_ID` | Yes | Your Cloudflare account ID | -| `CLOUDFLARE_GATEWAY_ID` | Yes (for AI Gateway) | Your AI Gateway ID | - -## Comparison with Other Telemetry Plugins - -| Feature | AWS (`aws`) | GCP (`google-cloud`) | Cloudflare | -|---------|-------------|---------------------|------------| -| Native Tracing Backend | ✅ X-Ray | ✅ Cloud Trace | ❌ None (use 3rd party) | -| Third-Party Export | ✅ | ✅ | ✅ Sentry, Honeycomb, etc. | -| OTLP Export | ✅ | ✅ | ✅ (Workers + AI Gateway) | -| Log Correlation | ✅ | ✅ | ✅ Logpush to many backends | -| Metrics | ✅ CloudWatch | ✅ Cloud Monitoring | ⚠️ Workers Analytics | -| Auto-instrumentation | ✅ | ✅ | ✅ AI Gateway auto-traces | -| Python SDK | ✅ Official | ✅ Official | ❌ REST API only | - -## Feasibility Assessment - -**Feasibility: ⚠️ MEDIUM-HIGH** - -**Reasons:** -1. No native Cloudflare tracing backend, BUT excellent third-party support -2. AI Gateway auto-exports traces to Sentry, Honeycomb, Datadog, etc. -3. Workers OTEL export supports all major observability platforms -4. Implementation would provide presets for common backends - -**Recommendation:** -- **Consider implementing** a `cloudflare` telemetry plugin that: - - Provides presets for Sentry, Honeycomb, Datadog, Grafana, Axiom - - Helps configure AI Gateway OTEL export - - Documents the integration patterns -- Could be combined with `cloudflare-ai` plugin or kept separate - -## References - -- [AI Gateway OTEL Integration](https://developers.cloudflare.com/ai-gateway/observability/otel-integration/) -- [Workers OTEL Export](https://developers.cloudflare.com/workers/observability/exporting-opentelemetry-data/) -- [Logpush Documentation](https://developers.cloudflare.com/logs/logpush/) -- [Workers Analytics](https://developers.cloudflare.com/workers/observability/) diff --git a/py/engdoc/planning/observability-plugin.md b/py/engdoc/planning/observability-plugin.md deleted file mode 100644 index 0856922bb6..0000000000 --- a/py/engdoc/planning/observability-plugin.md +++ /dev/null @@ -1,453 +0,0 @@ -# Observability Plugin Implementation Plan - -**Status:** Ready for Implementation -**Feasibility:** ✅ HIGH -**Estimated Effort:** 1 week -**Dependencies:** `opentelemetry-sdk`, `opentelemetry-exporter-otlp-proto-http` - -## Overview - -The `observability` plugin provides a unified way to export Genkit telemetry to any -OTLP-compatible backend (Sentry, Honeycomb, Datadog, Grafana, Axiom, etc.) with -simple presets for popular services. - -``` -┌─────────────────────────────────────────────────────────────────────────────────┐ -│ OBSERVABILITY PLUGIN ARCHITECTURE │ -│ │ -│ Key Concepts (ELI5): │ -│ ┌─────────────────────┬────────────────────────────────────────────────────┐ │ -│ │ OTLP │ OpenTelemetry Protocol. The universal language │ │ -│ │ │ for sending traces. Sentry, Honeycomb, all speak it.│ │ -│ ├─────────────────────┼────────────────────────────────────────────────────┤ │ -│ │ Backend Preset │ Pre-configured settings for a service. Just add │ │ -│ │ │ your API key and you're done! │ │ -│ ├─────────────────────┼────────────────────────────────────────────────────┤ │ -│ │ Sentry │ Error tracking + tracing. Great for debugging │ │ -│ │ │ crashes and performance issues. │ │ -│ ├─────────────────────┼────────────────────────────────────────────────────┤ │ -│ │ Honeycomb │ Observability platform built for debugging. │ │ -│ │ │ Query your traces like a database. │ │ -│ ├─────────────────────┼────────────────────────────────────────────────────┤ │ -│ │ Datadog │ Full-stack monitoring. Traces, metrics, logs, │ │ -│ │ │ all in one place. │ │ -│ └─────────────────────┴────────────────────────────────────────────────────┘ │ -│ │ -│ Data Flow: │ -│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │ -│ │ Genkit App │────▶│ OTLP Exporter │────▶│ Your Backend │ │ -│ │ (Your Code) │ │ (HTTP/gRPC) │ │ (Sentry, etc.) │ │ -│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │ -│ │ -│ Supported Backends: │ -│ ┌────────────┬────────────┬────────────┬────────────┬────────────┐ │ -│ │ Sentry │ Honeycomb │ Datadog │ Grafana │ Axiom │ │ -│ │ ✅ │ ✅ │ ✅ │ ✅ │ ✅ │ │ -│ └────────────┴────────────┴────────────┴────────────┴────────────┘ │ -│ │ -└─────────────────────────────────────────────────────────────────────────────────┘ -``` - -## When to Use This Plugin - -``` -┌─────────────────────────────────────────────────────────────────────────────────┐ -│ WHEN TO USE WHAT │ -├─────────────────────────────────────────────────────────────────────────────────┤ -│ │ -│ "I'm on AWS and want X-Ray" → aws plugin (SigV4, X-Ray format) │ -│ "I'm on GCP and want Cloud Trace" → google-cloud plugin (ADC) │ -│ "I'm on Azure and want App Insights" → azure plugin (Live Metrics, Map) │ -│ │ -│ "I'm on AWS but want Honeycomb" → observability plugin ← THIS ONE │ -│ "I'm on GCP but want Sentry" → observability plugin ← THIS ONE │ -│ "I'm multi-cloud, want Datadog" → observability plugin ← THIS ONE │ -│ "I don't care, just give me traces" → observability plugin ← THIS ONE │ -│ │ -└─────────────────────────────────────────────────────────────────────────────────┘ -``` - -## Why Not Just Use Platform Plugins? - -Platform plugins (aws, google-cloud, azure) provide: -- Platform-specific authentication (SigV4, ADC) -- Native backend features (Live Metrics, X-Ray Service Map) -- Platform log correlation (CloudWatch, Cloud Logging) - -**These CAN'T be replicated with generic OTLP.** - -The observability plugin is for users who: -- Don't want platform-native tools -- Need the same backend across multiple clouds -- Prefer third-party tools like Sentry or Honeycomb -- Want simple setup without platform-specific auth - -## Supported Backends - -| Backend | Endpoint | Auth | Features | -|---------|----------|------|----------| -| **Sentry** | `https://{org}.ingest.sentry.io/api/{project}/envelope/` | DSN | Error tracking, performance | -| **Honeycomb** | `https://api.honeycomb.io/v1/traces` | API Key | Query-based debugging | -| **Datadog** | `https://trace.agent.datadoghq.com/v0.4/traces` | API Key | Full-stack APM | -| **Grafana Cloud** | `https://{stack}.grafana.net/otlp` | API Key | Tempo traces | -| **Axiom** | `https://api.axiom.co/v1/traces` | API Token | Log + trace ingestion | -| **Custom** | Any OTLP endpoint | Headers | Bring your own | - -## API Design - -```python -"""Observability plugin for Genkit - Third-party telemetry backends. - -This plugin provides simple presets for popular observability platforms, -all using standard OpenTelemetry Protocol (OTLP) export. - -Key Concepts (ELI5):: - - ┌─────────────────────┬────────────────────────────────────────────────────┐ - │ Concept │ ELI5 Explanation │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ OTLP │ The universal language for traces. Like USB but │ - │ │ for observability data. │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ Backend │ Where your traces go. Sentry, Honeycomb, etc. │ - │ │ Pick one, add your API key, done! │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ Preset │ Pre-configured settings for a backend. Knows │ - │ │ the right URLs, headers, and formats. │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ Span │ A single operation's timing. Like a stopwatch │ - │ │ for one function call. │ - └─────────────────────┴────────────────────────────────────────────────────┘ - -Data Flow:: - - ┌─────────────────────────────────────────────────────────────────────────┐ - │ HOW OBSERVABILITY EXPORT WORKS │ - │ │ - │ Your Genkit App │ - │ │ │ - │ │ (1) Flows, models, tools create spans │ - │ ▼ │ - │ ┌─────────────────┐ │ - │ │ TracerProvider │ Collects all spans from your app │ - │ │ (OpenTelemetry)│ │ - │ └────────┬────────┘ │ - │ │ │ - │ │ (2) Batch and export via OTLP │ - │ ▼ │ - │ ┌─────────────────┐ │ - │ │ OTLP Exporter │ Sends to your chosen backend │ - │ │ (HTTP POST) │ │ - │ └────────┬────────┘ │ - │ │ │ - │ │ (3) View in your dashboard │ - │ ▼ │ - │ ┌─────────────────┐ │ - │ │ Sentry / │ Query, alert, debug your traces │ - │ │ Honeycomb / │ │ - │ │ Datadog / etc │ │ - │ └─────────────────┘ │ - └─────────────────────────────────────────────────────────────────────────┘ - -Example:: - - from genkit.plugins.observability import configure_telemetry - - # Sentry - configure_telemetry(backend="sentry", sentry_dsn="https://...") - - # Honeycomb - configure_telemetry(backend="honeycomb", honeycomb_api_key="...") - - # Datadog - configure_telemetry(backend="datadog", datadog_api_key="...") - - # Custom OTLP endpoint - configure_telemetry( - backend="custom", - endpoint="https://my-collector/v1/traces", - headers={"Authorization": "Bearer ..."}, - ) -""" - -from enum import Enum -from typing import Literal - - -class Backend(str, Enum): - """Supported observability backends.""" - - SENTRY = "sentry" - HONEYCOMB = "honeycomb" - DATADOG = "datadog" - GRAFANA = "grafana" - AXIOM = "axiom" - CUSTOM = "custom" - - -def configure_telemetry( - backend: Backend | Literal["sentry", "honeycomb", "datadog", "grafana", "axiom", "custom"], - *, - # Common options - service_name: str = "genkit-app", - service_version: str = "1.0.0", - environment: str | None = None, - - # Sentry - sentry_dsn: str | None = None, - - # Honeycomb - honeycomb_api_key: str | None = None, - honeycomb_dataset: str | None = None, - - # Datadog - datadog_api_key: str | None = None, - datadog_site: str = "datadoghq.com", - - # Grafana Cloud - grafana_endpoint: str | None = None, - grafana_api_key: str | None = None, - - # Axiom - axiom_api_token: str | None = None, - axiom_dataset: str | None = None, - - # Custom OTLP - endpoint: str | None = None, - headers: dict[str, str] | None = None, -) -> None: - """Configure telemetry export to a third-party backend. - - Args: - backend: Which backend to use (sentry, honeycomb, datadog, etc.) - service_name: Name of your service (appears in traces) - service_version: Version of your service - environment: Environment name (production, staging, etc.) - - # Backend-specific (provide based on chosen backend): - sentry_dsn: Sentry DSN (for backend="sentry") - honeycomb_api_key: Honeycomb API key (for backend="honeycomb") - datadog_api_key: Datadog API key (for backend="datadog") - grafana_endpoint: Grafana Cloud OTLP endpoint (for backend="grafana") - axiom_api_token: Axiom API token (for backend="axiom") - - # Custom OTLP: - endpoint: Custom OTLP endpoint URL (for backend="custom") - headers: Custom headers for authentication (for backend="custom") - - Example: - >>> # Sentry - >>> configure_telemetry(backend="sentry", sentry_dsn="https://...") - >>> - >>> # Honeycomb - >>> configure_telemetry(backend="honeycomb", honeycomb_api_key="...") - """ - ... -``` - -## Directory Structure - -``` -py/plugins/observability/ -├── pyproject.toml -├── README.md -├── LICENSE -├── src/genkit/plugins/observability/ -│ ├── __init__.py # Main API, configure_telemetry() -│ ├── backends/ -│ │ ├── __init__.py -│ │ ├── base.py # Base backend configuration -│ │ ├── sentry.py # Sentry preset -│ │ ├── honeycomb.py # Honeycomb preset -│ │ ├── datadog.py # Datadog preset -│ │ ├── grafana.py # Grafana Cloud preset -│ │ ├── axiom.py # Axiom preset -│ │ └── custom.py # Custom OTLP -│ ├── typing.py # Configuration schemas -│ └── py.typed -└── tests/ - ├── conftest.py - ├── sentry_test.py - ├── honeycomb_test.py - └── integration_test.py -``` - -## Implementation - -### Core Configuration - -```python -# src/genkit/plugins/observability/__init__.py - -import os -from typing import Any - -from opentelemetry import trace -from opentelemetry.sdk.trace import TracerProvider -from opentelemetry.sdk.trace.export import BatchSpanProcessor -from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter -from opentelemetry.sdk.resources import Resource, SERVICE_NAME, SERVICE_VERSION - -from .backends import get_backend_config - - -def configure_telemetry( - backend: str, - *, - service_name: str = "genkit-app", - service_version: str = "1.0.0", - **kwargs: Any, -) -> None: - """Configure telemetry export to a third-party backend.""" - - # Get backend-specific configuration - config = get_backend_config(backend, **kwargs) - - # Create resource with service info - resource = Resource.create({ - SERVICE_NAME: service_name, - SERVICE_VERSION: service_version, - }) - - # Create OTLP exporter with backend config - exporter = OTLPSpanExporter( - endpoint=config.endpoint, - headers=config.headers, - ) - - # Configure tracer provider - provider = TracerProvider(resource=resource) - provider.add_span_processor(BatchSpanProcessor(exporter)) - trace.set_tracer_provider(provider) -``` - -### Backend Presets - -```python -# src/genkit/plugins/observability/backends/sentry.py - -from dataclasses import dataclass - - -@dataclass -class SentryConfig: - """Sentry OTLP configuration.""" - - endpoint: str - headers: dict[str, str] - - -def get_sentry_config(dsn: str) -> SentryConfig: - """Create Sentry configuration from DSN. - - DSN format: https://{key}@{org}.ingest.sentry.io/{project} - """ - # Parse DSN and construct OTLP endpoint - # Sentry accepts OTLP at: https://{org}.ingest.sentry.io/api/{project}/envelope/ - - return SentryConfig( - endpoint=f"https://sentry.io/api/0/envelope/", - headers={ - "X-Sentry-Auth": f"Sentry sentry_key={dsn}", - }, - ) -``` - -## pyproject.toml - -```toml -[project] -name = "genkit-observability-plugin" -version = "0.1.0" -description = "Third-party observability backends for Genkit" -requires-python = ">=3.10" -dependencies = [ - "genkit", - "opentelemetry-sdk>=1.20.0", - "opentelemetry-exporter-otlp-proto-http>=1.20.0", -] - -[project.optional-dependencies] -dev = [ - "pytest>=8.0.0", - "pytest-asyncio>=0.24.0", -] -``` - -## Sample Application - -```python -# py/samples/provider-observability-hello/src/main.py -"""Observability hello sample - Third-party telemetry with Genkit. - -Key Concepts (ELI5):: - - ┌─────────────────────┬────────────────────────────────────────────────────┐ - │ Concept │ ELI5 Explanation │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ Observability │ Seeing what your app is doing. Like X-ray │ - │ │ vision for your code! │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ Traces │ The journey of a request through your app. │ - │ │ Shows timing, errors, everything. │ - ├─────────────────────┼────────────────────────────────────────────────────┤ - │ Backend │ Where traces are stored and visualized. │ - │ │ Sentry, Honeycomb, Datadog, etc. │ - └─────────────────────┴────────────────────────────────────────────────────┘ -""" - -import os -from genkit.ai import Genkit -from genkit.plugins.observability import configure_telemetry -from genkit.plugins.google_genai import GoogleAI - -# Configure telemetry FIRST (before any Genkit operations) -configure_telemetry( - backend="honeycomb", # or "sentry", "datadog", etc. - honeycomb_api_key=os.environ["HONEYCOMB_API_KEY"], - service_name="provider-observability-hello", -) - -ai = Genkit( - plugins=[GoogleAI()], - model="googleai/gemini-2.0-flash", -) - -@ai.flow() -async def say_hi(name: str) -> str: - """Say hello - traced to your observability backend.""" - response = await ai.generate(prompt=f"Say hi to {name}!") - return response.text -``` - -## Environment Variables - -| Variable | Backend | Description | -|----------|---------|-------------| -| `SENTRY_DSN` | Sentry | Your Sentry DSN | -| `HONEYCOMB_API_KEY` | Honeycomb | Honeycomb API key | -| `DD_API_KEY` | Datadog | Datadog API key | -| `GRAFANA_OTLP_ENDPOINT` | Grafana | Grafana Cloud OTLP endpoint | -| `AXIOM_TOKEN` | Axiom | Axiom API token | - -## Feasibility Score - -| Factor | Score | Notes | -|--------|-------|-------| -| **API Documentation** | 9/10 | Standard OTLP, well-documented | -| **Python Support** | 10/10 | Official opentelemetry-python | -| **Setup Simplicity** | 9/10 | One function call with preset | -| **Feature Coverage** | 8/10 | Traces + basic metrics | -| **Community Demand** | 9/10 | Common request | -| **Maintenance Burden** | 9/10 | Stable OTLP protocol | -| **Strategic Value** | 8/10 | Platform-agnostic option | -| **TOTAL** | **89/100** | ✅ **BUILD** | - -## References - -- [OpenTelemetry Python](https://opentelemetry.io/docs/languages/python/) -- [Sentry OTLP](https://docs.sentry.io/platforms/python/tracing/) -- [Honeycomb OpenTelemetry](https://docs.honeycomb.io/send-data/opentelemetry/) -- [Datadog OTLP](https://docs.datadoghq.com/tracing/trace_collection/open_standards/otlp_ingest_in_the_agent/) -- [Grafana Cloud OTLP](https://grafana.com/docs/grafana-cloud/send-data/otlp/) -- [Axiom OpenTelemetry](https://axiom.co/docs/send-data/opentelemetry) diff --git a/py/engdoc/planning/vercel-plugins.md b/py/engdoc/planning/vercel-plugins.md deleted file mode 100644 index 26d9523395..0000000000 --- a/py/engdoc/planning/vercel-plugins.md +++ /dev/null @@ -1,384 +0,0 @@ -# Vercel Plugins Implementation Plan - -**Status:** Research Complete -**AI Plugin Feasibility:** ⚠️ LOW-MEDIUM (AI SDK is JS/TS only, but AI Gateway works) -**Telemetry Plugin Feasibility:** ⚠️ MEDIUM (standard OTEL works, @vercel/otel is Node.js only) -**Estimated Effort:** Low (if implemented) -**Dependencies:** `httpx`, `openai` or `anthropic`, `opentelemetry-sdk` - -## Overview - -**Important Clarification:** Python IS fully supported on Vercel as a runtime platform. -FastAPI, Flask, and other Python frameworks work great as Vercel Functions. - -However, Vercel's **AI-specific SDKs** and **@vercel/otel** are JavaScript/TypeScript only. - -``` -┌─────────────────────────────────────────────────────────────────────────┐ -│ VERCEL PYTHON SUPPORT MATRIX │ -│ │ -│ ┌─────────────────────────────────────────────────────────────────┐ │ -│ │ Vercel + Python │ │ -│ ├─────────────────────────────────────────────────────────────────┤ │ -│ │ Feature │ Python Support │ Notes │ │ -│ ├─────────────────────────────────────────────────────────────────┤ │ -│ │ Vercel Platform │ ✅ YES │ FastAPI, Flask work great │ │ -│ │ Vercel Functions │ ✅ YES │ Python serverless │ │ -│ │ AI Gateway │ ✅ YES │ HTTP API, any language │ │ -│ │ AI SDK │ ❌ JS/TS only │ No Python package │ │ -│ │ @vercel/otel │ ❌ Node.js │ No Python package │ │ -│ │ Standard OTEL │ ✅ YES │ Works from Python apps │ │ -│ └─────────────────────────────────────────────────────────────────┘ │ -│ │ -│ Key Concepts (ELI5): │ -│ ┌─────────────────────┬────────────────────────────────────────────┐ │ -│ │ Vercel Functions │ Run Python (FastAPI/Flask) as serverless. │ │ -│ │ │ Auto-scales, 250MB limit per function. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ AI Gateway │ A proxy that adds caching, rate limiting, │ │ -│ │ │ and routing to AI API calls. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ Vercel AI SDK │ JavaScript library for building AI apps. │ │ -│ │ │ NOT available for Python (use AI Gateway).│ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ @vercel/otel │ Vercel's OTEL package for Node.js only. │ │ -│ │ │ Python apps use standard OTEL instead. │ │ -│ ├─────────────────────┼────────────────────────────────────────────┤ │ -│ │ OIDC Token │ Auto-generated auth token on Vercel. │ │ -│ │ │ Available to Python apps too! │ │ -│ └─────────────────────┴────────────────────────────────────────────┘ │ -└─────────────────────────────────────────────────────────────────────────┘ - -Reference: https://vercel.com/docs/frameworks/backend/fastapi -``` - -## Part 1: Vercel AI Gateway Plugin - -### What AI Gateway Provides - -The AI Gateway is an HTTP proxy that: -- Routes requests to multiple AI providers -- Adds request caching -- Provides rate limiting -- Offers fallback routing -- Works with ANY language via HTTP - -### Python Integration - -Since AI Gateway uses OpenAI-compatible and Anthropic-compatible APIs, Python apps can -use it by pointing existing SDKs at the gateway URL. - -```python -# Using OpenAI SDK with Vercel AI Gateway -from openai import OpenAI - -client = OpenAI( - api_key=os.getenv('AI_GATEWAY_API_KEY'), - base_url='https://ai-gateway.vercel.sh/v1' -) - -response = client.chat.completions.create( - model='anthropic/claude-sonnet-4.5', # Can use any provider! - messages=[{'role': 'user', 'content': 'Hello!'}] -) -``` - -### Implementation Option: Simple Helper - -Rather than a full plugin, provide a helper function: - -```python -# py/plugins/vercel/__init__.py -"""Vercel AI Gateway integration for Genkit. - -Vercel's AI Gateway is a proxy that works with any AI provider, adding -caching, rate limiting, and fallback routing. - -Key Concepts (ELI5):: - - ┌─────────────────────┬────────────────────────────────────────────────┐ - │ Concept │ ELI5 Explanation │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ AI Gateway │ A middleman between you and AI providers. │ - │ │ Adds caching and rate limiting automatically. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ Universal Provider │ Use any model (OpenAI, Anthropic, etc.) │ - │ │ through one consistent API. │ - ├─────────────────────┼────────────────────────────────────────────────┤ - │ API Key │ Your AI_GATEWAY_API_KEY for authentication. │ - │ │ Get it from Vercel dashboard. │ - └─────────────────────┴────────────────────────────────────────────────┘ - -Example:: - - from genkit.ai import Genkit - from genkit.plugins.vercel import vercel_gateway_url - from genkit.plugins.compat_oai import OpenAI - - # Route OpenAI requests through Vercel AI Gateway - ai = Genkit( - plugins=[ - OpenAI( - base_url=vercel_gateway_url(), - api_key=os.environ['AI_GATEWAY_API_KEY'], - ), - ], - ) -""" - -import os - -AI_GATEWAY_BASE_URL = "https://ai-gateway.vercel.sh/v1" - - -def vercel_gateway_url() -> str: - """Get the Vercel AI Gateway base URL. - - Returns: - The AI Gateway URL to use with OpenAI-compatible SDKs. - - Example: - >>> from genkit.plugins.compat_oai import OpenAI - >>> OpenAI(base_url=vercel_gateway_url()) - """ - return AI_GATEWAY_BASE_URL - - -def get_vercel_auth() -> str | None: - """Get the appropriate auth token for Vercel. - - On Vercel deployments, uses OIDC token. Otherwise, uses API key. - - Returns: - Auth token string or None if not configured. - """ - # On Vercel, OIDC token is auto-generated - if oidc := os.environ.get('VERCEL_OIDC_TOKEN'): - return oidc - # For local development, use API key - return os.environ.get('AI_GATEWAY_API_KEY') -``` - -### Feasibility Assessment - -**Feasibility: ⚠️ LOW-MEDIUM** - -**Reasons:** -1. AI Gateway works fine with Python via existing SDKs -2. No Vercel-specific AI functionality beyond the gateway -3. Implementation would be trivial (just URL helper) -4. Users can already do this without a plugin - -**Recommendation:** -- Document how to use AI Gateway with existing plugins (`compat-oai`, `anthropic`) -- Don't create a separate plugin unless there's strong user demand -- Could add as a simple utility function in documentation - ---- - -## Part 2: Vercel Telemetry Plugin - -### Current State - -Vercel's `@vercel/otel` package is **Node.js only**, but Python apps on Vercel CAN use -standard OpenTelemetry to export traces to any OTEL-compatible backend. - -```typescript -// @vercel/otel - JavaScript/TypeScript only -import { registerOTel } from '@vercel/otel'; - -export function register() { - registerOTel({ serviceName: 'your-project-name' }); -} -``` - -### What @vercel/otel Provides (Node.js only) - -- Auto-configuration for Vercel's OTEL collector -- Node.js and Edge runtime support -- W3C Trace Context propagation -- Fetch API instrumentation - -### Python Options on Vercel - -Python apps deployed on Vercel (FastAPI, Flask, etc.) can use standard OTEL: - -```python -# Standard OTLP export from Python on Vercel -from opentelemetry import trace -from opentelemetry.sdk.trace import TracerProvider -from opentelemetry.sdk.trace.export import BatchSpanProcessor -from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter - -# Works from Vercel Python Functions! -provider = TracerProvider() -provider.add_span_processor( - BatchSpanProcessor( - OTLPSpanExporter( - endpoint="https://api.honeycomb.io/v1/traces", # Or any OTEL backend - headers={"x-honeycomb-team": os.environ["HONEYCOMB_API_KEY"]}, - ) - ) -) -trace.set_tracer_provider(provider) -``` - -### Potential Vercel Telemetry Plugin - -A simple plugin could provide preset configs for common backends: - -```python -class VercelTelemetry: - """Telemetry for Python apps on Vercel. - - Since @vercel/otel is Node.js only, this provides standard OTEL - configuration with presets for popular backends. - """ - - def __init__( - self, - backend: Literal["honeycomb", "datadog", "grafana", "axiom"], - service_name: str = "genkit-vercel-app", - api_key: str | None = None, - ): - ... -``` - -### Feasibility Assessment - -**Feasibility: ⚠️ MEDIUM** - -**Reasons:** -1. Python DOES work on Vercel (FastAPI, Flask are first-class) -2. Standard OTEL works from Python Vercel Functions -3. No Vercel-specific OTEL package, but standard approach works -4. Could provide convenience presets for common backends - -**Recommendation:** -- **Consider a simple helper** for common OTEL backends -- Not Vercel-specific, but useful for Vercel Python users -- Lower priority than `azure` and `cloudflare-ai` - ---- - -## Summary: Should We Build Vercel Plugins? - -### Vercel AI Plugin - -| Aspect | Assessment | -|--------|------------| -| **Need** | Low - existing plugins work fine with AI Gateway | -| **Effort** | Very low - just URL helper | -| **Value** | Convenience for Vercel users | -| **Recommendation** | **Low priority** - document AI Gateway usage | - -### Vercel Telemetry Plugin - -| Aspect | Assessment | -|--------|------------| -| **Need** | Medium - Python on Vercel is growing | -| **Effort** | Low - standard OTEL with presets | -| **Value** | Convenience for common backends | -| **Recommendation** | **Consider** if user demand exists | - ---- - -## Alternative: Documentation Only - -Instead of plugins, provide documentation showing how to: - -### Using AI Gateway with Genkit - -```markdown -# Using Vercel AI Gateway with Genkit - -Vercel AI Gateway can be used with Genkit's `compat-oai` plugin by setting -the base URL: - -```python -from genkit.ai import Genkit -from genkit.plugins.compat_oai import OpenAI -import os - -ai = Genkit( - plugins=[ - OpenAI( - base_url="https://ai-gateway.vercel.sh/v1", - api_key=os.environ['AI_GATEWAY_API_KEY'], - ), - ], - model="openai/gpt-4o", # or "anthropic/claude-sonnet-4.5" -) -``` - -### Telemetry for Python on Vercel - -For Python serverless functions on Vercel, use standard OpenTelemetry -with your preferred backend (Honeycomb, Datadog, etc.): - -```python -from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter -# ... standard OTLP setup -``` -``` - ---- - -## Comparison with Other Platforms - -| Platform | AI Plugin | Telemetry Plugin | Python Runtime | -|----------|-----------|------------------|----------------| -| **AWS** | ✅ amazon-bedrock | ✅ aws | ✅ Full | -| **GCP** | ✅ google-genai | ✅ google-cloud | ✅ Full | -| **Azure** | ✅ microsoft-foundry | ✅ azure (planned) | ✅ Full | -| **Cloudflare** | ✅ cloudflare-ai (planned) | ⚠️ AI Gateway only | ✅ Workers | -| **Vercel** | ⚠️ AI Gateway helper | ⚠️ Standard OTEL | ✅ Functions | - ---- - -## Final Recommendation - -**Low priority, but feasible if user demand exists.** Python works great on Vercel! - -### Recommended Approach - -1. **Document AI Gateway usage** with existing `compat-oai` and `anthropic` plugins -2. **Document standard OTEL** for telemetry from Python Vercel Functions -3. **Consider a simple `vercel` plugin** if users request it, containing: - - `vercel_gateway_url()` helper for AI Gateway - - `VercelTelemetry` class with presets for Honeycomb, Datadog, etc. - -### Priority - -| Plugin | Priority | Reason | -|--------|----------|--------| -| `azure` | High | Official OTEL distro, pairs with microsoft-foundry | -| `cloudflare-ai` | High | Growing edge AI market | -| `vercel` | Low | Works without plugin, add if demanded | - -### If We Build It - -A minimal `vercel` plugin would look like: - -```python -# py/plugins/vercel/src/genkit/plugins/vercel/__init__.py -"""Vercel integration helpers for Genkit. - -Provides utilities for Python apps deployed on Vercel Functions. -""" - -def vercel_gateway_url() -> str: - """Get Vercel AI Gateway URL.""" - return "https://ai-gateway.vercel.sh/v1" - -class VercelTelemetry: - """Standard OTEL with presets for common backends.""" - ... -``` - -## References - -- [Vercel AI Gateway - Python](https://vercel.com/docs/ai-gateway/python) -- [Vercel AI SDK](https://sdk.vercel.ai/docs) (JS/TS only) -- [@vercel/otel](https://www.npmjs.com/package/@vercel/otel) (Node.js only) -- [Python AI SDK (Community)](https://github.com/python-ai-sdk/sdk) (unofficial) diff --git a/py/engdoc/release-publishing-guide.md b/py/engdoc/release-publishing-guide.md deleted file mode 100644 index 6a1941d1b3..0000000000 --- a/py/engdoc/release-publishing-guide.md +++ /dev/null @@ -1,340 +0,0 @@ -# Python SDK Release and Publishing Guide - -This guide documents the complete process for releasing and publishing the Genkit Python SDK. - -## Pre-Release Requirements - -### 1. Version Verification - -All packages must have the same version (`0.5.0` for this release): - -```bash -# Check all package versions -grep "^version = " packages/*/pyproject.toml plugins/*/pyproject.toml | sort -``` - -### 2. Documentation Requirements - -| Requirement | Location | Status | -|-------------|----------|--------| -| CHANGELOG.md updated | `py/CHANGELOG.md` | ✅ | -| PR description created | `py/.github/PR_DESCRIPTION_0.5.0.md` | ✅ | -| Blog article written | `py/engdoc/blog-genkit-python-0.5.0.md` | ✅ | -| Release validation passed | `./bin/validate_release_docs` | ✅ | - -### 3. Code Quality Requirements - -```bash -# All checks must pass -cd py && ./bin/lint # Linting and type checks -cd py && uv run pytest . # All tests pass -cd py && ./bin/validate_release_docs # Release doc validation -``` - -### 4. PyPI Package Status - -**Existing packages (update from v0.4.0 to v0.5.0):** -- genkit -- genkit-plugin-compat-oai -- genkit-plugin-dev-local-vectorstore -- genkit-plugin-firebase -- genkit-plugin-flask -- genkit-plugin-google-cloud -- genkit-plugin-google-genai -- genkit-plugin-ollama -- genkit-plugin-vertex-ai - -**New packages (first publish at v0.5.0):** -- genkit-plugin-anthropic -- genkit-plugin-aws -- genkit-plugin-amazon-bedrock -- genkit-plugin-cloudflare-workers-ai -- genkit-plugin-deepseek -- genkit-plugin-evaluators -- genkit-plugin-huggingface -- genkit-plugin-mcp -- genkit-plugin-mistral -- genkit-plugin-microsoft-foundry -- genkit-plugin-observability -- genkit-plugin-xai - -### 5. GitHub Environment Configuration - -Ensure the `pypi_github_publishing` environment is configured in GitHub repository settings with: -- PyPI trusted publishing enabled -- Required reviewers (if applicable) - -## Release Process - -### Step 1: Merge the Release PR - -After PR approval: -```bash -# Merge the PR (use squash or merge commit as appropriate) -gh pr merge 4417 --squash -``` - -### Step 2: Create a GitHub Release - -```bash -# Create and push a tag -git checkout main -git pull origin main -git tag -a py/v0.5.0 -m "Genkit Python SDK v0.5.0" -git push origin py/v0.5.0 -``` - -Then create a GitHub release: -1. Go to https://github.com/firebase/genkit/releases/new -2. Select tag: `py/v0.5.0` -3. Title: `Genkit Python SDK v0.5.0` -4. Copy release notes from `py/CHANGELOG.md` -5. Publish release - -### Step 3: Publish to PyPI - -1. Go to **Actions** → **Publish Python Package** -2. Click **Run workflow** -3. Select: - - `publish_scope: all` (to publish all 21 packages) -4. Click **Run workflow** -5. Monitor the workflow - it will build and publish all packages in parallel - -### Step 4: Verify Publication - -After workflow completes: -```bash -# Verify packages on PyPI -for pkg in genkit genkit-plugin-google-genai genkit-plugin-anthropic; do - pip index versions $pkg | head -1 -done -``` - -Or check on PyPI directly: -- https://pypi.org/project/genkit/ -- https://pypi.org/project/genkit-plugin-google-genai/ - -### Step 5: Post-Release Verification - -```bash -# Test installation in a fresh environment -python -m venv /tmp/genkit-test -source /tmp/genkit-test/bin/activate -pip install genkit genkit-plugin-google-genai -python -c "from genkit.ai import Genkit; print('Success!')" -``` - -## Troubleshooting - -### Package Already Exists at This Version - -If a package was partially published: -```bash -# Check the version on PyPI -curl -s "https://pypi.org/pypi/genkit/json" | jq -r '.info.version' -``` - -You cannot re-upload the same version. Either: -1. Bump the version (e.g., 0.5.1) -2. Delete the release on PyPI (only within 24 hours) - -### Trusted Publishing Fails - -Ensure the GitHub environment `pypi_github_publishing` is configured with: -1. Go to repository Settings → Environments -2. Create/edit `pypi_github_publishing` -3. Configure trusted publisher on PyPI for each package - -### Individual Package Publish - -To publish a single package: - -**Via GitHub UI:** -1. Go to **Actions** → **Publish Python Package** -2. Select `publish_scope: single` -3. Select `project_type: plugins` (or `packages` for genkit core) -4. Select the specific `project_name` - -**Via CLI:** -```bash -# Publish all packages -gh workflow run publish_python.yml -f publish_scope=all - -# Publish just the core genkit package -gh workflow run publish_python.yml \ - -f publish_scope=single \ - -f project_type=packages \ - -f project_name=genkit - -# Publish a specific plugin (3 parameters required) -gh workflow run publish_python.yml \ - -f publish_scope=single \ - -f project_type=plugins \ - -f project_name=anthropic - -gh workflow run publish_python.yml \ - -f publish_scope=single \ - -f project_type=plugins \ - -f project_name=google-genai - -gh workflow run publish_python.yml \ - -f publish_scope=single \ - -f project_type=plugins \ - -f project_name=vertex-ai -``` - -### Available Plugin Names for project_name - -| Plugin Name | PyPI Package Name | -|-------------|-------------------| -| `anthropic` | genkit-plugin-anthropic | -| `aws` | genkit-plugin-aws | -| `amazon-bedrock` | genkit-plugin-amazon-bedrock | -| `cloudflare-workers-ai` | genkit-plugin-cloudflare-workers-ai | -| `compat-oai` | genkit-plugin-compat-oai | -| `deepseek` | genkit-plugin-deepseek | -| `dev-local-vectorstore` | genkit-plugin-dev-local-vectorstore | -| `evaluators` | genkit-plugin-evaluators | -| `firebase` | genkit-plugin-firebase | -| `flask` | genkit-plugin-flask | -| `google-cloud` | genkit-plugin-google-cloud | -| `google-genai` | genkit-plugin-google-genai | -| `huggingface` | genkit-plugin-huggingface | -| `mcp` | genkit-plugin-mcp | -| `mistral` | genkit-plugin-mistral | -| `microsoft-foundry` | genkit-plugin-microsoft-foundry | -| `observability` | genkit-plugin-observability | -| `ollama` | genkit-plugin-ollama | -| `vertex-ai` | genkit-plugin-vertex-ai | -| `xai` | genkit-plugin-xai | - -### Monitoring Workflow Progress - -```bash -# List recent publish workflow runs -gh run list --workflow=publish_python.yml --limit=5 - -# Watch a specific run in real-time -gh run watch - -# View detailed job status -gh run view --json status,conclusion,jobs - -# View failed job logs -gh run view --log-failed | head -100 -``` - -### Retrying Failed Jobs - -```bash -# Re-run all failed jobs from a specific run -gh run rerun --failed - -# Or trigger a fresh workflow run -gh workflow run publish_python.yml -f publish_scope= -``` - -## Package Installation Reference - -After release, users install packages with: - -```bash -# Core package (required) -pip install genkit - -# Model providers -pip install genkit-plugin-google-genai # Google AI (Gemini) -pip install genkit-plugin-anthropic # Anthropic (Claude) -pip install genkit-plugin-ollama # Ollama (local models) -pip install genkit-plugin-vertex-ai # Vertex AI -pip install genkit-plugin-amazon-bedrock # AWS Bedrock -pip install genkit-plugin-mistral # Mistral AI -pip install genkit-plugin-deepseek # DeepSeek -pip install genkit-plugin-xai # xAI (Grok) -pip install genkit-plugin-huggingface # Hugging Face -pip install genkit-plugin-cloudflare-workers-ai # Cloudflare Workers AI + OTLP telemetry -pip install genkit-plugin-microsoft-foundry # Azure AI Foundry + Azure Application Insights telemetry - -# Telemetry -pip install genkit-plugin-google-cloud # GCP Cloud Trace -pip install genkit-plugin-aws # AWS X-Ray -# Azure telemetry is included in genkit-plugin-microsoft-foundry -pip install genkit-plugin-observability # Sentry, Honeycomb, Datadog - -# Other -pip install genkit-plugin-firebase # Firebase/Firestore -pip install genkit-plugin-evaluators # Evaluation metrics -pip install genkit-plugin-flask # Flask integration -pip install genkit-plugin-compat-oai # OpenAI compatibility -pip install genkit-plugin-mcp # Model Context Protocol -``` - -## Troubleshooting - -### PyPI 500 Error: Trusted Publishing Exchange Failure - -**Error:** `Trusted publishing exchange failure: Token request failed: the index produced an unexpected 500 response.` - -**Cause:** This is a transient PyPI server error, not a configuration issue. - -**Solution:** -1. Check PyPI status: https://status.python.org/ -2. Wait 5-10 minutes -3. Retry the failed jobs: - ```bash - gh run rerun --failed - ``` - -### PyPI 400 Error: Non-user Identities Cannot Create New Projects - -**Error:** `400 Non-user identities cannot create new projects. This was probably caused by successfully using a pending publisher but specifying the project name incorrectly.` - -**Cause:** The package doesn't exist on PyPI yet and needs Trusted Publisher setup. - -**Solution for new packages:** -1. Go to https://pypi.org/manage/account/publishing/ -2. Add a **Pending Publisher** for each new package: - - **PyPI Project Name:** `genkit-plugin-` (exact package name from pyproject.toml) - - **Owner:** `firebase` - - **Repository:** `genkit` - - **Workflow name:** `publish_python.yml` - - **Environment:** `pypi` -3. Retry the workflow - -### Package Already Exists at This Version - -**Error:** `File already exists` - -**Cause:** The exact version was already uploaded to PyPI. - -**Solution:** -- You cannot re-upload the same version to PyPI -- Either bump the version (e.g., 0.5.1) or verify the existing package is correct -- Use `--skip-existing` flag if publishing multiple packages and some already exist - -### Authentication Failure - -**Error:** `403 Forbidden` or `401 Unauthorized` - -**Cause:** OIDC token exchange failed between GitHub and PyPI. - -**Solution:** -1. Verify the GitHub environment `pypi` exists in repository settings -2. Verify Trusted Publisher is configured correctly on PyPI -3. Ensure workflow file path matches PyPI configuration exactly - -### Manual Fallback: API Token Upload - -If Trusted Publishing continues to fail: - -```bash -cd py - -# Build the package -uv build --package genkit-plugin- - -# Upload with API token (get token from https://pypi.org/manage/account/token/) -TWINE_USERNAME=__token__ TWINE_PASSWORD= twine upload dist/* -``` - -**Note:** This is a fallback for emergencies. Prefer Trusted Publishing for security. diff --git a/py/engdoc/user_guide/python/publishing_pypi.md b/py/engdoc/user_guide/python/publishing_pypi.md index d712df4679..98e3ae3fb4 100644 --- a/py/engdoc/user_guide/python/publishing_pypi.md +++ b/py/engdoc/user_guide/python/publishing_pypi.md @@ -1,24 +1,30 @@ # Overview -The Genkit Python AI SDK publishes some packages to PYPI in order to be able to -use them as python packages using any python package manager. +The Genkit Python AI SDK publishes packages to PyPI so they can be installed +with any Python package manager (`pip`, `uv`, etc.). -In order to generate a new version of any package or plugin, a CI with github -actions has been created. +## Publishing with ReleaseKit (current) -## CI to release new PYPI versions +The primary publishing mechanism is **ReleaseKit**, an internal release +orchestration tool. It automates the full release lifecycle: -The github action located on `.github/workflows/publish_python.yml` has two -inputs: +* Version bumping across all packages (core + 22 plugins + samples) +* Changelog generation from conventional commits +* Dependency-graph-aware publish ordering with retries +* SBOM generation -* Type of project to build. E.g. Package or Plugin -* Name of project. E.g. genkit +The automated workflow is at `.github/workflows/releasekit-uv.yml`. See +[`py/tools/releasekit/README.md`](../../../tools/releasekit/README.md) for +full documentation. -The process is separated in two steps. The first one make validations over the -project to build. Mainly, the project's new version to publish must be greater -that the current one. This step also builds with uv the package and validates -the wheel with twine. +## Legacy manual workflow -The last step uses an action `pypa/gh-action-pypi-publish@release/v1` to publish -the package with trusted publishers. See -(gh-action-pypi-publish)\[https://github.com/pypa/gh-action-pypi-publish] +The older manual workflow at `.github/workflows/publish_python.yml` is still +available as a fallback. It accepts two inputs: + +* Type of project to build (Package or Plugin) +* Name of project (e.g. `genkit`) + +It validates that the new version is greater than the current PyPI version, +builds with `uv`, validates the wheel with `twine`, and publishes using +`pypa/gh-action-pypi-publish@release/v1` with trusted publishers. diff --git a/py/plugins/README.md b/py/plugins/README.md index 6ac25cc9c3..29f0199398 100644 --- a/py/plugins/README.md +++ b/py/plugins/README.md @@ -664,6 +664,4 @@ model provider diversity. ## Further Reading -- [Plugin Planning & Roadmap](../engdoc/planning/) -- [Feature Matrix](../engdoc/planning/FEATURE_MATRIX.md) - [Contributing Guide](../engdoc/contributing/) diff --git a/py/tools/conform/ANNOUNCEMENT.md b/py/tools/conform/ANNOUNCEMENT.md deleted file mode 100644 index f275c3ea1b..0000000000 --- a/py/tools/conform/ANNOUNCEMENT.md +++ /dev/null @@ -1,275 +0,0 @@ -# Announcing Conform: Cross-Runtime Model Conformance Testing for Genkit - -## TL;DR - -**Conform** is a purpose-built model conformance test runner for the -Genkit SDK. It validates that every model plugin — across Python, JS, -and Go runtimes — behaves correctly and consistently. The Python -runtime runs **in-process** with zero subprocess overhead; JS and Go -runtimes communicate via async HTTP to their reflection servers. One -command tests **13 plugins**, runs **150+ test cases**, and reports -results in **under 4 minutes**. - ---- - -## The Problem - -The Genkit SDK supports **13+ model plugins** (Anthropic, Google GenAI, -Amazon Bedrock, Mistral, DeepSeek, Cohere, xAI, Ollama, …) across -**3 runtimes** (Python, JS, Go). Each plugin must correctly: - -1. **Generate text** — simple prompts, system messages, multi-turn -2. **Handle structured output** — JSON mode, schema conformance -3. **Support tool calling** — tool requests, tool responses, multi-step -4. **Stream responses** — text chunks, streamed JSON, streamed tool calls -5. **Process media** — image inputs, media outputs -6. **Expose reasoning** — thinking / reasoning content from supported models - -Previously, conformance was tested ad hoc: - -- Manual spot-checks against live APIs -- Plugin-specific unit tests with mocked responses -- No cross-runtime consistency verification -- No shared test suite between Python, JS, and Go -- Failures discovered in production, not at PR time - ---- - -## The Solution - -Conform provides a unified test framework with a single CLI: - -```bash -conform list # Show all plugins, runtimes, and env-var readiness -conform check-model # Run model conformance tests across all plugins -conform check-plugin # Verify every model plugin has conformance specs -``` - ---- - -## Features - -### Live Conformance Results - -Conform runs all plugin tests concurrently (bounded by a configurable -semaphore) and displays a live Rich progress table. Log lines scroll -above while the summary table stays pinned at the bottom: - -![conform check-model results](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/conform/docs/images/conform_results.png) - -13 plugins. 150+ tests. Under 4 minutes wall time. - -### In-Process Python Runner - -The Python runtime uses an **InProcessRunner** that imports the -plugin's entry point directly — no subprocess, no HTTP server, no -genkit CLI dependency: - -```python -class ActionRunner(Protocol): - async def run_action( - self, key: str, input_data: dict, *, stream: bool = False, - ) -> tuple[dict, list[dict]]: ... - async def close(self) -> None: ... -``` - -| Runner | When | How | -|--------|------|-----| -| **InProcessRunner** | Python (default) | Imports entry point, calls `action.arun_raw()` directly | -| **ReflectionRunner** | JS / Go | Subprocess → async HTTP to reflection server | -| **genkit CLI** | `--use-cli` flag | Delegates to `genkit dev:test-model` | - -### 10 Validators — 1:1 Parity with JS - -Every validator is ported from the canonical JS implementation: - -| Validator | What it checks | -|-----------|----------------| -| `text-includes` | Response text contains expected substring | -| `text-starts-with` | Response text starts with expected prefix | -| `text-not-empty` | Response text is non-empty | -| `valid-json` | Response text is valid JSON | -| `has-tool-request` | Response contains a tool request part | -| `valid-media` | Response contains a media part with valid URL | -| `reasoning` | Response contains a reasoning / thinking part | -| `stream-text-includes` | Streamed chunks contain expected text | -| `stream-has-tool-request` | Streamed chunks contain a tool request | -| `stream-valid-json` | Final streamed chunk is valid JSON | - -New validators: decorate a function with `@register('name')`. - -### YAML-Driven Test Specs - -Each plugin defines its tests in a declarative YAML file: - -```yaml -models: - - name: "anthropic/claude-sonnet-4" - supported_features: [text, json, tools, streaming, reasoning] - tests: - - name: "basic text generation" - prompt: "Say 'hello' and nothing else" - assertions: - - type: text-includes - value: hello - - - name: "streaming structured output" - prompt: "Output a JSON object with a 'name' field" - stream: true - output: - format: json - schema: { "type": "object" } - assertions: - - type: stream-valid-json -``` - -### Full Feature Matrix - -| Feature | Description | -|---------|-------------| -| **In-process Python runner** | Zero-overhead native execution — no subprocess, no HTTP | -| **Reflection runner** | Cross-runtime support via async HTTP (JS, Go) | -| **10 validators** | Ported 1:1 from canonical JS source | -| **YAML-driven specs** | Declarative test definitions per plugin | -| **Live progress table** | Rich terminal UI with real-time updates | -| **Inline progress bars** | Per-row colored bars (green/red/dim) with pre-calculated totals | -| **Log redaction** | Data URIs auto-truncated in debug logs for readability | -| **Concurrent execution** | Semaphore-bounded parallelism (default: 8 plugins, 3 tests/model) | -| **Retry with backoff** | Exponential backoff + full jitter on failure; serial fallback | -| **Human-readable details** | Details column shows `8 std + 0 custom` instead of cryptic `8s+0c` | -| **Per-plugin overrides** | `[conform.plugin-overrides.]` for rate-sensitive plugins | -| **Pre-flight checks** | Validates specs, entry points, and env vars before running | -| **CI integration** | `check-plugin` runs in `bin/lint` on every PR | -| **Multi-runtime** | Python, JS, Go from a single command | -| **Rust-style diagnostics** | Unique error codes with actionable help messages | -| **TOML configuration** | `conform.toml` alongside specs — concurrency, env vars, runtime paths | -| **Legacy CLI fallback** | `--use-cli` delegates to `genkit dev:test-model` | - ---- - -## Architecture - -``` -conform check-model google-genai - │ - ├── Auto-detect runtimes with entry points - │ ├── python? ──→ InProcessRunner - │ │ Import conformance_entry.py - │ │ Call action.arun_raw() directly - │ │ No subprocess · No HTTP · No reflection server - │ │ - │ ├── js? ──→ ReflectionRunner - │ │ Start conformance_entry.ts subprocess - │ │ Async HTTP (httpx) → reflection API - │ │ - │ └── go? ──→ ReflectionRunner (same as JS) - │ - All runners share: - ├── ActionRunner Protocol ← common interface - ├── Validators ← 10 validators, Protocol + @register - ├── Test cases ← 12 built-in, 1:1 with JS - └── Rich console output ← live progress + summary table -``` - -### Layout - -``` -py/ -├── tools/conform/ ← The CLI tool -│ ├── pyproject.toml ← Private package metadata -│ └── src/conform/ -│ ├── cli.py ← Argument parsing + subcommand dispatch -│ ├── config.py ← TOML config loader -│ ├── checker.py ← check-plugin: verify conformance files -│ ├── display.py ← Rich tables, inline progress bars, Rust-style errors -│ ├── log_redact.py ← Structlog processor to truncate data URIs -│ ├── plugins.py ← Plugin discovery + env-var checking -│ ├── reflection.py ← Async HTTP client for reflection API -│ ├── util_test_model.py ← Native test runner (ActionRunner) -│ ├── util_test_cases.py ← 12 built-in test cases -│ ├── types.py ← Shared types (PluginResult, Status) -│ └── validators/ ← Protocol-based validator registry -│ ├── __init__.py ← Validator Protocol + @register -│ ├── json.py ← valid-json -│ ├── streaming.py ← stream-* validators -│ ├── text.py ← text-* validators -│ └── tool.py ← has-tool-request -│ -└── tests/conform/ ← Per-plugin conformance specs - ├── conform.toml ← All repo-specific config (auto-discovered) - ├── anthropic/ - │ ├── model-conformance.yaml - │ ├── conformance_entry.py - │ ├── conformance_entry.ts - │ └── conformance_entry.go - ├── google-genai/ - ├── amazon-bedrock/ - ├── vertex-ai/ - └── ... (13 plugins total) -``` - ---- - -## Impact - -| Metric | Before | After | -|--------|--------|-------| -| **Cross-plugin testing** | Manual spot-checks | 150+ automated tests | -| **Cross-runtime parity** | Not verified | Unified test suite | -| **Time to run all plugins** | Hours (manual) | < 4 minutes | -| **New plugin onboarding** | Write custom tests | Add YAML spec + entry point | -| **CI coverage** | Unit tests only | Unit + conformance on every PR | -| **Failure diagnosis** | Dig through logs | Rust-style errors with codes | -| **Validator extensibility** | N/A | `@register` decorator | - -### CI Integration - -1. **PR checks** (`bin/lint` → `conform check-plugin`) — verifies every - model plugin has conformance specs and entry points. -2. **Conformance runs** (`conform check-model`) — full test suite - against live APIs with real model calls. - ---- - -## Try It - -```bash -# List all plugins and their readiness -py/bin/conform list - -# Run conformance tests for a single plugin -py/bin/conform check-model google-genai - -# Run all plugins (Python runtime) -py/bin/conform check-model - -# Run with verbose output -py/bin/conform check-model -v - -# Control concurrency: 4 plugins, 1 test/model (safe for free tiers) -py/bin/conform check-model -j 4 -t 1 - -# Disable retries (default: 2 retries with exponential backoff) -py/bin/conform check-model --max-retries 0 - -# Custom retry settings -py/bin/conform check-model --max-retries 3 --retry-base-delay 2.0 - -# Filter to a specific runtime -py/bin/conform check-model --runtime python - -# Specify config explicitly (flags are per-subcommand) -py/bin/conform list --config py/tests/conform/conform.toml - -# Verify all plugins have conformance specs (used by bin/lint) -py/bin/conform check-plugin -``` - ---- - -## Links - -- **Source**: `py/tools/conform/` -- **Specs + config**: `py/tests/conform/` (includes `conform.toml`) -- **Documentation**: `py/tools/conform/README.md` -- **Validators**: `py/tools/conform/src/conform/validators/` diff --git a/py/tools/conform/README.md b/py/tools/conform/README.md index a5078a0238..3ba1a9db83 100644 --- a/py/tools/conform/README.md +++ b/py/tools/conform/README.md @@ -329,8 +329,9 @@ py/ │ ├── pyproject.toml ← Private package + [tool.conform] config │ ├── README.md │ └── src/conform/ +│ ├── __main__.py ← Entry point for `python -m conform` │ ├── cli.py ← Argument parsing + subcommand dispatch -│ ├── config.py ← TOML config loader +│ ├── config.py ← TOML config loader (auto-discovers conform.toml) │ ├── checker.py ← check-plugin: verify conformance files exist │ ├── display.py ← Rich tables, inline progress bars, Rust-style errors │ ├── log_redact.py ← Structlog processor to truncate data URIs in logs @@ -338,8 +339,8 @@ py/ │ ├── plugins.py ← Plugin discovery and env-var checking │ ├── reflection.py ← Async HTTP client for reflection API (httpx) │ ├── runner.py ← Legacy parallel runner (genkit CLI subprocess) -│ ├── test_cases.py ← 12 built-in test cases (1:1 parity with JS) -│ ├── test_model.py ← Native test runner with ActionRunner Protocol +│ ├── util_test_cases.py ← 12 built-in test cases (1:1 parity with JS) +│ ├── util_test_model.py ← Native test runner with ActionRunner Protocol │ ├── types.py ← Shared types (PluginResult, Status, Runtime) │ └── validators/ ← Protocol-based validator registry │ ├── __init__.py ← Validator Protocol + @register decorator @@ -488,6 +489,9 @@ The wrapper script `py/bin/conform` passes `--config` automatically. [conform] concurrency = 8 test-concurrency = 3 +action-timeout = 120 # seconds per LLM action call +health-timeout = 5 # seconds per health check +startup-timeout = 30 # seconds to wait for reflection server additional-model-plugins = ["google-genai", "vertex-ai", "ollama"] [conform.env] @@ -499,6 +503,11 @@ google-genai = ["GEMINI_API_KEY"] [conform.plugin-overrides.cloudflare-workers-ai] test-concurrency = 1 +# Per-model overrides (e.g. longer timeout for slow models). +# 3-level resolution: model → plugin → global. +[conform.model-overrides."gemini-2.0-flash"] +action-timeout = 180 + # Paths are relative to the conform.toml file. [conform.runtimes.python] cwd = "../../.." @@ -530,6 +539,9 @@ CLI flags override TOML values: | `-j N` | `concurrency` | Max concurrent plugins | | `-t N` | `test-concurrency` | Max concurrent tests per model spec | | `--verbose` | — | Print full output for failures | +| — | `action-timeout` | Timeout in seconds for a single LLM action call (default: 120) | +| — | `health-timeout` | Timeout in seconds for health checks (default: 5) | +| — | `startup-timeout` | Timeout in seconds for reflection server startup (default: 30) | ## Adding a New Plugin diff --git a/py/tools/releasekit/ANNOUNCEMENT.md b/py/tools/releasekit/ANNOUNCEMENT.md deleted file mode 100644 index e6162ab2db..0000000000 --- a/py/tools/releasekit/ANNOUNCEMENT.md +++ /dev/null @@ -1,236 +0,0 @@ -# Announcing ReleaseKit: Automated Release Orchestration for the Genkit Python SDK - -## TL;DR - -**ReleaseKit** is a purpose-built release orchestration tool for the Genkit -Python SDK. It automates the end-to-end process of publishing 60+ Python -packages to PyPI in the correct dependency order — a process that was -previously manual, error-prone, and took hours. With ReleaseKit, a full -release takes **one command** and completes in minutes. - ---- - -## The Problem - -The Genkit Python SDK is a [uv](https://docs.astral.sh/uv/) workspace -with **62 interdependent packages**: 1 core framework, 22 plugins, and -39 samples. These packages form a 4-level dependency graph with 121 -dependency edges. - -Publishing them to PyPI requires: - -1. **Correct ordering** — `genkit` (core) must be published *before* any - plugin that depends on it, and plugins *before* samples that use them. -2. **Ephemeral version pinning** — during build, workspace-sourced - dependencies (`genkit = { workspace = true }`) must be temporarily - rewritten to concrete versions (`genkit>=0.5.0`), then restored. -3. **Transitive bump propagation** — if `genkit` bumps from 0.5.0 → 0.6.0, - every plugin and sample that depends on it must also be bumped. -4. **Crash safety** — if the process fails mid-way through package #37, - we need to resume from that point, not restart from scratch. - -**No existing tool does this.** `uv publish` is a single-package command. -`release-please` doesn't understand Python workspaces. PyPI-specific tools -like `twine` and `flit` have no concept of dependency ordering. - -Our previous release process was: -- Manual `uv publish` for each package, one at a time -- Copy-paste version numbers into pyproject.toml files -- Hope we didn't miss a dependency or publish in the wrong order -- If something failed mid-release, start over - ---- - -## The Solution - -ReleaseKit automates the entire release lifecycle: - -``` -releasekit prepare → Opens a Release PR with computed version bumps - and generated changelogs - -releasekit publish → Builds and publishes all packages to PyPI - in topological dependency order - -releasekit release → Tags the merge commit and creates a GitHub Release -``` - ---- - -## Features - -### Dependency Graph Visualization - -ReleaseKit discovers all 62 workspace packages and builds a dependency -graph, which can be visualized in 8 output formats (ASCII art, Mermaid, -Graphviz DOT, CSV, JSON, Markdown table, D2, and plain text levels): - -![releasekit graph --format ascii](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/releasekit/docs/docs/images/releasekit_graph_ascii.png) - -The topological sort guarantees that every package is published only after -all its dependencies are available on PyPI. - -### Workspace Health Checks - -19 automated health checks run on every PR via `bin/lint`, catching -issues before they reach PyPI: - -![releasekit check](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/releasekit/docs/docs/images/releasekit_check.png) - -Checks include: circular dependency detection, missing LICENSE/README -files, version consistency across all plugins, PEP 561 type markers, -lockfile staleness, naming conventions, and PyPI metadata completeness. - -### Architecture Overview - -The publish pipeline processes each package through 8 stages, with a -dependency-triggered scheduler that maximizes parallelism: - -![releasekit architecture](https://raw.githubusercontent.com/firebase/genkit/main/py/tools/releasekit/docs/docs/images/releasekit_overview.png) - -### Full Feature Matrix - -| Feature | Description | -|---------|-------------| -| **Dependency-triggered publishing** | Packages publish as soon as their dependencies complete, maximizing parallelism | -| **Conventional commits → semver** | Automatic version bump computation from git history | -| **Transitive propagation** | A change in `genkit` triggers patch bumps for all 61 dependents | -| **Crash-safe resume** | State persistence after each package; resume from failure point | -| **19 pre-publish health checks** | Catch issues at PR time, not after a broken release | -| **Ephemeral pinning** | Workspace deps temporarily pinned to concrete versions for build | -| **Post-publish verification** | SHA-256 checksums verified against PyPI | -| **Smoke testing** | `python -c 'import ...'` after publish to verify installability | -| **Changelog generation** | Per-package changelogs from conventional commits | -| **Git tagging** | Per-package tags (`genkit-v0.5.0`) + umbrella tag (`v0.5.0`) | -| **8 graph formats** | ASCII, CSV, DOT, D2, JSON, Mermaid, Markdown table, levels | -| **Rust-style diagnostics** | Every error has a unique code (e.g. `RK-GRAPH-CYCLE-DETECTED`) | -| **SIGUSR1/SIGUSR2 controls** | Pause/resume the scheduler from another terminal | -| **Release groups** | Publish a subset of packages (e.g. `--group core`) | -| **Rollback** | Delete a tag and its GitHub release with one command | - ---- - -## Architecture - -ReleaseKit is built on a **protocol-based backend architecture** that -makes it fully testable with in-memory fakes — no subprocess calls, no -network I/O, no file system side effects in tests: - -``` -releasekit -├── Backends (DI / Protocol-based) -│ ├── VCS git operations (tag, commit, push) -│ ├── PackageManager build, publish, lock (uv) -│ ├── Workspace package discovery (uv) -│ ├── Registry package registry queries (PyPI) -│ └── Forge release / PR management (GitHub CLI + API) -│ -├── Core Pipeline -│ ├── workspace.py discover packages from pyproject.toml -│ ├── graph.py build & topo-sort dependency graph -│ ├── versioning.py conventional commits → semver bumps -│ ├── scheduler.py dependency-triggered queue dispatcher -│ ├── publisher.py async publish orchestration -│ ├── preflight.py pre-publish safety checks -│ └── checks/ standalone workspace health checks (subpackage) -│ -├── Formatters 8 output formats (ASCII, CSV, DOT, Mermaid, ...) -├── UX Rust-style errors, structured logging, CLI -└── UI Rich live progress (TTY) / structured logs (CI) -``` - -### Publish Pipeline - -Each package goes through an 8-stage pipeline: - -``` -pin → build → checksum → publish → poll → verify_checksum → smoke_test → restore -``` - -The **dependency-triggered scheduler** is more efficient than level-based -lockstep — each package starts as soon as all its dependencies complete, -not when the entire level finishes. - ---- - -## Impact - -| Metric | Before | After | -|--------|--------|-------| -| **Release time** | Hours (manual) | Minutes (automated) | -| **Risk of wrong ordering** | High | Zero (topological sort) | -| **Crash recovery** | Start over | Resume from failure point | -| **Version consistency** | Error-prone | Enforced by 19 checks | -| **Missing metadata** | Found after publish | Caught at PR time | -| **Changelog** | Manual | Auto-generated from commits | -| **PyPI verification** | Manual spot-check | Automated checksum + smoke test | - -### CI Integration - -ReleaseKit is integrated into CI at two levels: - -1. **PR checks** (`bin/lint` → `releasekit check`) — runs 19 health - checks on every PR touching `py/`. Catches issues before merge. -2. **Publish workflow** (`.github/workflows/publish_python.yml`) — - orchestrates the full publish pipeline on release tags. - ---- - -## Dependency Graph (62 packages, 121 edges, 4 levels) - -``` -┌────────────────────────────────────────────────────────┐ -│ Level 0 │ -│ genkit (0.5.0) │ -├────────────────────────────────────────────────────────┤ -│ Level 1 (19 plugins + 1 sample) │ -│ genkit-plugin-anthropic (0.5.0) │ -│ genkit-plugin-google-genai (0.5.0) │ -│ genkit-plugin-firebase (0.5.0) │ -│ genkit-plugin-vertex-ai (0.5.0) │ -│ genkit-plugin-ollama (0.5.0) │ -│ ... 15 more │ -├────────────────────────────────────────────────────────┤ -│ Level 2 (35 packages) │ -│ genkit-plugin-deepseek (0.5.0) │ -│ genkit-plugin-flask (0.5.0) │ -│ provider-google-genai-hello (0.1.0) │ -│ web-endpoints-hello (0.1.0) │ -│ ... 31 more │ -├────────────────────────────────────────────────────────┤ -│ Level 3 (6 packages) │ -│ framework-restaurant-demo (0.1.0) │ -│ provider-vertex-ai-model-garden (0.1.0) │ -│ ... 4 more │ -└────────────────────────────────────────────────────────┘ -``` - ---- - -## Try It - -```bash -# From the genkit repo root -cd py/tools/releasekit - -# Discover all workspace packages -uv run releasekit discover - -# View the dependency graph -uv run releasekit graph --format ascii - -# Run workspace health checks -uv run releasekit check - -# Preview what a release would look like -uv run releasekit plan -``` - ---- - -## Links - -- **Source**: `py/tools/releasekit/` in the genkit repo -- **CI PR**: [#4590](https://github.com/firebase/genkit/pull/4590) — enables `releasekit check` in CI -- **Documentation PR**: [#4589](https://github.com/firebase/genkit/pull/4589) — MkDocs engineering docs -- **Publish Workflow**: `.github/workflows/publish_python.yml` diff --git a/py/tools/releasekit/FIXES.md b/py/tools/releasekit/FIXES.md deleted file mode 100644 index 325eb8aacb..0000000000 --- a/py/tools/releasekit/FIXES.md +++ /dev/null @@ -1,143 +0,0 @@ -# Releasekit: Audit Fixes Roadmap - -Findings from an exhaustive audit cross-referencing known pain points from -[release-please](https://github.com/googleapis/release-please/issues) and -[python-semantic-release](https://github.com/python-semantic-release/python-semantic-release/issues) -against the releasekit codebase. - -## Dependency Graph - -```text - F1: Label new PRs ──────────┐ - │ - F2: Filter by head branch ──┼──▶ F5: Auto-prepare on push to main - │ │ - F3: checkout@v5 → v4 ───────┘ │ - ▼ - F4: --first-parent dedup ──────▶ F6: Write CHANGELOG.md to disk - (per-package files, prepend new - entries, commit in release branch) -``` - -## Reverse Topological Order (phases) - -### Phase 1 — Foundations (no dependencies, land first) - -These are prerequisites for the auto-prepare feature and fix real bugs. - -| ID | Severity | File | Fix | -|----|----------|------|-----| -| **F1** | Medium | `prepare.py` | Add `autorelease: pending` label to **newly created** PRs, not just updated ones. Without this, `tag_release` can't find the merged PR. | -| **F2** | Medium | `release.py` | Filter `list_prs()` by `head='releasekit--release'` in addition to label. Prevents race where a stale PR with the same label is picked up. | -| **F3** | Medium | `publish_python.yml` | Change `actions/checkout@v5` → `@v4`. v5 doesn't exist; workflow will fail. | - -### Phase 2 — Changelog Quality (independent) - -| ID | Severity | File | Fix | -|----|----------|------|-----| -| **F4** | Critical | `vcs/git.py` + `vcs/__init__.py` | Add `--first-parent` to `git log` in the `log()` method. Prevents duplicate changelog entries when merge commits repeat the same conventional commit message as the squashed commit. See [release-please#2476](https://github.com/googleapis/release-please/issues/2476). | - -### Phase 3 — Auto-Prepare + Changelog Files (depends on F1 + F2 + F4) - -| ID | Severity | File | Fix | -|----|----------|------|-----| -| **F5** | Feature | `release.yml` | Add `push` trigger on `main` so `prepare` runs automatically on every merge. The Release PR stays up-to-date with accumulated changelogs. Publish remains manual or merge-triggered. | -| **F6** | Feature | `prepare.py` | Write per-package `CHANGELOG.md` files to disk during `prepare`. Prepend new entries to existing file (or create it). Commit alongside version bumps on the release branch. Depends on F4 so written changelogs are dedup-clean. | - -## Detailed Fix Descriptions - -### F1: Label new PRs with `autorelease: pending` - -**Problem**: In `prepare.py:334-349`, the label is only added when an -existing PR is found and updated. When a brand-new PR is created, it -never gets the label. The `release` step searches for merged PRs by -this label, so it will miss PRs that were created fresh and then merged. - -**Fix**: After `forge.create_pr()`, extract the PR number from the -result and call `forge.add_labels(pr_number, [_AUTORELEASE_PENDING])`. - -### F2: Filter merged PR lookup by head branch - -**Problem**: In `release.py:236-240`, `list_prs(label=..., state='merged')` -could return a stale PR from a previous release cycle if the label wasn't -cleaned up. Adding `head='releasekit--release'` narrows the search to -only the correct branch. - -**Fix**: Add `head=_RELEASE_BRANCH` to the `list_prs` call. Import the -branch constant or define it locally. - -### F3: Fix `actions/checkout` version - -**Problem**: `publish_python.yml:84` uses `actions/checkout@v5` which -does not exist. The latest stable is `v4`. - -**Fix**: One-line change: `@v5` → `@v4`. - -### F4: Deduplicate changelog entries with `--first-parent` - -**Problem**: When a PR is merged (not squashed), both the merge commit -and the original feature commit appear in `git log`. If both have the -same conventional commit message (common with GitHub's default merge -commit format), the changelog gets duplicate entries. - -**Fix**: Add `--first-parent` flag to the git log command in -`GitCLIBackend.log()`. This follows only the first parent of merge -commits, which is the mainline. Also add the parameter to the `VCS` -protocol so all backends are aware of it. - -### F5: Auto-prepare on push to main - -**Problem**: Currently `release.yml` only triggers on `workflow_dispatch`, -requiring manual intervention to create/update the Release PR. - -**Fix**: Add a `push` trigger filtered to `main` (and scoped to `py/` -paths). The `prepare` job's `if` condition is updated to also run on -push events. The `publish` job remains gated on manual dispatch or -PR merge with the `autorelease: pending` label. - -Result: - -```text -push to main ──▶ prepare runs ──▶ Release PR created/updated - │ - (human reviews, merges when ready) - │ - ▼ - publish triggers automatically -``` - -### F6: Write per-package CHANGELOG.md files to disk - -**Problem**: Currently, changelogs are only rendered into the Release PR -body. There are no `CHANGELOG.md` files in any package directory. Users -cannot view the changelog locally, and published PyPI packages have no -changelog file included. - -**Where changelogs go today**: -- PR body only (via `_build_pr_body` in `prepare.py`) -- `releasekit plan` shows version bumps but not changelog text - -**Fix**: After generating changelogs in `prepare_release` (step 6), -write each package's rendered changelog to `{pkg.path}/CHANGELOG.md`: - -1. If the file exists, prepend the new version's section above the - existing content (below the `# Changelog` heading). -2. If the file doesn't exist, create it with a `# Changelog` heading - followed by the new section. -3. Include the changelog files in the release branch commit (step 8). - -This ensures: -- Changelog is visible in the repo alongside each package -- Changelog is included in the sdist/wheel (via `pyproject.toml` - `[tool.setuptools]` or default inclusion) -- The Release PR diff shows the changelog additions for review -- `releasekit plan --format table` can optionally preview changelog text - -## Low-Priority Notes (not blocking, document only) - -| Issue | Notes | -|-------|-------| -| Signal handler in `pin.py` not async-safe | Acceptable — crash recovery only, atexit + finally handle normal cases | -| Prerelease format (`0.6.0rc1` vs `0.6.0a1`) | Valid PEP 440, but unconventional for `alpha`/`beta` labels | -| Pre-1.0 major bump convention | Design choice — `0.x` + breaking → `1.0.0`. Some prefer `0.x+1`. Document. | -| Advisory lock not atomic | Acceptable — CI concurrency group prevents races in practice | diff --git a/py/tools/releasekit/README.md b/py/tools/releasekit/README.md index 712252b1d8..33fe6a8161 100644 --- a/py/tools/releasekit/README.md +++ b/py/tools/releasekit/README.md @@ -64,18 +64,18 @@ implementation plan. | 🔄 Retry with backoff | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | 🔒 Release lock | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | ✍️ Signing / provenance | 🔜 | ❌ | ⚠️ npm | ❌ | ❌ | ❌ | ✅ GPG/Cosign | -| 📋 SBOM | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | +| 📋 SBOM | ✅ CycloneDX+SPDX | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | | 📢 Announcements | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ | -| 📊 Plan profiling | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | -| 🔭 OpenTelemetry tracing | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | +| 📊 Plan profiling | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | +| 🔭 OpenTelemetry tracing | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | | 🔄 Migrate from alternatives | 🔜 | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | **Legend:** ✅ = supported, ⚠️ = partial, ❌ = not supported, 🔜 = planned See [docs/competitive-gap-analysis.md](docs/competitive-gap-analysis.md) for the full analysis with issue tracker references, and -[docs/roadmap-execution-plan.md](docs/roadmap-execution-plan.md) for the -dependency-graphed, topo-sorted execution plan. +[roadmap.md](roadmap.md) for the detailed roadmap with dependency graphs +and execution phases. ## Getting Started @@ -121,6 +121,7 @@ uvx releasekit check | `explain` | Look up any error code (e.g. `releasekit explain RK-GRAPH-CYCLE-DETECTED`) | | `version` | Show the releasekit version | | `migrate` | Migrate from another release tool (release-please, semantic-release, changesets, etc.) | +| `doctor` | Diagnose inconsistent state between workspace, git tags, and platform releases | | `completion` | Generate shell completion scripts (bash/zsh/fish) | ## Features @@ -399,17 +400,27 @@ in the workspace root. Use `releasekit init` to scaffold one: ```toml # releasekit.toml -changelog = true -smoke_test = true -tag_format = "{name}-v{version}" -umbrella_tag = "v{version}" +forge = "github" +repo_owner = "firebase" +repo_name = "genkit" +default_branch = "main" +pr_title_template = "chore(release): v{version}" + +[workspace.py] +ecosystem = "python" +tool = "uv" # defaults from ecosystem if omitted +root = "py" +tag_format = "{name}@{version}" +umbrella_tag = "py/v{version}" +changelog = true +smoke_test = true +major_on_zero = false +max_commits = 500 # limit git log depth for large repos +extra_files = [] exclude_publish = ["group:samples"] -major_on_zero = false -pr_title_template = "chore(release): v{version}" -extra_files = [] -[groups] +[workspace.py.groups] core = ["genkit"] samples = ["*-hello", "*-demo", "web-*"] ``` @@ -432,6 +443,7 @@ samples = ["*-hello", "*-demo", "web-*"] | `major_on_zero` | `false` | Allow `0.x → 1.0.0` on breaking changes (default: downgrade to minor) | | `pr_title_template` | `"chore(release): v{version}"` | Template for the Release PR title. Placeholder: `{version}` | | `extra_files` | `[]` | Extra files with version strings to bump (path or `path:regex` pairs) | +| `max_commits` | `0` | Limit git log depth (0 = unlimited; useful for large repos) | ### Exclusion Hierarchy @@ -516,7 +528,14 @@ releasekit │ ├── release_notes.py release notes generation │ ├── commitback.py commit-back version bumps │ ├── detection.py multi-ecosystem auto-detection -│ └── groups.py release group filtering +│ ├── groups.py release group filtering +│ ├── sbom.py CycloneDX + SPDX SBOM generation +│ ├── profiling.py pipeline step timing + bottleneck analysis +│ ├── tracing.py optional OpenTelemetry tracing (graceful no-op) +│ ├── doctor.py release state consistency checker +│ ├── distro.py distro packaging dep sync (Debian/Fedora/Homebrew) +│ ├── branch.py default branch resolution +│ └── commit_parsing/ conventional commit parser (subpackage) │ ├── Formatters │ ├── ascii_art.py box-drawing terminal art @@ -532,7 +551,7 @@ releasekit ├── UX │ ├── errors.py error catalog + Rust-style render_error/render_warning │ ├── logging.py structured logging setup -│ ├── config.py TOML config loading + validation +│ ├── config.py TOML config loading + validation (workspace-aware) │ ├── init.py workspace config scaffolding │ └── cli.py argparse + rich-argparse + shell completion │ @@ -616,6 +635,8 @@ enables multi-ecosystem support: ## Testing +The test suite has **1,274 tests** across 19k+ lines: + ```bash # Run all tests uv run pytest tests/ diff --git a/py/tools/releasekit/docs/competitive-gap-analysis.md b/py/tools/releasekit/docs/competitive-gap-analysis.md index d4e374d7e9..01c834974b 100644 --- a/py/tools/releasekit/docs/competitive-gap-analysis.md +++ b/py/tools/releasekit/docs/competitive-gap-analysis.md @@ -595,9 +595,8 @@ signing, and publishing. 27. **Dart/Pub workspace backend** (`pubspec.yaml`, `dart pub publish`). 28. **Rustification** — Rewrite core in Rust with PyO3/maturin (see roadmap §12). -> **See [roadmap-execution-plan.md](roadmap-execution-plan.md)** for the -> dependency-graphed, topo-sorted parallel execution plan with Gantt chart -> and critical path analysis. +> **See [../roadmap.md](../roadmap.md)** for the detailed roadmap with +> dependency graphs and execution phases. --- diff --git a/py/tools/releasekit/docs/roadmap-execution-plan.md b/py/tools/releasekit/docs/roadmap-execution-plan.md deleted file mode 100644 index 3bc750fce9..0000000000 --- a/py/tools/releasekit/docs/roadmap-execution-plan.md +++ /dev/null @@ -1,1002 +0,0 @@ -# Releasekit Roadmap — Dependency Graph & Parallel Execution Plan - -**Date:** 2026-02-13 - -This document models every roadmap item as a node in a dependency graph, -reverse-topologically sorts it, and partitions it into **parallel execution -phases** (levels) so that independent work streams can proceed simultaneously. - ---- - -## 0. Genkit Python Release — Status - -The full roadmap (§1–§9) covers releasekit's long-term vision across all -ecosystems. This section tracks items **immediately relevant to shipping -Genkit Python**, ordered by release-blocking priority. - -Context: [PR #4586](https://github.com/firebase/genkit/pull/4586) migrates -`publish_python.yml` to use `releasekit publish`. The -[FIXES.md](../FIXES.md) audit identified 6 fixes (F1–F6). The -`releasekit.toml` config defines groups (core, google_plugins, -community_plugins), tag format, and publish exclusions. - -### Tier 0 — Release Blockers — ✅ ALL DONE - -| ID | Item | Status | Notes | -|----|------|--------|-------| -| **F4** | `--first-parent` in `git log` | ✅ Done | `versioning.py:316`, `changelog.py:320` already pass `first_parent=True` | -| **F1** | Label new PRs with `autorelease: pending` | ✅ Done | `prepare.py:376-390` labels both new and existing PRs | -| **F2** | Filter merged PR lookup by head branch | ✅ Done | `release.py:237-241` filters by `head=_RELEASE_BRANCH` | -| **F3** | Fix `actions/checkout@v5` → `@v4` | ✅ N/A | `actions/checkout@v5` exists (released 2024). Not a bug. | - -### Tier 1 — High Value — ✅ ALL DONE - -| ID | Item | Status | Notes | -|----|------|--------|-------| -| **F6** | Write per-package `CHANGELOG.md` to disk | ✅ Done | `prepare.py:313-321` + `changelog.py:write_changelog()` | -| **F5** | Auto-prepare on push to main | ✅ Done | `releasekit-uv.yml:46-50` triggers on push to `py/packages/**`, `py/plugins/**` | -| **R07** | Internal dep version propagation | ✅ Done | `versioning.py:386-400` BFS propagation via `graph.reverse_edges` | -| **R32** | Parallel `vcs.log()` in `compute_bumps` | ✅ Done | Replaced sequential loop with `asyncio.gather` (2026-02-12) | -| **R04** | Revert commit handling | ✅ Done | `parse_conventional_commit` detects `Revert "..."` and `revert:` formats; bump counter cancellation (2026-02-12) | -| **R27** | `--ignore-unknown-tags` flag | ✅ Done | `compute_bumps(ignore_unknown_tags=True)` falls back to full history on bad tags; CLI flag on publish/plan/version (2026-02-12) | -| — | `--no-merges` in VCS protocol | ✅ Done | `VCS.log(no_merges=True)` filters accidental merge commits from bump computation and changelogs (2026-02-12) | -| — | Default branch auto-detection | ✅ Done | `VCS.default_branch()` + `branch.py:resolve_default_branch()` + `config.default_branch` override. Git: `symbolic-ref` → probe → fallback. Mercurial: `"default"` (2026-02-12) | -| — | Distro packaging dep sync | ✅ Done | `distro.py`: auto-syncs Debian/Ubuntu `control`, Fedora/RHEL `.spec`, and Homebrew formula deps from `pyproject.toml`. Check via `releasekit check`, fix via `releasekit check --fix` (2026-02-12, Homebrew added 2026-02-13) | -| — | Non-conventional commit warnings | ✅ Done | `versioning.py` and `changelog.py` now log `non_conventional_commit` warnings for improperly formatted commit messages (2026-02-12) | -| — | Debian/Ubuntu + Fedora/RHEL + Homebrew packaging | ✅ Done | `packaging/debian/` (control, changelog, copyright, rules) + `packaging/fedora/*.spec` + `packaging/homebrew/*.rb` + `packaging/README.md` (2026-02-12, Homebrew added 2026-02-13) | -| — | pnpm publish params (`dist_tag`, `publish_branch`, `provenance`) | ✅ Done | Threaded through `PackageManager` protocol → `PnpmBackend.publish()` → `PublishConfig` → `WorkspaceConfig` → CLI `--dist-tag` flag (2026-02-13) | -| — | Ecosystem-aware `discover_packages` | ✅ Done | `discover_packages(ecosystem=)` dispatches to `PnpmWorkspace` for JS, `uv` for Python. Async bridge via `_discover_js_packages` (2026-02-13) | -| — | `pyproject_path` → `manifest_path` rename | ✅ Done | Renamed across all source and test files for ecosystem-agnostic naming (2026-02-13) | - -### Tier 2 — Important but Not Blocking — ✅ ALL DONE - -| ID | Item | Effort | Status | Why Important | -|----|------|--------|--------|---------------| -| **R25** | `--commit-depth` / `--max-commits` | S | ✅ Done | `max_commits` param on VCS protocol, `compute_bumps`, `WorkspaceConfig`. | -| **R05** | `releasekit doctor` | M | ✅ Done | `run_doctor` in `doctor.py` with 6 checks (config, tag alignment, orphaned tags, VCS state, forge, default branch). CLI `releasekit doctor` subcommand wired (2026-02-13). | -| **R26** | `bootstrap-sha` config | S | ✅ Done | `bootstrap_sha` on `WorkspaceConfig`, threaded through `compute_bumps`, `prepare_release`, and all CLI call sites. Falls back to full history when no tag exists (2026-02-13). | -| **R08** | Contributor attribution in changelogs | S | ✅ Done | `ChangelogEntry.author` field, git log format `%H\x00%an\x00%s`, rendered as `— @author` in changelog entries (2026-02-13). | -| **R28** | Lockfile update after version bump | S | ✅ Done | `prepare.py` step 5 calls `pm.lock(upgrade_package=ver.name)` after each `bump_pyproject` (2026-02-13). | -| **R17** | Auto-merge release PRs | S | ✅ Done | `auto_merge` config on `WorkspaceConfig`. `prepare.py` step 10 calls `forge.merge_pr()` after labeling. All 4 forge backends implement `merge_pr` (2026-02-13). | - -### Genkit JS Release — Parity Analysis & Migration Plan - -**Goal:** Migrate Genkit JS from its current shell-script-based release -process to releasekit, achieving full parity before switching over. - -#### Current Genkit JS Release Process (as-is) - -The JS release pipeline is spread across 6 GitHub Actions workflows and -4 shell scripts: - -| Workflow / Script | What It Does | -|-------------------|-------------| -| `bump-js-version.yml` | Manual dispatch → runs `bump_and_tag_js.sh` to bump **all** JS packages in lockstep, commit, tag, push. | -| `bump-cli-version.yml` | Manual dispatch → runs `bump_and_tag_cli.sh` to bump CLI packages (`tools-common`, `telemetry-server`, `genkit-cli`) separately. | -| `bump-package-version.yml` | Manual dispatch → bumps a **single** package by dir + name. | -| `release_js_main.yml` | Manual dispatch → `pnpm install && pnpm build && pnpm test:js`, then runs `scripts/release_main.sh` which publishes ~20 packages **sequentially** to Wombat Dressing Room (Google's npm proxy). | -| `release_js_package.yml` | Manual dispatch → publishes a **single** package to Wombat. | -| `build-cli-binaries.yml` | Manual dispatch → cross-compiles CLI binaries via Bun for 5 platforms (linux-x64, linux-arm64, darwin-x64, darwin-arm64, win32-x64), uploads artifacts, runs smoke tests. | - -**Key characteristics:** -- **Manual version bumps** — operator picks `patch`/`minor`/`major`/`prerelease` via workflow dispatch; no Conventional Commits automation. -- **Synchronized versions** — `bump_and_tag_js.sh` bumps all JS packages to the same version (lockstep mode). -- **Separate CLI versioning** — CLI packages (`genkit-tools/*`) are versioned independently from `js/*` packages. -- **Tag format** — dual tags per package: `{tag_prefix}{version}` (e.g. `core-v1.2.3`) **and** `{package_name}@{version}` (e.g. `@genkit-ai/core@1.2.3`). -- **npm dist-tag** — publishes with `--tag next` or `--tag latest` (operator choice). -- **Wombat Dressing Room** — all publishes go through `https://wombat-dressing-room.appspot.com/` (Google's npm proxy that adds provenance). -- **No changelogs** — no automated CHANGELOG generation. -- **No Release PR** — version bumps are committed directly to the branch. -- **No dependency graph awareness** — publish order is hardcoded in `release_main.sh`. -- **Sequential publish** — one package at a time, no parallelism. -- **Clean worktree check** — `ensure-clean-working-tree.sh` runs after build, before publish. - -#### Releasekit Parity Gap Analysis - -| JS Capability | Releasekit Status | Gap / Work Needed | -|--------------|-------------------|-------------------| -| pnpm workspace discovery | ✅ Done | `PnpmWorkspaceBackend` reads `pnpm-workspace.yaml`, discovers packages from `package.json`. | -| `npm version` bump | ✅ Done | `PnpmBackend.version_bump()` uses `npm version --no-git-tag-version`. | -| Synchronized (lockstep) versions | ✅ Done | `synchronize = true` in `WorkspaceConfig`. | -| Independent per-package bump | ✅ Done | Default mode. | -| Separate release groups (JS vs CLI) | ✅ Done | `groups` config in `WorkspaceConfig`. | -| Dual tag format (`prefix-v` + `name@`) | ✅ Done | `tag_format` with `{label}` placeholder. Per-workspace config. | -| npm dist-tag (`next` / `latest`) | ✅ Done | `--dist-tag` CLI flag → `WorkspaceConfig` → `PublishConfig` → `PnpmBackend.publish(--tag)`. | -| Wombat Dressing Room registry | ✅ Done | `PnpmBackend.publish(index_url=...)` maps to `--registry`. | -| `pnpm publish` | ✅ Done | `PnpmBackend.publish()` with `--access=public`, `--registry`, `--tag`, `--publish-branch`, `--provenance`. | -| `pnpm install` / `pnpm build` / `pnpm test` | ✅ Done | `PnpmBackend.build()` (`pnpm pack`), `lock()`, `smoke_test()`. | -| pnpm lockfile update | ✅ Done | `PnpmBackend.lock()` — `pnpm install --lockfile-only` / `--frozen-lockfile`. | -| Cross-compiled CLI binaries | ❌ Out of scope | **R23**: Cross-compilation orchestration. Separate concern from release. | -| Conventional Commits automation | ✅ Done | JS currently lacks this; releasekit adds it. **Upgrade.** | -| Changelog generation | ✅ Done | JS currently lacks this; releasekit adds it. **Upgrade.** | -| Release PR workflow | ✅ Done | JS currently lacks this; releasekit adds it. **Upgrade.** | -| Dependency-aware publish order | ✅ Done | JS currently hardcodes order; releasekit computes it from the graph. **Upgrade.** | -| Parallel publish | ✅ Done | JS publishes sequentially; releasekit parallelizes by dependency level. **Upgrade.** | -| Clean worktree preflight | ✅ Done | `preflight.py` checks this. | -| Prerelease support (`--preid rc`) | ⚠️ Partial | **R03**: Full prerelease workflow (rollup vs separate). Basic `prerelease` param exists in `compute_bumps`. | - -#### Migration Workflow - -**Phase 1 — pnpm Backend (R11)** - -Implement the pnpm workspace backend so releasekit can discover, build, -test, version-bump, and publish JS packages: - -1. `PnpmWorkspaceBackend` — discover packages from `pnpm-workspace.yaml` -2. `PnpmBackend.publish()` — `pnpm publish` with `--tag`, `--registry`, - `--publish-branch`, `--access=public`, `--provenance=false` -3. `PnpmBackend.lock()` — `pnpm install --lockfile-only` -4. `PnpmBackend.version_bump()` — `npm version` or `pnpm version` -5. `PnpmBackend.build()` / `test()` — `pnpm build`, `pnpm test` - -**Phase 2 — npm Registry Backend (R12, R37)** - -1. `NpmRegistryBackend` — check if a version is already published - (`npm view @`) -2. Wombat Dressing Room support — custom `--registry` URL for publish -3. npm dist-tag support — `--tag next` / `--tag latest` - -**Phase 3 — JS Workspace Config** - -Add a `[workspace.js]` section to `releasekit.toml`: - -```toml -[workspace.js] -ecosystem = "js" -tool = "pnpm" -root = "." # JS packages span root + js/ -tag_format = "{name}@{version}" -umbrella_tag = "js/v{version}" -synchronize = true # lockstep versions for js/* -bootstrap_sha = "abc123..." # starting point for adoption - -[workspace.js-cli] -ecosystem = "js" -tool = "pnpm" -root = "genkit-tools" -tag_format = "{name}@{version}" -synchronize = true -``` - -**Phase 4 — Parallel Cutover** - -1. Run releasekit in `--dry-run` mode alongside existing scripts for - 1–2 release cycles to validate parity. -2. Verify tag format, version bumps, and publish output match. -3. Switch `release_js_main.yml` to call `releasekit publish`. -4. Archive `scripts/release_main.sh` and `js/scripts/bump_*.sh`. - -#### What Releasekit Gains Over Current JS Process - -- **Automated version bumps** from Conventional Commits (no manual - `patch`/`minor`/`major` selection). -- **Changelogs** generated automatically per package. -- **Release PR workflow** with review gate before publish. -- **Dependency-aware parallel publish** instead of hardcoded sequential. -- **Unified tooling** across Python and JS ecosystems. -- **Rollback support** (`releasekit rollback `). -- **Preflight checks** (cycles, lockfile, shallow clone, forge). -- **Doctor diagnostics** for state consistency. - -### Tier 3 — Extended Features (partially done) - -| ID | Item | Status | Notes | -|----|------|--------|-------| -| ★ **R11** | pnpm workspace publish pipeline | ✅ Done | `PnpmBackend` + `PnpmWorkspace` fully implemented (2026-02-13). | -| ★ **R12** | npm registry backend | ✅ Done | `NpmRegistry` with `npm view` version check (2026-02-13). | -| ★ **R37** | Custom registry URL / Wombat Dressing Room | ✅ Done | `index_url` wired through `PnpmBackend.publish(--registry)` (2026-02-13). | -| **R13** | Scoped tag format | ✅ Done | `parse_tag()` reverse-parses scoped npm tags (`@scope/name@version`). `secondary_tag_format` config for dual-tagging in `create_tags`. (2026-02-13). | -| **R30** | Plan profiling | ✅ Done | `profiling.py`: `StepTimer` context manager, `PipelineProfile` with summary stats, JSON export, and ASCII table rendering (2026-02-13). | -| **R31** | OpenTelemetry tracing | ✅ Done | `tracing.py`: optional OTel spans with zero-overhead no-op fallback when `opentelemetry-api` is not installed. `@span` decorator for sync/async. `pip install releasekit[tracing]` (2026-02-13). | -| **R02** | Standalone repo packaging | Pending | PyPI-publishable wheel + entry point. | -| **R03** | Full prerelease workflow | Pending | Rollup vs separate prerelease modes. | -| **R06** | Hotfix / maintenance branches | Pending | `--base-branch` for non-default branch releases. | -| **R10** | Snapshot releases | Pending | `--snapshot` for CI testing with ephemeral versions. | -| **R14** | npm provenance (Sigstore) | Pending | `--provenance` attestation for npm publishes. | -| **R15** | GPG / Sigstore signing | Pending | Sign tags and release artifacts. | -| **R16** | SBOM generation | ✅ Done | `sbom.py`: CycloneDX 1.5 + SPDX 2.3 JSON generation from release manifest. Package URLs (purl), license IDs, supplier metadata. `generate_sbom()` + `write_sbom()` (2026-02-13). | -| **R23** | Cross-compilation orchestration | Pending | CLI binary builds for multiple platforms. | -| **R24** | PEP 440 scheme | Pending | Full PEP 440 version scheme support. | -| **R29** | `releasekit migrate` | Pending | Protocol-based migration from alternatives. | -| **R38** | Cherry-pick for release branches | Pending | `releasekit cherry-pick` subcommand. | -| **R18–R22** | Changelog templates, announcements, changesets, plugins, programmatic API | Pending | | -| **R33–R36** | Bazel, Rust, Java, Dart ecosystem backends | Pending | | - -### Implementation Summary (2026-02-13) - -All Tier 0, Tier 1, and Tier 2 items are complete. The release pipeline -is production-ready for Genkit Python. JS parity backends (pnpm, npm) -are implemented and ecosystem-aware. - -**2026-02-13 additions:** `releasekit doctor` CLI wired, `bootstrap_sha` -confirmed wired, contributor attribution in changelogs (`@author`), -lockfile update after bump confirmed wired, auto-merge release PRs -(`auto_merge` config + `forge.merge_pr()`). - -**Codebase stats:** 73 source modules (~23,400 LOC), 51 test files -(~20,700 LOC), 1293 tests passing (86% coverage), 14 CLI subcommands. - -**Protocols:** 5 backend protocols (VCS, PackageManager, Workspace, -Registry, Forge) + 1 check protocol (CheckBackend). - -**Backends implemented:** - -| Protocol | Backends | -|----------|----------| -| VCS | Git (full), Mercurial (full) | -| PackageManager | uv, pnpm | -| Workspace | uv, pnpm | -| Registry | PyPI, npm | -| Forge | GitHub (CLI + API), GitLab (CLI), Bitbucket (API) | -| CheckBackend | PythonCheckBackend (34 checks + 14 auto-fixers) | - -**Key changes (2026-02-12):** - -- **R32** — `versioning.py`: `compute_bumps` Phase 1 now uses - `asyncio.gather` to run per-package `vcs.log()` + `tag_exists()` - concurrently (~10× speedup for 60+ packages). -- **R04** — `versioning.py`: `parse_conventional_commit` handles - `Revert "feat: ..."` (GitHub format) and `revert: feat: ...` - (conventional format). Bump computation uses per-level counters - where reverts decrement, so a reverted `feat:` cancels the MINOR bump. -- **R27** — `versioning.py` + `cli.py`: New `ignore_unknown_tags` - parameter on `compute_bumps`. When `True`, a failed `git log {tag}..HEAD` - falls back to `since_tag=None` (full history) with a warning. - CLI flag `--ignore-unknown-tags` added to `publish`, `plan`, `version`. -- **`--no-merges`** — VCS protocol + Git/Mercurial backends filter - accidental merge commits from bump computation and changelogs. -- **Default branch detection** — `VCS.default_branch()` auto-detects - via `git symbolic-ref` (Git) or returns `"default"` (Mercurial). - Config override via `default_branch` in `releasekit.toml`. - `prepare.py` uses `resolve_default_branch()` for PR base. -- **Distro dep sync** — New `distro.py` module parses `pyproject.toml` - deps and generates/validates Debian/Ubuntu `control` and Fedora/RHEL - `.spec` dependency lists. Integrated as check (`distro_deps`) and - auto-fixer (`releasekit check --fix`). -- **Non-conventional commit warnings** — `versioning.py` and - `changelog.py` now log structured warnings for commit messages that - don't follow Conventional Commits format. -- **Distro packaging** — Added `packaging/debian/` and - `packaging/fedora/` with full Debian and RPM packaging files. - -**Key changes (2026-02-13):** - -- **Config at repo root** — `releasekit.toml` moved to repo root. - `_find_workspace_root()` updated. New `_effective_workspace_root()` - resolves per-workspace root from `config_root / ws_config.root`. -- **Ecosystem-aware backends** — `_create_backends()` selects - `PnpmBackend`/`NpmRegistry` for JS workspaces, `UvBackend`/`PyPIBackend` - for Python, based on `ws_config.tool`. -- **R25** — `max_commits` param added to VCS protocol, `compute_bumps`, - and `WorkspaceConfig`. Bounds changelog generation for large repos. -- **Tag `{label}` placeholder** — `format_tag()`, `create_tags()`, - `delete_tags()` accept `label` param. Tag format `{name}@{version}`, - umbrella tag `py/v{version}`. -- **VCS `list_tags` + `current_branch`** — Added to VCS protocol, - `GitCLIBackend`, `MercurialCLIBackend`, and all 8 FakeVCS test classes. - Enables `releasekit doctor` orphan tag and branch checks. -- **R05 partial** — `run_doctor()` in `doctor.py` with 7 diagnostic - checks (config, VCS, forge, registry, orphaned tags, branch, packages). - CLI wiring still pending. -- **CI matrix expansion** — `tool-tests` and `conform-tests` now run - on Python 3.10–3.14 (5 versions). Path-filtered via - `dorny/paths-filter` so tests only run when relevant files change. -- **Repo portability** — Audited: zero genkit imports, zero hardcoded - paths, self-contained deps. Ready for standalone repo extraction. -- **pnpm publish params** — `dist_tag`, `publish_branch`, `provenance` - added to `PackageManager` protocol, `PnpmBackend.publish()` (maps to - `--tag`, `--publish-branch`, `--provenance`), `UvBackend.publish()` - (accepts and ignores for protocol compat), `PublishConfig`, - `WorkspaceConfig`, `_WORKSPACE_TYPE_MAP`. CLI `--dist-tag` flag on - `publish` subcommand. All `FakePackageManager.publish()` signatures - updated across 3 test files. -- **Ecosystem-aware `discover_packages`** — New `ecosystem` parameter - dispatches to `PnpmWorkspace.discover()` for JS workspaces via - `_discover_js_packages` async bridge. All `discover_packages` call - sites in `cli.py` updated to pass `ws_config.ecosystem`. -- **`pyproject_path` → `manifest_path`** — Renamed across all source - and test files for ecosystem-agnostic naming (`Package` dataclass, - `discover_packages`, `ephemeral_pin`, CLI, tests). -- **Homebrew packaging** — `packaging/homebrew/releasekit.rb` formula - with `virtualenv_install_with_resources` and 10 dependency resource - blocks. `distro.py`: `_brew_resource_name()`, `expected_brew_resources()`, - `_parse_brew_resources()`, `check_brew_deps()`, `fix_brew_formula()`. - Wired into `check_distro_deps()` and `fix_distro_deps()`. 19 new tests. - `packaging/README.md` updated with Homebrew section. -- **R38** — Cherry-pick for release branches added to roadmap (depends - on R06). Added to items table, gap traceability, Mermaid graph, topo - sort, parallel execution phases, and Gantt chart. -- **FAQ edge cases** — Added 19 edge case entries to `docs/guides/faq.md` - covering dependency graph topologies (diamond, disconnected, chain, - cycle, self-dep) and version bump edge cases (revert cancellation, - mixed levels, `major_on_zero`, `synchronize`, `propagate_bumps`, - `force_unchanged`, `exclude_bump` vs `exclude_publish`, `max_commits`, - unreachable tags). - -All 1293 tests pass (86% coverage). - ---- - -## 1. Roadmap Items (Nodes) - -Each item has an ID, description, estimated effort, and list of dependencies. - -| ID | Item | Effort | Depends On | -|----|------|--------|------------| -| `R01` | Core protocol audit — ensure all 6 protocols are fully agnostic | S | — | -| `R02` | Standalone repo scaffolding (CI, pyproject.toml, LICENSE, docs) | S | `R01` | -| `R03` | Pre-release workflow (`--prerelease` flag, PEP 440 / SemVer) | M | `R01` | -| `R04` | Revert commit handling (cancel bumps for reverted commits) | S | — | -| `R05` | `releasekit doctor` (state consistency checker) | M | — | -| `R06` | Hotfix / maintenance branch support (`--base-branch`) | M | `R03` | -| `R07` | Internal dep version propagation (`fix_internal_dep_versions`) | M | — | -| `R08` | Contributor attribution in changelogs | S | — | -| `R09` | Incremental changelog generation (perf for large repos) | M | — | -| `R10` | Snapshot releases (`--snapshot` for CI testing) | S | `R03` | -| `R11` | pnpm workspace publish pipeline (end-to-end JS support) | L | `R01` | -| `R12` | npm registry backend (wire up `NpmRegistry` for publish) | M | `R11` | -| `R13` | Wombat proxy auth support (Google internal npm proxy) | S | `R12` | -| `R14` | `@scope/name@version` tag format support | S | `R11` | -| `R15` | Sigstore / GPG signing + provenance | M | `R02` | -| `R16` | SBOM generation (CycloneDX / SPDX) | M | `R15` | -| `R17` | Auto-merge release PRs | S | — | -| `R18` | Custom changelog templates (Jinja2) | S | — | -| `R19` | Announcement integrations (Slack, Discord) | S | — | -| `R20` | Optional changeset file support (hybrid with conv. commits) | M | — | -| `R21` | Plugin system for custom steps (entry-point discovery) | L | `R01` | -| `R22` | Programmatic Python API | L | `R01`, `R21` | -| `R23` | Cross-compilation orchestration (CLI binaries) | M | `R02` | -| `R24` | PEP 440 version scheme (`version_scheme = "pep440"`) | S | `R03` | -| `R25` | `--commit-depth` / `--max-commits` for large repos | S | — | -| `R26` | `bootstrap-sha` config for mid-stream adoption | S | `R05` | -| `R27` | `--ignore-unknown-tags` flag | S | — | -| `R28` | Lockfile update after version bump | S | `R07` | -| `R29` | `releasekit migrate` — protocol-based migration from alternatives | M | `R01`, `R02` | -| `R30` | `releasekit plan --analyze` — critical path & bottleneck profiling | S | — | -| `R31` | OpenTelemetry tracing backend (spans for publish stages, HTTP, git) | M | `R01` | -| `R32` | Parallel `vcs.log()` in `compute_bumps` via `asyncio.gather` | S | — | -| `R33` | Bazel workspace backend (BUILD files, `bazel run //pkg:publish`) | L | `R01` | -| `R34` | Rust/Cargo workspace backend (`Cargo.toml`, `cargo publish`) | M | `R01` | -| `R35` | Java backend (Maven `pom.xml` / Gradle `build.gradle`, `mvn deploy`) | L | `R01` | -| `R36` | Dart/Pub workspace backend (`pubspec.yaml`, `dart pub publish`) | M | `R01` | -| `R37` | pyx package registry backend | M | `R01` | -| `R38` | Cherry-pick for release branches (`releasekit cherry-pick`) | M | `R06` | - -**Effort key:** S = Small (1–3 days), M = Medium (3–7 days), L = Large (1–2 weeks) - -### Gap → Roadmap Traceability - -Every gap identified in the [competitive analysis](competitive-gap-analysis.md) -maps to one or more roadmap nodes: - -| Severity | Gap | Roadmap Node(s) | Alternative Tool Issues | -|----------|-----|-----------------|-------------------| -| 🔴 Critical | Pre-release workflow | `R03`, `R24` | release-please [#510](https://github.com/googleapis/release-please/issues/510), semantic-release [#563](https://github.com/semantic-release/semantic-release/issues/563) | -| 🔴 Critical | Revert commit handling | `R04` | release-please [#296](https://github.com/googleapis/release-please/issues/296) | -| 🔴 Critical | Hotfix / maintenance branches | `R06` | release-please [#2475](https://github.com/googleapis/release-please/issues/2475), semantic-release [#1038](https://github.com/semantic-release/semantic-release/issues/1038) | -| 🟠 High | Dep version propagation | `R07`, `R28` | release-please [#1032](https://github.com/googleapis/release-please/issues/1032) | -| 🟠 High | Contributor attribution | `R08` | release-please [#292](https://github.com/googleapis/release-please/issues/292) | -| 🟠 High | PEP 440 version scheme | `R24` | python-semantic-release [#455](https://github.com/python-semantic-release/python-semantic-release/issues/455) | -| 🟠 High | Performance on large repos | `R09`, `R25`, `R26` | python-semantic-release [#722](https://github.com/python-semantic-release/python-semantic-release/issues/722) | -| 🟠 High | `releasekit doctor` | `R05`, `R26` | release-please [#1946](https://github.com/googleapis/release-please/issues/1946) | -| 🟡 Nice | GPG / Sigstore signing | `R15`, `R16` | release-please [#1314](https://github.com/googleapis/release-please/issues/1314) | -| 🟡 Nice | Auto-merge release PRs | `R17` | release-please [#2299](https://github.com/googleapis/release-please/issues/2299) | -| 🟡 Nice | Custom changelog templates | `R18` | release-please [#2007](https://github.com/googleapis/release-please/issues/2007) | -| 🟡 Nice | Plugin / extension system | `R21`, `R22` | python-semantic-release [#321](https://github.com/python-semantic-release/python-semantic-release/issues/321) | -| 🟡 Nice | Snapshot releases | `R10` | changesets (built-in feature) | -| 🟡 Nice | Changeset file support | `R20` | changesets [#862](https://github.com/changesets/changesets/issues/862) | -| 🟡 Nice | Announcement integrations | `R19` | goreleaser (built-in feature) | -| 🟢 Growth | `releasekit migrate` command | `R29` | Users of all alternatives | -| 🟠 High | Plan profiling / bottleneck analysis | `R30` | python-semantic-release [#722](https://github.com/python-semantic-release/python-semantic-release/issues/722) | -| 🟠 High | OpenTelemetry tracing | `R31` | No alternative has this | -| 🟠 High | Parallel commit log fetching | `R32` | python-semantic-release [#722](https://github.com/python-semantic-release/python-semantic-release/issues/722) | -| 🟢 Growth | Bazel workspace support | `R33` | No alternative supports Bazel monorepos | -| 🟢 Growth | Rust/Cargo workspace support | `R34` | cargo-release is single-crate only | -| 🟢 Growth | Java (Maven/Gradle) support | `R35` | jreleaser covers Java but no monorepo graph | -| 🟢 Growth | Dart/Pub workspace support | `R36` | No alternative supports Dart workspaces | -| 🟢 Growth | pyx package registry support | `R37` | No alternative supports pyx | -| 🟠 High | Cherry-pick for release branches | `R38` | release-please [#2475](https://github.com/googleapis/release-please/issues/2475), semantic-release [#1038](https://github.com/semantic-release/semantic-release/issues/1038) | - ---- - -## 2. Dependency Graph (Mermaid) - -```mermaid -graph TD - R01[R01: Protocol audit] - R02[R02: Standalone repo] - R03[R03: Pre-release workflow] - R04[R04: Revert handling] - R05[R05: Doctor command] - R06[R06: Hotfix branches] - R07[R07: Dep version propagation] - R08[R08: Contributor changelogs] - R09[R09: Incremental changelog] - R10[R10: Snapshot releases] - R11[R11: pnpm publish pipeline] - R12[R12: npm registry backend] - R13[R13: Wombat proxy auth] - R14[R14: Scoped tag format] - R15[R15: Signing + provenance] - R16[R16: SBOM generation] - R17[R17: Auto-merge PRs] - R18[R18: Changelog templates] - R19[R19: Announcements] - R20[R20: Changeset file support] - R21[R21: Plugin system] - R22[R22: Programmatic API] - R23[R23: Cross-compilation] - R24[R24: PEP 440 scheme] - R25[R25: Commit depth limit] - R26[R26: Bootstrap SHA] - R27[R27: Ignore unknown tags] - R28[R28: Lockfile update] - R29[R29: Migrate command] - R30[R30: Plan profiling] - R31[R31: OTel tracing] - R32[R32: Parallel vcs.log] - R33[R33: Bazel backend] - R34[R34: Rust/Cargo backend] - R35[R35: Java Maven/Gradle] - R36[R36: Dart/Pub backend] - R37[R37: pyx registry] - R38[R38: Cherry-pick for release branches] - - R01 --> R37 - R01 --> R33 - R01 --> R34 - R01 --> R35 - R01 --> R36 - R01 --> R02 - R01 --> R29 - R02 --> R29 - R01 --> R31 - R01 --> R03 - R01 --> R11 - R01 --> R21 - R01 --> R22 - R03 --> R06 - R03 --> R10 - R03 --> R24 - R05 --> R26 - R07 --> R28 - R11 --> R12 - R11 --> R14 - R12 --> R13 - R02 --> R15 - R15 --> R16 - R02 --> R23 - R21 --> R22 - R06 --> R38 - - classDef small fill:#d4edda,stroke:#28a745 - classDef medium fill:#fff3cd,stroke:#ffc107 - classDef large fill:#f8d7da,stroke:#dc3545 - - class R01,R04,R08,R10,R13,R14,R17,R18,R19,R24,R25,R26,R27,R28,R02 small - class R03,R05,R06,R07,R09,R12,R15,R16,R20,R23,R29,R31 medium - class R11,R21,R22,R33,R35 large - class R30,R32 small - class R34,R36,R37,R38 medium -``` - ---- - -## 3. Reverse Topological Sort - -Reverse topological order (leaves first, roots last): - -``` -Level 0 (no deps): R01, R04, R05, R07, R08, R09, R17, R18, R19, R20, R25, R27, R30, R32 -Level 1 (deps on L0): R02, R03, R11, R21, R26, R28 -Level 2 (deps on L1): R06, R10, R12, R14, R15, R22, R23, R24, R29, R31, R33, R34, R35, R36, R37 -Level 3 (deps on L2): R13, R16, R38 -``` - ---- - -## 4. Parallel Execution Phases - -Items within each phase can execute **simultaneously**. A phase starts only -after all items in the previous phase are complete. - -### Phase 0 — Foundation (all independent, max parallelism) - -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ R01 Protocol audit [S] │ -│ R04 Revert commit handling [S] │ -│ R05 Doctor command [M] │ -│ R07 Internal dep version propagation [M] │ -│ R08 Contributor attribution in changelogs [S] │ -│ R09 Incremental changelog generation [M] │ -│ R17 Auto-merge release PRs [S] │ -│ R18 Custom changelog templates [S] │ -│ R19 Announcement integrations [S] │ -│ R20 Optional changeset file support [M] │ -│ R25 Commit depth limit [S] │ -│ R27 Ignore unknown tags [S] │ -│ R30 Plan profiling / bottleneck analysis [S] │ -│ R32 Parallel vcs.log in compute_bumps [S] │ -├─────────────────────────────────────────────────────────────────────┤ -│ 14 items │ ~7 days wall-clock (limited by M items) │ -│ Critical path: R01 (gates Phase 1) │ -└─────────────────────────────────────────────────────────────────────┘ -``` - -### Phase 1 — Core Features (depends on Phase 0) - -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ R02 Standalone repo scaffolding [S] ← R01 │ -│ R03 Pre-release workflow [M] ← R01 │ -│ R11 pnpm workspace publish pipeline [L] ← R01 │ -│ R21 Plugin system [L] ← R01 │ -│ R26 Bootstrap SHA config [S] ← R05 │ -│ R28 Lockfile update after bump [S] ← R07 │ -├─────────────────────────────────────────────────────────────────────┤ -│ 6 items │ ~10 days wall-clock (limited by L items: R11, R21) │ -│ Critical path: R11 (gates JS publish in Phase 2) │ -└─────────────────────────────────────────────────────────────────────┘ -``` - -### Phase 2 — Ecosystem & Extensions (depends on Phase 1) - -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ R06 Hotfix branch support [M] ← R03 │ -│ R10 Snapshot releases [S] ← R03 │ -│ R12 npm registry backend [M] ← R11 │ -│ R14 Scoped tag format [S] ← R11 │ -│ R15 Sigstore / GPG signing [M] ← R02 │ -│ R22 Programmatic Python API [L] ← R01, R21 │ -│ R23 Cross-compilation orchestration [M] ← R02 │ -│ R24 PEP 440 version scheme [S] ← R03 │ -│ R29 Migrate command [M] ← R01, R02 │ -│ R31 OpenTelemetry tracing [M] ← R01 │ -│ R33 Bazel workspace backend [L] ← R01 │ -│ R34 Rust/Cargo workspace backend [M] ← R01 │ -│ R35 Java (Maven/Gradle) backend [L] ← R01 │ -│ R36 Dart/Pub workspace backend [M] ← R01 │ -│ R37 pyx package registry backend [M] ← R01 │ -├─────────────────────────────────────────────────────────────────────┤ -│ 15 items │ ~10 days wall-clock (limited by L items: R22, R33, R35)│ -│ Critical path: R12 (gates Wombat proxy in Phase 3) │ -└─────────────────────────────────────────────────────────────────────┘ -``` - -### Phase 3 — Polish (depends on Phase 2) - -``` -┌─────────────────────────────────────────────────────────────────────┐ -│ R13 Wombat proxy auth [S] ← R12 │ -│ R16 SBOM generation [M] ← R15 │ -│ R38 Cherry-pick for release branches [M] ← R06 │ -├─────────────────────────────────────────────────────────────────────┤ -│ 3 items │ ~7 days wall-clock │ -└─────────────────────────────────────────────────────────────────────┘ -``` - ---- - -## 5. Critical Path Analysis - -The **longest path** through the dependency graph determines the minimum -total wall-clock time: - -``` -R01 (S:3d) → R11 (L:10d) → R12 (M:5d) → R13 (S:2d) -Total critical path: ~20 working days -``` - -Alternative critical path (for plugin system): -``` -R01 (S:3d) → R21 (L:10d) → R22 (L:10d) -Total: ~23 working days -``` - -**Optimization:** R22 (Programmatic API) can start as soon as R21 reaches -a stable internal API, even before R21 is fully complete. With this overlap, -effective critical path is ~20 days. - ---- - -## 6. Gantt Chart (Mermaid) - -```mermaid -gantt - title Releasekit Roadmap Execution - dateFormat YYYY-MM-DD - axisFormat %b %d - - section Phase 0 — Foundation - R01 Protocol audit :r01, 2026-02-17, 3d - R04 Revert handling :r04, 2026-02-17, 2d - R05 Doctor command :r05, 2026-02-17, 5d - R07 Dep propagation :r07, 2026-02-17, 5d - R08 Contributor changelogs :r08, 2026-02-17, 2d - R09 Incremental changelog :r09, 2026-02-17, 5d - R17 Auto-merge PRs :r17, 2026-02-17, 2d - R18 Changelog templates :r18, 2026-02-17, 2d - R19 Announcements :r19, 2026-02-17, 2d - R20 Changeset support :r20, 2026-02-17, 5d - R25 Commit depth limit :r25, 2026-02-17, 1d - R27 Ignore unknown tags :r27, 2026-02-17, 1d - - section Phase 1 — Core Features - R02 Standalone repo :r02, after r01, 3d - R03 Pre-release workflow :r03, after r01, 5d - R11 pnpm publish pipeline :crit, r11, after r01, 10d - R21 Plugin system :r21, after r01, 10d - R26 Bootstrap SHA :r26, after r05, 2d - R28 Lockfile update :r28, after r07, 2d - - section Phase 2 — Ecosystem - R06 Hotfix branches :r06, after r03, 5d - R10 Snapshot releases :r10, after r03, 2d - R12 npm registry backend :crit, r12, after r11, 5d - R14 Scoped tag format :r14, after r11, 2d - R15 Signing + provenance :r15, after r02, 5d - R22 Programmatic API :r22, after r21, 10d - R23 Cross-compilation :r23, after r02, 5d - R24 PEP 440 scheme :r24, after r03, 2d - R29 Migrate command :r29, after r02, 5d - R31 OTel tracing :r31, after r01, 5d - R33 Bazel backend :r33, after r01, 10d - R34 Rust/Cargo backend :r34, after r01, 5d - R35 Java Maven/Gradle :r35, after r01, 10d - R36 Dart/Pub backend :r36, after r01, 5d - R37 pyx registry :r37, after r01, 5d - R30 Plan profiling :r30, 2026-02-17, 2d - R32 Parallel vcs.log :r32, 2026-02-17, 2d - - section Phase 3 — Polish - R13 Wombat proxy auth :r13, after r12, 2d - R16 SBOM generation :r16, after r15, 5d - R38 Cherry-pick release br :r38, after r06, 5d -``` - ---- - -## 7. Standalone Repo Readiness Checklist - -Releasekit is already architecturally independent. These items ensure it -can live in its own repository: - -- [x] **No hardcoded paths** — All paths are relative to workspace root - (discovered at runtime via `releasekit.toml` location). -- [x] **Protocol-based backends** — 6 protocols (VCS, PackageManager, - Workspace, Registry, Forge, Telemetry) with no concrete coupling in core. -- [x] **Ecosystem-agnostic core** — `graph.py`, `scheduler.py`, - `versioning.py`, `changelog.py` operate on abstract `Package` objects. -- [x] **Config-driven** — All repo-specific settings in `releasekit.toml`. -- [x] **No imports from parent packages** — `releasekit` has zero imports - from the genkit monorepo. -- [x] **Own pyproject.toml** — Complete with build system, dependencies, - entry point. -- [x] **Own test suite** — `tests/` directory with full coverage. -- [ ] **LICENSE file** — Currently references `../../LICENSE`; needs own copy. -- [ ] **CI workflows** — Needs own `.github/workflows/` for testing and - publishing. -- [ ] **PyPI publishing** — Needs Trusted Publisher setup. -- [ ] **Documentation site** — `docs/mkdocs.yml` exists; needs deployment. - -### Abstraction Layers (6 Protocols) - -``` -┌──────────────────────────────────────────────────────────────────┐ -│ releasekit core │ -│ │ -│ graph.py scheduler.py versioning.py changelog.py plan.py │ -│ preflight.py state.py lock.py tags.py groups.py │ -│ │ -│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐│ -│ │ VCS │ │ Package │ │Workspace │ │ Registry │ │ Forge ││ -│ │ Protocol │ │ Manager │ │ Protocol │ │ Protocol │ │Protocol││ -│ │ │ │ Protocol │ │ │ │ │ │ ││ -│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘│ -└───────┼────────────┼────────────┼────────────┼────────────┼─────┘ - │ │ │ │ │ - ┌────┴────┐ ┌────┴────┐ ┌───┴────┐ ┌───┴────┐ ┌───┴─────┐ - │ git │ │ uv │ │ uv │ │ PyPI │ │ GitHub │ - │ hg │ │ pnpm │ │ pnpm │ │ npm │ │ GitLab │ - │ │ │ cargo │ │ cargo │ │crates.io│ │Bitbucket│ - │ │ │ maven │ │ bazel │ │ Maven │ │ Gitea │ - │ │ │ gradle │ │ dart │ │ Pub │ │ │ - │ │ │ dart │ │ maven │ │ │ │ │ - └─────────┘ └─────────┘ └────────┘ └────────┘ └─────────┘ -``` - -Each protocol is a `typing.Protocol` (structural subtyping) — no base -class inheritance required. New backends are added by implementing the -protocol and registering in `detection.py`. - ---- - -## 8. Algorithm & Data Structure Audit - -An audit of the current codebase confirms optimal choices across all -performance-critical paths: - -### Algorithms - -| Module | Algorithm | Complexity | Status | -|--------|-----------|------------|--------| -| `graph.py` `topo_sort` | Kahn's algorithm (BFS-based) | O(V+E) | ✅ Optimal | -| `graph.py` `detect_cycles` | DFS with 3-color marking | O(V+E) | ✅ Optimal | -| `graph.py` `forward_deps` / `reverse_deps` | BFS with `deque` | O(V+E) | ✅ Optimal | -| `versioning.py` transitive propagation | BFS via `deque` over reverse edges | O(V+E) | ✅ Optimal | -| `scheduler.py` dispatch | Dependency-triggered queue (not level-lockstep) | O(1) per completion | ✅ Optimal | -| `scheduler.py` retry | Exponential backoff + full jitter (capped 60s) | — | ✅ Best practice | -| `net.py` HTTP retry | Exponential backoff on 429/5xx + connection errors | — | ✅ Best practice | - -### Data Structures - -| Structure | Where Used | Why | -|-----------|-----------|-----| -| `dict[str, Package]` | `DependencyGraph.packages` | O(1) lookup by name | -| `dict[str, list[str]]` | `edges`, `reverse_edges` | O(1) adjacency lookup | -| `dict[str, int]` | `in_degree` in Kahn's | O(1) decrement | -| `set[str]` | `_done`, `_enqueued`, `_cancelled` in Scheduler | O(1) membership test | -| `deque[str]` | BFS queues in topo sort, forward/reverse deps | O(1) append + popleft | -| `asyncio.Queue` | Scheduler work queue | Thread-safe async FIFO | -| `asyncio.Semaphore` | Concurrency limiter | Cooperative async gating | -| `frozenset[int]` | `RETRYABLE_STATUS_CODES` | O(1) membership, immutable | -| `frozen dataclass` | `Package`, `SchedulerResult` | Hashable, safe to share | - -### Async Runtime - -| Component | Implementation | Notes | -|-----------|---------------|-------| -| Event loop | `asyncio.run()` (stdlib) | Single-loop, no thread contention | -| Concurrency | `asyncio.Semaphore(N)` | Cooperative, no OS thread overhead | -| Worker pool | `asyncio.create_task()` × N | Lightweight coroutines, not threads | -| HTTP | `httpx.AsyncClient` with connection pooling | Reuses TCP connections | -| Subprocess | `asyncio.create_subprocess_exec` (via `_run.py`) | Non-blocking process I/O | -| File I/O | `aiofiles` | Non-blocking disk I/O | -| Pause/resume | `asyncio.Event` gate | Zero-cost when not paused | -| Signals | `loop.add_signal_handler` (SIGUSR1/2) | OS-level, no polling | - -### Identified Optimization: R32 — Parallel `vcs.log()` - -**Current:** `compute_bumps` calls `vcs.log()` sequentially for each -package (N serial git subprocess calls for N packages). - -**Fix:** Use `asyncio.gather()` to fetch all commit logs in parallel, -bounded by a semaphore to avoid fork-bombing: - -```python -# Before (sequential): -for pkg in packages: - log_lines = await vcs.log(since_tag=tag, paths=[str(pkg.path)]) - -# After (parallel): -sem = asyncio.Semaphore(10) -async def _fetch(pkg): - async with sem: - return await vcs.log(since_tag=tag, paths=[str(pkg.path)]) -results = await asyncio.gather(*[_fetch(p) for p in packages]) -``` - -For a 60-package workspace, this reduces commit log fetching from -~60 × 0.1s = 6s to ~0.6s (10× speedup). - ---- - -## 9. OpenTelemetry Tracing Design (R31) - -### Why - -No alternative has built-in observability. For large workspaces (60+ -packages), understanding where time is spent is critical: - -- Which packages are on the critical path? -- Is the bottleneck git, the registry, or the build? -- How long does each publish stage take? - -### Architecture - -``` -┌──────────────────────────────────────────────────────────┐ -│ releasekit core │ -│ │ -│ scheduler.py ──┐ │ -│ publisher.py ──┤── @traced decorator ──► TracerProvider │ -│ versioning.py ─┤ │ │ -│ net.py ────────┘ ▼ │ -│ ┌────────────────┐ │ -│ │ SpanExporter │ │ -│ │ (pluggable) │ │ -│ └───┬────┬───┬───┘ │ -└─────────────────────────────────────────┼────┼───┼───────┘ - │ │ │ - ┌───────────┘ │ └──────────┐ - ▼ ▼ ▼ - OTLP/gRPC Console JSON file - (Jaeger, (--verbose) (CI artifact) - Grafana) -``` - -### Span Hierarchy - -``` -releasekit.publish -├── releasekit.discover (workspace discovery) -├── releasekit.graph.build (graph construction) -├── releasekit.graph.topo_sort (topological sort) -├── releasekit.compute_bumps (version computation) -│ ├── releasekit.vcs.log [pkg=genkit] -│ ├── releasekit.vcs.log [pkg=genkit-plugin-foo] -│ └── ... -├── releasekit.preflight (preflight checks) -└── releasekit.scheduler.run (publish orchestration) - ├── releasekit.publish_one [pkg=genkit] - │ ├── releasekit.pin - │ ├── releasekit.build - │ ├── releasekit.checksum - │ ├── releasekit.upload (registry publish) - │ ├── releasekit.poll (availability check) - │ ├── releasekit.verify (checksum verify) - │ └── releasekit.smoke_test - ├── releasekit.publish_one [pkg=genkit-plugin-foo] - └── ... -``` - -### Implementation Plan - -1. **Optional dependency** — `opentelemetry-api` + `opentelemetry-sdk` as - extras: `pip install releasekit[telemetry]`. -2. **`Telemetry` protocol** — New 6th protocol in `backends/`: - ```python - class Telemetry(Protocol): - def start_span(self, name: str, **attrs) -> Span: ... - def record_metric(self, name: str, value: float, **attrs) -> None: ... - ``` -3. **`NullTelemetry`** — Default no-op backend (zero overhead when tracing - is not configured). -4. **`OTelTelemetry`** — OpenTelemetry backend that creates real spans. -5. **`@traced` decorator** — Wraps async functions to auto-create spans: - ```python - @traced('releasekit.vcs.log') - async def log(self, *, since_tag=None, paths=None, ...): ... - ``` -6. **`--trace` CLI flag** — Enables tracing with console exporter. - `--trace-endpoint` sends to OTLP collector. -7. **`plan --analyze`** (R30) — Uses trace data to compute: - - Critical path through the dependency graph - - Estimated wall-clock time per phase - - Bottleneck packages (longest build/publish time) - - Parallelism efficiency (actual vs. theoretical speedup) - -### Metrics to Track - -| Metric | Type | Description | -|--------|------|-------------| -| `releasekit.publish.duration` | Histogram | Total publish time | -| `releasekit.package.duration` | Histogram | Per-package publish time | -| `releasekit.stage.duration` | Histogram | Per-stage time (pin, build, upload, ...) | -| `releasekit.vcs.log.duration` | Histogram | Git log fetch time | -| `releasekit.http.duration` | Histogram | HTTP request time | -| `releasekit.scheduler.queue_wait` | Histogram | Time waiting in queue | -| `releasekit.scheduler.concurrency` | Gauge | Active workers | -| `releasekit.retry.count` | Counter | Total retries | - -### Plan Profiling Output (R30) - -```bash -$ releasekit plan --analyze - -Critical Path: genkit → genkit-plugin-firebase → genkit-plugin-google-cloud - Estimated: 45s (build: 20s, publish: 15s, poll: 10s) - -Bottleneck Packages: - 1. genkit-plugin-firebase — 18s build (heaviest) - 2. genkit — 15s build (most dependents: 42) - 3. genkit-plugin-ollama — 12s build - -Parallelism: - Theoretical speedup: 8.2× (60 packages, 5 workers) - Estimated speedup: 5.1× (critical path limits parallelism) - Utilization: 62% - -Phase Breakdown: - Phase 0 (12 pkgs): ~8s ████████░░░░░░░░ - Phase 1 (18 pkgs): ~12s ████████████░░░░ - Phase 2 (20 pkgs): ~15s ███████████████░ - Phase 3 (10 pkgs): ~10s ██████████░░░░░░ -``` - ---- - -## 10. Branding - -**Logo:** 🚀 Rocketship. - -The releasekit logo is a rocketship — representing launches, velocity, -and shipping releases. Use it in CLI banners, docs, and README headers. - -Deliverables: - -- SVG logo (rocketship silhouette, monochrome + color variants) -- ASCII art banner for `releasekit --version` output -- Favicon for docs site - ---- - -## 11. Repo Portability - -Releasekit is designed to be extractable to a standalone repository. - -Current state (audited): - -- **Zero imports** from any genkit package -- **Zero hardcoded paths** — all paths are config-driven via `releasekit.toml` -- **Self-contained deps** — `pyproject.toml` has no workspace-internal dependencies -- **Own build system** — hatchling with `[project.scripts]` entry point -- **Own test suite** — 1293+ tests with FakeVCS/FakeForge mocks, no genkit fixtures -- All `genkit`/`firebase` references in source are docstring examples or test fixtures - -To extract to a standalone repo: - -1. Copy `py/tools/releasekit/` to a new repo root -2. Move `releasekit.toml` into the consuming repo (it stays there) -3. Publish to PyPI: `pip install releasekit` -4. No code changes required in releasekit itself - -Post-extraction, update docs (README, getting-started guide) to reflect -standalone installation and remove genkit-specific examples from docstrings. - ---- - -## 12. Rustification - -Long-term, rewrite the performance-critical core of releasekit in Rust -and expose it to Python via PyO3/maturin. Python becomes a thin CLI -driver and configuration layer; Rust handles the heavy lifting. - -### Motivation - -- **Speed** — Commit log parsing, dependency graph resolution, and - topological sorting are CPU-bound. Rust eliminates the GIL bottleneck - and enables true parallelism. -- **Single binary** — A Rust core can be compiled to a standalone CLI - (`releasekit`) with zero runtime dependencies, usable from any - language ecosystem (JS, Go, etc.) without requiring Python. -- **Memory safety** — Rust's ownership model prevents the class of bugs - that arise in concurrent subprocess orchestration. - -### Architecture - -``` -┌─────────────────────────────────────────┐ -│ Python CLI driver (click/argparse) │ -│ - Config loading (releasekit.toml) │ -│ - User interaction (prompts, UI) │ -│ - Plugin system (custom backends) │ -└────────────────┬────────────────────────┘ - │ PyO3 FFI -┌────────────────▼────────────────────────┐ -│ Rust core (releasekit-core) │ -│ - Commit parsing (conventional) │ -│ - Version computation (semver) │ -│ - Dependency graph + topo sort │ -│ - Tag formatting + validation │ -│ - Changelog generation │ -│ - Parallel subprocess orchestration │ -│ - Registry polling (async reqwest) │ -└─────────────────────────────────────────┘ -``` - -### Migration phases - -1. **Phase 1 — Rust library crate** (`releasekit-core`): Implement - commit parsing, semver computation, and graph resolution in Rust. - Expose via PyO3 as a native Python extension module. -2. **Phase 2 — Hybrid mode**: Python calls into Rust for hot paths - (versioning, graph, changelog). Backends (VCS, PM, Registry, Forge) - remain in Python for flexibility. -3. **Phase 3 — Standalone binary**: Compile the Rust core into a - standalone `releasekit` CLI binary. Python driver becomes optional. -4. **Phase 4 — Full Rust**: Migrate remaining backends to Rust. - Python package becomes a thin wrapper (`releasekit-py`) for users - who prefer `pip install`. diff --git a/py/tools/releasekit/roadmap.md b/py/tools/releasekit/roadmap.md index 3e93eb1305..d299cfc1c8 100644 --- a/py/tools/releasekit/roadmap.md +++ b/py/tools/releasekit/roadmap.md @@ -144,20 +144,34 @@ Flat top-level keys, no `[tool.*]` nesting: ```toml # releasekit.toml (at the monorepo root) -synchronize = true # all packages share one version number -tag_format = "v{version}" -publish_from = "ci" - -# Ecosystems: declare which workspace roots to scan. -# Each ecosystem maps to a (Workspace, PackageManager, Registry) triple. -[ecosystems.python] -workspace_root = "py/" # contains pyproject.toml with [tool.uv.workspace] - -[ecosystems.js] -workspace_root = "js/" # contains pnpm-workspace.yaml - -[ecosystems.go] -workspace_root = "go/" # contains go.work +forge = "github" +repo_owner = "firebase" +repo_name = "genkit" +default_branch = "main" +pr_title_template = "chore(release): v{version}" + +[workspace.py] +ecosystem = "python" +tool = "uv" +root = "py" +tag_format = "{name}@{version}" +umbrella_tag = "py/v{version}" +changelog = true +smoke_test = true +max_commits = 500 + +[workspace.js] +ecosystem = "js" +tool = "pnpm" +root = "." +tag_format = "{name}@{version}" +umbrella_tag = "js/v{version}" +synchronize = true + +# Go workspace (future) +# [workspace.go] +# ecosystem = "go" +# root = "go" ``` ##### Per-package config (`releasekit.toml`) @@ -422,9 +436,9 @@ own the transport and format details. |---|----------|---------------|---------|--------| | 1 | **`VCS`** | Commit, tag, push, log, branch operations | `GitCLIBackend`, `MercurialBackend` | — | | 2 | **`Forge`** | PRs, releases, labels, availability check | `GitHubCLIBackend`, `GitLabBackend`, `BitbucketAPIBackend` | — | -| 3 | **`Workspace`** | Discover members, classify deps, rewrite versions | `UvWorkspace`, `PnpmWorkspace` | `GoWorkspace`, `CargoWorkspace`, `PubWorkspace`, `MavenWorkspace`, `GradleWorkspace` | -| 4 | **`PackageManager`** | Lock, build, publish | `UvBackend`, `PnpmBackend` | `GoBackend`, `CargoBackend`, `PubBackend`, `MavenBackend`, `GradleBackend` | -| 5 | **`Registry`** | Check published versions, checksums | `PyPIBackend`, `NpmRegistry` | `GolangProxy`, `CratesIoRegistry`, `PubDevRegistry`, `MavenCentralRegistry` | +| 3 | **`Workspace`** | Discover members, classify deps, rewrite versions | ✅ `UvWorkspace`, ✅ `PnpmWorkspace` | `GoWorkspace`, `CargoWorkspace`, `PubWorkspace`, `MavenWorkspace`, `GradleWorkspace` | +| 4 | **`PackageManager`** | Lock, build, publish | ✅ `UvBackend`, ✅ `PnpmBackend` | `GoBackend`, `CargoBackend`, `PubBackend`, `MavenBackend`, `GradleBackend` | +| 5 | **`Registry`** | Check published versions, checksums | ✅ `PyPIBackend`, ✅ `NpmRegistry` | `GolangProxy`, `CratesIoRegistry`, `PubDevRegistry`, `MavenCentralRegistry` | > **Design note:** `ManifestParser` and `VersionRewriter` were folded > into the `Workspace` protocol as `rewrite_version()` and @@ -436,7 +450,7 @@ own the transport and format details. | Ecosystem | Workspace Config | Source Mechanism | Manifest File | Registry | Status | |-----------|-----------------|-----------------|---------------|----------|--------| | **Python (uv)** | `[tool.uv.workspace]` | `[tool.uv.sources]` `workspace = true` | `pyproject.toml` | PyPI | ✅ Implemented | -| **TypeScript (pnpm)** | `pnpm-workspace.yaml` | `"workspace:*"` protocol in `package.json` | `package.json` | npm | 🔧 Backend done | +| **TypeScript (pnpm)** | `pnpm-workspace.yaml` | `"workspace:*"` protocol in `package.json` | `package.json` | npm | ✅ Implemented | | **Go** | `go.work` | `use ./pkg` directives | `go.mod` | proxy.golang.org | ⬜ Designed (see §7) | | **Java (Maven)** | reactor POM `` | `${project.version}` | `pom.xml` | Maven Central | ⬜ Future | | **Java (Gradle)** | `settings.gradle` `include` | `project(':sub')` deps | `build.gradle(.kts)` | Maven Central | ⬜ Future | @@ -480,7 +494,7 @@ Remaining migration steps: | 4c: UI States | ✅ Complete | observer.py, sliding window, keyboard shortcuts, signal handlers | | 5: Release-Please | ✅ Complete | Orchestrators, CI workflow, workspace-sourced deps | | 6: UX Polish | ✅ Complete | init, formatters (9), rollback, completion, diagnostics, granular flags, TOML config migration | -| 7: Quality + Ship | 🔶 In progress | 706 tests pass, 16.8K src lines, 12.1K test lines | +| 7: Quality + Ship | 🔶 In progress | 1,293 tests pass, 76 source modules, 51 test files (~19.3K test LOC) | ### Phase 5 completion status @@ -865,10 +879,17 @@ Phase 6: UX Polish ▼ ✅ COMPLETE │ Phase 7: Quality + Ship ▼ 🔶 IN PROGRESS ┌─────────────────────────────────────────────────────────┐ -│ tests (706 tests, 12.1K lines) │ +│ tests (1,293 tests, 51 files, ~19.3K lines) │ │ type checking (ty, pyright, pyrefly -- zero errors) │ │ README.md (21 sections, mermaid diagrams) │ │ workspace config (releasekit init on genkit repo) │ +│ sbom.py (CycloneDX + SPDX SBOM generation) │ +│ profiling.py (pipeline step timing + bottleneck) │ +│ tracing.py (optional OpenTelemetry, graceful no-op) │ +│ doctor.py (release state consistency checker) │ +│ distro.py (Debian/Fedora/Homebrew dep sync) │ +│ branch.py (default branch resolution) │ +│ commit_parsing/ (conventional commit parser) │ │ │ │ ✓ Ship v0.1.0 to PyPI │ └─────────────────────────────────────────────────────────┘ @@ -1374,12 +1395,86 @@ deletion. Shell completion works in bash/zsh/fish. | Type checking | Zero errors from `ty`, `pyright`, and `pyrefly` in strict mode. | config | | `README.md` | 21 sections with Mermaid workflow diagrams, CLI reference, config reference, testing workflow, vulnerability scanning, migration guide. | ~800 | | Workspace config | Run `releasekit init` on the genkit repo. Review auto-detected groups. Commit generated config. | config | +| `migrate.py` | `releasekit migrate` subcommand for mid-stream adoption. See details below. | ~200 | **Done when**: `pytest --cov-fail-under=90` passes, all three type checkers report zero errors, README is complete. **Milestone**: Ship `releasekit` v0.1.0 to PyPI. +#### `releasekit migrate` — Automatic Tag Detection and Bootstrap + +When adopting releasekit on a repo that already has releases, the user +currently needs to manually find the last release tag and set +`bootstrap_sha` in `releasekit.toml`. The `migrate` subcommand automates +this entirely. + +**What it does:** + +1. **Scan all git tags** in the repo (`git tag -l`). +2. **Classify each tag** by matching against known tag patterns: + - Umbrella tags: `py/v0.5.0`, `js/v1.2.3`, `go/v0.1.0` + - Per-package tags: `py/genkit-v0.5.0`, `@genkit-ai/core@1.2.3` + - Legacy tags: `genkit-python@0.4.0`, `genkit@1.0.0-rc.5` + - Unrecognized tags are reported but not associated. +3. **Associate tags with workspaces** by matching the tag prefix/format + against each `[workspace.*]` section's `tag_format`, `umbrella_tag`, + and `root` fields in `releasekit.toml`. +4. **Associate tags with packages** by matching the `{name}` component + of the tag against discovered workspace members (from + `Workspace.discover()`). +5. **Determine the latest release per workspace** by sorting associated + tags by semver and picking the highest. +6. **Auto-set `bootstrap_sha`** to the commit the latest tag points to + (via `git rev-list -1 `). +7. **Generate a migration report** showing: + - Tags found per workspace (with version, commit SHA, date). + - Tags that could not be associated (orphaned/legacy). + - The `bootstrap_sha` that will be written. + - Per-package tag status (present / missing / legacy format). +8. **Write `bootstrap_sha`** into `releasekit.toml` (using tomlkit for + comment-preserving edits), or print the diff in `--dry-run` mode. + +**CLI interface:** + +``` +releasekit migrate [--dry-run] [--workspace LABEL] + +Options: + --dry-run Show what would be written without modifying files. + --workspace Migrate a specific workspace (default: all). +``` + +**Example output:** + +``` +Scanning tags... + Found 4 tags: + py/v0.5.0 → workspace: py (commit b71a3d20c, 2026-02-05) + genkit-python@0.4.0 → workspace: py (legacy format, commit a1b2c3d) + genkit-python@0.3.2 → workspace: py (legacy format, commit e4f5g6h) + genkit-python@0.3.1 → workspace: py (legacy format, commit i7j8k9l) + + Latest release for workspace 'py': py/v0.5.0 (0.5.0) + + Per-package tag status (workspace: py): + genkit — no per-package tag (will use bootstrap_sha) + genkit-plugin-google-genai — no per-package tag (will use bootstrap_sha) + genkit-plugin-vertex-ai — no per-package tag (will use bootstrap_sha) + ... (22 packages total) + +Writing bootstrap_sha = "b71a3d20c74b71583edbc652e5b26117caad43f4" to releasekit.toml + ✅ Migration complete. Run 'releasekit plan' to preview the next release. +``` + +**Why this matters:** + +- Eliminates manual SHA lookup when adopting releasekit. +- Handles repos with mixed tag formats (legacy + new) gracefully. +- Works across multiple workspaces (e.g. `py` + `js` in the same repo). +- The classification logic reuses `tag_format` parsing from + `versioning.py`, ensuring consistency with how releasekit creates tags. + --- ## Critical Path @@ -1552,6 +1647,7 @@ py/tools/releasekit/ mermaid.py ← Mermaid syntax d2.py ← D2 syntax init.py ← workspace config scaffolding + migrate.py ← mid-stream adoption: tag detection + bootstrap_sha versioning.py ← Conventional Commits -> semver pin.py ← ephemeral version pinning bump.py ← version string rewriting