[SUPERSEDED] feat(py/genkit): add model-level middleware support via define_model(use=[...])#4516
[SUPERSEDED] feat(py/genkit): add model-level middleware support via define_model(use=[...])#4516
Conversation
Summary of ChangesHello @yesudeep, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Python Genkit SDK by introducing model-level middleware support, aligning its capabilities with the JavaScript SDK. This feature allows middleware to be associated directly with a model during its definition via Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request successfully implements model-level middleware support, achieving parity with the JavaScript SDK. The changes are well-structured, touching the core Action and Registry components to store and apply middleware, and updating the dispatch logic to correctly chain call-time and model-level middleware. The addition of comprehensive tests and an updated sample application is great. I have one suggestion to improve type safety and clarity in the middleware combination logic.
| # then model-level middleware (from define_model(use=[...])). | ||
| # This matches JS SDK execution order: | ||
| # call-time[0..N] → model-level[0..M] → runner | ||
| model_middleware: list[ModelMiddleware] = model.middleware if model.middleware else [] |
There was a problem hiding this comment.
The Action.middleware property is guaranteed to return a list, so the if model.middleware else [] check is redundant. Additionally, model.middleware returns list[Any], but it's being assigned to a variable typed as list[ModelMiddleware]. Using cast makes this type conversion explicit and improves type safety.
| model_middleware: list[ModelMiddleware] = model.middleware if model.middleware else [] | |
| model_middleware: list[ModelMiddleware] = cast(list[ModelMiddleware], model.middleware) |
There was a problem hiding this comment.
Good catch — Action.middleware always returns a list, so the conditional was redundant. Replaced with cast(list[ModelMiddleware], model.middleware) for explicit type safety.
… and merge order Comprehensive update reflecting current state of all merged and open PRs: Merged PRs reflected: - #4511 (G5+G6: span_id + X-Genkit-Span-Id) → marked Done - #4507+#4508 (G11: CHANGELOG.md) → marked Done - #4509 (plugin test coverage uplift) → marked Done - #4505 (PARITY_AUDIT.md baseline) → marked Done - #4488 (sample naming, py.typed, check_consistency) → marked Done Open PRs tracked: - #4516 (G1+G2: model middleware storage) - #4514 (Transfer-Encoding fix) - #4513 (G18: multipart tools) - #4512 (G20-G22: constructor parity) - #4510 (G3+G12-G16: middleware functions) - #4504 (Checks plugin) - #4495, #4494, #4401 (bug fixes + reflection v2) New content: - Middleware Taxonomy section with 4-layer diagram - Decision guide for which middleware layer to use - G38 gap (auto-wiring via get_model_middleware) - PR merge order graph with file conflict matrix - Updated summary metrics and phase tables
… and merge order Comprehensive update reflecting current state of all merged and open PRs: Merged PRs reflected: - #4511 (G5+G6: span_id + X-Genkit-Span-Id) → marked Done - #4507+#4508 (G11: CHANGELOG.md) → marked Done - #4509 (plugin test coverage uplift) → marked Done - #4505 (PARITY_AUDIT.md baseline) → marked Done - #4488 (sample naming, py.typed, check_consistency) → marked Done Open PRs tracked: - #4516 (G1+G2: model middleware storage) - #4514 (Transfer-Encoding fix) - #4513 (G18: multipart tools) - #4512 (G20-G22: constructor parity) - #4510 (G3+G12-G16: middleware functions) - #4504 (Checks plugin) - #4495, #4494, #4401 (bug fixes + reflection v2) New content: - Middleware Taxonomy section with 4-layer diagram - Decision guide for which middleware layer to use - G38 gap (auto-wiring via get_model_middleware) - PR merge order graph with file conflict matrix - Updated summary metrics and phase tables
…use=[...])
This implements parity with the JS SDK's defineModel({use: [mw]}) feature,
allowing model plugins to register middleware that runs at model execution
time, after any call-time middleware passed via generate(use=[...]).
The execution order matches the JS SDK:
call-time[0..N] → model-level[0..M] → runner
Changes:
- Add middleware param to Action.__init__() and Action.middleware property
- Add middleware param to Registry.register_action()
- Add use param to define_model() in ai/_registry.py
- Update dispatch() in generate.py to chain model-level middleware
- Add 3 comprehensive tests: model-level only, execution order, chaining
- Update PARITY_AUDIT.md: mark G1 and G2 as done
…ware-demo Adds define_model(use=[...]) and combined call-time + model-level middleware demonstrations. Updates README with execution order documentation. New flows: - model_level_middleware_demo: model-level mw via define_model(use=[...]) - combined_middleware_demo: both call-time and model-level mw together
Adds §8f.1 analyzing how built-in model middleware overlaps with inline plugin implementations. Identifies new gap G38 (auto-wiring middleware at define-model time based on ModelInfo.supports, matching JS getModelMiddleware). Key findings: - Python plugins do NOT currently implement system prompt, constrained generation, or media download inline — so adding middleware auto-wiring will not conflict with existing plugin logic - G38 (auto-wiring) is the critical enabler for G3, G4, G14, G15, G16 - G12 (retry) and G13 (fallback) are standalone, not auto-wired
Cast p.root.text to str() for str.join() compatibility with pyrefly/ty. Apply ruff format to fix formatting inconsistencies.
…_AUDIT.md Adds §10j with a concrete dependency-aware merge order for all 8 open PRs, including a visual merge sequence diagram and a file conflict matrix showing which PRs touch the same source files.
4826920 to
cf567f9
Compare
… and merge order Comprehensive update reflecting current state of all merged and open PRs: Merged PRs reflected: - #4511 (G5+G6: span_id + X-Genkit-Span-Id) → marked Done - #4507+#4508 (G11: CHANGELOG.md) → marked Done - #4509 (plugin test coverage uplift) → marked Done - #4505 (PARITY_AUDIT.md baseline) → marked Done - #4488 (sample naming, py.typed, check_consistency) → marked Done Open PRs tracked: - #4516 (G1+G2: model middleware storage) - #4514 (Transfer-Encoding fix) - #4513 (G18: multipart tools) - #4512 (G20-G22: constructor parity) - #4510 (G3+G12-G16: middleware functions) - #4504 (Checks plugin) - #4495, #4494, #4401 (bug fixes + reflection v2) New content: - Middleware Taxonomy section with 4-layer diagram - Decision guide for which middleware layer to use - G38 gap (auto-wiring via get_model_middleware) - PR merge order graph with file conflict matrix - Updated summary metrics and phase tables
… and merge order Comprehensive update reflecting current state of all merged and open PRs: Merged PRs reflected: - #4511 (G5+G6: span_id + X-Genkit-Span-Id) → marked Done - #4507+#4508 (G11: CHANGELOG.md) → marked Done - #4509 (plugin test coverage uplift) → marked Done - #4505 (PARITY_AUDIT.md baseline) → marked Done - #4488 (sample naming, py.typed, check_consistency) → marked Done Open PRs tracked: - #4516 (G1+G2: model middleware storage) - #4514 (Transfer-Encoding fix) - #4513 (G18: multipart tools) - #4512 (G20-G22: constructor parity) - #4510 (G3+G12-G16: middleware functions) - #4504 (Checks plugin) - #4495, #4494, #4401 (bug fixes + reflection v2) New content: - Middleware Taxonomy section with 4-layer diagram - Decision guide for which middleware layer to use - G38 gap (auto-wiring via get_model_middleware) - PR merge order graph with file conflict matrix - Updated summary metrics and phase tables
Status: SUPERSEDED — Pending Middleware V2 RedesignThis PR is being converted to draft and marked superseded because the upstream JS and Go SDKs are actively redesigning the middleware architecture:
What's changingThe
What this means for this PRThe Action-level middleware storage (G2) and Tracking in PARITY_AUDIT.md gap G38. |
… and merge order Comprehensive update reflecting current state of all merged and open PRs: Merged PRs reflected: - #4511 (G5+G6: span_id + X-Genkit-Span-Id) → marked Done - #4507+#4508 (G11: CHANGELOG.md) → marked Done - #4509 (plugin test coverage uplift) → marked Done - #4505 (PARITY_AUDIT.md baseline) → marked Done - #4488 (sample naming, py.typed, check_consistency) → marked Done Open PRs tracked: - #4516 (G1+G2: model middleware storage) - #4514 (Transfer-Encoding fix) - #4513 (G18: multipart tools) - #4512 (G20-G22: constructor parity) - #4510 (G3+G12-G16: middleware functions) - #4504 (Checks plugin) - #4495, #4494, #4401 (bug fixes + reflection v2) New content: - Middleware Taxonomy section with 4-layer diagram - Decision guide for which middleware layer to use - G38 gap (auto-wiring via get_model_middleware) - PR merge order graph with file conflict matrix - Updated summary metrics and phase tables fix: address review comments — fix G1/G2 status consistency, middleware level count, #4514 merged docs(py): update PARITY_AUDIT.md — mark #4494 and #4514 merged, add #4518 Cohere Update PR status tracking: - #4494 (RedactedSpan fix): marked merged - #4514 (Transfer-Encoding fix): marked merged - #4518 (Cohere provider plugin): added as open PR - Updated dependency chain, summary metrics, and open PR counts docs(py): update PARITY_AUDIT.md — add #4519 (Core fix) docs(py): update PARITY_AUDIT.md with latest merged PR status Rationale: Several PRs have merged since the last update. This syncs the document with the current state of all PRs. Changes: - Mark #4495, #4518, #4520 as merged in all tables - Move #4495, #4518 from INDEPENDENT to MERGED in dependency chain - Add #4520 (converter extraction) to merged list - Update summary metrics: 6 open PRs (down from 8) - Simplify Layer 3 deps since #4495 is now merged docs(py): add issue tracker analysis, dependency graph, model conformance roadmap, and sample flow test plan Rationale: Comprehensive update to PARITY_AUDIT.md with 5 new sections (§12–§16) covering: - Cross-SDK issue tracker analysis verified against Python source code - Dependency-aware reverse topological sort roadmap for prioritized fixes - Model conformance testing roadmap with provider parity matrix - Sample flow test plan with optimal execution order for error detection Changes: - §12: Fixability assessment of 9 'likely' issues with code-level verdicts - §13: Dependency graph (W1–W14), file conflict matrix, PR manifest with regression test specifications, sprint-based execution plan - §14: Model conformance roadmap (Phases 0–4), plugin parity matrix, conformance PR mapping, JS-only plugin gaps - §15: Combined roadmap unifying parity gaps, issue fixes, and conformance - §16: Sample flow test plan with 5-phase error detection priority pyramid, 36 samples ordered by feature coverage, quick-start commands, and env var reference table docs(py): add active RFC redesigns (Middleware V2, Bidi, Agent) to parity audit - Add 7 new gaps (G38-G44) for Middleware V2, Bidi Action/Flow/Model, Agent primitive, Plugin V2, and Reflection API V2 - Mark middleware gaps G1-G3, G12-G16 as PAUSED (blocked on upstream JS #4515 and Go #4422 Middleware V2 RFCs) - Mark G19 (Model API V2) as SUPERSEDED by G38 + G41 - Add deep-dive sections §8l-8p covering all 5 active RFC designs - Update dependency graph: critical path now G38→G2→G1→G3 (4 levels) - Restructure phased roadmap: Phase 1 (unblocked), Phase 2-3 (paused), Phase 4 (bidi/agent, blocked), Phase 5 (integration), Phase 6 (deferred) - Update §5g cross-SDK gaps table with new primitives - Update §9b status tracker and §9c dependency matrix - Update summary metrics: 36 total gaps, 8 paused, 6 upstream-blocked docs(py): update PARITY_AUDIT.md PR status to 2026-02-11 Comprehensive status update reflecting 23 PRs merged since last update (2026-02-09), 3 PRs closed/superseded, and 11 currently open PRs including releasekit tooling, dotprompt fixes, and CI workflow migration. Key changes: - §14f: Checks plugin marked as merged (#4504) - §15b: Split into Recently Merged / Closed / Currently Open - §15c: Updated metrics (31 merged, 11 open, 3 closed) - Added releasekit PRs (14 merged, 3 open) as new workstream
Cross-checked all markdown files in py/ against the codebase and open PRs. Fixed outdated content across 9 files. engdoc/index.md: - Fix Python version: 3.12+ → 3.10+ - Update feature parity table (6 of 7 features now ✅, Agents still ❌) - Replace 8-plugin table with full 23-plugin parity table - Rewrite all 6 Python code examples (generation, structured output, tool calling, chat, agents, data retrieval) with correct imports, Genkit() class API, and @ai.tool() decorator pattern engdoc/extending/api.md: - Replace stale Sync/Async design section (GenkitExperimental/SyncGenkit/ AsyncGenkit never implemented) with actual async-first architecture documenting GenkitRegistry → GenkitBase → Genkit hierarchy engdoc/extending/index.md: - Update d2 diagram plugin list from 7 to 22 plugins engdoc/extending/servers.md: - Fill Python TODO links with actual file paths (flows.py, reflection.py) engdoc/user_guide/python/publishing_pypi.md: - Add ReleaseKit as primary publishing mechanism - Demote manual workflow to "Legacy" section GEMINI.md: - Remove 7 dangling references to deleted files (engdoc/planning/, blog-genkit-python-*.md, release-publishing-guide.md) - Update blog article guidelines from mandatory to optional - Remove stale validation script checking deleted paths .github/PR_RELEASE.md: - Remove dangling reference to deleted blog-genkit-python-0.5.0.md PARITY_AUDIT.md: - G7: ✅ Done → ⬜ Reverted (#4459 reverted by #4469, needs re-land) - §8c.3/§8c.4: Update stale text — X-Genkit-Span-Id IS now sent (#4511) - §1d: genkitx-cohere ❌ → ✅ (in-tree cohere plugin exists) - §6c: Community coverage 3/6 → 4/6 - G17: 🔄 draft → ⬜ (#4521 closed, needs new PR) - G3/G12-G16/G4: Note #4510 is closed, needs new PR after G38 - G2→G1: Mark as superseded (#4516 titled [SUPERSEDED])
Cross-checked all markdown files in py/ against the codebase and open PRs. Fixed outdated content across 9 files. engdoc/index.md: - Fix Python version: 3.12+ → 3.10+ - Update feature parity table (6 of 7 features now ✅, Agents still ❌) - Replace 8-plugin table with full 23-plugin parity table - Rewrite all 6 Python code examples (generation, structured output, tool calling, chat, agents, data retrieval) with correct imports, Genkit() class API, and @ai.tool() decorator pattern engdoc/extending/api.md: - Replace stale Sync/Async design section (GenkitExperimental/SyncGenkit/ AsyncGenkit never implemented) with actual async-first architecture documenting GenkitRegistry → GenkitBase → Genkit hierarchy engdoc/extending/index.md: - Update d2 diagram plugin list from 7 to 22 plugins engdoc/extending/servers.md: - Fill Python TODO links with actual file paths (flows.py, reflection.py) engdoc/user_guide/python/publishing_pypi.md: - Add ReleaseKit as primary publishing mechanism - Demote manual workflow to "Legacy" section GEMINI.md: - Remove 7 dangling references to deleted files (engdoc/planning/, blog-genkit-python-*.md, release-publishing-guide.md) - Update blog article guidelines from mandatory to optional - Remove stale validation script checking deleted paths .github/PR_RELEASE.md: - Remove dangling reference to deleted blog-genkit-python-0.5.0.md PARITY_AUDIT.md: - G7: ✅ Done → ⬜ Reverted (#4459 reverted by #4469, needs re-land) - §8c.3/§8c.4: Update stale text — X-Genkit-Span-Id IS now sent (#4511) - §1d: genkitx-cohere ❌ → ✅ (in-tree cohere plugin exists) - §6c: Community coverage 3/6 → 4/6 - G17: 🔄 draft → ⬜ (#4521 closed, needs new PR) - G3/G12-G16/G4: Note #4510 is closed, needs new PR after G38 - G2→G1: Mark as superseded (#4516 titled [SUPERSEDED])
Cross-checked all markdown files in py/ against the codebase and open PRs. Fixed outdated content across 9 files. engdoc/index.md: - Fix Python version: 3.12+ → 3.10+ - Update feature parity table (6 of 7 features now ✅, Agents still ❌) - Replace 8-plugin table with full 23-plugin parity table - Rewrite all 6 Python code examples (generation, structured output, tool calling, chat, agents, data retrieval) with correct imports, Genkit() class API, and @ai.tool() decorator pattern engdoc/extending/api.md: - Replace stale Sync/Async design section (GenkitExperimental/SyncGenkit/ AsyncGenkit never implemented) with actual async-first architecture documenting GenkitRegistry → GenkitBase → Genkit hierarchy engdoc/extending/index.md: - Update d2 diagram plugin list from 7 to 22 plugins engdoc/extending/servers.md: - Fill Python TODO links with actual file paths (flows.py, reflection.py) engdoc/user_guide/python/publishing_pypi.md: - Add ReleaseKit as primary publishing mechanism - Demote manual workflow to "Legacy" section GEMINI.md: - Remove 7 dangling references to deleted files (engdoc/planning/, blog-genkit-python-*.md, release-publishing-guide.md) - Update blog article guidelines from mandatory to optional - Remove stale validation script checking deleted paths .github/PR_RELEASE.md: - Remove dangling reference to deleted blog-genkit-python-0.5.0.md PARITY_AUDIT.md: - G7: ✅ Done → ⬜ Reverted (#4459 reverted by #4469, needs re-land) - §8c.3/§8c.4: Update stale text — X-Genkit-Span-Id IS now sent (#4511) - §1d: genkitx-cohere ❌ → ✅ (in-tree cohere plugin exists) - §6c: Community coverage 3/6 → 4/6 - G17: 🔄 draft → ⬜ (#4521 closed, needs new PR) - G3/G12-G16/G4: Note #4510 is closed, needs new PR after G38 - G2→G1: Mark as superseded (#4516 titled [SUPERSEDED])
Summary
Implements model-level middleware support in the Python Genkit SDK, achieving parity
with the JavaScript SDK's
defineModel({use: [mw]})feature.What Changed
Core Framework
Actionclass (core/action/_action.py): Addedmiddlewareparameter to__init__()andmiddlewareproperty for read-only accessRegistry.register_action()(core/registry.py): Accepts and passesthrough
middlewareparameter toActionconstructordefine_model()(ai/_registry.py): Addeduseparameter acceptinglist[ModelMiddleware], matching JS SDK'sdefineModel({use: [...]})dispatch()(blocks/generate.py): Correctly chains model-level middlewareafter call-time middleware, matching JS execution order:
call-time[0..N] → model-level[0..M] → runnerTests
test_model_level_middleware_applied: Model-level middleware applied to requeststest_call_time_middleware_runs_before_model_level: Execution order verificationtest_multiple_model_level_middleware_chain: Multiple model-level MW chain correctlySample
framework-middleware-demowith new flows:model_level_middleware_demo: Model-level middleware viadefine_model(use=[...])combined_middleware_demo: Call-time + model-level middleware togetherPARITY_AUDIT.md
define_model(use=[...])) and G2 (Action-level middleware storage) as DonegetModelMiddleware())Execution Order
Parity Status
define_model(use=[...])— Done