Use case
We are adding Google ADK support to Kitaru, an OSS durable execution/checkpointing runtime for AI workflows.
For ADK, we can already support two useful paths:
- Whole-runner checkpointing: wrap an ADK runner invocation as one durable operation.
- Explicit model/tool checkpointing: users pass wrapped ADK objects such as
KitaruADKModel and KitaruADKTool, and ADK calls those wrapped objects directly.
The second path works because the actual model/tool operation happens inside the wrapper method:
ADK runner
-> ADK LlmAgent
-> KitaruADKModel.generate_content_async(...)
-> Kitaru checkpoint
-> real model.generate_content_async(...)
ADK runner
-> ADK tool dispatch
-> KitaruADKTool.run_async(...)
-> Kitaru checkpoint
-> real tool.run_async(...)
That gives replay-safe behavior: if a process crashes after a model/tool call completes, Kitaru can replay from the stored checkpoint instead of repeating the provider call or tool side effect.
What is missing
For arbitrary ADK runners, we do not see a public plugin hook that lets a plugin wrap the actual model/tool call body.
The current plugin/callback shape appears to be before/after/error:
before_model_callback(callback_context, llm_request)
ADK calls the model
after_model_callback(callback_context, llm_response)
and similarly for tools:
before_tool_callback(tool, tool_args, tool_context)
ADK runs the tool
after_tool_callback(tool, tool_args, tool_context, result)
Those hooks are useful for logging, guardrails, short-circuiting, and response modification. But they do not let a durability runtime run the actual provider/tool call inside its own checkpoint.
The failure case is:
1. before_model_callback runs.
2. It returns None, so ADK calls the provider.
3. The provider returns successfully.
4. The process crashes before after_model_callback persists the response.
5. Replay calls the provider again.
For a durability system, that means duplicate model cost or duplicated tool side effects.
Returning a cached response from before_model_callback helps when the cache already exists, but it does not solve the first successful uncached call unless the call itself can happen inside the durable checkpoint.
Requested API shape
Would ADK consider adding middleware-style around-call plugin hooks that receive both the call input and a continuation/proceed callable?
For example, model calls:
async def wrap_model_call(
self,
*,
callback_context: CallbackContext,
llm_request: LlmRequest,
proceed: Callable[[LlmRequest], Awaitable[LlmResponse]],
) -> LlmResponse:
...
Tool calls:
async def wrap_tool_call(
self,
*,
tool: BaseTool,
tool_args: dict[str, Any],
tool_context: ToolContext,
proceed: Callable[[], Awaitable[dict]],
) -> dict:
...
Then a durability plugin can do this:
open checkpoint
-> call proceed(...)
-> persist returned response/result as checkpoint output
return response/result to ADK
This differs from before/after callbacks because the plugin controls when the side effect happens and can place that side effect inside its own durable boundary.
Why this matters
This would enable replay-safe integrations for:
- durable model-call checkpointing
- durable tool-call checkpointing
- exact retry/replay behavior after process crashes
- provider-cost deduplication after successful calls
- avoiding duplicated tool side effects when tools are idempotent only at the runtime/checkpoint layer
It would also let ADK support durable execution systems without requiring users to manually wrap every model/tool object before constructing their agents.
Current workaround
The current safe workaround is explicit wrapping:
user passes wrapped model/tool objects into ADK
ADK calls the wrappers
wrappers checkpoint their own method bodies
That works, but it is opt-in per model/tool and does not give a runner-level integration for arbitrary ADK runners.
We intentionally do not treat before/after metadata callbacks as replay-safe checkpointing, because that would make users think provider/tool side effects are protected when they may still be repeated on replay.
Questions
- Is there already a public around-call hook or continuation-style plugin API that we missed?
- If not, would the ADK team be open to a
wrap_model_call / wrap_tool_call style plugin API?
- Are there constraints in ADK's runner/model/tool execution model that would make this unsafe or hard to support?
Use case
We are adding Google ADK support to Kitaru, an OSS durable execution/checkpointing runtime for AI workflows.
For ADK, we can already support two useful paths:
KitaruADKModelandKitaruADKTool, and ADK calls those wrapped objects directly.The second path works because the actual model/tool operation happens inside the wrapper method:
That gives replay-safe behavior: if a process crashes after a model/tool call completes, Kitaru can replay from the stored checkpoint instead of repeating the provider call or tool side effect.
What is missing
For arbitrary ADK runners, we do not see a public plugin hook that lets a plugin wrap the actual model/tool call body.
The current plugin/callback shape appears to be before/after/error:
and similarly for tools:
Those hooks are useful for logging, guardrails, short-circuiting, and response modification. But they do not let a durability runtime run the actual provider/tool call inside its own checkpoint.
The failure case is:
For a durability system, that means duplicate model cost or duplicated tool side effects.
Returning a cached response from
before_model_callbackhelps when the cache already exists, but it does not solve the first successful uncached call unless the call itself can happen inside the durable checkpoint.Requested API shape
Would ADK consider adding middleware-style around-call plugin hooks that receive both the call input and a continuation/proceed callable?
For example, model calls:
Tool calls:
Then a durability plugin can do this:
This differs from before/after callbacks because the plugin controls when the side effect happens and can place that side effect inside its own durable boundary.
Why this matters
This would enable replay-safe integrations for:
It would also let ADK support durable execution systems without requiring users to manually wrap every model/tool object before constructing their agents.
Current workaround
The current safe workaround is explicit wrapping:
That works, but it is opt-in per model/tool and does not give a runner-level integration for arbitrary ADK runners.
We intentionally do not treat before/after metadata callbacks as replay-safe checkpointing, because that would make users think provider/tool side effects are protected when they may still be repeated on replay.
Questions
wrap_model_call/wrap_tool_callstyle plugin API?