🔴 Required Information
Is your feature request related to a specific problem?
Plugin before_model_callbacks run before the agent's own before_model_callbacks and before ADK's post-callback request finalization. As a result, observability plugins that snapshot the LlmRequest in before_model_callback capture the request too early — they cannot observe the request that is actually sent to the model. There is currently no plugin hook positioned after all callbacks/finalization but immediately before the model call.
This surfaced in #4202 (discussion comment): the BigQueryAgentAnalyticsPlugin's LLM_REQUEST event does not reflect context modifications made in a user's before_model_callback.
Current behavior (flows/llm_flows/base_llm_flow.py):
_handle_before_model_callback runs plugin before_model_callbacks first, then agent.canonical_before_model_callbacks.
- After the callbacks return,
_call_llm_async finalizes the request (e.g. injects config.labels, including adk_agent_name) and only then calls llm.generate_content_async(llm_request).
So a plugin snapshotting the request in before_model_callback captures:
- ✅ Everything assembled in
_preprocess_async — request processors, system instruction, contents, tools.
- ❌ Misses anything mutated afterward: the agent's
before_model_callback, any plugin registered after this one, and ADK's own config.labels injection.
The LlmRequest is mutated in place through this whole chain, so the gap is purely about when a plugin is allowed to observe it — there is no hook at the actual send point.
Describe the Solution You'd Like
Add an additive plugin lifecycle hook that fires in _call_llm_async after all before_model_callbacks and request finalization, immediately before llm.generate_content_async, receiving the final LlmRequest. For example:
async def on_model_request_callback(
self, *, callback_context: CallbackContext, llm_request: LlmRequest
) -> Optional[LlmResponse]:
"""Called with the final LlmRequest, right before it is sent to the model."""
...
This is correct by construction: it observes exactly what is sent, at the right time, and preserves LLM_REQUEST → LLM_RESPONSE event ordering and span/latency semantics. It is backward-compatible (no-op for existing plugins) and benefits every observability consumer, not just BigQueryAgentAnalyticsPlugin.
Impact on your work
Any plugin doing request-level observability/auditing (BigQueryAgentAnalyticsPlugin, logging_plugin, custom plugins) currently logs a request that can differ from what the model received. This makes the logged prompt/context untrustworthy for debugging, auditing, and eval/replay — exactly the use cases these plugins exist for. Without a shared hook, each integration has to re-implement its own fragile workaround.
Willingness to contribute
Yes — happy to help with the design and/or a PR.
🟡 Recommended Information
Describe Alternatives You've Considered
Plugin-local workaround (not recommended long-term): a plugin can stash the llm_request reference in before_model_callback and serialize it in after_model_callback / on_model_error_callback, relying on in-place mutation to read the final state. Downsides: the LLM_REQUEST event lands after the response; it needs per-in-flight-call keying; and short-circuited calls (a callback returning an LlmResponse) never reach after_model_callback, so nothing is logged in that case.
Expose the final llm_request to after_model_callback: simpler, but it keeps the timing wrong (request observed after the response). A dedicated pre-send hook keeps event ordering correct.
Proposed API / Implementation
In _call_llm_async, after callbacks + label injection and immediately before the model call:
# after _handle_before_model_callback(...) and config.labels finalization
await invocation_context.plugin_manager.run_on_model_request_callback(
callback_context=callback_context,
llm_request=llm_request,
)
# ... then:
llm = self.__get_llm(invocation_context)
async for llm_response in llm.generate_content_async(llm_request, ...):
...
BigQueryAgentAnalyticsPlugin would then emit its LLM_REQUEST event from on_model_request_callback instead of before_model_callback.
Additional Context
🔴 Required Information
Is your feature request related to a specific problem?
Plugin
before_model_callbacks run before the agent's ownbefore_model_callbacks and before ADK's post-callback request finalization. As a result, observability plugins that snapshot theLlmRequestinbefore_model_callbackcapture the request too early — they cannot observe the request that is actually sent to the model. There is currently no plugin hook positioned after all callbacks/finalization but immediately before the model call.This surfaced in #4202 (discussion comment): the
BigQueryAgentAnalyticsPlugin'sLLM_REQUESTevent does not reflect context modifications made in a user'sbefore_model_callback.Current behavior (
flows/llm_flows/base_llm_flow.py):_handle_before_model_callbackruns pluginbefore_model_callbacks first, thenagent.canonical_before_model_callbacks._call_llm_asyncfinalizes the request (e.g. injectsconfig.labels, includingadk_agent_name) and only then callsllm.generate_content_async(llm_request).So a plugin snapshotting the request in
before_model_callbackcaptures:_preprocess_async— request processors, system instruction, contents, tools.before_model_callback, any plugin registered after this one, and ADK's ownconfig.labelsinjection.The
LlmRequestis mutated in place through this whole chain, so the gap is purely about when a plugin is allowed to observe it — there is no hook at the actual send point.Describe the Solution You'd Like
Add an additive plugin lifecycle hook that fires in
_call_llm_asyncafter allbefore_model_callbacks and request finalization, immediately beforellm.generate_content_async, receiving the finalLlmRequest. For example:This is correct by construction: it observes exactly what is sent, at the right time, and preserves
LLM_REQUEST→LLM_RESPONSEevent ordering and span/latency semantics. It is backward-compatible (no-op for existing plugins) and benefits every observability consumer, not justBigQueryAgentAnalyticsPlugin.Impact on your work
Any plugin doing request-level observability/auditing (
BigQueryAgentAnalyticsPlugin,logging_plugin, custom plugins) currently logs a request that can differ from what the model received. This makes the logged prompt/context untrustworthy for debugging, auditing, and eval/replay — exactly the use cases these plugins exist for. Without a shared hook, each integration has to re-implement its own fragile workaround.Willingness to contribute
Yes — happy to help with the design and/or a PR.
🟡 Recommended Information
Describe Alternatives You've Considered
Plugin-local workaround (not recommended long-term): a plugin can stash the
llm_requestreference inbefore_model_callbackand serialize it inafter_model_callback/on_model_error_callback, relying on in-place mutation to read the final state. Downsides: theLLM_REQUESTevent lands after the response; it needs per-in-flight-call keying; and short-circuited calls (a callback returning anLlmResponse) never reachafter_model_callback, so nothing is logged in that case.Expose the final
llm_requesttoafter_model_callback: simpler, but it keeps the timing wrong (request observed after the response). A dedicated pre-send hook keeps event ordering correct.Proposed API / Implementation
In
_call_llm_async, after callbacks + label injection and immediately before the model call:BigQueryAgentAnalyticsPluginwould then emit itsLLM_REQUESTevent fromon_model_request_callbackinstead ofbefore_model_callback.Additional Context
flows/llm_flows/base_llm_flow.py(_handle_before_model_callback,_call_llm_async),plugins/base_plugin.py,plugins/bigquery_agent_analytics_plugin.py