Add a plugin hook to observe the final LlmRequest immediately before generate_content_async

## 🔴 Required Information

### Is your feature request related to a specific problem?

Plugin `before_model_callback`s run **before** the agent's own `before_model_callback`s and before ADK's post-callback request finalization. As a result, observability plugins that snapshot the `LlmRequest` in `before_model_callback` capture the request *too early* — they cannot observe the request that is actually sent to the model. There is currently **no plugin hook** positioned after all callbacks/finalization but immediately before the model call.

This surfaced in #4202 ([discussion comment](https://github.com/google/adk-python/discussions/4202#discussioncomment-17450332)): the `BigQueryAgentAnalyticsPlugin`'s `LLM_REQUEST` event does not reflect context modifications made in a user's `before_model_callback`.

**Current behavior** (`flows/llm_flows/base_llm_flow.py`):

- `_handle_before_model_callback` runs **plugin** `before_model_callback`s first, then `agent.canonical_before_model_callbacks`.
- After the callbacks return, `_call_llm_async` finalizes the request (e.g. injects `config.labels`, including `adk_agent_name`) and only then calls `llm.generate_content_async(llm_request)`.

So a plugin snapshotting the request in `before_model_callback` captures:

- ✅ Everything assembled in `_preprocess_async` — request processors, system instruction, contents, tools.
- ❌ **Misses** anything mutated afterward: the agent's `before_model_callback`, any plugin registered after this one, and ADK's own `config.labels` injection.

The `LlmRequest` is mutated **in place** through this whole chain, so the gap is purely about *when* a plugin is allowed to observe it — there is no hook at the actual send point.

### Describe the Solution You'd Like

Add an additive plugin lifecycle hook that fires in `_call_llm_async` **after** all `before_model_callback`s and request finalization, **immediately before** `llm.generate_content_async`, receiving the final `LlmRequest`. For example:

```python
async def on_model_request_callback(
    self, *, callback_context: CallbackContext, llm_request: LlmRequest
) -> Optional[LlmResponse]:
    """Called with the final LlmRequest, right before it is sent to the model."""
    ...
```

This is correct by construction: it observes exactly what is sent, at the right time, and preserves `LLM_REQUEST` → `LLM_RESPONSE` event ordering and span/latency semantics. It is backward-compatible (no-op for existing plugins) and benefits every observability consumer, not just `BigQueryAgentAnalyticsPlugin`.

### Impact on your work

Any plugin doing request-level observability/auditing (`BigQueryAgentAnalyticsPlugin`, `logging_plugin`, custom plugins) currently logs a request that can differ from what the model received. This makes the logged prompt/context untrustworthy for debugging, auditing, and eval/replay — exactly the use cases these plugins exist for. Without a shared hook, each integration has to re-implement its own fragile workaround.

### Willingness to contribute

Yes — happy to help with the design and/or a PR.

---

## 🟡 Recommended Information

### Describe Alternatives You've Considered

**Plugin-local workaround (not recommended long-term):** a plugin can stash the `llm_request` reference in `before_model_callback` and serialize it in `after_model_callback` / `on_model_error_callback`, relying on in-place mutation to read the final state. Downsides: the `LLM_REQUEST` event lands *after* the response; it needs per-in-flight-call keying; and short-circuited calls (a callback returning an `LlmResponse`) never reach `after_model_callback`, so nothing is logged in that case.

**Expose the final `llm_request` to `after_model_callback`:** simpler, but it keeps the timing wrong (request observed after the response). A dedicated pre-send hook keeps event ordering correct.

### Proposed API / Implementation

In `_call_llm_async`, after callbacks + label injection and immediately before the model call:

```python
# after _handle_before_model_callback(...) and config.labels finalization
await invocation_context.plugin_manager.run_on_model_request_callback(
    callback_context=callback_context,
    llm_request=llm_request,
)
# ... then:
llm = self.__get_llm(invocation_context)
async for llm_response in llm.generate_content_async(llm_request, ...):
    ...
```

`BigQueryAgentAnalyticsPlugin` would then emit its `LLM_REQUEST` event from `on_model_request_callback` instead of `before_model_callback`.

### Additional Context

- Discussion: #4202 — https://github.com/google/adk-python/discussions/4202#discussioncomment-17450332
- Relevant code: `flows/llm_flows/base_llm_flow.py` (`_handle_before_model_callback`, `_call_llm_async`), `plugins/base_plugin.py`, `plugins/bigquery_agent_analytics_plugin.py`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add a plugin hook to observe the final LlmRequest immediately before generate_content_async #6222

🔴 Required Information

Is your feature request related to a specific problem?

Describe the Solution You'd Like

Impact on your work

Willingness to contribute

🟡 Recommended Information

Describe Alternatives You've Considered

Proposed API / Implementation

Additional Context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add a plugin hook to observe the final LlmRequest immediately before generate_content_async #6222

Description

🔴 Required Information

Is your feature request related to a specific problem?

Describe the Solution You'd Like

Impact on your work

Willingness to contribute

🟡 Recommended Information

Describe Alternatives You've Considered

Proposed API / Implementation

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions