feat(plugins): add fallback plugin to automatically fallback between Gemini models


## 🔴 Required Information

### Is your feature request related to a specific problem?

When using ADK with models that can return transient HTTP errors such as
rate-limit (`429`) or gateway timeout (`504`), there is no built-in mechanism
to:

1. Guarantee that every new request always starts with the intended primary
   model (non-persistent fallback).
2. Detect those error responses and record structured metadata about the fallback
   event so that the caller or model layer can take remedial action.

Manually wiring this logic in every agent callback is repetitive, error-prone,
and leaks infrastructure concerns into agent business logic.

### Describe the Solution You'd Like

A new prebuilt plugin — `FallbackPlugin` — that:

1. **Resets the model to `root_model` before every LLM request** via
   `before_model_callback`, so that any fallback state from a previous failed
   turn does not carry over ("non-persistent fallback").
2. **Detects retriable errors in the response** via `after_model_callback`:
   when `LlmResponse.error_code` matches one of the configured `error_status`
   codes (`[429, 504]` by default) *and* a `fallback_model` is configured, it
   writes the following keys to `LlmResponse.custom_metadata`:

   | Key | Type | Description |
   |---|---|---|
   | `fallback_triggered` | `bool` | Always `True`. |
   | `original_model` | `str` | The `root_model` value. |
   | `fallback_model` | `str` | The backup model identifier. |
   | `fallback_attempt` | `int` | Cumulative attempt count for this context. |
   | `error_code` | `str` | String representation of the error code. |

3. **Tracks per-context attempt counts** in an internal dictionary, pruned
   automatically to avoid unbounded memory growth.

The plugin deliberately does **not** re-issue the request itself; the actual
retry is delegated to the underlying model (e.g. `LiteLlm`'s `fallbacks`
parameter), keeping the plugin focused on a single responsibility.

**API surface:**

```python
from google.adk.plugins.fallback_plugin import FallbackPlugin

plugin = FallbackPlugin(
    root_model="gemini-2.0-flash",    # Primary model, always tried first.
    fallback_model="gemini-1.5-pro",  # Backup model recorded in metadata.
    error_status=[429, 504],          # Default retriable HTTP codes.
)

app = App(
    agent=root_agent,
    plugins=[plugin],
)
```

### Impact on your work

Without this plugin, every team using multi-model setups with LiteLlm
fallbacks has to reimplement the same `before_model_callback` /
`after_model_callback` boilerplate. This plugin consolidates that pattern
into a single, tested, zero-configuration component that is available
immediately.

This is important for production deployments where models can hit rate limits
and graceful degradation is required.

### Willingness to contribute

**Yes** — implementation, unit tests, sample agent, and README are all
included in the accompanying PR.

---

## 🟡 Recommended Information

### Describe Alternatives You've Considered

- **Agent-level callbacks:** The same logic can be placed in
  `before_model_callback` / `after_model_callback` on a single `LlmAgent`.
  This does not scale — every agent that needs fallback has to duplicate the
  code, and the reset-to-root-model guarantee is easy to forget.
- **LiteLlm `fallbacks` alone:** LiteLlm's native retry handles the actual
  re-request, but provides no per-turn reset guarantee and no structured
  metadata on the `LlmResponse`.  The `FallbackPlugin` is designed to
  complement, not replace, LiteLlm's built-in mechanism.

### Proposed API / Implementation

```python
# src/google/adk/plugins/fallback_plugin.py

class FallbackPlugin(BasePlugin):
    def __init__(
        self,
        name: str = "fallback_plugin",
        root_model: Optional[str] = None,
        fallback_model: Optional[str] = None,
        error_status: list[int] = [429, 504],
    ) -> None: ...

    async def before_model_callback(
        self,
        *,
        callback_context: CallbackContext,
        llm_request: LlmRequest,
    ) -> Optional[LlmResponse]: ...

    async def after_model_callback(
        self,
        *,
        callback_context: CallbackContext,
        llm_response: LlmResponse,
    ) -> Optional[LlmResponse]: ...
```

### Additional Context

- Pairs naturally with `contributing/samples/litellm_with_fallback_models`
  which already demonstrates `LiteLlm`'s `fallbacks` parameter.
- A new sample `contributing/samples/plugin_fallback/` is included with a
  `README.md` documenting configuration, metadata schema, and usage.
- The plugin should be listed in the
  [Prebuilt Plugins](https://google.github.io/adk-docs/plugins/) section of
  the documentation alongside `GlobalInstructionPlugin`,
  `ReflectAndRetryToolPlugin`, and `DebugLoggingPlugin`.


Key	Type	Description
`fallback_triggered`	`bool`	Always `True`.
`original_model`	`str`	The `root_model` value.
`fallback_model`	`str`	The backup model identifier.
`fallback_attempt`	`int`	Cumulative attempt count for this context.
`error_code`	`str`	String representation of the error code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(plugins): add fallback plugin to automatically fallback between Gemini models #88

🔴 Required Information

Is your feature request related to a specific problem?

Describe the Solution You'd Like

Impact on your work

Willingness to contribute

🟡 Recommended Information

Describe Alternatives You've Considered

Proposed API / Implementation

Additional Context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feat(plugins): add fallback plugin to automatically fallback between Gemini models #88

Description

🔴 Required Information

Is your feature request related to a specific problem?

Describe the Solution You'd Like

Impact on your work

Willingness to contribute

🟡 Recommended Information

Describe Alternatives You've Considered

Proposed API / Implementation

Additional Context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions