microsoft · PattaraS · Apr 27, 2026 · Apr 27, 2026 · Apr 27, 2026 · Copilot
diff --git a/python/samples/02-agents/providers/README.md b/python/samples/02-agents/providers/README.md
@@ -11,6 +11,7 @@ This directory groups provider-specific samples for Agent Framework.
 | [`custom/`](custom/) | Framework extensibility samples for building custom `BaseAgent` and `BaseChatClient` implementations, including layer-composition guidance. |
 | [`foundry/`](foundry/) | Microsoft Foundry and Foundry Local samples using `FoundryChatClient`, `FoundryAgent`, `RawFoundryAgentChatClient`, and `FoundryLocalClient` for hosted agents, Responses API, local inference, tools, MCP, and sessions. |
 | [`github_copilot/`](github_copilot/) | `GitHubCopilotAgent` samples showing basic usage, session handling, permission-scoped shell/file/url access, and MCP integration. |
+| [`mlflow_gateway/`](mlflow_gateway/) | MLflow AI Gateway samples using `OpenAIChatClient` configured to route through the gateway's OpenAI-compatible endpoint for unified multi-provider access. |
 | [`ollama/`](ollama/) | Local Ollama samples using `OllamaChatClient` (recommended) plus OpenAI-compatible Ollama setup, including reasoning and multimodal examples. |
 | [`openai/`](openai/) | OpenAI provider samples for Chat and Chat Completion clients, including tools, structured output, sessions, MCP, web search, and multimodal tasks. |
 

diff --git a/python/samples/02-agents/providers/mlflow_gateway/README.md b/python/samples/02-agents/providers/mlflow_gateway/README.md
@@ -0,0 +1,69 @@
+# MLflow AI Gateway Examples
+
+This folder contains examples demonstrating how to use the [MLflow AI Gateway](https://mlflow.org/docs/latest/genai/governance/ai-gateway/) with the Agent Framework.
+
+## What is MLflow AI Gateway?
+
+MLflow AI Gateway (MLflow ≥ 3.0) is a database-backed LLM proxy built into the MLflow tracking server. It provides a unified API across multiple LLM providers — OpenAI, Anthropic, Gemini, Mistral, Bedrock, Ollama, and more — with built-in:
+
+- **Secrets management** — provider API keys stored encrypted on the server
+- **Fallback & retry** — automatic failover to backup models on failure
+- **Traffic splitting** — A/B test by routing percentages of requests to different models
+- **Budget tracking** — per-endpoint or per-user token budgets
+- **Usage tracing** — every call logged as an MLflow trace automatically
+
+All gateway features are configured through the MLflow UI. Your application code stays the same regardless of which underlying LLM provider the gateway routes to.
+
+## Prerequisites
+
+1. **Install MLflow** (using [`uv`](https://docs.astral.sh/uv/), which Agent Framework uses):
+
+    ```bash
+    uv pip install 'mlflow[genai]'
+    ```
+
+    Or run it directly with `uvx` (no install needed):
+
+    ```bash
+    uvx --from 'mlflow[genai]' mlflow server --host 127.0.0.1 --port 5000
+    ```
+
+2. **Start the MLflow server** (if you didn't use `uvx` above):
+
+    ```bash
+    mlflow server --host 127.0.0.1 --port 5000
+    ```
+
+3. **Create a gateway endpoint** in the MLflow UI at [http://localhost:5000](http://localhost:5000). Navigate to **AI Gateway → Create Endpoint**, select a provider (e.g., OpenAI) and model (e.g., `gpt-4o-mini`), and enter your provider API key. The key is stored encrypted on the server.
+
+    See the [MLflow AI Gateway documentation](https://mlflow.org/docs/latest/genai/governance/ai-gateway/endpoints/) for details on endpoint configuration.
+
+## Recommended Approach
+
+Since MLflow AI Gateway exposes an OpenAI-compatible endpoint at `/gateway/openai/v1`, you can connect Agent Framework to it using the existing `OpenAIChatClient` with a custom `base_url` — no extra packages required beyond the OpenAI integration.
+
+## Examples
+
+| File | Description |
+|------|-------------|
+| [`mlflow_gateway_with_openai_chat_client.py`](mlflow_gateway_with_openai_chat_client.py) | Connect an Agent Framework agent to MLflow AI Gateway via the OpenAI-compatible endpoint. Shows both streaming and non-streaming responses with tool calling. |
+
+## Configuration
+
+Set the following environment variables before running the example:
+
+- `MLFLOW_GATEWAY_ENDPOINT`: The base URL for the gateway's OpenAI-compatible endpoint (must include the `/gateway/openai/v1/` suffix)
+  - Example: `export MLFLOW_GATEWAY_ENDPOINT="http://localhost:5000/gateway/openai/v1/"`
+
+- `MLFLOW_GATEWAY_MODEL`: The gateway endpoint name you created in the MLflow UI
+  - Example: `export MLFLOW_GATEWAY_MODEL="my-chat-endpoint"`
+
+## Switching Providers Without Code Changes
+
+A key benefit of using MLflow AI Gateway is that you can change the underlying LLM provider by reconfiguring the gateway endpoint in the MLflow UI — your Agent Framework code stays the same. For example, the same agent can route to:
+
+- An OpenAI-backed endpoint for production
+- An Anthropic-backed endpoint for fallback
+- A local Ollama-backed endpoint for development
+
+All controlled by the gateway's endpoint configuration.
diff --git a/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py b/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py
@@ -0,0 +1,134 @@
+# Copyright (c) Microsoft. All rights reserved.
+
+import asyncio
+import os
+import sys
+from random import randint
+from typing import Annotated
+
+from agent_framework import Agent, tool
+from agent_framework.openai import OpenAIChatClient
+from dotenv import load_dotenv
+
+# Load environment variables from .env file
+load_dotenv()
+
+"""
+MLflow AI Gateway with OpenAI Chat Client Example
+
+This sample demonstrates routing Agent Framework requests through the
+MLflow AI Gateway using the OpenAI-compatible passthrough endpoint.
+
+MLflow AI Gateway (MLflow >= 3.0) is a database-backed LLM proxy that
+provides a unified API across multiple providers (OpenAI, Anthropic,
+Gemini, Mistral, Bedrock, Ollama, and more) with built-in secrets
+management, fallback/retry, traffic splitting, and budget tracking.
+Provider API keys are stored encrypted on the server.
+
+Setup:
+    pip install mlflow[genai]
+    mlflow server --host 127.0.0.1 --port 5000
+
+Then create a gateway endpoint in the MLflow UI at http://localhost:5000
+under AI Gateway -> Create Endpoint, select a provider and model, and
+enter your provider API key.
+
+Environment Variables:
+- MLFLOW_GATEWAY_ENDPOINT: Base URL for the gateway's OpenAI-compatible
+  endpoint (e.g., "http://localhost:5000/gateway/openai/v1/")
+- MLFLOW_GATEWAY_MODEL: The gateway endpoint name you created in the
+  MLflow UI (e.g., "my-chat-endpoint")
+
+See: https://mlflow.org/docs/latest/genai/governance/ai-gateway/
+"""
+
+
+def _require_env(name: str) -> str:
+    """Read a required env var; exit with a clear error if missing or empty.
+
+    Without this check, an empty MLFLOW_GATEWAY_ENDPOINT would cause
+    OpenAIChatClient to silently fall back to OpenAI's public endpoint and
+    forward prompts there.
+    """
+    value = os.getenv(name)
+    if not value:
+        sys.exit(
+            f"Error: {name} is not set. See the README in this folder for setup "
+            "instructions: https://mlflow.org/docs/latest/genai/governance/ai-gateway/"
+        )
+    return value
+
+
+# NOTE: approval_mode="never_require" is for sample brevity. Use "always_require" in production;
+# see samples/02-agents/tools/function_tool_with_approval.py
+# and samples/02-agents/tools/function_tool_with_approval_and_sessions.py.
+@tool(approval_mode="never_require")
+def get_weather(
+    location: Annotated[str, "The location to get the weather for."],
+) -> str:
+    """Get the weather for a given location."""
+    conditions = ["sunny", "cloudy", "rainy", "stormy"]
+    return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C."
+
+
+async def non_streaming_example(base_url: str, model: str) -> None:
+    """Example of non-streaming response (get the complete result at once)."""
+    print("=== Non-streaming Response Example ===")
+
+    _client = OpenAIChatClient(
+        api_key="unused",  # Provider keys are managed by the MLflow server
+        base_url=base_url,
+        model=model,
+    )
+    agent = Agent(
+        client=_client,
+        name="WeatherAgent",
+        instructions="You are a helpful weather agent.",
+        tools=[get_weather],
+    )
+
+    query = "What's the weather like in Seattle?"
+    print(f"User: {query}")
+    result = await agent.run(query)
+    print(f"Agent: {result}\n")
+
+
+async def streaming_example(base_url: str, model: str) -> None:
+    """Example of streaming response (get results as they are generated)."""
+    print("=== Streaming Response Example ===")
+
+    _client = OpenAIChatClient(
+        api_key="unused",  # Provider keys are managed by the MLflow server
+        base_url=base_url,
+        model=model,
+    )
+    agent = Agent(
+        client=_client,
+        name="WeatherAgent",
+        instructions="You are a helpful weather agent.",
+        tools=[get_weather],
+    )
+
+    query = "What's the weather like in Portland?"
+    print(f"User: {query}")
+    print("Agent: ", end="", flush=True)
+    async for chunk in agent.run(query, stream=True):
+        if chunk.text:
+            print(chunk.text, end="", flush=True)
+    print("\n")
+
+
+async def main() -> None:
+    print("=== MLflow AI Gateway with OpenAI Chat Client Agent Example ===")
+
+    # Validate required env vars upfront so we never silently route to OpenAI's
+    # public endpoint if MLFLOW_GATEWAY_ENDPOINT is missing or empty.
+    base_url = _require_env("MLFLOW_GATEWAY_ENDPOINT")
+    model = _require_env("MLFLOW_GATEWAY_MODEL")
+
+    await non_streaming_example(base_url, model)
+    await streaming_example(base_url, model)
+
+
+if __name__ == "__main__":
+    asyncio.run(main())