From 1e6f80681a29b7b39ba4377205c2481abe318176 Mon Sep 17 00:00:00 2001 From: PattaraS Date: Mon, 27 Apr 2026 14:31:29 +0800 Subject: [PATCH 1/3] Add MLflow AI Gateway provider samples MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds a new providers/mlflow_gateway/ directory with an example showing how to route Agent Framework requests through MLflow AI Gateway's OpenAI-compatible endpoint using the existing OpenAIChatClient. Follows the same pattern as ollama_with_openai_chat_client.py — no new dependencies required beyond the OpenAI integration. --- python/samples/02-agents/providers/README.md | 1 + .../providers/mlflow_gateway/README.md | 63 ++++++++++ .../mlflow_gateway_with_openai_chat_client.py | 112 ++++++++++++++++++ 3 files changed, 176 insertions(+) create mode 100644 python/samples/02-agents/providers/mlflow_gateway/README.md create mode 100644 python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py diff --git a/python/samples/02-agents/providers/README.md b/python/samples/02-agents/providers/README.md index 6ab5fa9d76..eae5ec6a4e 100644 --- a/python/samples/02-agents/providers/README.md +++ b/python/samples/02-agents/providers/README.md @@ -11,6 +11,7 @@ This directory groups provider-specific samples for Agent Framework. | [`custom/`](custom/) | Framework extensibility samples for building custom `BaseAgent` and `BaseChatClient` implementations, including layer-composition guidance. | | [`foundry/`](foundry/) | Microsoft Foundry and Foundry Local samples using `FoundryChatClient`, `FoundryAgent`, `RawFoundryAgentChatClient`, and `FoundryLocalClient` for hosted agents, Responses API, local inference, tools, MCP, and sessions. | | [`github_copilot/`](github_copilot/) | `GitHubCopilotAgent` samples showing basic usage, session handling, permission-scoped shell/file/url access, and MCP integration. | +| [`mlflow_gateway/`](mlflow_gateway/) | MLflow AI Gateway samples using `OpenAIChatClient` configured to route through the gateway's OpenAI-compatible endpoint for unified multi-provider access. | | [`ollama/`](ollama/) | Local Ollama samples using `OllamaChatClient` (recommended) plus OpenAI-compatible Ollama setup, including reasoning and multimodal examples. | | [`openai/`](openai/) | OpenAI provider samples for Chat and Chat Completion clients, including tools, structured output, sessions, MCP, web search, and multimodal tasks. | diff --git a/python/samples/02-agents/providers/mlflow_gateway/README.md b/python/samples/02-agents/providers/mlflow_gateway/README.md new file mode 100644 index 0000000000..b877be8f19 --- /dev/null +++ b/python/samples/02-agents/providers/mlflow_gateway/README.md @@ -0,0 +1,63 @@ +# MLflow AI Gateway Examples + +This folder contains examples demonstrating how to use the [MLflow AI Gateway](https://mlflow.org/docs/latest/genai/governance/ai-gateway/) with the Agent Framework. + +## What is MLflow AI Gateway? + +MLflow AI Gateway (MLflow ≥ 3.0) is a database-backed LLM proxy built into the MLflow tracking server. It provides a unified API across multiple LLM providers — OpenAI, Anthropic, Gemini, Mistral, Bedrock, Ollama, and more — with built-in: + +- **Secrets management** — provider API keys stored encrypted on the server +- **Fallback & retry** — automatic failover to backup models on failure +- **Traffic splitting** — A/B test by routing percentages of requests to different models +- **Budget tracking** — per-endpoint or per-user token budgets +- **Usage tracing** — every call logged as an MLflow trace automatically + +All gateway features are configured through the MLflow UI. Your application code stays the same regardless of which underlying LLM provider the gateway routes to. + +## Prerequisites + +1. **Install MLflow**: + + ```bash + pip install 'mlflow[genai]' + ``` + +2. **Start the MLflow server**: + + ```bash + mlflow server --host 127.0.0.1 --port 5000 + ``` + +3. **Create a gateway endpoint** in the MLflow UI at [http://localhost:5000](http://localhost:5000). Navigate to **AI Gateway → Create Endpoint**, select a provider (e.g., OpenAI) and model (e.g., `gpt-4o-mini`), and enter your provider API key. The key is stored encrypted on the server. + + See the [MLflow AI Gateway documentation](https://mlflow.org/docs/latest/genai/governance/ai-gateway/endpoints/) for details on endpoint configuration. + +## Recommended Approach + +Since MLflow AI Gateway exposes an OpenAI-compatible endpoint at `/gateway/openai/v1`, you can connect Agent Framework to it using the existing `OpenAIChatClient` with a custom `base_url` — no extra packages required beyond the OpenAI integration. + +## Examples + +| File | Description | +|------|-------------| +| [`mlflow_gateway_with_openai_chat_client.py`](mlflow_gateway_with_openai_chat_client.py) | Connect an Agent Framework agent to MLflow AI Gateway via the OpenAI-compatible endpoint. Shows both streaming and non-streaming responses with tool calling. | + +## Configuration + +Set the following environment variables before running the example: + +- `MLFLOW_GATEWAY_ENDPOINT`: The base URL for the gateway's OpenAI-compatible endpoint (must include the `/gateway/openai/v1/` suffix) + - Example: `export MLFLOW_GATEWAY_ENDPOINT="http://localhost:5000/gateway/openai/v1/"` + +- `MLFLOW_GATEWAY_MODEL`: The gateway endpoint name you created in the MLflow UI + - Example: `export MLFLOW_GATEWAY_MODEL="my-chat-endpoint"` + +## Switching Providers Without Code Changes + +A key benefit of using MLflow AI Gateway is that you can change the underlying LLM provider by reconfiguring the gateway endpoint in the MLflow UI — your Agent Framework code stays the same. For example, the same agent can route to: + +- An OpenAI-backed endpoint for production +- An Anthropic-backed endpoint for fallback +- A local Ollama-backed endpoint for development + +All controlled by the gateway's endpoint configuration. diff --git a/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py b/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py new file mode 100644 index 0000000000..acec68920e --- /dev/null +++ b/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py @@ -0,0 +1,112 @@ +# Copyright (c) Microsoft. All rights reserved. + +import asyncio +import os +from random import randint +from typing import Annotated + +from agent_framework import Agent, tool +from agent_framework.openai import OpenAIChatClient +from dotenv import load_dotenv + +# Load environment variables from .env file +load_dotenv() + +""" +MLflow AI Gateway with OpenAI Chat Client Example + +This sample demonstrates routing Agent Framework requests through the +MLflow AI Gateway using the OpenAI-compatible passthrough endpoint. + +MLflow AI Gateway (MLflow >= 3.0) is a database-backed LLM proxy that +provides a unified API across multiple providers (OpenAI, Anthropic, +Gemini, Mistral, Bedrock, Ollama, and more) with built-in secrets +management, fallback/retry, traffic splitting, and budget tracking. +Provider API keys are stored encrypted on the server. + +Setup: + pip install mlflow[genai] + mlflow server --host 127.0.0.1 --port 5000 + +Then create a gateway endpoint in the MLflow UI at http://localhost:5000 +under AI Gateway -> Create Endpoint, select a provider and model, and +enter your provider API key. + +Environment Variables: +- MLFLOW_GATEWAY_ENDPOINT: Base URL for the gateway's OpenAI-compatible + endpoint (e.g., "http://localhost:5000/gateway/openai/v1/") +- MLFLOW_GATEWAY_MODEL: The gateway endpoint name you created in the + MLflow UI (e.g., "my-chat-endpoint") + +See: https://mlflow.org/docs/latest/genai/governance/ai-gateway/ +""" + + +# NOTE: approval_mode="never_require" is for sample brevity. Use "always_require" in production; +# see samples/02-agents/tools/function_tool_with_approval.py +# and samples/02-agents/tools/function_tool_with_approval_and_sessions.py. +@tool(approval_mode="never_require") +def get_weather( + location: Annotated[str, "The location to get the weather for."], +) -> str: + """Get the weather for a given location.""" + conditions = ["sunny", "cloudy", "rainy", "stormy"] + return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C." + + +async def non_streaming_example() -> None: + """Example of non-streaming response (get the complete result at once).""" + print("=== Non-streaming Response Example ===") + + _client = OpenAIChatClient( + api_key="unused", # Provider keys are managed by the MLflow server + base_url=os.getenv("MLFLOW_GATEWAY_ENDPOINT"), + model=os.getenv("MLFLOW_GATEWAY_MODEL"), + ) + agent = Agent( + client=_client, + name="WeatherAgent", + instructions="You are a helpful weather agent.", + tools=[get_weather], + ) + + query = "What's the weather like in Seattle?" + print(f"User: {query}") + result = await agent.run(query) + print(f"Agent: {result}\n") + + +async def streaming_example() -> None: + """Example of streaming response (get results as they are generated).""" + print("=== Streaming Response Example ===") + + _client = OpenAIChatClient( + api_key="unused", # Provider keys are managed by the MLflow server + base_url=os.getenv("MLFLOW_GATEWAY_ENDPOINT"), + model=os.getenv("MLFLOW_GATEWAY_MODEL"), + ) + agent = Agent( + client=_client, + name="WeatherAgent", + instructions="You are a helpful weather agent.", + tools=[get_weather], + ) + + query = "What's the weather like in Portland?" + print(f"User: {query}") + print("Agent: ", end="", flush=True) + async for chunk in agent.run(query, stream=True): + if chunk.text: + print(chunk.text, end="", flush=True) + print("\n") + + +async def main() -> None: + print("=== MLflow AI Gateway with OpenAI Chat Client Agent Example ===") + + await non_streaming_example() + await streaming_example() + + +if __name__ == "__main__": + asyncio.run(main()) From 48ac8c9cfb4fba8f90daaef956285d4b17e21968 Mon Sep 17 00:00:00 2001 From: PattaraS Date: Mon, 27 Apr 2026 14:52:10 +0800 Subject: [PATCH 2/3] Validate MLFLOW_GATEWAY_ENDPOINT/MODEL upfront Address Copilot review feedback: without explicit validation, an unset or empty MLFLOW_GATEWAY_ENDPOINT would cause OpenAIChatClient to silently fall back to OpenAI's public endpoint and forward prompts there. Validate both env vars in main() and pass the resolved values into both example functions, failing fast with a clear error message. --- .../mlflow_gateway_with_openai_chat_client.py | 38 +++++++++++++++---- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py b/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py index acec68920e..18ab355b91 100644 --- a/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py +++ b/python/samples/02-agents/providers/mlflow_gateway/mlflow_gateway_with_openai_chat_client.py @@ -2,6 +2,7 @@ import asyncio import os +import sys from random import randint from typing import Annotated @@ -42,6 +43,22 @@ """ +def _require_env(name: str) -> str: + """Read a required env var; exit with a clear error if missing or empty. + + Without this check, an empty MLFLOW_GATEWAY_ENDPOINT would cause + OpenAIChatClient to silently fall back to OpenAI's public endpoint and + forward prompts there. + """ + value = os.getenv(name) + if not value: + sys.exit( + f"Error: {name} is not set. See the README in this folder for setup " + "instructions: https://mlflow.org/docs/latest/genai/governance/ai-gateway/" + ) + return value + + # NOTE: approval_mode="never_require" is for sample brevity. Use "always_require" in production; # see samples/02-agents/tools/function_tool_with_approval.py # and samples/02-agents/tools/function_tool_with_approval_and_sessions.py. @@ -54,14 +71,14 @@ def get_weather( return f"The weather in {location} is {conditions[randint(0, 3)]} with a high of {randint(10, 30)}°C." -async def non_streaming_example() -> None: +async def non_streaming_example(base_url: str, model: str) -> None: """Example of non-streaming response (get the complete result at once).""" print("=== Non-streaming Response Example ===") _client = OpenAIChatClient( api_key="unused", # Provider keys are managed by the MLflow server - base_url=os.getenv("MLFLOW_GATEWAY_ENDPOINT"), - model=os.getenv("MLFLOW_GATEWAY_MODEL"), + base_url=base_url, + model=model, ) agent = Agent( client=_client, @@ -76,14 +93,14 @@ async def non_streaming_example() -> None: print(f"Agent: {result}\n") -async def streaming_example() -> None: +async def streaming_example(base_url: str, model: str) -> None: """Example of streaming response (get results as they are generated).""" print("=== Streaming Response Example ===") _client = OpenAIChatClient( api_key="unused", # Provider keys are managed by the MLflow server - base_url=os.getenv("MLFLOW_GATEWAY_ENDPOINT"), - model=os.getenv("MLFLOW_GATEWAY_MODEL"), + base_url=base_url, + model=model, ) agent = Agent( client=_client, @@ -104,8 +121,13 @@ async def streaming_example() -> None: async def main() -> None: print("=== MLflow AI Gateway with OpenAI Chat Client Agent Example ===") - await non_streaming_example() - await streaming_example() + # Validate required env vars upfront so we never silently route to OpenAI's + # public endpoint if MLFLOW_GATEWAY_ENDPOINT is missing or empty. + base_url = _require_env("MLFLOW_GATEWAY_ENDPOINT") + model = _require_env("MLFLOW_GATEWAY_MODEL") + + await non_streaming_example(base_url, model) + await streaming_example(base_url, model) if __name__ == "__main__": From a009d68c06633b7fa1ba5822ccbbdcd542cca7c3 Mon Sep 17 00:00:00 2001 From: PattaraS Date: Mon, 27 Apr 2026 15:18:48 +0800 Subject: [PATCH 3/3] Use uv for MLflow installation in README --- .../02-agents/providers/mlflow_gateway/README.md | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/python/samples/02-agents/providers/mlflow_gateway/README.md b/python/samples/02-agents/providers/mlflow_gateway/README.md index b877be8f19..74b086d29d 100644 --- a/python/samples/02-agents/providers/mlflow_gateway/README.md +++ b/python/samples/02-agents/providers/mlflow_gateway/README.md @@ -16,13 +16,19 @@ All gateway features are configured through the MLflow UI. Your application code ## Prerequisites -1. **Install MLflow**: +1. **Install MLflow** (using [`uv`](https://docs.astral.sh/uv/), which Agent Framework uses): ```bash - pip install 'mlflow[genai]' + uv pip install 'mlflow[genai]' ``` -2. **Start the MLflow server**: + Or run it directly with `uvx` (no install needed): + + ```bash + uvx --from 'mlflow[genai]' mlflow server --host 127.0.0.1 --port 5000 + ``` + +2. **Start the MLflow server** (if you didn't use `uvx` above): ```bash mlflow server --host 127.0.0.1 --port 5000