From a30a10ba20ec82aec043ce1d1a448a57d7c88213 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Tue, 5 May 2026 10:42:38 +0200 Subject: [PATCH 01/20] Python: Channel spec (#5549) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * first iteration of channel spec * added deny link setup * clarify invocation hook role and dedupe ADR/spec ADR 0026: - Tighten Decision Outcome Summary so each concept is mentioned once; defer full definitions to the Terminology section. - Update ChannelInvocationHook bullet to match the clarified gap #7 language (uniform ChannelRequest envelope, hook timing, illustrative examples). - Drop Decision Drivers bullets that just restated Business Goals; cross-link to the goals section instead. - Replace the More Information bullet list with a pointer to Non-Goals. Spec 002: - Trim requirement #21 to point at the canonical LinkPolicy section instead of restating the full contract. - Add a #linkpolicy-and-trust_level subsection anchor for cross-refs. - Trim the Terminology LinkPolicy entry's two-hosts caveat (canonical version stays in the Key Types section). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * updated adr and spec * Update hosting channels ADR and spec - Document FoundryHostedAgentHistoryProvider roundtrip of additional_properties namespaces via the agent_framework container key on stored OutputItems. - Add Foundry storage gap subsection capturing the update_item service ask required for post-push delivery_tracking[] mutation. - Triage open questions: 18 resolved (now in a Resolved Questions decisions log), 3 notes-updated, 6 unchanged. Capture spec-body follow-ups implied by the resolutions in a new Decisions-driven follow-ups subsection. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Refine hosting ADR + spec: A2A/MCP-tool channels, store-parameter matrix, open-question pass - Surface A2A and MCP-tool channels as explicitly designed-in but fast-follow work after the first Responses + Invocations + Telegram release. Updated ADR business goals, non-goals, and More Information; added spec reqs #25 (A2AChannel) and #26 (MCPToolChannel) under v1 Fast Follow; renumbered the WhatsApp/Teams entry to #27. - New 'The Responses store parameter' subsection in the spec: 2x3 destination matrix making explicit that 'store' has no canonical meaning at the hosted-agent layer — the developer decides what it maps to across service-side, hosted-agent storage, and caller-side. Includes design properties on forwarding-vs-mapping, per-deployment documentation responsibility, and richer storage vocabulary via OpenAI's extra_body. - Fixed contradicting spec text that previously claimed ResponsesChannel maps store=False to session_mode=disabled by default; updated channel options table, session_mode terminology entry, and Scenario 3 prose/comment to match the new model. - Renamed FoundryHistoryProvider -> FoundryHostedAgentHistoryProvider throughout the spec (9 occurrences) so the name reinforces the intended hosted-agent use case. - ADR open-questions pass: walked through all 15 entries with the user. 13 resolved (moved to a new 'Resolved Questions (decisions log)' table), 2 kept open with refined wording (Q6 'Channel' GA name, Q14 Responses WS subprotocol). Added a 'Decisions-driven follow-ups' bullet list capturing the spec-body / sample edits implied by the resolutions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Hosting ADR + spec: rename Teams channel to Activity Protocol, add multi-user conversation design - Rename the planned Teams channel to ActivityChannel (package agent-framework-hosting-activity). Promoted to req #27 (v1 fast follow) alongside A2A and MCP-tool, with native translations from Activity Protocol objects to AF types so the contract is explicit rather than implicit through Invocations. Channel sits behind Azure Bot Service, which fronts Teams / Web Chat / Slack / etc. Naming reserves a TeamsChannel name for any future direct-to-Teams transport that bypasses Bot Service (now stretch req #28 with WhatsApp). ResponseTarget channel ids and JSON examples updated from "teams" to "activity". Appendix B updated to acknowledge that ActivityChannel deliberately reuses the Bot Service connector model (the no-connector stance applies to the rest of the channel set). - Add first-class design for multi-user surfaces (Telegram groups / supergroups / forum topics; Activity Protocol groupChat and team channels). Cleanly separate user identity (ChannelIdentity.native_id = from.id / from.aadObjectId) from conversation locator (ChannelRequest.conversation_id = chat.id (+ message_thread_id / replyToId)). New per-channel options: conversation_scope (per_user / per_user_per_conversation (default in groups) / per_conversation) and accept_in_group addressing rule (mention_only (default) / command_only / mention_or_command / all). Specifies originating reply must include conversation + thread locator, ChannelPush behavior in groups, link-ceremony privacy (challenges redirected to user DMs), and the Activity-channel mapping for personal / groupChat / channel conversationType plus Teams replyToId threading. Broadcast Telegram Channels and adaptive-card Invoke activity flows scoped as fast follow. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): rename RunHandle → ContinuationToken; HostStateStore (file-based v1); align agentserver dependency posture - Rename RunHandle → ContinuationToken (opaque URL-safe `token` field) throughout ADR + spec; update routes to /{continuation_token}; spec out equivalent continuation-token support for the Invocations channel (Q20 done). - Introduce HostStateStore as the single persistence seam for host-execution metadata (continuation tokens, identity-link grants, last-seen records). V1 default: FileHostStateStore (atomic JSON-per-record under ./.af-hosting/, per-namespace TTLs) — background runs and link grants now survive host restarts. InMemoryHostStateStore for tests; pluggable Cosmos / SQL / Redis remain v1 fast follow under req #23. Closes Q9, Q11, Q14. - Drop blanket "no agentserver dependency" claims. Hosting core is still independent of agentserver, but channel packages MAY consume lower-level building blocks (notably the Foundry response-store SDK that FoundryHostedAgentHistoryProvider builds on). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): swap Scenarios 6 and 7 so the linker comes before cross-channel continuity Scenario 6 (cross-channel continuity) previously forward-referenced Scenario 7 (linker) twice, since continuity depends on the link/merge ceremony. Invert the order so the linker scenario establishes the mechanism first and the continuity scenario builds on it. Update internal cross-references, the require_link section anchor, and Scenario 8's prerequisites/comment to match. Also tightened the new Scenario 7's closing note to point at HostStateStore (file-based default) for cross-host continuity, and dropped a stale MfaIdentityLinker reference from the linker variants paragraph (Q13 dropped MFA from phase 1). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): rewrite Scenario 7 as trusted-relay + add ResponseTarget.identities The previous Scenario 7 (cross-channel chat continuity) implied two independent auto-issued isolation_keys would converge by themselves — they don't, that needs a linker. Replace with a more realistic and complementary scenario: a trusted server-side application backend exposes Responses + Telegram against the same agent and uses extra_body to carry app-internal identity hints (app_user_id, push_to_telegram_chat_id) that a Responses run_hook translates into both an isolation_key promotion and a push to a known Telegram chat. Includes a closing variant pointing back at Scenario 6's linker for the no-app-table flow. Adds the ResponseTarget.identities([ChannelIdentity(...)]) variant to the type table and req #12 to support 'caller already knows the channel-native recipient' delivery without going through the link store. Bypasses the link store but still consults LinkPolicy per delivery. Drops MfaIdentityLinker references from req #11, req #24, and the linker helpers table (Q13 had already dropped MFA from phase 1; the spec body just hadn't caught up). Marks ADR Q8 follow-up done. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): wire FileCheckpointStorage into Scenario 9 + show resume-from-checkpoint flow Scenario 9 now builds the workflow with a FileCheckpointStorage so executor frames are persisted across runs, and demonstrates how the run_hook surfaces a caller-supplied resume_from_checkpoint into request.attributes so the host's workflow dispatch can pass it to Workflow.run(checkpoint_id=...). Closing paragraph clarifies that CheckpointStorage is workflow-runtime state, kept structurally separate from HostStateStore and ContextProvider — three protocols that MAY share a backend but stay independently typed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): emphasize result richness in Scenario 10 (channels are not limited to result.text) Add a 'Result is rich, not just text' callout under the channel-authoring sample. Inventories the typed Contents on the underlying AgentRunResult (TextContent, DataContent, UriContent, FunctionCallContent / FunctionResultContent, HostedFile/VectorStoreContent, UsageContent, TextReasoningContent, ErrorContent + additional_properties), the typed structured output via result.value, and shows concrete examples per channel shape: Telegram (MarkdownV2 + sendPhoto/sendAudio + inline keyboards), Responses (full content-list round-trip), chat UI (GFM/HTML + collapsible tool/reasoning panels), voice (TTS + earcons), typed RPC (result.value first). result.text is positioned as a convenience for single-string channels, not the contract. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * spec: add TeamsChannel (microsoft/teams.py) as fast-follow req #28 Add a Teams-native channel package built on the MIT-licensed microsoft/teams.py SDK as fast-follow alongside the generic ActivityChannel (req #27). Where ActivityChannel targets the generic Activity Protocol surface, TeamsChannel exploits Teams-specific affordances the generic protocol does not surface natively: Adaptive Cards (typed builder), streamed replies, AI-generated badge, feedback controls + form, suggested-prompt chips, inline citations, modal Dialogs, Message Extensions (action / search / link unfurling), proactive / targeted / threaded messages, and SSO via MSAL. Mounts the SDK's App into the host's Starlette app via a custom HttpServerAdapter; reuses the same host-tracked-session family as ActivityChannel (from.aadObjectId -> ChannelIdentity). The SDK already ships a 'Build an agent using Microsoft Agent Framework' guide so the integration story is direct. Renumber the WhatsApp / direct-to-Teams stretch item to req #29 and clarify its 'direct-to-Teams' placeholder is a future transport that bypasses both Bot Service and the teams.py SDK. Add the SDK to Dependencies & Commitment Status as a proposed runtime dep of agent-framework-hosting-teams. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * spec: clarify direct-to-Teams stretch as speculative (no Bot Service) Split the WhatsApp + direct-to-Teams stretch entry into two distinct items and reword the direct-to-Teams item to be honest about its current feasibility: - It MUST not rely on Azure Bot Service (otherwise it is just ActivityChannel / TeamsChannel under a different name). - No such transport is publicly available today: Graph chat APIs and microsoft/teams.py both ultimately route through Bot Service for the bot-as-conversation-participant pattern. - The slot is kept on the roadmap to preserve the naming line in case Microsoft ships a Bot-Service-free transport (native Teams REST/RPC, a Graph subscription strong enough to drive both inbound and outbound message flow, ...). - Reaffirm TeamsChannel (req #28) as the canonical Teams channel until then. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * spec: clarify TeamsChannel still rides on Bot Service in v1; add audience table Make explicit that TeamsChannel (req #28) uses Azure Bot Service in v1 — the microsoft/teams.py SDK is a higher-level Pythonic wrapper over the same Activity Protocol pipeline that ActivityChannel exposes raw. The difference is what the developer writes against, not the network path. A Bot-Service-free Teams transport is not currently possible and stays tracked as the speculative req #30. Add the ActivityChannel vs TeamsChannel audience comparison table to req #28 so the choice is obvious to readers: - ActivityChannel: maximum portability across all Bot Service-fronted channels. - TeamsChannel: Teams-first deployments wanting Cards / Dialogs / Message Extensions / citations / feedback / suggested prompts / SSO out of the box. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions/0026-hosting-channels.md | 367 +++++ docs/specs/002-python-hosting-channels.md | 1602 +++++++++++++++++++++ 2 files changed, 1969 insertions(+) create mode 100644 docs/decisions/0026-hosting-channels.md create mode 100644 docs/specs/002-python-hosting-channels.md diff --git a/docs/decisions/0026-hosting-channels.md b/docs/decisions/0026-hosting-channels.md new file mode 100644 index 00000000000..66610a14c41 --- /dev/null +++ b/docs/decisions/0026-hosting-channels.md @@ -0,0 +1,367 @@ +--- +status: proposed +contact: eavanvalkenburg +date: 2026-04-24 +deciders: eavanvalkenburg +--- + +# Agent Framework hosting core with pluggable channels + +## What are the business goals for this feature? + +Give Agent Framework app authors — in every supported language — one low-level hosting surface that can expose a single **hostable target** (an agent or a workflow) on **one or more channels** (Responses API, Invocations API, Telegram, future A2A, MCP-tool, Activity Protocol via Azure Bot Service — which fronts Teams, Web Chat, Slack, …— WhatsApp, optional future direct-to-Teams, custom webhooks) without requiring them to hand-build protocol routing or server glue per protocol, **and** let an end user start a conversation on one channel (e.g. Telegram on their phone) and seamlessly continue it on another (e.g. Teams at their desk via the Activity channel) against the same target and the same conversation history. + +This consolidates the protocol-specific hosting layers that exist today (in Python: `agent-framework-foundry-hosting`, `-ag-ui`, `-a2a`, `-devui`; in .NET: the analogous per-protocol hosting helpers) into a shared composable model where: + +- a host owns the application object and channels own protocol shape, +- the host's hostable target may be an **agent** (executed via the per-language agent execution seam) **or** a **workflow** (executed via the per-language workflow execution seam) — channels do not care which, because the channel's `run_hook` adapts the inbound `ChannelRequest` into the input shape the target needs, and +- session identity is **channel-neutral** — the host resolves a session from a channel-supplied `isolation_key` (e.g. a stable user identity) so two channels mounted on the same host can resolve to the **same** session for the same end user, and a shared session store extends that continuity across hosts and processes. +- channel-native identity is **mapped, not assumed** — every channel has its own user namespace (Telegram `chat_id`, Teams AAD object id, WhatsApp phone number, Slack user id, …). The host provides a first-class **identity resolver** seam that maps a channel-native identifier into the channel-neutral `isolation_key`, and a first-class **identity linker** seam that lets an end user **connect** a new channel to an existing `isolation_key` through a well-known mechanism (OAuth, MFA, signed one-time code, …) so cross-channel continuity is achievable without ad-hoc per-channel bookkeeping, and +- **response delivery is decoupled from request origin** — a target's response can be routed back to the **originating** channel (default), the user's **active** channel (the channel most recently observed for that `isolation_key`), a **specific** channel, **all linked** channels (fan-out), or **none** (background). Background/asynchronous runs are first-class: a channel can kick off a run, return a `ContinuationToken` to the caller, and the response is delivered when the user is next observed on any (or a chosen) channel — so a user can start a long task on Telegram and pick up the result on Teams. + +We know we're successful when: + +- after the target is created, a basic multi-channel sample requires only one host, channel objects, and one start call — no handwritten protocol routes and no per-protocol server bootstrap. The hosting core itself takes no dependency on the legacy protocol-specific hosts (e.g. Python's `agentserver`); individual channel packages MAY consume lower-level building blocks shipped in those packages where they ship reusable SDKs (e.g. the Foundry response-store SDK in `azure.ai.agentserver`), +- the same host construction works whether the target is an agent or a workflow — only the `run_hook` (channel-default or app-supplied) changes to adapt the input, +- a single host configured with two channels (e.g. Telegram + a future Activity Protocol channel — Teams via Azure Bot Service) can be exercised by one end user across both channels and observe one continuous conversation, **and** +- the same conceptual model applies in Python and .NET. + +## Problem Statement + +### How do developers solve this problem today? + +Today, every protocol surface is its own integration package with its own server. A developer who wants to expose one agent over both the Responses API and a webhook channel has to stand up two separate hosts and stitch them into one application by hand. In Python that means manually mounting two `agentserver`-based hosts into a Starlette app and calling `uvicorn.run(...)`. In .NET it means composing two protocol-specific hosting helpers into one `WebApplication` and wiring middleware twice. + +Adding a Telegram bot to the same agent today means leaving the hosting stack entirely: spinning up a separate process, installing a Telegram SDK, writing the polling/webhook loop, manually translating updates into agent calls, and wiring command handlers (`/start`, `/new`, `/cancel`, …) and native command registration (`set_my_commands(...)`) by hand — none of which is reusable across other message channels (Teams, WhatsApp, …) or across languages. + +### Why does this problem require a new hosting abstraction? + +The gap is between **owning a hostable target** (an agent or a workflow) and **operationalizing it on multiple channels**. Agent Framework already provides agents, workflows, sessions, run inputs, response/update streaming, and per-language execution seams (`SupportsAgentRun.run(...)` and the workflow execution seam in Python; `AIAgent.RunAsync(...)` and the workflow execution seam in .NET). What's missing is a generic host that: + +1. Owns one application object and one set of lifecycle hooks per language. +2. Lets channels contribute routes, middleware, commands, and startup/shutdown without protocol leakage into the host. +3. Standardizes how protocol requests become target invocations (input, options, session, streaming) and how target results flow back out — independent of whether the target is an agent or a workflow. +4. **Resolves a session from a channel-neutral `isolation_key`** so two channels mounted on the same host can converge on the same session for the same end user — enabling cross-channel chat continuity (start on Telegram, continue on Teams) without per-channel session bookkeeping. +5. **Bridges channel-native identities into the shared `isolation_key` namespace** — every channel has its own user identifier (Telegram `chat_id`, Teams AAD object id, WhatsApp phone, Slack user id). The generic host needs (a) an **identity resolver** seam that maps a channel-native id to an `isolation_key` for already-known users, and (b) an **identity linker** seam that lets an end user **connect** a new channel to an existing `isolation_key` through a well-known mechanism (OAuth, MFA, signed one-time code) — without each channel reinventing the linking flow. +6. Provides a first-class extension seam for webhook/message channels with native command catalogs (per PR #5393 Telegram sample). +7. Treats the **run hook** as the developer's runtime escape hatch over a uniform request envelope. Every channel translates its native protocol payload (Responses JSON body, Telegram update, Invocations request, …) into the same `ChannelRequest` shape — that uniformity is what lets one host front many channels with one target. The run hook runs **after** that channel-internal translation and **before** the target is invoked, receives the channel-built `ChannelRequest`, and returns a possibly-modified `ChannelRequest`. The same seam covers, for example: reshaping a free-form chat message into the typed input a workflow target requires, removing or adding fields on `ChatOptions` (e.g. dropping `temperature`/`store` that a particular target should never see, or injecting a default `model`), enforcing app policy (rejecting requests that omit a required option), or overriding `session_mode` / `response_target`. The list is illustrative, not exhaustive — anything the channel put on the `ChannelRequest` is fair game for the hook to validate, rewrite, or strip. +8. Treats **response delivery** as a first-class, configurable concern — by default the response goes back to the originating channel synchronously, but the host must support routing the response to a different channel (the user's most recently active channel, a specific channel, or all linked channels) and **background runs** where the request returns immediately with a `ContinuationToken` and the response is delivered later via a channel push when the user is next observed (or polled by the caller). +9. Applies the same conceptual model across language ecosystems so concepts, terminology, and behavior transfer between teams and docs. + +The current top-level protocol-specific hosts (e.g. `ResponsesAgentServerHost`, `InvocationAgentServerHost`) are valuable prior art but sit too high in the stack — they encode protocol ownership at the host level and are duplicated per language. The new generic core learns from their behavior without depending on those top-level wrappers; individual channel packages may still consume the lower-level SDKs that ship alongside (notably the Foundry response-store SDK). + +## Non-Goals / Relationship to existing hosting packages + +The hosting core is deliberately **not** a replacement for the existing protocol packages in their first form, and it is not a multi-agent router. It is a peer abstraction layer that lets future protocol packages share one host. + +| Dimension | Existing protocol packages | Hosting core | +|---|---|---| +| **Mental model** | One package = one protocol surface, owns its own server | One host owns the app; channels plug protocols in | +| **Scope** | Protocol-specific request/session/event mapping | Generic host + channel contract; protocol logic lives in channel packages | +| **Composition** | One protocol per process or per Mount | Many channels per host, shared middleware, lifecycle, session resolution | +| **Multi-agent** | Out of scope per package | **No.** One host = one agent. Future work, if desired. | +| **Cross-language** | Per language, per protocol | Same conceptual model in every implementing language | + +**Explicit non-goals:** + +- Migrating existing protocol packages (AG-UI, A2A, DevUI in Python; analogous .NET helpers) onto the new core in the first implementation. +- Standardizing a persistent session storage contract across all channels in the first phase. (Cross-channel continuity within one host is enabled by `isolation_key` resolution; cross-host/cross-process continuity requires the pluggable session store, listed as a fast follow.) +- Hosting multiple agents behind one router in this first design. +- Designing every detail of WhatsApp, the full Activity Protocol surface, or a future direct-to-Teams channel now (only Telegram is concretely targeted, informed by PR #5393; Activity Protocol via Azure Bot Service is designed-in fast follow alongside A2A and MCP-tool). Within Telegram and Activity, **broadcast Telegram Channels** (the read-only product) and **adaptive-card `Invoke` activity** flows are explicitly fast-follow scope; v1 ships group/supergroup/forum-topic and `personal` / `groupChat` / `channel` `conversationType` support. +- Shipping the **A2A** (agent-to-agent) and **MCP-tool** (exposing the agent as an MCP tool) channels in the first implementation. Both are explicitly **in scope for the overall design** — the host contract, `ChannelRequest` envelope, identity/session/response-target stack, and persisted delivery envelope must accommodate them as caller-supplied-session channels — but their concrete protocol bindings, route catalogs, and packages are **fast-follow work** after the first Telegram + Responses + Invocations release. +- Replacing protocol-specific serializers with one generic event model. +- Taking a runtime or package dependency on the legacy protocol-specific top-level hosts (e.g. `ResponsesAgentServerHost` / `InvocationAgentServerHost` in Python's `agentserver`) from the new hosting core. Channel packages MAY depend on lower-level building blocks shipped alongside those hosts where they provide reusable SDKs (notably the Foundry response-store SDK consumed by `FoundryHostedAgentHistoryProvider`). +- Forcing identical type names across languages — each language follows its own idioms while preserving the same concepts and terminology. + +**Boundary rule:** If you need protocol-specific event semantics, codecs, or signature validation, that lives in the channel package. The host owns the application object, lifecycle, session resolution, and the call into the agent's run/stream seam. + +## Decision Drivers + +These are the design principles applied on top of the [business goals](#what-are-the-business-goals-for-this-feature) above. + +- Keep the app author experience simple for the common case (one host, channels, one start call). +- Treat agents and workflows as peer hostable targets behind one host, so the same channel ecosystem (Responses, Invocations, Telegram, Activity, …) can serve either without rework. +- Preserve room for channel-specific capabilities (signature validation, conversations, streaming, native commands, action surfaces). +- Support message-channel capabilities — native commands, command menus, action surfaces — from the start. +- Support channels that need startup/shutdown behavior (long polling, platform-side command registration) in addition to routes. +- Use the existing protocol-specific implementations as prior art **without** taking a runtime dependency on them. +- Keep the new core protocol-agnostic. +- Align to the per-language agent **and workflow** execution seams rather than introducing a new contract for the target. +- Follow each language's idiomatic packaging conventions rather than growing a monolithic integration package. +- Avoid forcing migration of existing protocol packages as part of the first implementation. +- Keep the abstractions language-neutral so the same conceptual model can be implemented by Python, .NET, and future language ecosystems with idiomatic code. + +## Considered Options + +- Keep the current protocol-specific hosting packages only. +- Create one monolithic `hosting` package with the host and all channels built in. +- Create a new hosting core plus new channel packages, but reimplement the channel stack from scratch with no reference to the current protocol implementations. +- Create a new hosting core plus separate channel packages, informed by the current protocol-specific implementations but without depending on them. + +## Decision Outcome + +Chosen option: **Create a new hosting core plus separate channel packages, informed by the current protocol-specific implementations but without depending on them.** Apply the same conceptual model in Python and .NET, with idiomatic per-language API shapes. + +### Summary + +We will introduce a new hosting core distribution package per language. The full conceptual vocabulary is defined once in [Terminology](#terminology); this section calls out only the design decisions baked into each concept. + +- **Host** (`AgentFrameworkHost`) — owns the application object (Starlette in Python, ASP.NET Core / Kestrel in .NET), one **hostable target**, and a sequence of channels. Exposes the underlying app as the canonical portability surface and a `serve(...)`-style convenience for the common single-process case. **Named `AgentFrameworkHost` rather than `AgentHost` because the target is not restricted to agents.** +- **Hostable target** — may be either an **agent** (per-language agent execution seam) or a **workflow** (per-language workflow execution seam). The host detects the kind and dispatches; channels are unchanged. +- **Channel**, **`ChannelContext`**, **`ChannelRequest`**, **`ChannelSession`**, **`ChannelContribution`**, **`ChannelCommand`** — the channel-authoring surface. Defined in Terminology. +- **`ChannelRunHook`** — the developer's runtime escape hatch over the uniform `ChannelRequest` envelope. Channels translate their native protocol payload into `ChannelRequest`; the hook then runs **after** that translation and **before** target invocation, receiving and returning a `ChannelRequest`. Examples (illustrative): reshaping a chat message into a workflow's typed input, dropping/injecting `ChatOptions` fields, enforcing required options, overriding `session_mode` / `response_target`. +- **`IdentityResolver`** + **`IdentityLinker`** — the channel-neutral identity stack. Resolver maps channel-native ids to `isolation_key`; linker runs the **link/connect ceremony** (OAuth / MFA / signed one-time code) so a new channel can join an existing `isolation_key`. The host owns the routes and short-lived state the linker needs; channels surface entry points. Channels may declare `require_link=True` to enforce "authenticate before chatting", and the linker stores verified IdP claims (e.g. Entra ID `oid`) so subsequent channels that supply the same claim are auto-merged onto the same `isolation_key` without a second ceremony. +- **`ResponseTarget`** + **`ChannelPush`** + **`ContinuationToken`** + **active channel** — the response-delivery stack. `ResponseTarget` decouples *where* a response is delivered from *where* it originated; `ChannelPush` is the optional channel capability used for non-`originating` delivery; `ContinuationToken` makes background runs first-class with a stable id and status; the host tracks last-seen `(isolation_key, channel)` to resolve `response_target="active"`. +- **`confidentiality_tier`** + **`LinkPolicy`** — the multi-tier-on-one-host stack. `confidentiality_tier` is an opaque per-channel label; `LinkPolicy` is the host-level decision over which channel pairs may share an `isolation_key` (link) and which may push to one another (deliver). Built-in `DenyAllLinks` enforces "share a target, never share a session"; running multiple hosts is always a valid alternative. +- **Persisted delivery envelope** — assistant messages stored by the host carry a `deliveries[]` array on `Message.additional_properties["hosting"]` capturing the resolved destination set (per-destination `status`, `attempts`, timestamps, `last_error`, channel-issued `delivery_id`). This is the data model for **audit** ("which destinations did this response actually reach?") and for **replay** ("Telegram was offline; resend to that user when it comes back"). The replay *mechanism* is out of scope for v1; the data model is committed to so providers (especially the Foundry-backed Responses store) and operators can build on it. Live in-place updates require an opt-in `SupportsDeliveryTracking` provider capability; append-only providers degrade to write-once at completion. +- **Caller-supplied vs. host-tracked session carriage** — channels split into two families based on whether the upstream protocol carries a per-conversation key on every request. *Caller-supplied* channels (Responses' `previous_response_id`, Invocations, A2A, MCP) parse it into `ChannelSession.key` and let the caller branch threads by sending fresh ids. *Host-tracked* channels (Telegram, Activity Protocol via Azure Bot Service — Teams/Web Chat/Slack/…— WhatsApp) carry only a stable identity and rely on the host's per-`isolation_key` session alias plus a `host.reset_session(...)` `/new`-style command. The split is invisible to the agent target and explains why `reset_session` and aliasing exist at all (host-tracked channels have no other way to start a fresh thread). Anonymous vs. identified is an orthogonal axis; identity is supplied by the channel, the resolver, or both. +- **Multi-user surfaces are first-class.** Telegram groups, supergroups, forum topics, and Activity Protocol multi-user `conversationType`s (`groupChat`, `channel`) are designed-in from v1 — not retrofitted. The contract enforces a clean separation of **user identity** (`ChannelIdentity.native_id` = `from.id` / `from.aadObjectId`) and **conversation locator** (`ChannelRequest.conversation_id` = `chat.id` (+ optional `message_thread_id` / `replyToId`)). Channel implementations expose a `conversation_scope` option (`per_user`, `per_user_per_conversation` (default in groups), `per_conversation`) and an `accept_in_group` addressing rule (`mention_only` (default), `command_only`, `mention_or_command`, `all`) so the bot does not respond to every message in a group and so a single user's group context does not leak into their DM by default. Linker challenge messages (OAuth URL / one-time code) MUST redirect to the user's DM in group contexts. +- **Built-in channels** — own their protocol-defined relative routes under default mount roots (`/responses/v1`, `/invocations/invoke`, `/telegram/webhook`) without the app author spelling those out. + +Channel implementations live in **separate distribution packages**, one per channel, with public surfaces kept stable per language. + +| Concept | Python (proposed) | .NET (proposed) | +|---|---|---| +| Core | `agent-framework-hosting` → `agent_framework.hosting` | `Microsoft.Agents.AI.Hosting` | +| Responses channel | `agent-framework-hosting-responses` → `agent_framework.hosting.ResponsesChannel` (lazy) | `Microsoft.Agents.AI.Hosting.Responses` | +| Invocations channel | `agent-framework-hosting-invocations` → `agent_framework.hosting.InvocationsChannel` (lazy) | `Microsoft.Agents.AI.Hosting.Invocations` | +| Telegram channel | `agent-framework-hosting-telegram` → `agent_framework.hosting.TelegramChannel` (lazy) | `Microsoft.Agents.AI.Hosting.Telegram` | + +Each language follows its own conventions: + +- Python keeps the public import path stable at `agent_framework.hosting` via lazy imports. +- .NET keeps the public namespaces stable per package, following existing `Microsoft.Agents.AI.*` conventions. + +The new hosting core and its channel packages **must not** take a dependency on legacy protocol-specific hosts; those are prior art and parity reference only. + +The initial design target, in every implementing language, is: + +- any execution-seam-compatible target (not just the concrete `Agent`/`ChatClientAgent`), +- built-in channel designs for Responses and Invocations, +- a documented authoring model for webhook/message channels, including a first detailed Telegram design, +- conceptual alignment with existing protocol packages but no implementation or migration requirement for those in the first phase. + +### Conceptual API shape + +The top-level user experience should look the same conceptually in every language: compose one host with one agent and a list of channels, then start it. The channel-authoring seam should follow each language's idioms while preserving the same concepts. + +| Concept | Python idiom | .NET idiom | +|---|---|---| +| Define a host | `AgentFrameworkHost(target, channels=[...])` (target = agent or workflow) | `AgentFrameworkHostBuilder` / `AddAgentFrameworkHost(target, ...)` on the host builder | +| Canonical app surface | `host.app` (Starlette `Starlette`) — supports HTTP **and** WebSocket scopes via ASGI | `WebApplication` (ASP.NET Core) — supports HTTP **and** WebSocket via `app.UseWebSockets()` / `MapWebSocket(...)` | +| Convenience start | `host.serve(host=, port=)` (lazy `uvicorn`) | `host.RunAsync()` (Kestrel) | +| Channel contract | `Channel` Protocol with `contribute(context) -> ChannelContribution` | `IChannel` interface with `Contribute(IChannelContext)` returning `ChannelContribution` | +| Per-request hook | `ChannelRunHook = Callable[..., ChannelRequest \| Awaitable[ChannelRequest]]` invoked as `hook(request, *, target=..., protocol_request=...)` | `Func>` / delegate with named extras | +| Identity resolver | `IdentityResolver = Callable[[ChannelIdentity], str \| None]` | `IIdentityResolver` (returns `isolation_key`) | +| Identity linker | `IdentityLinker` Protocol with `begin(...)` / `complete(...)` plus `routes()` for callback / verification endpoints | `IIdentityLinker` interface with begin/complete + route contributions | +| Response routing | `ChannelRequest.response_target = ResponseTarget.originating \| .active \| .channel("activity") \| .all_linked \| .none`; channels expose `ChannelPush` if they can deliver proactively | `ChannelRequest.ResponseTarget` discriminated union; `IChannelPush` interface for proactive delivery | +| Background runs | `ContinuationToken` returned by `host.run_in_background(request)`; channels may return it as their protocol response and/or expose a poll route | `ContinuationToken` record + `HostStateStore` for persistence (file-based default; pluggable Cosmos / SQL / Redis) | +| Confidentiality tier on a channel | `Channel.confidentiality_tier: str \| None` (opaque) | `IChannel.ConfidentialityTier { get; }` (opaque string) | +| Link / delivery policy | `LinkPolicy = Callable[[LinkPolicyContext], bool]` with built-ins `AllowAllLinks`, `SameConfidentialityTierOnly`, `ExplicitAllowList`, `DenyAllLinks` | `ILinkPolicy.IsAllowed(LinkPolicyContext)` with the same set of built-in implementations | +| Command descriptor | `ChannelCommand` dataclass | `ChannelCommand` record | +| Lifecycle | `on_startup` / `on_shutdown` callables | `IHostedService` integration / explicit lifecycle delegates | + +Built-in channels own the default mapping from each protocol's request model into a `ChannelRequest`, **and** expose a per-request invocation-hook seam so app authors can validate or rewrite invocation behavior before the host invokes the agent. + +The full Python API surface — exact types, fields, default routes, code samples — is specified in the companion Python spec. A future .NET spec captures the .NET-idiomatic API surface for the same model. + +## Terminology + +These terms are language-neutral and shared between Python and .NET implementations. Each language realizes them with idiomatic types and naming. + +- **Host**: The object that owns one application, one execution-seam-compatible target, and a sequence of channels. Provides the underlying app object (canonical portability surface) and a convenience start method. +- **Channel**: A pluggable component that contributes routes (HTTP and/or WebSocket), middleware, commands, and lifecycle hooks to a host. One channel = one external protocol surface. Used interchangeably with "head" in earlier discussions; **Channel** is the canonical name. +- **`ChannelRequest`**: The host-neutral, normalized invocation envelope produced by a channel before the host invokes the agent. Carries `input`, `options`, `session` hint, `session_mode`, and channel-specific `attributes`. Also carries a small set of **typed slots** for protocol-extension data so multiple event-rich channels (AG-UI today, future custom front-ends) can settle on a shared shape rather than smuggling fields through `attributes`: `client_state` (mutable per-request state object), `client_tools` (frontend tool catalog the agent should see but not execute), and `forwarded_props` (pass-through bag for resume/command/HITL payloads). All three are optional and channel-defined in shape; the host treats them as opaque. +- **`ChannelSession`**: A small session hint with a stable lookup key, an optional protocol-visible conversation/thread identifier, and an opaque `isolation_key`. The host resolves it into a framework session; storage specifics are deferred. +- **`isolation_key`**: An opaque partition boundary aligned with hosted-agent terminology — may represent a user, tenant, chat, or other scope without baking direct identity semantics into the generic host. +- **Channel-native identity**: The **user/account** identifier the channel observes from its own platform (Telegram `from.id`, Teams `from.aadObjectId`, WhatsApp phone number, Slack user id). Always per-channel; never assumed to align across channels. Distinct from the **conversation locator** (e.g. Telegram `chat.id`, Teams `conversation.id` + optional `replyToId`) which lives on `ChannelRequest.conversation_id` — in multi-user surfaces (Telegram groups, Teams group chats and team channels) the two never coincide, and the spec defines a `conversation_scope` knob (`per_user`, `per_user_per_conversation` (default in groups), `per_conversation`) plus default `mention_only` addressing so the bot does not respond to every message in a group. +- **Identity resolver**: Host-level seam that maps a channel-native identity into an `isolation_key`. Default behavior **auto-issues and persists** a fresh, stable `isolation_key` on first contact per `(channel, native_id)` so every end user automatically gets a per-user partition without app code; linking merges the second channel's auto-issued key onto the first channel's existing key. Apps that already own an identity namespace can supply a custom resolver that returns those values directly. +- **Identity linker**: Host-level seam that runs a connect ceremony — typically OAuth, MFA, or a signed one-time code — to associate a new channel-native identity with an existing `isolation_key`. Channels expose entry points (e.g. a `/link` command or button); the host owns the ceremony's routes and short-lived state. Mechanism (OAuth provider, MFA factor, code transport) is pluggable; the contract is not. +- **`ResponseTarget`**: Per-request directive on `ChannelRequest` that controls **where** the response is delivered: `originating` (default), `active` (the user's most recently observed channel), a specific channel, a list of channels, `all_linked`, or `none` (background-only). Independent of `session_mode`. When the target differs from the originating channel, delivery uses the destination channel's `ChannelPush` capability. +- **`ChannelPush`**: Optional channel capability for **proactive** outbound delivery (proactive Telegram message, Activity Protocol proactive message via Azure Bot Service, webhook callback, SSE broadcast). Channels that don't implement it cannot be the destination of a non-`originating` `ResponseTarget`. +- **Active channel**: The channel most recently observed for a given `isolation_key`. The host tracks last-seen `(isolation_key, channel)` so `response_target="active"` resolves to whichever channel the user is currently using. +- **`confidentiality_tier`** (channel-level): An opaque label declared on a channel (`"corp"`, `"public"`, `"internal"`, …) consumed by the host's `LinkPolicy`. Two channels with different confidentiality tiers can share an agent target on one host while remaining session-isolated. +- **`LinkPolicy`**: Host-level decision over which channel pairs may share an `isolation_key` (link) and which channel pairs may be `ResponseTarget` source/destination for one another (deliver). Built-in variants: allow-all (default), same-tier-only, explicit allow-list, deny-all (the explicit "no cross-channel continuity" mode). Running multiple hosts is always a valid alternative; the policy exists for cases where one shared host with policy-enforced isolation is preferred. +- **`ContinuationToken`**: First-class artifact for background/asynchronous runs. Carries an opaque, URL-safe `token`, current status (`queued` | `running` | `completed` | `failed`), and the resolved `isolation_key`. Channels may return it directly in their protocol response (e.g. an Invocations 202 with the token plus a polling URL) so the caller can poll later, while the host also pushes the result to the configured response target when ready. Persisted via the host-level `HostStateStore` (file-based default in v1) so background runs survive host restarts. +- **`session_mode`**: Per-request directive (`auto` | `required` | `disabled`) that controls whether the host resolves a session before invoking the agent. Lets channels honor protocol semantics like Responses `store=False` and lets app authors enforce extra policy. +- **`ChannelContribution`**: What a channel returns from its `contribute(...)` method — routes, middleware, commands, and startup/shutdown lifecycle hooks. The host aggregates contributions into one application. +- **`ChannelCommand`**: A transport-neutral command descriptor. Message channels project these into native command surfaces — Telegram bot commands, future Activity Protocol slash commands / adaptive cards, WhatsApp menus. +- **`ChannelRunHook`**: Per-request callable on built-in channels. Runs after the channel's default `ChannelRequest` is produced, before session resolution. The escape hatch for forcing or forbidding session use, requiring extra options, or adapting to targets like `A2AAgent`. +- **Native command registration**: The startup-time projection of `ChannelCommand` metadata into a platform's native command catalog (e.g. Telegram `set_my_commands(...)`). +- **Hostable target**: The executable object the host fronts — either an **agent** (invoked via the agent execution seam) or a **workflow** (invoked via the workflow execution seam). The host detects the kind and dispatches to the appropriate runner; channels remain unchanged. +- **Execution seam**: The framework's existing per-language invocation contracts — for agents, `SupportsAgentRun.run(...)` in Python and `AIAgent.RunAsync(...)` in .NET; for workflows, the equivalent per-language workflow execution seam. The host requires one of these from the hosted target. +- **`HostedStreamResult.raw_events`**: Optional passthrough seam onto the underlying agent event stream **before** update normalization, for channels whose protocol carries domain events the framework does not model (e.g. AG-UI's `StateSnapshotEvent` / `StateDeltaEvent` / `ToolCallStartEvent`). Channels that consume `raw_events` bear responsibility for the full event translation; the request still flows through `context.stream(...)` so session resolution, identity, push, and policy continue to apply. The high-level normalized `updates` stream remains the happy path for Responses, Invocations, Telegram, and most channels. +- **Per-conversation storage seam**: One public seam — `ContextProvider`. Messages flow through `HistoryProvider` (its canonical subclass); non-message per-thread state for event-rich channels (e.g. AG-UI `client_state`) flows through a channel-owned `ContextProvider` subclass that writes into the same per-source state slot. **No parallel `StateProvider` Protocol is introduced.** Host-level pluggable state (`ContinuationToken`s, identity-link grants, last-seen records) and workflow `CheckpointStorage` are deliberately separate seams because the data shapes are structurally different; all three MAY be backed by the same physical store. Per-request transport state (response ids, platform isolation keys, future signals) flows from channels via `ChannelRequest.attributes` into `ContextProvider.bind_request_context(**attrs)`, so providers consume backend-specific request signals without app authors having to wrap the host's ASGI app or install middleware. + +## Consequences + +- Good, because app authors get one consistent low-level hosting story for single- and multi-channel scenarios in each supported language. +- Good, because channel packages can stay opinionated about protocol payloads and capabilities without pushing those semantics into the core. +- Good, because the existing protocol-specific implementations provide proven prior art and behavioral guidance. +- Good, because the design supports webhook/message channels that do not look like OpenAI or Foundry APIs. +- Good, because command-capable message channels such as Telegram are first-class channels rather than special-case samples. +- Good, because architectural portability stays at the **standard web-application object** level (ASGI app in Python, `WebApplication` in .NET), so the host is not fundamentally coupled to any one server implementation even when a `serve(...)` convenience uses one. +- Good, because channels can ship sensible invocation defaults while still giving app authors a clear place to enforce extra policy or adapt to different agent implementations (e.g. `A2AAgent`). +- Good, because cross-channel chat continuity for one end user is achievable in the first phase whenever channels can produce a stable `isolation_key`, without requiring any new cross-package storage contract. +- Good, because the same conceptual model is shared across languages — concepts, terminology, and behavior transfer between Python and .NET teams and docs. +- Bad, because we introduce new package and namespace surface area that must be versioned and documented in each language. +- Bad, because we still need to reimplement the needed behavior in Agent Framework-owned code per language. +- Bad, because there will be a temporary overlap with the existing protocol-specific hosts until the new channel packages are implemented and stabilized. +- Neutral, because existing protocol packages remain outside the first implementation scope even though the model keeps a path open for later convergence. + +## Validation + +The decision is validated when, in each implementing language: + +1. a one-channel Responses sample and a two-channel Responses + Invocations sample can be expressed with one host, default route layouts under `/responses/v1` and `/invocations/invoke`, and no handwritten protocol routing, +2. a Responses channel by default forwards official request parameters like `temperature` into agent options and maps `store=False` into disabled session use, +3. app authors can override that default per request with an run hook that validates or rewrites the final `ChannelRequest` (for example requiring `temperature`, ignoring `store`, or adapting for `A2AAgent`), +4. a Telegram-style message channel can express command metadata, command registration, and either webhook or polling lifecycle behavior through the new channel contract, +5. a custom webhook/message channel can be authored only against the new channel contract plus the language's web-framework primitives and lifecycle hooks, +6. two channels mounted on the same host (e.g. Telegram + a future Activity Protocol channel — Teams via Azure Bot Service) configured with a stable per-user `isolation_key` resolve to the same session for the same end user, so a conversation started on one channel can be continued on the other against the same conversation history, +7. an end user who is known on one channel can **link a second channel to the same `isolation_key`** through a host-provided ceremony (OAuth, MFA, or a signed one-time code) without each channel reinventing the linking flow, and subsequent requests from the linked channel resolve to the same session as the original channel, +8. a request submitted on one channel can opt into **delivery on a different channel** — `response_target="active"` (whichever channel the user is currently using), a specific channel id, all linked channels, or `none` (background only) — using the destination channel's `ChannelPush` capability, without the originating channel having to know how the destination delivers, +9. **background runs are first-class**: a channel can submit a request that returns a `ContinuationToken` immediately and the response is later delivered both via channel push (when the user is next observed on the configured target channel) and via a poll route the caller can hit with the token, +10. the **same host construction** can front either an agent or a workflow target — the channel ecosystem (Responses, Invocations, Telegram, …) is unchanged, and only the `run_hook` (channel-default or app-supplied) differs to adapt the inbound `ChannelRequest` into the input shape the target requires, +11. a host configured with at least the Responses and Invocations channels can be packaged into a container image whose runtime contract (exposed routes, request/response shapes, health/lifecycle behavior) is **compatible with the Hosted Agents platform**, so the same image can be deployed to that platform without protocol shims, +12. a channel can contribute a **WebSocket endpoint** alongside its HTTP routes through the same `Channel` contract, the host's app object exposes it through the standard ASGI / ASP.NET Core WebSocket scope, and the built-in Responses channel exposes a WebSocket transport (default `/responses/ws`) carrying the same Responses request/event model as its HTTP+SSE transport — so the host is forward-compatible with the OpenAI Responses WebSocket transport without changing the hosting contract, +13. a host can **mix channels of different confidentiality tiers** under a `LinkPolicy` so e.g. a corporate-tier channel (Teams) and a public-tier channel (Telegram) share one agent target without sharing a session, cross-tier link attempts are refused with a typed error, cross-tier `ResponseTarget` deliveries are dropped, and the same outcome is reachable by simply running two separate hosts (validating that the policy is a convenience, not a load-bearing mechanism), and +14. the first Responses and Invocations implementations achieve parity with the important behavior of the current protocol-specific hosts without introducing a runtime dependency on them or leaking protocol-specific request models into the hosting core. + +## Pros and Cons of the Options + +### Keep the current protocol-specific hosting packages only + +- Good, because no new package or abstraction needs to be introduced. +- Good, because each protocol can move independently. +- Bad, because users still cannot host one agent on multiple channels through one shared host. +- Bad, because request/session/event bridging keeps being rebuilt at the protocol layer. +- Bad, because webhook/message channels still have no natural home. +- Bad, because the same gap exists in every language with no shared conceptual model. + +### One monolithic `hosting` package with all channels built in + +- Good, because discovery is straightforward. +- Good, because cross-channel refactoring is simpler inside one package. +- Bad, because every app pays the dependency and maintenance cost of every channel. +- Bad, because lifecycle and stability become coupled across unrelated channels. +- Bad, because it does not fit either ecosystem's subpackage direction. + +### New hosting core plus new channel packages, reimplemented without reference to current hosting implementations + +- Good, because the abstraction boundary can be kept very clean. +- Good, because package ownership is clear. +- Bad, because it ignores useful prior art in the current hosting implementations. +- Bad, because it increases implementation cost and migration risk. +- Bad, because it makes early channel parity harder. + +### New hosting core plus separate channel packages, informed by current protocol-specific implementations + +- Good, because it gives us a reusable host abstraction without discarding what we learned from current protocol work. +- Good, because the core stays protocol-agnostic while channel packages remain Agent Framework-owned and dependency-free with respect to the legacy protocol-specific hosts. +- Good, because it gives future channels a deeper seam than today's top-level host wrappers. +- Good, because the conceptual model can be applied uniformly in Python and .NET. +- Neutral, because some implementation details may look similar to the current hosts when they are solving the same problem. +- Bad, because the design team must still curate the boundary carefully to avoid copying protocol-specific assumptions into the generic host. + +## Open Questions + +| # | Question | Notes | +|---|---|---| +| 6 | Is "Channel" the GA name in both languages? "Head" was used interchangeably during design discussions. | Use "Channel" for now in spec, ADR, samples, and sub-package names. Other names remain on the table; revisit before public docs in either language. | +| 14 | For the Responses WebSocket transport, what subprotocol identifier and auth carrier should the channel adopt — `Authorization` header on the `Upgrade`, a `Sec-WebSocket-Protocol` token, or a query-string-bound short-lived token? | Wait for the upstream OpenAI Responses WS spec to land. The channel codec is intentionally swappable (the host contract does not depend on the WS framing) so the channel package can track upstream changes without touching the host. Document the swappable-codec property explicitly in the spec. | + +## Resolved Questions (decisions log) + +| # | Question | Decision | +|---|---|---| +| 1 | Final distribution package and namespace names per language. | Accept the proposed Python distribution + import names (`agent-framework-hosting` → `agent_framework.hosting`, plus per-channel `agent-framework-hosting-{responses,invocations,telegram}`). Keep the proposed .NET namespaces (`Microsoft.Agents.AI.Hosting{,.Responses,.Invocations,.Telegram}`) as the working target. | +| 2 | How tightly do Python and .NET API names need to match? | Keep concepts and terminology identical across languages; allow idiomatic naming differences (e.g. `serve` vs `RunAsync`). | +| 3 | Should generic auth helpers (HMAC signature, bearer token) live in core, in optional shared helpers, or per channel? | Per-channel auth + host-level middleware composition (current draft). No separate shared-helpers package in v1. **Cross-check the matching decision in the Python spec.** | +| 4 | Should a later phase define a pluggable session store interface, and should it be cross-language or per-language? | Per-language interface (idiomatic per ecosystem). Cross-language compatibility is **not** a v1 goal; revisit if/when concrete demand emerges. | +| 5 | Should the host support multi-target hosting (one host fronting a router across multiple agents/workflows)? | **No.** One host = one target. External routers compose multiple single-target hosts (e.g. via Starlette mount in Python, equivalent in .NET). Confirms the existing non-goal. | +| 7 | Should command scopes / projection metadata (private vs group, per-locale descriptions) become first-class on `ChannelCommand`? | Add **optional** `scopes` and `locales` fields on `ChannelCommand`. Channels are free to ignore them. Keeps the cross-channel surface lean while letting Telegram (and the future Activity Protocol channel) project the metadata into their native command catalog. | +| 8 | Which identity-linking mechanisms ship in the first phase? | Ship two first-party helpers in v1 fast-follow: **Entra OAuth** (preset on `OAuthIdentityLinker`) and **`OneTimeCodeIdentityLinker`** (cross-channel code exchange). **Drop `MfaIdentityLinker`** from the v1 fast-follow list. The generic `IdentityLinker` contract still admits any other linker app authors want to write. | +| 9 | Where do issued link grants live? | **File storage for v1**, leveraging Hosted Agents' isolated, persistent per-instance file storage. Resolved together with Q11. | +| 10 | Should the identity resolver be invoked per channel or once on the host with `(channel_id, native_id)`? | **Host-level resolver receiving `(channel_id, native_id)`** so cross-channel decisions stay in one place. Per-channel overrides remain a future option if real cases emerge. | +| 11 | Where does the continuation-token store live? At-rest format and TTL? | Same as Q9 — **file storage for v1** (`FileHostStateStore` under `./.af-hosting/continuations/`, atomic JSON-per-token writes, 24h default TTL on completed entries). Shares the host-level `HostStateStore` contract with link grants and last-seen records. Pluggable Cosmos / SQL / Redis adapters tracked in spec req #23. | +| 12 | Contract for `ChannelPush` failure (offline destination, opt-out, expired token)? | Default: **fall back to the originating channel**, recorded on the persisted `deliveries[]` array with telemetry. Per-request override via `run_hook`. (This already matches the spec; cross-check the wording.) | +| 13 | Should `response_target="active"` use a time window? Behavior on expiry? | Yes — configurable `active_window_seconds` on the host (suggested default **300 s**). On expiry, fall back to `originating`, then to `all_linked`. Recorded on `deliveries[]`. Per-request override via `run_hook`. | +| 15 | Should `Channel.confidentiality_tier` stay opaque or become an ordered enum? | **Keep as opaque string.** Apps define their own taxonomy. Built-in policies do equality / set membership checks only — no ordered-comparison policy is shipped. | + +## Decisions-driven follow-ups + +These are spec-body / sample / code edits implied by the resolutions above, **out of scope for this ADR pass** but tracked here so they aren't lost: + +- **Q3** — cross-check the Python spec's auth-helpers stance against the resolved "per-channel + host middleware" decision; reconcile any drift. +- **Q7** — spec, `ChannelCommand` reference, and the Telegram channel design need optional `scopes` and `locales` fields with clear "channels free to ignore" semantics. +- **Q8** — ✅ Done in spec rev. Req #24 lists only `OAuthIdentityLinker` and `OneTimeCodeIdentityLinker`; the linker-helper table and the OAuth scenario no longer reference `MfaIdentityLinker`. +- **Q9 + Q11** — ✅ Resolved in spec rev. Spec req #23 now names the seam **`HostStateStore`** with a v1 default of `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`), so continuation tokens, link grants, and last-seen records all survive single-node restarts. Pluggable Cosmos / SQL / Redis adapters remain v1 fast follow. +- **Q12** — verify the spec's `ChannelPush` failure narrative includes "recorded on `deliveries[]`" alongside "telemetry warning"; tighten if needed. +- **Q13** — add `active_window_seconds` (default 300 s) to the host config surface and document the `originating` → `all_linked` fallback chain. +- **Q14** — explicitly document the **swappable WS codec** property in the Responses channel section (host contract does not depend on the framing) so the spec stays valid as upstream OpenAI evolves. +- **Q15** — confirm the spec consistently treats `confidentiality_tier` as an opaque string and that no built-in policy assumes an ordered hierarchy. + +## More Information + +See [Non-Goals](#non-goals--relationship-to-existing-hosting-packages) for what this ADR explicitly does **not** require in the first phase. + +The Telegram sample proposed in PR #5393 is prior art for native command catalogs and for channels that need startup/shutdown lifecycle behavior beyond plain route registration. The same shape is expected to inform the future Activity Protocol channel (Teams/Web Chat/etc. via Azure Bot Service) and a future WhatsApp channel in both languages. + +**Designed-in followup channels.** Three further channels are explicitly part of the overall design but are scheduled as fast-follow work after the first Responses + Invocations + Telegram release: + +- **A2A channel** — exposes the hostable target over the Agent-to-Agent protocol so other agents can consume it as a peer. Fits the existing **caller-supplied session** family (alongside Responses and Invocations): A2A's per-conversation identifier is parsed into `ChannelSession.key`, the calling agent's identity (e.g. its A2A agent card / signed JWT) flows through the standard `IdentityResolver` seam, and structured replies fit the existing `ChannelRequest` / `ResponseTarget` envelope. No new host primitives are required to support it; the work is the protocol binding and the package. +- **MCP-tool channel** — exposes the hostable target as a **Model Context Protocol tool** so MCP clients (other agents, IDE tooling, …) can invoke it. Same caller-supplied-session family: the MCP `tool/call` carries the conversation key into `ChannelSession.key`, the MCP client identity flows through `IdentityResolver`, and the tool result is the target's response. Streaming MCP tools map onto the host's existing streaming response delivery; non-streaming MCP tools map onto background runs with `ContinuationToken` if the target needs more time than a single tool-call round-trip allows. +- **Activity Protocol channel** (`ActivityChannel`) — exposes the hostable target behind **Azure Bot Service**, which fronts Teams, Web Chat, Slack-style connectors, and the rest of the Bot Framework / M365 connector ecosystem. Native translations from Activity Protocol objects (`Activity`, `ConversationReference`, adaptive cards, `Invoke` activities, …) onto the host's `ChannelRequest` / `ChannelResponse` types — so the contract is **explicit** rather than smuggled through a generic Invocations endpoint. Fits the **host-tracked session** family: Bot Service authenticates with a JWT carrying the AAD object id, the channel populates `ChannelIdentity` from `from.aadObjectId`, the host's per-`isolation_key` alias decides which `AgentSession` to resolve, and `host.reset_session(...)` is reachable via a Teams slash command or adaptive-card action. `ChannelPush` is implemented over Bot Service's `ConversationReference` + `continueConversationAsync` pattern. Naming this channel **Activity** rather than **Teams** keeps a `TeamsChannel` name available for any future direct-to-Teams transport that bypasses Bot Service. + +All three channels MUST be reachable through the **same** `AgentFrameworkHost` as Responses, Invocations, and Telegram so the cross-channel `isolation_key` continuity story (start a task via MCP from an IDE, follow up on Telegram, deliver the result on Teams via the Activity channel) is coherent. Their detailed API surfaces are deferred to dedicated follow-up specs. + +Companion specs cover the per-language API surface, information design, and sample code: + +- [SPEC-002 Python hosting core and pluggable channels](../specs/002-python-hosting-channels.md) +- *(future)* SPEC-00X .NET hosting core and pluggable channels + +## Appendix A — Comparison with Microsoft 365 Activity Protocol + +The [Microsoft 365 Agents SDK Activity Protocol](https://learn.microsoft.com/en-us/microsoft-365/agents-sdk/activity-protocol) (and its underlying [protocol-activity spec](https://github.com/microsoft/Agents/blob/main/specs/activity/protocol-activity.md)) is the closest existing Microsoft prior art for a multi-channel hosting layer. It powers Microsoft 365 Copilot, Copilot Studio, and the M365 Agents SDK across Teams, web chat, Slack-style connectors, and so on. This appendix contrasts the two designs so future readers know which problems we deliberately solve differently and why. + +### Mental model + +| Concept | Activity Protocol | This ADR | +|---|---|---| +| **Inbound + outbound envelope** | A single `Activity` JSON envelope used in both directions, distinguished by `type` (`message`, `event`, `invoke`, `conversationUpdate`, `typing`, …). | Asymmetric: `ChannelRequest` for inbound, `HostedRunResult` / `HostedStreamResult` / `ChannelPush` for outbound. Protocol-native bytes never leave the channel package. | +| **Channel surface** | A `ChannelID` string (e.g. `msteams`) on every Activity; channels are connected via Bot Framework Connector Service or M365 Agents SDK adapters. | A `Channel` Protocol contributed by an in-process Python package. Each channel owns its own routes, parsing, auth validation, and protocol model — no central connector service. | +| **Adapter** | `Adapter` / `CloudAdapter` translates channel-native protocol ↔ Activity and runs the turn. Adapters are framework-supplied. | `Channel.contribute(context) -> ChannelContribution` returns Starlette routes + lifecycle. Channels are user-extensible packages. | +| **Turn** | `TurnContext` bundles incoming `Activity`, outbound `SendActivityAsync`, `TurnState`, and adapter. Per-turn, disposed at end. | Channel handler calls `await context.run(channel_request)` / `context.stream(...)`; reply is the awaited `HostedRunResult`. No per-turn state object beyond the request itself. Earlier draft had a `ChannelRunHookContext`; that wrapper was removed in favor of `(request, **kwargs)`. | +| **Identity** | `Activity.From` + `Activity.Recipient` carry per-turn identities; cross-channel identity unification is not in protocol. | `ChannelIdentity(channel, native_id, attributes)` extracted by the channel; host-level `IdentityResolver` maps to a stable `isolation_key`; `IdentityLinker` performs cross-channel link ceremonies. | +| **Conversation context** | `Activity.Conversation.id` is the per-channel conversation key; conversation history is the agent author's responsibility. | `ChannelSession(key, isolation_key)` resolves to an `AgentSession` host-side, with cross-channel continuity when channels emit the same `isolation_key`. | +| **Routing reply target** | Reply goes to `Activity.Conversation.id` on the originating channel. Cross-channel proactive sends require manually persisting a `ConversationReference`. | `ResponseTarget` (`originating`, `active`, `channel(name)`, `channels([...])`, `all_linked`, `none`) is first-class on every request, resolved by the host against last-seen channel state and the identity store. | +| **Background work** | No first-class `ContinuationToken`; long work uses proactive messaging via stored `ConversationReference`. | `ContinuationToken` + `host.run_in_background(...)` + per-channel poll routes are part of the host contract; result delivery follows `ResponseTarget`. | +| **Auth** | Bot Framework Auth: JWTs signed by the Bot Connector Service, verified by the SDK adapter. | Each channel implements its own validation against the upstream protocol (Telegram secret token, Bot Service JWT for the planned `ActivityChannel`, OAuth on identity-link routes); host can layer Starlette middleware. | +| **Activity types beyond messages** | First-class `ConversationUpdate`, `Event`, `Invoke`, `Typing`, plus 20+ others — channels emit them uniformly. | `ChannelRequest.operation` is a free-form discriminator (default `"message.create"`); other categories (typing indicators, membership change, structured `invoke` request/reply) are channel-package concerns and not modeled centrally. | +| **Outbound streaming** | `SendActivityAsync(typing)` + multiple `SendActivity` calls. | `HostedStreamResult` async iterator returned to the channel; channel decides how to render onto its protocol (SSE for Responses, long messages for Telegram, etc.). | + +### Where we deliberately diverge + +1. **Asymmetric envelopes instead of a single `Activity`.** The Activity envelope is heavyweight and tightly coupled to Bot Framework conventions (`From`/`Recipient`/`Conversation`/`ServiceUrl`). For a hosting layer that fronts the Responses HTTP API, OpenAI-style invocations, and Telegram all at once, forcing every channel through a unified envelope would either dilute it (Responses-shaped JSON wedged into `Activity.Value`) or impose Bot Framework semantics on protocols that don't carry them (Responses has no per-message `From` to fill). The cost of asymmetry is that channels write their own outbound serialization; the gain is each channel stays idiomatic to its upstream protocol. + +2. **In-process channel packages instead of a connector service.** Activity Protocol assumes a Bot Connector Service (cloud-hosted by Microsoft for Teams/Web Chat/etc.) sits between the channel and the agent. We target a single Starlette ASGI app the developer runs anywhere, with each channel package owning its own webhook/HTTP/SSE/WS surface. This is critical for the Responses and Invocations channels (which **are** the upstream protocol; there is no connector to terminate them) and removes the operational dependency for self-hosted deployments. The trade-off is that scaling, auth federation, and channel-update rollout become the operator's problem instead of being centralized. **Note:** the planned `ActivityChannel` (designed-in fast follow) does deliberately sit behind Azure Bot Service so we inherit the connector model for Teams/Web Chat/Slack — that channel is the *interop* path for Activity Protocol; the contrast above is about the rest of the channel set (Responses, Invocations, Telegram, A2A, MCP-tool) where there is no equivalent service and a direct in-process binding is the only sensible option. + +3. **Cross-channel identity is first-class.** Activity Protocol has no native concept of "this Teams user is the same person as this Telegram user." Bot Framework's User Authentication / OAuth Connection Settings handle per-channel sign-in but not the merge. Our `IdentityLinker` + host-managed identity store explicitly model the link ceremony and the resulting merge so a single `AgentSession` can span channels. This is required for the multi-channel scenarios this hosting layer was created to support (Scenarios 7 and 8 in SPEC-002) and is intentionally above what the Activity Protocol contract guarantees. + +4. **`ResponseTarget` as a request-level field instead of an out-of-band proactive-send pattern.** Activity Protocol treats proactive cross-channel delivery as a deployment exercise (persist `ConversationReference`, restore later, call `continueConversationAsync`). We elevate it to a typed field on every request, consumed by the host. This makes "submit on Telegram, deliver result on Teams" a one-line authoring change instead of a custom pipeline, but it does require that channels capable of proactive delivery implement the `ChannelPush` capability. + +5. **No central activity-type taxonomy in v1.** `ChannelRequest.operation` is intentionally free-form. Activity Protocol's `Type` discriminator (`message`, `event`, `invoke`, `conversationUpdate`, `typing`, …) is a real strength — it lets generic middleware reason about non-message events uniformly. We accept the gap in v1 because (a) the Responses + Invocations + Telegram set has effectively one "type" (a message that wants a reply), and (b) modeling the long tail of typed events properly is a design exercise that should not block hosting v1. See **Possible influence on future iterations** below. + +6. **No `TurnContext`-style per-turn bag.** Earlier drafts of this ADR proposed `ChannelRunHookContext` to play a similar role to `TurnContext`. It was removed in favor of `def hook(request, **kwargs) -> ChannelRequest` because the only consumers (run hooks) don't need most of what `TurnContext` provides, and forcing a wrapper made simple hooks awkward to write inline. Channels that need adapter-style state can compose it inside their own `Channel` implementation. + +### Where Activity Protocol could influence future iterations + +- **Typed event taxonomy.** Adopting a small enum for `ChannelRequest.operation` modeled on Activity Protocol's set (`message`, `event`, `conversationUpdate`, `invoke`, `typing`) would let generic middleware (rate limit, audit, content moderation) reason about channel traffic uniformly. This is additive and could land alongside the v1.x telemetry work without breaking the free-form string field. +- **Outbound `Activity`-style envelope as a serialization target.** This is the planned `ActivityChannel` (designed-in fast follow) — it maps `HostedRunResult` ↔ `Activity` inside the channel package and forwards through Azure Bot Service. The hosting contract was designed so this binding requires no new host primitives. +- **`ConversationReference`-style proactive seed.** When `ResponseTarget.active` cannot find a recently seen channel, falling back to a stored `ConversationReference`-equivalent (last-known channel + last-known native id, persisted in the identity store) would mirror Bot Framework's proactive-message recovery story. This is implicit in the v1.x identity-store work (Open Question 9). +- **Invoke-style synchronous request/reply.** Activity Protocol's `Invoke` (`task/fetch`, `task/submit`) is a useful precedent for what a typed `InvocationsChannel.invoke()` operation could look like beyond "post one message, get one reply" — particularly for Teams adaptive-card submit flows that the `ActivityChannel` will eventually need to host. + +### Summary + +Activity Protocol optimizes for **a single Microsoft-operated abstraction over many client surfaces**, with a uniform envelope, a connector service in the middle, and per-channel adapters supplied by the SDK. This ADR optimizes for **a self-hosted, in-process Python (and later .NET) layer that fronts both LLM-shaped HTTP protocols and human-chat channels**, with each channel owning its idiomatic protocol and the host owning identity, sessions, and cross-channel routing. The two designs solve overlapping but distinct problems; nothing in this ADR precludes a future Activity Protocol channel package, and several of Activity Protocol's primitives (typed event taxonomy, conversation reference, invoke) are tracked as candidate future enhancements. diff --git a/docs/specs/002-python-hosting-channels.md b/docs/specs/002-python-hosting-channels.md new file mode 100644 index 00000000000..d5ece88023c --- /dev/null +++ b/docs/specs/002-python-hosting-channels.md @@ -0,0 +1,1602 @@ +--- +status: proposed +contact: eavanvalkenburg +date: 2026-04-24 +deciders: eavanvalkenburg +--- + +# Python hosting core and pluggable channels + +## What are the business goals for this feature? + +Give Python app authors one low-level, Starlette-based hosting surface that can expose a single **hostable target** — either a `SupportsAgentRun`-compatible agent **or** a `Workflow` — on one or more channels (Responses API, Invocations API, Telegram, future A2A, MCP-tool, Activity Protocol via Azure Bot Service — which fronts Teams, Web Chat, Slack, …— WhatsApp, optional future direct-to-Teams, etc.) without requiring them to hand-build protocol routing or server glue per protocol, **and** let an end user start a conversation on one channel (e.g. Telegram on their phone) and seamlessly continue it on another (e.g. Teams at their desk via the Activity channel) against the same target and the same conversation history. + +This consolidates the protocol-specific hosting layers that exist today (`agent-framework-foundry-hosting`, `agent-framework-ag-ui`, `agent-framework-a2a`, `agent-framework-devui`) into a shared composable model where: + +- a host owns the ASGI app and channels own protocol shape, +- session identity is **channel-neutral** — the host resolves a session from a channel-supplied `isolation_key` (e.g. a stable user identity) so two channels mounted on the same host can resolve to the **same** `AgentSession` for the same end user, and a future pluggable session store extends that continuity across hosts and processes, and +- channel-native identity is **mapped, not assumed** — the host owns a first-class `IdentityResolver` seam (channel-native id → `isolation_key`) and an `IdentityLinker` seam (well-known connect ceremony — OAuth, MFA, signed one-time code — to associate a new channel-native id with an existing `isolation_key`), so cross-channel continuity does not depend on each channel's user namespace happening to align, and +- response delivery is **decoupled from request origin** — every `ChannelRequest` carries a `ResponseTarget` (`originating` (default), `active` for the user's most recently used channel, a specific channel id, all linked channels, or `none` for background-only). Background/asynchronous runs are first-class via a `ContinuationToken` returned by `host.run_in_background(...)` so a user can submit a long-running request on one channel and receive the result on another (or poll by continuation token), and +- channels can be assigned different **confidentiality tiers** so two channels on one host can share an agent without sharing a session — e.g. Teams (corporate, allowed to access internal resources) and Telegram (public) can run against the same target while remaining session-isolated, with a host-level `LinkPolicy` that decides which confidentiality tiers may be linked (and includes an explicit "deny all" variant for hosts that want no cross-channel continuity at all). Running two separate hosts is always a valid alternative; the per-tier policy exists for cases where one shared host with two policy-isolated tiers is preferred, and +- **multi-user surfaces** (Telegram groups, supergroups, forum topics; Teams group chats and team channels) are first-class — the channel layer separates user identity from conversation locator, defaults to safe behavior (`mention_only` addressing, `per_user_per_conversation` session scoping, link ceremonies redirected to DMs), and exposes per-channel options to opt into shared-context modes when desired (see [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels)). + +We know we're successful when: + +- after the agent is created, a basic multi-channel sample requires only one `AgentFrameworkHost`, channel objects, and one `host.serve(...)` call — no handwritten protocol routes and no per-protocol server bootstrap. The hosting core itself takes no dependency on `agentserver`; individual channel packages MAY depend on it where it provides directly reusable building blocks (e.g. `agent-framework-foundry-hosting` builds on the Foundry response-store SDK that ships in `azure.ai.agentserver`), +- a single `AgentFrameworkHost` configured with two channels (e.g. Telegram + a future Activity Protocol channel — Teams via Azure Bot Service) can be exercised by one end user across both channels and observe one continuous conversation, +- an end user known on one channel can run a host-provided `link`/`connect` command on a second channel, complete an OAuth (or MFA, or one-time-code) ceremony, and see subsequent messages on the second channel resolved against the same `AgentSession` as the first, **and** +- a user can submit a long-running request on Telegram with `response_target="active"`, switch to Teams (via the Activity channel), and receive the result there as a proactive message — with a poll route as a fallback for callers that prefer polling. + +## Problem Statement + +### How do developers solve this problem today? + +Today, every protocol surface is its own package with its own server. A developer who wants to expose one agent over both the Responses API and a webhook channel has to stand up two separate hosts and stitch them into one ASGI app by hand: + +```python +# Today: developer composes two protocol-specific hosts manually +import os +import uvicorn +from starlette.applications import Starlette +from starlette.routing import Mount + +from agent_framework import Agent +from agent_framework.openai import OpenAIChatClient +from agent_framework.foundry_hosting import ( + ResponsesHostServer, + InvocationsHostServer, +) + +agent = Agent( + name="WeatherAgent", + instructions="You are a helpful weather agent.", + client=OpenAIChatClient(model="gpt-4.1-mini"), +) + +# Two separate, protocol-specific host wrappers, each with their own +# request/session/event mapping inside. +responses_host = ResponsesHostServer(agent=agent) +invocations_host = InvocationsHostServer(agent=agent) + +# Manually mount each into a Starlette app so they share a process. +app = Starlette(routes=[ + Mount("/responses", app=responses_host.app), + Mount("/invocations", app=invocations_host.app), +]) + +# Bring up the server by hand. +if __name__ == "__main__": + uvicorn.run(app, host="localhost", port=8000) +``` + +Adding a Telegram bot to the same agent today means leaving this stack entirely: spinning up a separate process, installing a Telegram SDK, writing the polling/webhook loop, manually translating updates into agent calls, and wiring command handlers (`/start`, `/new`, `/cancel`, ...) and `set_my_commands(...)` registration by hand — none of which is reusable across other message channels. + +### Why does this problem require a new hosting abstraction? + +The gap is between **owning a hostable target** (a `SupportsAgentRun` agent or a `Workflow`) and **operationalizing it on multiple channels**. Agent Framework already provides agents, workflows, sessions, run inputs, response/update streaming, the `SupportsAgentRun` execution seam, and the `Workflow` execution seam. What's missing is a generic host that: + +1. Owns one Starlette app and one set of lifecycle hooks. +2. Lets channels contribute routes, middleware, commands, and startup/shutdown without protocol leakage into the host. +3. Standardizes how protocol requests become agent invocations (input, options, session, streaming) and how agent results flow back out. +4. **Resolves a session from a channel-neutral `isolation_key`** so two channels mounted on the same host can converge on the same `AgentSession` for the same end user — enabling cross-channel chat continuity (start on Telegram, continue on Teams) without per-channel session bookkeeping. +5. Provides a first-class extension seam for webhook/message channels with native command catalogs (per PR #5393 Telegram sample). + +The current `agentserver`-based hosts are valuable prior art but sit too high in the stack — they encode protocol ownership at the host level. The new generic core learns from their behavior without depending on them; individual channel packages may still depend on the parts of `agentserver` that ship reusable building blocks (notably the Foundry response-store SDK). + +## Non-Goals / Relationship to existing hosting packages + +The hosting core is deliberately **not** a replacement for the existing protocol packages in their first form, and it is not a multi-agent router. Hosting core, `ag-ui`, `a2a`, `devui`, and `foundry-hosting` solve adjacent but distinct problems: + +| Dimension | Existing protocol packages | `agent-framework-hosting` | +|---|---|---| +| **Mental model** | One package = one protocol surface, owns its own server | One host owns ASGI app; channels plug protocols in | +| **Scope** | Protocol-specific request/session/event mapping | Generic host + channel contract; protocol logic lives in channel packages | +| **Composition** | One protocol per process or per Mount | Many channels per host, shared middleware, lifecycle, session resolution | +| **Multi-agent** | Out of scope per package | **No.** One host = one agent. Future work. | + +**Explicit non-goals:** +- Migrating `ag-ui`, `a2a`, or `devui` onto the new core in the first implementation. +- Standardizing a persistent session storage contract across all channels. +- Hosting multiple agents behind one router in this first design. +- Designing every detail of WhatsApp, the full Activity Protocol surface, or a future direct-to-Teams channel now (only Telegram is concretely targeted, informed by PR #5393; Activity Protocol via Azure Bot Service, A2A, MCP-tool, and Teams-native via `microsoft/teams.py` are designed-in fast follow — see reqs #25–#28). +- Replacing protocol-specific serializers with one generic event model. +- Taking a runtime or package dependency on the legacy protocol-specific hosts (e.g. `ResponsesAgentServerHost`, `InvocationAgentServerHost`) from the new hosting core. Channel packages MAY depend on lower-level parts of `azure.ai.agentserver` where it ships reusable building blocks (e.g. the Foundry response-store SDK consumed by `FoundryHostedAgentHistoryProvider`). + +**Boundary rule:** If you need protocol-specific event semantics, codecs, or signature validation, that lives in the channel package. The host owns ASGI, lifecycle, session resolution, and the call into the target's execution seam (`SupportsAgentRun.run(...)` for agents, the workflow execution seam for workflows). + +## Requirements + +After we deliver `agent-framework-hosting` and its first channel packages, users will be able to: + +1. **Compose one host with one or more channels** — instantiate `AgentFrameworkHost(target=..., channels=[...])` where `target` is either a `SupportsAgentRun`-compatible agent or a `Workflow`, and get one Starlette application with all channels mounted. +2. **Expose the Responses API** — add `ResponsesChannel()` and serve `/responses/v1` (and conversation routes) without writing protocol handlers. +3. **Expose the Invocations API** — add `InvocationsChannel()` and serve `/invocations/invoke` without writing protocol handlers. +4. **Expose a Telegram bot** — add `TelegramChannel(bot_token=...)` with either `polling` or `webhook` transport, and register native commands declaratively with `ChannelCommand`. +5. **Override mount roots without breaking protocol paths** — pass `path="/public/responses"` and the channel still owns the protocol-relative suffix (`/v1`, `/invoke`, `/webhook`). +6. **Customize per-request invocation behavior** — pass a `run_hook` to any built-in channel. The hook receives the channel-produced `ChannelRequest` (the host-neutral envelope each channel builds from its own protocol parsing — see [Key Types](#key-types)) and returns a possibly-modified `ChannelRequest`. Use it to validate, rewrite, or strip channel-derived options (e.g. enforce or drop `temperature`, override `session_mode`) before the host calls the target's execution seam. It is also the **adapter** that reshapes the channel's default `ChannelRequest.input` into the typed inputs a workflow target requires. +7. **Control session use per request** — built-in channels set `ChannelRequest.session_mode` to `auto`, `required`, or `disabled`; the host honors that when resolving `AgentSession`. +8. **Partition sessions by isolation key** — channels populate `ChannelSession.isolation_key` (user, tenant, chat, …) using hosted-agent terminology. +9. **Resolve to the same session across channels on one host** — two channels mounted on the same `AgentFrameworkHost` that produce the same `isolation_key` (e.g. a stable user identity mapped from each channel's native identifier) resolve to the same `AgentSession`, so an end user starting a chat on Telegram can continue it on Teams against the same conversation history without per-channel session bookkeeping. +10. **Map channel-native identity into `isolation_key`** — every channel has its own user namespace (Telegram `chat_id`, Teams AAD object id, WhatsApp phone, Slack user id). The host accepts a host-level `identity_resolver` callable that maps a `ChannelIdentity(channel_id, native_id, attributes)` into an `isolation_key` (or `None` if unknown). Channels publish the native identity they observed; the resolver decides whether it maps to an existing user. +11. **Link a new channel to an existing identity through a well-known ceremony** — the host accepts a host-level `identity_linker` (e.g. `OAuthIdentityLinker(...)`, `OneTimeCodeIdentityLinker(...)`) which contributes its own routes/lifecycle and exposes a `begin(channel_identity) -> LinkChallenge` / `complete(challenge_id, proof) -> isolation_key` flow. Channels surface a `link`/`connect` `ChannelCommand` that delegates to the linker; on success the resolver subsequently maps the new channel-native identity to the existing `isolation_key`. Mechanism (OAuth provider, signed one-time code, future linker types) is pluggable; the contract is fixed. +12. **Route the response to a chosen channel** — `ChannelRequest.response_target` accepts `ResponseTarget.originating` (default — synchronous response on the originating channel), `ResponseTarget.active` (the channel most recently observed for the resolved `isolation_key`), `ResponseTarget.channel("activity")` (specific channel id, recipient resolved from the link store), `ResponseTarget.channels([...])` (a list), `ResponseTarget.identities([ChannelIdentity(...)])` (one or more **explicit channel-native identities** — bypasses the link store, used when the caller already knows the recipient's channel-native id), `ResponseTarget.all_linked` (every channel where this `isolation_key` is known), or `ResponseTarget.none` (background-only — caller must poll the `ContinuationToken`). When the target is not the originating channel, the host delivers via the destination channel's `ChannelPush` capability. +13. **Push proactively from a channel** — channels that can deliver outbound messages without a prior request (Telegram bot proactive message, Activity Protocol proactive message via Azure Bot Service, webhook callbacks, SSE broadcasts) implement an optional `ChannelPush` capability on top of the base `Channel` protocol. Channels without push can only be the `originating` target. +14. **Submit background runs as a first-class operation** — `host.run_in_background(request) -> ContinuationToken` returns immediately with an opaque, URL-safe `token` and a status (`queued` | `running` | `completed` | `failed`). The host invokes the target asynchronously and, when complete, both delivers the result via the configured `ResponseTarget` push **and** records it against the token so callers can poll `host.get_continuation(token)`. Built-in channels expose poll routes (`/responses/v1/{continuation_token}`, `/invocations/{continuation_token}`) that surface this without app code. Continuation tokens are persisted via a `HostStateStore` (file-based by default — see [Host state storage](#host-state-storage)) so background runs survive host restarts. +15. **Track the active channel per `isolation_key`** — the host records `(isolation_key, last_seen_channel, last_seen_at)` on every successfully resolved request so `ResponseTarget.active` resolves correctly. Apps can override in the `run_hook` (e.g. force `active` to a specific channel for a particular request). +16. **Add Starlette middleware at the host level** — pass `middleware=[Middleware(CORSMiddleware, ...)]` to `AgentFrameworkHost`. +17. **Serve with one call** — call `host.serve(host="localhost", port=8000)` without manually importing `uvicorn`, while `host.app` remains the canonical ASGI surface for any other server (Hypercorn, Daphne, Granian, Gunicorn+uvicorn workers). +18. **Author new channels** — implement the `Channel` protocol, return a `ChannelContribution` with routes/middleware/commands/lifecycle hooks, and call `context.run(...)` or `context.stream(...)` to invoke the agent. +19. **Target any `SupportsAgentRun` or `Workflow`** — host an `Agent`, `A2AAgent`, or a `Workflow`; the `run_hook` is the seam for adapting the channel's default `ChannelRequest` into the target-specific input shape (free-form messages for agents, typed inputs for workflows). +20. **Contribute WebSocket endpoints from a channel** — `ChannelContribution.routes` accepts both `Route` (HTTP) and `WebSocketRoute` (WS); the channel codec is responsible for framing and the same `run_hook` / default mapping pipeline applies. Built-in `ResponsesChannel` exposes a WebSocket transport (default `/responses/ws`, controlled by `transports=("http", "websocket")`) alongside its HTTP+SSE transport, anticipating the OpenAI Responses WebSocket transport. The host requires an ASGI server with WebSocket scope support (Uvicorn, Hypercorn, Daphne, Granian). +21. **Mix channels of different confidentiality tiers on one host** — every `Channel` may declare an opaque `confidentiality_tier: str | None` (e.g. `"corp"`, `"public"`). The host's `LinkPolicy` decides which `(source_tier, target_tier)` pairs may share an `isolation_key` (link) and which may be `ResponseTarget` source/destination for one another (deliver). Built-in policies (`AllowAllLinks` (default), `SameConfidentialityTierOnly`, `ExplicitAllowList`, `DenyAllLinks`) and the policy contract are defined in [LinkPolicy](#linkpolicy-and-confidentiality_tier). Cross-tier link attempts are refused with a typed error; cross-tier deliveries are dropped — so two tiers can share **an agent target** on one host while remaining strictly session-isolated. + +### v1 Fast Follow +22. **Generic auth helpers** — shared middleware for common channel auth patterns (HMAC signature, bearer token). +23. **Pluggable host state store** — interface for cross-host persistence of `ContinuationToken`s, identity-link grants, and last-seen `(isolation_key, channel)` records. Default implementation in v1 is **file-based** (`FileHostStateStore`); `InMemoryHostStateStore` is available for tests. A future `CosmosHostStateStore` / `SQLHostStateStore` would extend cross-channel chat continuity (req #9), background runs (req #14), and identity-link continuity (req #11) beyond a single host/process — but the v1 file-based default already survives host restarts on a single node. Same protocol covers session aliasing where applicable. +24. **First-party identity linker helpers** — concrete `OAuthIdentityLinker` (with provider presets) and `OneTimeCodeIdentityLinker` (cross-channel code exchange) shipped as opt-in helpers on top of the `IdentityLinker` contract. Investigation of additional first-party linker types tracked as a follow-up. +25. **`A2AChannel` package** (`agent-framework-hosting-a2a`) — exposes the hostable target over the Agent-to-Agent protocol so other agents can consume it as a peer. Caller-supplied-session family (alongside Responses and Invocations): A2A's per-conversation id maps to `ChannelSession.key`; the calling agent's identity (e.g. its A2A agent card / signed JWT) flows through `IdentityResolver`; structured replies fit the existing `ChannelRequest` + `ResponseTarget` envelope. No new host primitives required — only the protocol binding and package. +26. **`MCPToolChannel` package** (`agent-framework-hosting-mcp`) — exposes the hostable target as a **Model Context Protocol tool** so MCP clients (other agents, IDE tooling) can invoke it. Same caller-supplied-session family: the MCP `tool/call` carries the conversation key into `ChannelSession.key`; the MCP client identity flows through `IdentityResolver`; the tool result is the target's response. Streaming MCP tools map onto the host's existing streaming response delivery; long-running MCP tools map onto background runs with `ContinuationToken` when the work outlasts a single tool-call round-trip. +27. **`ActivityChannel` package** (`agent-framework-hosting-activity`) — exposes the hostable target behind **Azure Bot Service**, which fronts Teams, Web Chat, Slack-style connectors, and the rest of the Bot Framework / M365 connector ecosystem. Provides **native translations** between Activity Protocol objects (`Activity`, `ConversationReference`, adaptive cards, `Invoke` activities, …) and the host's `ChannelRequest` / `ChannelResponse` types — so the contract is **explicit** rather than implicit through a generic Invocations endpoint. Host-tracked-session family: Bot Service authenticates with a JWT carrying the AAD object id, the channel populates `ChannelIdentity` from `from.aadObjectId`, the host's per-`isolation_key` alias decides which `AgentSession` to resolve, and `host.reset_session(...)` is reachable via a Teams slash command or adaptive-card action. `ChannelPush` is implemented over Bot Service's `ConversationReference` + `continueConversationAsync` pattern. Naming this channel **Activity** rather than **Teams** keeps a `TeamsChannel` name available for the Teams-native channel below (req #28) and for any future direct-to-Teams transport. +28. **`TeamsChannel` package** (`agent-framework-hosting-teams`) — Teams-native channel built on the MIT-licensed [`microsoft/teams.py`](https://github.com/microsoft/teams.py) SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`). Where `ActivityChannel` (req #27) targets the **generic** Activity Protocol surface across all Bot Service-fronted channels, `TeamsChannel` exploits **Teams-specific affordances** that the generic Activity Protocol does not surface natively: + - **Adaptive Cards** via the typed `microsoft-teams-cards` builder, attached as tool side-effects through a `ContextVar`-scoped pending-cards collector consumed by the channel's result projector. + - **Streamed assistant replies** via `ctx.stream.emit(chunk)` — the channel projects `agent.run(..., stream=True)` chunks directly. + - **Teams "AI generated" badge**, **built-in feedback controls + custom feedback form**, **suggested-prompt chips** (`SuggestedActions` / `CardAction(IM_BACK)`), **inline citations** (`CitationAppearance` populated from a `FunctionMiddleware` that assigns stable positions to tool-result sources). + - **Modal Dialogs** (multi-step forms) with submission events routed through the host's normal request pipeline. + - **Message Extensions** — action commands (modal forms invoked from the compose box / message context menu), search commands (typed-ahead inline cards), and link unfurling (preview cards on URL paste). Each is exposed via the same `ChannelCommand` model as Telegram-style slash commands. + - **Proactive, targeted (ephemeral), and threaded messages** via `app.send(conversation_id, MessageActivityInput(...))`, `with_recipient(account, is_targeted=True)`, and `to_threaded_conversation_id(conversation_id, message_id)` — used by `ChannelPush` and by `ResponseTarget.identities([ChannelIdentity(channel="teams", chat_id=…)])`. + - **SSO / OAuth** via the SDK's MSAL-backed connections, surfaced through `IdentityResolver` and the channel's run hook. + - **Teams API client + Microsoft Graph client** preconfigured on the SDK's `App`, available to the run hook for Teams-specific lookups (team roster, channel metadata, …) without re-implementing auth. + + Mounts the SDK's `App` into the host's Starlette app via a custom `HttpServerAdapter` that defers `register_route(...)` to `ChannelContribution.routes` — the SDK does **not** start its own server; the host owns the lifecycle. Host-tracked-session family (same as `ActivityChannel`): `from.aadObjectId` populates `ChannelIdentity`. The result projector reads `AgentRunResult.messages[*].contents` and routes the rich content variants to their Teams-native renderings (`TextContent` → markdown body, `DataContent`/structured output → Adaptive Card, citation entries from `additional_properties` → `add_citation`, `ErrorContent` → typed error card). + + **Note on transport.** `TeamsChannel` **still rides on Azure Bot Service in v1** — the `microsoft/teams.py` SDK is a higher-level Pythonic wrapper over the same Activity Protocol pipeline that `ActivityChannel` exposes raw. The difference is **what the developer writes against**, not the underlying network path. A truly Bot-Service-free Teams transport is *not currently possible* and is tracked as a separate, speculative stretch item (req #30); when/if Microsoft ships one, the new transport would slot in under the same `TeamsChannel` package without changing this requirement. + + **`ActivityChannel` vs `TeamsChannel` — pick by audience:** + + | Channel | Built on | Audience | + |---|---|---| + | `ActivityChannel` (req #27) | Activity Protocol over HTTP, no Teams-specific helpers | Bot Service-fronted channels generically (Teams, Web Chat, Slack-style connectors, DirectLine, …); maximum portability across the Bot Framework / M365 connector ecosystem | + | `TeamsChannel` (req #28) | `microsoft/teams.py` `App` mounted via custom `HttpServerAdapter` into the host's Starlette app | Teams-first deployments that want Adaptive Cards, modal Dialogs, Message Extensions, citations, feedback, suggested-prompt chips, and SSO out-of-the-box | + + Deployments that only need plain Activity Protocol over Bot Service stick with `ActivityChannel`; `TeamsChannel` is the upgrade path when Teams-native richness is wanted. + +### Stretch +29. **WhatsApp channel package** — using the same `Channel` + `ChannelCommand` model, designed so it participates in cross-channel continuity (req #9) and can serve as a `ChannelPush` destination (req #13) when paired with a stable per-user `isolation_key`. +30. **Direct-to-Teams channel package** — *speculative*. Reserved for a future transport that connects to Teams **without going through Azure Bot Service** (and therefore without the Activity Protocol pipeline that backs both `ActivityChannel` (req #27) and `TeamsChannel` (req #28)). At the time of writing **no such transport is publicly available** — the Microsoft Graph chat APIs (`/teams/{id}/channels/{id}/messages`, `/chats/{id}/messages`) and the `microsoft/teams.py` SDK both ultimately route through Bot Service for the bot-as-conversation-participant pattern. This requirement is kept on the roadmap purely to preserve the `TeamsChannel` naming line for if/when Microsoft ships a Bot-Service-free transport (a native Teams REST/RPC, a Graph subscription strong enough to drive both inbound and outbound message flow, or similar). Until then, **the canonical Teams channel is `TeamsChannel` (req #28)** and `ActivityChannel` (req #27) covers the generic Bot Service surface. + +## API Surface + +### Packages + +| Distribution package | Public import surface | Purpose | +| --- | --- | --- | +| `agent-framework-hosting` | `agent_framework.hosting` | Core Starlette host, channel contract, session/request bridge | +| `agent-framework-hosting-responses` | `agent_framework.hosting` (lazy) | `ResponsesChannel` | +| `agent-framework-hosting-invocations` | `agent_framework.hosting` (lazy) | `InvocationsChannel` | +| `agent-framework-hosting-telegram` | `agent_framework.hosting` (lazy) | `TelegramChannel` and Telegram-specific helpers | + +The split is between distribution packages. The **public import path stays stable at `agent_framework.hosting`** via lazy imports, consistent with the repository's packaging conventions. + +### Built-in routes + +For built-in channels, `path` is the configurable mount root, not the full final endpoint. The channel package owns the fixed protocol-relative suffix. + +| Channel | Default `path` | Default exposed route(s) | +| --- | --- | --- | +| `ResponsesChannel` | `/responses` | `/responses/v1` and nested responses/conversation routes below it | +| `InvocationsChannel` | `/invocations` | `/invocations/invoke` | +| `TelegramChannel` | `/telegram` | webhook mode: `/telegram/webhook`; polling mode: no required HTTP route | + +Overrides only replace the outer mount root: + +```python +ResponsesChannel(path="/public/responses") # -> /public/responses/v1 +InvocationsChannel(path="/internal/invocations") # -> /internal/invocations/invoke +TelegramChannel(path="/bots/telegram", bot_token=token) # -> /bots/telegram/webhook +``` + +### Key Types + +**`AgentFrameworkHost`** — owner of the Starlette app and channel lifecycle. Fronts one **hostable target** (an agent or a workflow). + +| Field / Method | Type | Description | +|---|---|---| +| `__init__(target, *, channels, middleware=(), identity_resolver=None, identity_linker=None, debug=False)` | constructor | Composes one host from one **hostable target** (`SupportsAgentRun` or `Workflow`) and a sequence of channels. Optional `identity_resolver` and `identity_linker` provide channel-native-id → `isolation_key` mapping and a connect ceremony for linking new channels to existing identities. The host detects the target kind and dispatches to the appropriate runner. | +| `app` | `Starlette` | Canonical ASGI surface; can be handed to any ASGI server. | +| `serve(*, host="127.0.0.1", port=8000, **kwargs)` | method | Convenience wrapper around `uvicorn.run(self.app, ...)`. Lazy-imports `uvicorn`. | +| `run_in_background(request)` | `-> ContinuationToken` | Submits a `ChannelRequest` for asynchronous execution. Returns a `ContinuationToken` immediately; the result is delivered via the configured `ResponseTarget` push when ready and recorded against the token (in the configured `HostStateStore`) for later polling. Channels typically call this when their protocol response should be a 202 / acknowledgement rather than the agent reply. | +| `get_continuation(token)` | `-> ContinuationToken \| None` | Look up a previously submitted background run by its opaque token. Returns `None` when the token is unknown or has expired. Reads through the `HostStateStore` so tokens issued before the most recent restart still resolve. | + +**`HostableTarget`** — the union of executable targets the host can front. + +| Variant | Type | Execution seam | +|---|---|---| +| Agent | `SupportsAgentRun` | `target.run(input, *, session=..., stream=...)` | +| Workflow | `Workflow` | `target.run(input, ...)` (workflow execution seam) | + +**`Channel`** (Protocol) — anything that contributes routes/commands/lifecycle to a host. + +| Field | Type | Description | +|---|---|---| +| `name` | `str` | Channel name used for routing, telemetry, and `ChannelRequest.channel`. | +| `confidentiality_tier` | `str?` | Optional opaque confidentiality tier (e.g. `"corp"`, `"public"`). Consumed by the host's `LinkPolicy` to decide which channels may be linked into the same `isolation_key` and which may be `ResponseTarget` destinations for a given originating request. `None` = single-tier (no policy filtering). See `LinkPolicy`. | +| `contribute(context: ChannelContext) -> ChannelContribution` | method | Called once at host construction; returns routes/middleware/commands/lifecycle. | + +**`ChannelContext`** — host-owned bridge channels use to invoke the agent. + +| Method | Type | Description | +|---|---|---| +| `run(request: ChannelRequest)` | `-> HostedRunResult` | One-shot invocation. | +| `stream(request: ChannelRequest)` | `-> HostedStreamResult` | Streaming invocation. | + +**`ChannelContribution`** — what a channel returns from `contribute(...)`. + +| Field | Type | Description | +|---|---|---| +| `routes` | `Sequence[BaseRoute]` | Starlette routes mounted under the channel's `path`. Accepts both `Route` (HTTP) and `WebSocketRoute` (WS) — both are `BaseRoute`. | +| `middleware` | `Sequence[Middleware]` | Channel-scoped middleware. | +| `commands` | `Sequence[ChannelCommand]` | Native command catalog (e.g. Telegram bot commands). | +| `on_startup` | `Sequence[Callable]` | Lifecycle hooks for polling workers, command registration, etc. | +| `on_shutdown` | `Sequence[Callable]` | Lifecycle hooks for cleanup. | + +**`ChannelRequest`** — normalized ingress passed to the host. + +| Field | Type | Description | +|---|---|---| +| `channel` | `str` | Originating channel name. | +| `operation` | `str` | e.g. `message.create`, `command.invoke`, `approval.respond`. | +| `input` | `AgentRunInputs` | Reuses framework input types. | +| `session` | `ChannelSession?` | Session hint from the channel. | +| `options` | `ChatOptions?` | Caller-derived options (e.g. Responses `temperature`). | +| `session_mode` | `Literal["auto", "required", "disabled"]` | Whether host-managed session use is automatic, mandatory, or bypassed. | +| `metadata` | `Mapping[str, Any]` | Protocol-level metadata for telemetry. | +| `attributes` | `Mapping[str, Any]` | Channel-specific structured values (signature state, capability hints). Host code never reads this map; reserved for channel-private bookkeeping. | +| `client_state` | `Mapping[str, Any] \| None` | Bidirectional, mutable per-request state object supplied by event-rich front-ends (e.g. AG-UI). Channel-defined shape; the host treats it as opaque. Channels typically thread this into a channel-owned `ContextProvider` (see [Channel-owned per-thread state](#channel-owned-per-thread-state)) and read it back after the run to emit state-snapshot/delta events. | +| `client_tools` | `Sequence[ToolDescriptor] \| None` | Frontend tool catalog supplied per request. The channel forwards definitions onto the agent's `ChatOptions` so the LLM can call them, but tool *execution* returns to the originating client (the host does not invoke them). Run hooks may filter or rewrite the catalog. | +| `forwarded_props` | `Mapping[str, Any] \| None` | Pass-through bag for channel-protocol extras the run hook needs to route into the target — e.g. AG-UI `resume` / `command` / HITL response payloads that drive workflow `RequestInfo` / `RequestResponse` round-trips. Opaque to the host; the run hook decides where it lands on the rebuilt `ChannelRequest.input`. | +| `identity` | `ChannelIdentity?` | Channel-native **user** identity observed on this request — `(channel, native_id, attributes)`. Channels populate it from the inbound payload's user field (Telegram `from.id`, Teams `from.aadObjectId`, Responses `safety_identifier`, …) — **not** the chat / conversation id, which is carried separately on `conversation_id` and matters in multi-user surfaces (Telegram groups, Teams group chats and channels — see [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels)). The host records `(isolation_key, channel) → identity` on every successful resolve so `ResponseTarget.active`, `.channel(name)`, `.channels([...])`, and `.all_linked` can find a destination native id without per-request payload bookkeeping. | +| `stream` | `bool` | Whether to invoke `stream(...)` rather than `run(...)`. | +| `response_target` | `ResponseTarget` | Where the response is delivered (default: `ResponseTarget.originating`). See `ResponseTarget` below. | +| `background` | `bool` | If `True`, host returns a `ContinuationToken` immediately rather than awaiting the response. Forced `True` when `response_target == ResponseTarget.none`. | + +**`ChannelSession`** — small, host-neutral session hint. + +| Field | Type | Description | +|---|---|---| +| `key` | `str?` | Stable host lookup key for an `AgentSession`. **Caller-supplied** channels populate it from the wire payload (e.g. `previous_response_id`, request-body `session_id`). **Host-tracked** channels leave it `None` and let the host's per-`isolation_key` alias decide which `AgentSession` to resolve (see [Channel session-carriage models](#channel-session-carriage-models)). | +| `conversation_id` | `str?` | Protocol-visible conversation/thread identifier when one exists. | +| `isolation_key` | `str?` | Opaque isolation boundary (user, tenant, chat, …) using hosted-agent terminology. | +| `attributes` | `Mapping[str, Any]` | Channel-specific session hints. | + +**`ChannelRunHook`** — per-request escape hatch for built-in channels. + +```python +ChannelRunHook = Callable[..., Awaitable[ChannelRequest] | ChannelRequest] +``` + +Channels invoke the hook positionally with the channel-built `ChannelRequest` and pass named extras as keyword arguments. The minimum signature an app author needs is: + +```python +def my_hook(request: ChannelRequest, **kwargs) -> ChannelRequest: ... +``` + +Hooks that want the named extras pull them out by name: + +| Keyword | Type | Description | +|---|---|---| +| `target` | `SupportsAgentRun \| Workflow` | The hosted target (so hooks can adapt to e.g. `A2AAgent` or to a `Workflow`'s typed inputs). | +| `protocol_request` | `Any?` | Original channel-native protocol payload — Responses JSON body, Telegram `Update` dict, Activity Protocol `Activity` dict, Invocations body, … (loosely typed in v1). | + +Runs **after** the channel has produced its default `ChannelRequest`, **before** the host resolves session behavior and calls the target's execution seam. This is the canonical adapter point for workflow targets, where the channel's free-form input must be reshaped into the workflow's typed inputs. + +> Earlier drafts wrapped these arguments into a `ChannelRunHookContext` object. The signature was simplified so the typical hook only needs `(request, **kwargs)` — making it safe against future named extras and easier to write inline. + +**`ChannelIdentity`** — the channel-native identity the host sees on each request, used as the resolver/linker input. + +| Field | Type | Description | +|---|---|---| +| `channel` | `str` | Originating channel name (matches `Channel.name`). | +| `native_id` | `str` | Channel-native **user** identifier (Telegram `from.id`, Teams `from.aadObjectId`, WhatsApp phone number, Slack user id, …). In 1:1 chats this often coincides with the chat / conversation id; in multi-user surfaces (Telegram groups, Teams group chats and channels) it is **strictly the user** — the conversation locator lives separately on `ChannelRequest.conversation_id` / `ChannelSession.conversation_id`. Always per-channel; never assumed to align across channels. | +| `attributes` | `Mapping[str, Any]` | Optional per-channel context (display name, locale, group/private chat flag, Teams `tenantId`, Telegram `chat.type`, Teams `conversationType`, …) the resolver/linker may key on. | + +**`IdentityResolver`** — host-level seam that maps a `ChannelIdentity` to an `isolation_key`. + +```python +IdentityResolver = Callable[[ChannelIdentity], Awaitable[str | None] | (str | None)] +``` + +The **default resolver auto-issues** an `isolation_key` the first time a `(channel, native_id)` is seen and persists the mapping in the host's identity store, so every end user automatically gets a stable per-user `isolation_key` on first contact through **any** channel — no per-channel boilerplate is required for the single-channel case. Returning `None` is reserved for advanced cases where the resolver wants to refuse unknown identities (e.g. allow-list enforcement). + +Cross-channel continuity is then a one-shot **merge** operation: after a successful link ceremony (Scenario 6), the host atomically rewrites the second channel's auto-issued key to point at the first channel's existing `isolation_key`. Apps never have to write per-channel mapping hooks just to get continuity to work. + +Apps that already own an identity namespace (corporate user id, tenant-scoped account id) can supply a custom resolver that returns those values directly — bypassing auto-issuance. + +**`IdentityLinker`** (Protocol) — host-level seam that runs a connect ceremony to associate a new `ChannelIdentity` with an existing `isolation_key`. The linker is a peer of `Channel` for routing purposes and contributes its own routes/lifecycle. + +| Field / Method | Type | Description | +|---|---|---| +| `name` | `str` | Linker name; used for telemetry and to namespace its routes. | +| `contribute(context: ChannelContext) -> ChannelContribution` | method | Same shape as `Channel.contribute(...)`; lets the linker publish callback/verification routes (e.g. `/identity/oauth/callback`, `/identity/verify`) and lifecycle hooks. | +| `begin(identity: ChannelIdentity, *, requested_isolation_key=None) -> LinkChallenge` | method | Starts the ceremony for a channel-native identity. Returns a `LinkChallenge` describing what the user must do (URL to visit, code to enter, MFA prompt). | +| `complete(challenge_id: str, proof: Mapping[str, Any]) -> str` | method | Verifies the proof and returns the resolved `isolation_key`. On success the host atomically records both `(channel, native_id) → isolation_key` and any verified IdP claim recovered from the proof (e.g. `(microsoft.oid, )`) so subsequent channels that supply the same claim auto-link without a second ceremony. | +| `is_linked(identity: ChannelIdentity, *, verified_claims: Mapping[str, str] = {}) -> str \| None` | method | Returns the `isolation_key` for an already-linked identity, or `None` if no link exists. Channels with `require_link=True` call this on every inbound request before invoking the agent. When `verified_claims` are supplied (e.g. Teams' AAD `oid` from the inbound activity bearer) and a match exists in the link store, the linker silently auto-merges the new `(channel, native_id)` onto the existing `isolation_key` and returns it — this is the "sign in once, every other channel just works" mechanism. | + +| Built-in helper | Mechanism | Notes | +|---|---|---| +| `OAuthIdentityLinker(provider, ...)` | OAuth authorization-code redirect | Contributes `/identity/oauth/{provider}/start` + `/callback`; ships with provider presets (Microsoft, Google, GitHub) as opt-in helpers. Stores the verified IdP `sub` / `oid` as a verified claim alongside the channel-native identity so channels that authenticate with the same IdP (e.g. Teams via Entra ID) auto-link on first contact. | +| `OneTimeCodeIdentityLinker(...)` | Signed short-lived code | User runs `/link` on channel A, receives a code; runs `/link ` on channel B; host verifies and merges. | + +A built-in `link` (or `connect`) `ChannelCommand` is exposed automatically when an `IdentityLinker` is configured. Its `handle` invokes `linker.begin(...)` and replies with the `LinkChallenge` payload (URL, code, instructions) projected through the channel's native rendering. Channels may opt out (`expose_in_ui=False`) or override the command's name per channel. + +**`require_link` (per-channel)** — every channel that emits a `ChannelIdentity` accepts a `require_link: bool = False` constructor argument. When `True`, the channel calls `linker.is_linked(identity, verified_claims=…)` before producing a `ChannelRequest`; un-linked identities are short-circuited to a rendered `LinkChallenge` reply (the same payload the `link` command would emit) and the agent is **not** invoked for that turn. Combined with the linker's verified-claim auto-link, this gives an "authenticate before chatting" enforcement model where the first channel forces the OAuth ceremony and subsequent channels join the same `isolation_key` silently. See [Scenario 6](#scenario-6-linking-a-new-channel-to-an-existing-identity-via-oauth) for the end-to-end flow. Default is `False`, which preserves the opportunistic flow (auto-issued `isolation_key`, link manually later). Channels whose protocol does not authenticate the user (e.g. anonymous Responses calls) ignore the flag. + +#### `LinkPolicy` and `confidentiality_tier` + +**`LinkPolicy`** — host-level decision over which channels may share an `isolation_key` and which channels may be a `ResponseTarget` for one another. Consumed by both the `IdentityLinker` (to refuse incompatible link attempts) and the host's response-routing layer (to filter `all_linked` / `active` / specific destinations). + +```python +LinkPolicy = Callable[[LinkPolicyContext], bool] +``` + +`LinkPolicyContext` carries the originating `Channel` (and its `confidentiality_tier`), the prospective destination `Channel` (and its `confidentiality_tier`), and the operation kind (`"link"` or `"deliver"`). Returns `True` to allow, `False` to refuse. Refusal during `link` raises a typed error to the user; refusal during `deliver` excludes that destination from the route set (and falls back to `originating` if the route set becomes empty). + +| Built-in policy | Behavior | +|---|---| +| `AllowAllLinks()` | Default. Any pair allowed; preserves today's single-tier behavior. | +| `SameConfidentialityTierOnly()` | Only allows pairs whose `confidentiality_tier` matches (including both `None`). Most common multi-tier setup. | +| `ExplicitAllowList(allowed_pairs={("public", "corp"), ...})` | Allows only the listed `(source, target)` pairs. Useful for one-directional escalation flows. | +| `DenyAllLinks()` | Refuses every link attempt and excludes every non-`originating` destination — channels share an agent target on the host but never share sessions. Equivalent to running each channel on its own host minus the deployment overhead. | + +Confidentiality tiers are **opaque labels** — the host does not interpret them; the policy decides what they mean. Setting `confidentiality_tier=None` on every channel preserves single-tier behavior. Two separate hosts is always a valid alternative to using `LinkPolicy`; the policy exists for cases where shared deployment, shared middleware, or a shared target object are preferred over running multiple hosts. + +#### Multi-user conversations (Telegram groups, Teams group chats and channels) + +Telegram and Activity Protocol (Bot Service) both surface **multi-user conversations** alongside 1:1 chats — Telegram has private chats, groups, supergroups, forum topics inside supergroups, and broadcast channels; Activity Protocol has `conversationType` of `personal`, `groupChat`, and `channel` (a Teams team channel, with optional threaded `replyToId`). The hosting contract treats these uniformly, but channel implementations and host configuration both need to make a few explicit choices: + +**Identity vs. conversation are two axes, not one.** `ChannelIdentity.native_id` is always the **user** (`from.id` / `from.aadObjectId`); `ChannelRequest.conversation_id` is the **chat / channel / thread**. In 1:1 chats they collapse onto the same value (Telegram `chat.id == from.id`); in groups they don't and must not be conflated. The default `IdentityResolver` keys on `(channel, native_id)`, so a single user automatically gets one `isolation_key` whether they message in a group or in DM — that may or may not be what you want (see scoping below). + +**Conversation scoping policy.** A channel exposes a `conversation_scope` constructor option declaring how the host should derive the resolved `isolation_key` for multi-user surfaces. Three built-ins: + +| Scope | `isolation_key` derivation in multi-user conversations | When to pick it | +|---|---|---| +| `per_user` | The user's `isolation_key` from `IdentityResolver(ChannelIdentity)` only — group and DM share state. | Personal-assistant agents where the bot follows the user across surfaces and their preferences/memory should travel with them. Risky if the agent emits user-specific data in a public group. | +| `per_user_per_conversation` (default for multi-user) | `f"{user_isolation_key}:{conversation_id}"` — same user gets a different `isolation_key` per group / channel / topic / DM. | Default and safest. The agent's memory of a Teams team channel is separate from its memory of the same user's DM. | +| `per_conversation` | `f"_conv:{channel}:{conversation_id}"` — every member of the group shares one `isolation_key` and one `AgentSession`. The user identity is still attached to each turn (via `ChannelRequest.identity`) so the agent can address users by name, but session state is shared. | "Bot lives in this channel" deployments: meeting-notes bot, shared scratchpad, support-triage queue. | + +1:1 chats always derive `isolation_key` from the user identity alone — the per-user-per-conversation key would just include the user's own DM and add no isolation value. + +**Addressing rule.** Group surfaces typically don't want the bot replying to every message. Channels expose an `accept_in_group` constructor option: + +| Mode | Semantics | Default for | +|---|---|---| +| `mention_only` | Accept only messages that explicitly mention the bot (`@bot` for Telegram, `botname` mention entity for Teams). | Telegram groups, Teams `groupChat`, Teams team channels | +| `command_only` | Accept only registered `ChannelCommand` invocations (e.g. `/ask …`). | — | +| `mention_or_command` | Either of the above. | — | +| `all` | Accept every inbound message. | 1:1 chats; opt-in for groups when the agent really is the only conversational participant | + +Messages that don't satisfy the rule are ignored at the channel layer — no `ChannelRequest` is produced and the agent is never invoked. This is purely an inbound filter; outbound delivery (push / response routing) is unaffected. + +**Reply / `originating` routing.** The `originating` `ResponseTarget` always replies in the **same conversation** the request came from — including the same Teams team-channel thread (`replyToId`) or Telegram forum topic (`message_thread_id`). Channels carry the conversation-locator details on `ChannelRequest.conversation_id` (and additional fields on `ChannelRequest.attributes` when needed, e.g. `thread_id`); the channel's reply path reads them back. Channels that cannot reply in-thread (rare) fall back to a fresh top-level reply in the same conversation. + +**`ChannelPush` in groups.** When a non-`originating` `ResponseTarget` lands on a multi-user surface, the push must address a `(user, conversation)` pair: the host calls `ChannelPush.push(identity, payload)` where `identity.attributes` includes the recorded `conversation_id` (and thread/topic id when applicable) of the most recent observation under that scope. For `per_conversation` scope, every member's `ChannelIdentity` resolves to the same `isolation_key`, so the host instead picks the most recently observed `conversation_id` for that key and posts a single message to the conversation rather than fanning out to each user. + +**Linker ceremonies in groups.** OAuth and one-time-code link flows MUST NOT post the challenge URL or code into a group conversation visible to other users. Channels that support groups MUST detect group context (via `ChannelIdentity.attributes`) and, when `require_link=True` triggers a `LinkChallenge`, redirect the rendered challenge to the user's DM (Telegram: bot DM with the user; Teams: `personal` scope conversation with the same user). If a DM cannot be opened (Telegram user has not started the bot, Teams personal scope not installed), the channel returns a short prompt asking the user to DM the bot and retry. Verified-claim auto-link is unaffected — when a Teams `groupChat` request carries an AAD-verified `from.aadObjectId` that already matches an existing claim in the link store, the merge happens silently with no group-visible artifact. + +**Confidentiality tier interaction.** A Teams team channel post is visible to every member of the team; a 1:1 DM is not. Operators who care about the distinction MUST configure separate `Channel` instances (e.g. `ActivityChannel(scopes=["personal"], confidentiality_tier="user")` + `ActivityChannel(scopes=["channel", "groupChat"], confidentiality_tier="team")`) and apply a `LinkPolicy` so cross-tier `ResponseTarget` deliveries and identity links are filtered. The hosting layer does not infer tier from `conversationType`; it is an explicit deployment choice. + +**Telegram broadcast `Channel` (the Telegram product) and forum topics.** + +- *Broadcast Channels* — bots that are members of a Telegram broadcast Channel can post but generally do not receive user replies; treat as `ChannelPush`-only and configure with `accept_in_group="command_only"` so admin-issued commands (`/announce …`) are the only inbound trigger. Out of scope for v1; v1 ships group/supergroup support and leaves broadcast Channels for fast follow. +- *Forum topics* — supergroups with topics surface `message_thread_id`. The `TelegramChannel` populates `ChannelRequest.conversation_id` as `f"{chat_id}:{message_thread_id}"` so `per_user_per_conversation` and `per_conversation` scopes naturally separate topics from each other and from the group's general thread. + +**Activity Protocol specifics for `ActivityChannel`.** + +- `conversationType` mapping: `personal` → 1:1 (`accept_in_group="all"` rule applied), `groupChat` and `channel` → multi-user (default `mention_only`). +- Teams team channels carry both a channel id and an optional `replyToId`. The channel populates `conversation_id` as `f"{conversation.id}:{replyToId}"` when replying in-thread is desired (`per_user_per_conversation` scope makes thread-isolated sessions easy); deployments that prefer a single session per Teams channel can set `conversation_scope="per_conversation"` and the channel will key on `conversation.id` alone. +- `tenantId` is recorded on `ChannelIdentity.attributes` so multi-tenant deployments can implement an `IdentityResolver` that scopes `isolation_key` by tenant (or refuses unknown tenants). +- Adaptive-card submit (`Invoke` activities) flows are addressed in fast-follow alongside the `ActivityChannel` package; v1 of the host contract supports them via `ChannelRequest.forwarded_props`, so no host-level change is needed. + +**`ResponseTarget`** — directs **where** the host delivers the agent response. Independent of `session_mode`. + +| Variant | Constructor | Behavior | +|---|---|---| +| Originating | `ResponseTarget.originating` (default) | Synchronous response on the originating channel. | +| Active | `ResponseTarget.active` | Delivered to the channel most recently observed for the resolved `isolation_key`. | +| Specific channel (link-store recipient) | `ResponseTarget.channel("activity")` | Delivered via the named channel's `ChannelPush` to whichever channel-native identity is recorded for the resolved `isolation_key` in the link store. | +| Explicit identities | `ResponseTarget.identities([ChannelIdentity("telegram", native_id=""), ...])` | Delivered via each named channel's `ChannelPush` to the **caller-supplied channel-native identity** — bypasses the link store entirely. Used when the originating caller already knows the recipient's channel-native id (e.g. a server-side Responses caller relaying for a known user). The host still consults `LinkPolicy` for each delivery. Convenience alias: `ResponseTarget.identity(ChannelIdentity(...))` for the single-identity case. | +| Multiple channels | `ResponseTarget.channels(["telegram", "activity"])` | Delivered to each named channel (link-store recipient per channel). | +| All linked | `ResponseTarget.all_linked` | Delivered to every channel where the resolved `isolation_key` is known. | +| None | `ResponseTarget.none` | Background-only — caller must poll the `ContinuationToken`. Forces `background=True`. | + +When `response_target` is anything other than `originating`, the originating channel's protocol response is the **`ContinuationToken`** (e.g. an Invocations 202 with the token in the response body and/or a polling URL header), and the actual agent response is delivered out-of-band via the destination channel(s)' `ChannelPush`. If the destination channel doesn't implement `ChannelPush`, the host falls back per the configured policy (default: deliver to `originating`; surfaces a warning in telemetry). The configured `LinkPolicy` is consulted for every destination — destinations that fail the policy (e.g. a corp-tier channel addressed from a public-tier originating request) are dropped, and if every destination is dropped the host falls back to `originating`. + +**`ChannelPush`** (Protocol) — optional capability for channels that can deliver outbound messages without a prior request. + +| Method | Type | Description | +|---|---|---| +| `push(identity: ChannelIdentity, payload: HostedRunResult)` | async | Proactively delivers a completed run result to the given channel-native identity (Telegram proactive message, Activity Protocol proactive message via Bot Service `continueConversation`, webhook callback, SSE broadcast). Channels implement this in addition to `Channel`; channels that cannot push omit it. | + +**`ContinuationToken`** — first-class artifact for asynchronous / background runs. + +| Field | Type | Description | +|---|---|---| +| `token` | `str` | Opaque, URL-safe continuation token. The only field channels expose to callers; all other fields are implementation detail of the host's `HostStateStore`. Stable for the lifetime of the run record (until expiry / eviction). | +| `status` | `Literal["queued", "running", "completed", "failed"]` | Current status. | +| `isolation_key` | `str?` | The resolved isolation key the run is associated with. | +| `created_at` | `datetime` | Submission time. | +| `completed_at` | `datetime?` | Set when status is `completed` or `failed`. | +| `result` | `HostedRunResult?` | Populated on `completed`. | +| `error` | `str?` | Populated on `failed`. | +| `response_target` | `ResponseTarget` | The configured delivery target (recorded for diagnostics). | + +The host stores `ContinuationToken`s through a `HostStateStore` (see [Host state storage](#host-state-storage)). The v1 default is **`FileHostStateStore`** — one JSON file per token under a configurable directory (default `./.af-hosting/continuations/`), written atomically (`.tmp` + `os.replace`) so a host crash mid-write doesn't corrupt the record. This means background runs **survive host restarts**: a caller that polls `/responses/v1/{continuation_token}` after the process recycles still gets a valid status (and the result if the run had completed before the crash). Completed/failed entries are evicted by a configurable TTL (default 24h). `InMemoryHostStateStore` is available for tests / ephemeral hosts. Built-in channels expose poll routes that surface the token in their native shape (`/responses/v1/{continuation_token}` returns a Responses-shaped object; `/invocations/{continuation_token}` returns the Invocations status envelope). + +#### Host state storage + +`HostStateStore` is the single persistence seam for **host-execution metadata** that needs to outlive a single request: continuation tokens, identity-link grants, and last-seen `(isolation_key, channel)` records. It is deliberately separate from `ContextProvider` (per-conversation context) and `CheckpointStorage` (workflow checkpoints) because the data shapes are structurally different — but a deployment MAY back all three with the same physical store. + +| Method | Purpose | +|---|---| +| `put_continuation(token: ContinuationToken)` / `get_continuation(token: str)` / `delete_continuation(token: str)` | Background-run records. | +| `put_link_grant(grant: LinkGrant)` / `get_link_grant(code: str)` / `consume_link_grant(code: str)` | Pending identity-link grants (Entra OAuth state, one-time codes). | +| `record_last_seen(isolation_key: str, channel: str, identity: ChannelIdentity, ts: datetime)` / `get_last_seen(isolation_key: str)` | Backs `ResponseTarget.active`. | + +V1 ships two implementations: + +- **`FileHostStateStore(directory: Path = "./.af-hosting/")`** — default; one JSON file per record under `continuations/`, `link_grants/`, plus a `last_seen.json` keyed by isolation key. Atomic writes; per-namespace TTL cleanup (continuations 24h, link grants 15min, last-seen 30d by default). Suitable for single-node hosts and dev; works in hosted-agent environments where the working directory is persisted and isolated per agent. +- **`InMemoryHostStateStore()`** — testing / ephemeral; same protocol, no persistence. + +Pluggable v1-fast-follow implementations (Cosmos, SQL, Redis) plug into the same protocol — see req #23. + +**`ChannelCommand` / `ChannelCommandContext` / `CommandHandler`** — cross-channel native command model (per PR #5393). + +| Type | Fields | Description | +|---|---|---| +| `ChannelCommand` | `name`, `description`, `handle`, `expose_in_ui=True`, `metadata={}` | Transport-neutral command descriptor. | +| `ChannelCommandContext` | `session`, `state`, `raw_event`, `reply(...)`, `run(request)` | Runtime context for command handlers. | +| `CommandHandler` | `Callable[[ChannelCommandContext], Awaitable[None] \| None]` | Command implementation; may reply locally, mutate state, or invoke the agent. | + +**`HostedRunResult` / `HostedStreamResult`** — outbound results from the host. + +| Type | Fields | Description | +|---|---|---| +| `HostedRunResult` | `response: AgentResponse`, `session: AgentSession?`, `text` | One-shot outcome. | +| `HostedStreamResult` | `updates: ResponseStream[...]`, `raw_events: AsyncIterable[Any] \| None`, `session: AgentSession?` | Streaming outcome. `updates` is the **normalized** stream of `AgentRunResponseUpdate` (lossless for messages, function calls, usage) and is the happy path for Responses, Invocations, Telegram, and most channels. `raw_events` is an optional **passthrough seam** onto the underlying agent event stream (before update normalization) for channels whose protocol carries domain events the framework does not model — e.g. AG-UI's `StateSnapshotEvent` / `StateDeltaEvent` / `ToolCallStartEvent`. Channels that consume `raw_events` bear responsibility for the full event translation; the request still flows through `context.stream(...)` so session resolution, identity, push, and policy continue to apply. `None` when the host has no raw upstream (e.g. a workflow-only target produced from cached events). | + +The host does **not** emit protocol events directly — channels translate `HostedRunResult`/`HostedStreamResult` into Responses events, Invocations SSE, webhook callbacks, or platform messages. + +### Built-in channel constructors + +```python +class ResponsesChannel(Channel): + def __init__( + self, + *, + path: str = "/responses", + run_hook: ChannelRunHook | None = None, + expose_conversations: bool = True, + transports: Sequence[Literal["http", "websocket"]] = ("http",), + websocket_path: str = "/ws", + options: object | None = None, + ) -> None: ... + +class InvocationsChannel(Channel): + def __init__( + self, + *, + path: str = "/invocations", + run_hook: ChannelRunHook | None = None, + openapi_spec: dict[str, Any] | None = None, + ) -> None: ... + +class TelegramChannel(Channel): + def __init__( + self, + *, + bot_token: str, + transport: Literal["webhook", "polling"] = "webhook", + path: str = "/telegram", + run_hook: ChannelRunHook | None = None, + commands: Sequence[ChannelCommand] = (), + register_native_commands: bool = True, + require_link: bool = False, + ) -> None: ... +``` + +`options` on `ResponsesChannel` is intentionally loosely typed in this draft because the option-mapping boundary is still settling. If it becomes a formal type later, it should be Agent Framework-owned, not imported from `agentserver`. + +#### Conversation history for the Responses channel + +The Responses channel does **not** introduce its own history seam. Conversation history for every channel — Responses, Invocations, Telegram, Activity Protocol — flows through the agent's standard core `HistoryProvider` (`agent_framework._sessions.HistoryProvider`). The Responses channel is a *caller-supplied session* channel (see [Channel session-carriage models](#channel-session-carriage-models)): it parses `previous_response_id` (and/or `conversation_id`) off the inbound request and projects it into `ChannelSession.key`. The host then resolves an `AgentSession` for that key and the agent's `HistoryProvider` does the load / append exactly as it would for any other session. + +```text +POST /responses { "previous_response_id": "resp_018f…", "input": [...] } + -> ResponsesChannel parses previous_response_id + -> ChannelRequest.session = ChannelSession(key="resp_018f…") + -> host resolves AgentSession(id="resp_018f…") + -> agent.HistoryProvider.load_messages(session=…) # if load_messages=True + -> agent.run(input, session=…) + -> agent.HistoryProvider.save_messages(session=…, new_messages) + -> ResponsesChannel serializes the result with response_id="resp_018f…+1" +``` + +This means **any** AF `HistoryProvider` backs Responses out of the box — `FileHistoryProvider`, an in-memory provider, a future `CosmosHistoryProvider`, etc. The wire `previous_response_id` is just a session id with channel-defined formatting; nothing in the provider has to know "this is a Responses session". + +##### The Responses `store` parameter + +The OpenAI Responses API exposes a `store` boolean on every request. Its meaning in the official SDK is "service-side: persist this response so a later call can reference it via `previous_response_id`." In the hosting world this gets more interesting because there are **three** independent places a turn can end up persisted: + +- **Service-side** — the upstream provider's response store (e.g. OpenAI's hosted response store, accessible by `previous_response_id` against that provider directly). Controlled by the `store` flag on the agent's underlying `ChatClient` at construction time. +- **Hosted-agent storage** — the `HistoryProvider`(s) attached to the agent (`FileHistoryProvider`, `FoundryHostedAgentHistoryProvider`, in-memory, dual-write, …). Controlled by the host's `session_mode` directive, which `run_hook` can rewrite per request. +- **Caller-side** — the API caller keeps the `response_id` returned by the host and chains future calls with `previous_response_id`. Always available; out of host scope. + +These axes are **independent**. The same wire `store` value can land in any combination of them — or none — depending on (a) how the developer assembled the agent (`HistoryProvider` attached or not? `ChatClient` configured with its own `store=True` or not?) and (b) what the channel's `run_hook` does with the value. **The point of the matrix below is that `store` does not have a single canonical meaning at the hosted-agent layer — the developer of the hosted agent decides what it means.** + +| Caller sends | **Service-side** (underlying `ChatClient`'s own `store`) | **Hosted-agent storage** (agent's `HistoryProvider`) | **Caller-side** (caller chains `previous_response_id`) | +|---|---|---|---| +| `store=true` (or omitted; OpenAI default is `true`) | Writes **iff** the `ChatClient` was constructed to honor `store=true` against the upstream service. The host forwards the wire value into the chat client's options but does not look at it itself. | **Default:** loads and writes via the configured `HistoryProvider` (`session_mode="auto"`).
**Developer overrides** (via `run_hook`): `session_mode="disabled"` to suppress (compliance hold, ephemeral one-shots); `session_mode="required"` to fail closed if no session can be resolved instead of auto-issuing. | Always available — the host returns a chained `response_id` the caller may keep and re-send as `previous_response_id`. | +| `store=false` | Typically suppresses the service-side write — but the exact behavior depends on the `ChatClient` (some providers ignore the per-request flag, some honor it, some require a different opt-out). The host does not interpret it on the chat client's behalf. | **Default:** **still loads and writes** via the configured `HistoryProvider` — `store=false` is **not** auto-translated into a session-disable. The `HistoryProvider` is configured on the agent for app-level reasons (audit, replay, multi-channel continuity) the API caller has no business unilaterally overriding.
**Developer overrides** (via `run_hook`): `session_mode="disabled"` to **honor caller intent** (the path most apps that expose `store=false` as a real "stateless" guarantee will take); `session_mode="required"` (Scenario 3) to **ignore caller intent** and force host-managed sessions; conditional rules (e.g. honor `store=false` only from internal callers). | Always available — and the default fallback when both server-side surfaces are suppressed. | + +The same `store=false` request can therefore end up persisted in: + +- **service-side only** (chat client honors the flag → no service-side write; `HistoryProvider` not attached → no hosted-agent write; caller keeps `response_id`), +- **hosted-agent storage only** (chat client honors the flag → no service-side write; `HistoryProvider` attached and `run_hook` does not override → host writes anyway), +- **both** (chat client ignores the flag → service-side write happens; `HistoryProvider` attached and not overridden → hosted-agent write also happens), +- **neither** (chat client honors the flag and `run_hook` translates it into `session_mode="disabled"` → only the caller's local copy exists). + +Two design properties fall out of this: + +1. **`store` is forwarded, not auto-mapped to host policy.** The caller's `store` value is forwarded into the chat client's options (where the upstream provider's own `store` semantics apply), but it is **not** translated into a `session_mode` directive against the agent's `HistoryProvider` by default. Collapsing the two — for example to make `store=false` a real end-to-end "stateless" guarantee — is an explicit developer choice expressed in `run_hook`. +2. **Documenting `store` semantics is a per-deployment responsibility.** Because the resolved persistence depends on three independent developer decisions, the meaning of `store=true` / `store=false` against any given hosted agent is something the deployment **must document for its callers** — there is no framework-level guarantee beyond "the wire value is forwarded to the chat client, and the host's `HistoryProvider` runs by default unless `run_hook` says otherwise." +3. **Richer storage vocabulary via `extra_body`.** A single boolean is often too coarse to express what a deployment actually wants to offer. The OpenAI Responses request envelope supports an `extra_body` mapping (the official Python SDK exposes it on every call as a passthrough into the request JSON); the `ResponsesChannel` parses unknown body keys onto `ChannelRequest.attributes`, so `run_hook` can read deployment-specific knobs from there and translate them into `session_mode`, the chat client's `store` flag, or anything else. Examples a deployment might expose: `extra_body={"af_store": "audit_only"}` to write to the `HistoryProvider` but suppress the service-side mirror; `{"af_store": "ephemeral"}` to skip both server-side surfaces; `{"af_store": "replay_safe"}` to force `session_mode="required"` and reject calls without a resolvable session. The framework does not standardize these names — they are part of the deployment's documented contract with its callers, on top of the standard `store` flag. + +##### `FoundryHostedAgentHistoryProvider` — Foundry-backed history + +For users who want the conversation persisted in the **same Foundry response store** that `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider` writes to (so e.g. Foundry Workbench can replay the conversation, or other Foundry tools can introspect it), a new provider is added — proposed name `FoundryHostedAgentHistoryProvider` — implementing the standard `HistoryProvider` Protocol and built **on top of** the Foundry response-store SDK that ships in `azure.ai.agentserver` (so the wire contract, auth, and isolation headers stay aligned with the SDK without re-implementation). Shipped in `agent-framework-foundry-hosting`, attached the same way any other history provider is attached to an agent: + +```python +agent = Agent( + client=client, + history_provider=FoundryHostedAgentHistoryProvider( + endpoint=os.environ["FOUNDRY_ENDPOINT"], + load_messages=True, + ), +) + +host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel()]) +``` + +The provider implements the standard `HistoryProvider` interface — there is no Responses-specific Protocol in between. It is also valid for any other channel (Telegram, Invocations, …) — Foundry storage simply becomes the chosen backend. + +Foundry's storage backend keys writes off two platform-injected request headers (`x-agent-user-isolation-key`, `x-agent-chat-isolation-key`) rather than the request body. The Responses and Invocations channels parse both headers off the inbound request and forward them as an opaque mapping on `ChannelRequest.attributes["isolation"]` (`{"user_key", "chat_key"}`); the host's per-request `bind_request_context` then passes that value to `FoundryHostedAgentHistoryProvider.bind_request_context(isolation=...)`, which the provider applies to its storage calls. Channels never import `IsolationContext`; the provider accepts both an `IsolationContext` instance and a plain mapping. When the headers are absent (local dev outside the Hosted Agents runtime) the attribute is omitted and storage falls back to non-isolated reads/writes, so the same code path works in both environments. + +##### Multi-provider composition + +The existing AF convention applies: an agent may compose **multiple** `HistoryProvider`s, but **only one** carries `load_messages=True`. Common patterns: + +- *Single store.* `FileHistoryProvider(load_messages=True)` — local dev. Or `FoundryHostedAgentHistoryProvider(load_messages=True)` — Foundry-backed prod. +- *Audit dual-write.* `FoundryHostedAgentHistoryProvider(load_messages=True)` + `CosmosHistoryProvider(load_messages=False)` — Foundry is the source of truth used to reconstruct context for the LLM; Cosmos receives a write-only audit copy. +- *Mirror to Foundry for Workbench replay only.* Conversely, an in-house store can hold `load_messages=True` while `FoundryHostedAgentHistoryProvider(load_messages=False)` mirrors writes into Foundry purely so the conversation shows up in Foundry tooling. + +The choice of where to store, and whether to dual-write, is fully the developer's. The channel does not need to know which backing store(s) the agent is using. + +#### Channel-owned per-thread state + +Some channel protocols carry **non-message** durable state attached to the conversation — most notably AG-UI's per-thread `state` object, mutated mid-stream via `StateSnapshotEvent` / `StateDeltaEvent` (JSON-Patch-shaped) and read by the front-end on the next turn. This is *not* message history, so it does not belong on `HistoryProvider`; but it has the same lifetime, isolation, and "opaque to the host" properties as messages, so the framework already has the right primitive: **`ContextProvider`**. + +`HistoryProvider` is only one concrete `ContextProvider` (the one that uses the per-source `state: dict[str, Any]` slot to hold messages). Channels with non-message per-thread state SHOULD ship their own `ContextProvider` subclass and write into the same per-source `state` slot. + +Sketch (for AG-UI; the same pattern applies to any event-rich front-end): + +```python +from agent_framework import ContextProvider, ContextProviderState + +class AgUiStateProvider(ContextProvider): + """Per-thread non-message state for AG-UI front-ends. + + Persists the AG-UI ``state`` object scoped by ``source_id`` (the + AgentSession id). Reads from ``ChannelRequest.client_state`` before + the run, exposes the current value to the agent via Context, and lets + the channel diff it after the run to emit StateSnapshotEvent / + StateDeltaEvent on the wire. + """ + + state_key = "ag_ui_state" # slot in the per-source state dict + + async def before_run(self, context, *, source_id, **kw): + slot = context.state.setdefault(source_id, {}) + # If the request supplied a fresh client_state, seed/replace it. + if (incoming := context.request.client_state) is not None: + slot[self.state_key] = dict(incoming) + # Expose the live value to the agent (e.g. into context.metadata). + + async def after_run(self, context, *, source_id, **kw): + # The current value lives in context.state[source_id][self.state_key]; + # the channel reads it and emits StateSnapshotEvent / StateDeltaEvent. + ... +``` + +Composition rules are unchanged: one `HistoryProvider` carries `load_messages=True`, additional `ContextProvider`s (including `AgUiStateProvider`) attach alongside. Backing storage is whatever the user wires — in-memory for dev, the same physical store as messages for prod. **No new storage protocol is introduced for channel state**; it shares the same per-source state slot that `HistoryProvider` uses. + +#### Storage taxonomy + +To make the picture explicit: there are exactly three distinct *storage seams* in the hosting design, each with a clear scope. The first two are usually backed by the same physical store the user wires; they stay distinct as protocols because the data shapes differ. + +| Seam | Scope | Examples | +|---|---|---| +| **`ContextProvider`** (per-conversation) | Per-`source_id` data the agent needs at run time. Messages (via `HistoryProvider`), AG-UI per-thread state (via `AgUiStateProvider`), or any future per-conversation extension. **The only public per-conversation seam.** | `FileHistoryProvider`, `FoundryHostedAgentHistoryProvider`, `AgUiStateProvider` | +| **Host-level pluggable store** (per-host) | `ContinuationToken`s for background runs, identity-link grants, last-seen `(isolation_key, channel)` records. **File-based by default** in v1 (`FileHostStateStore`, atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis adapters in v1 fast follow (req #23). MAY be backed by the same physical store as `ContextProvider`, but the protocol is distinct because the data is host-execution metadata, not per-conversation context. | `FileHostStateStore` (v1 default), `InMemoryHostStateStore`, future Cosmos / SQL / Redis adapters | +| **`CheckpointStorage`** (workflow runtime) | Workflow executor frames so a workflow can resume after process restart. Structurally distinct from both seams above (the data is workflow-runtime state, not session/identity state). MAY share a physical backend, but the protocol stays separate. | `FileCheckpointStorage`, future `CosmosCheckpointStorage` | + +Concretely, this means an app deploying onto e.g. Foundry storage can run **all three** against the same Foundry backend and still have three orthogonal protocol surfaces — one per concern — instead of one universal store everything accidentally collides in. + +Channels surface per-request transport state (response ids, isolation keys, future signals) on `ChannelRequest.attributes`; the host's `bind_request_context` forwards those attributes as kwargs to each `ContextProvider.bind_request_context` call so providers can apply them to their reads and writes. Providers SHOULD accept `**_` to ignore unknown attributes for forward-compat. This keeps channel↔provider coupling to a documented attribute name (e.g. `"isolation"`) instead of requiring providers to install ASGI middleware. + +The `ResponsesChannel` exposes both an HTTP transport (`{path}/v1/...`) and an optional **WebSocket transport** (`{path}{websocket_path}`, default `/responses/ws`) controlled by `transports`. The WS transport carries the same Responses request/event model as the HTTP+SSE variant — clients open a single connection per conversation and send/receive Responses frames as JSON messages. Both transports go through the same `run_hook`, the same default mapping, and the same `ChannelRequest` shape; the channel codec is responsible for framing only. Auth is reused from the HTTP transport (Authorization header on the `Upgrade` request); subprotocol negotiation is open (see Open Questions). + +### Default invocation behavior by channel + +Each built-in channel owns a **default** mapping from its protocol request model into a `ChannelRequest`. That mapping flows through the optional `run_hook` before the host resolves session behavior and invokes the target. + +| Channel | Default mapping | +|---|---| +| `ResponsesChannel` | Forwards relevant caller settings (e.g. `temperature`, `store`) into `ChannelRequest.options` so the underlying chat client receives them; **does not** map `store=false` to `session_mode="disabled"` by default — see [The Responses store parameter](#the-responses-store-parameter) for the full matrix and the developer-override path. The same default mapping is used for both HTTP and WebSocket transports — WS frames are decoded into the same Responses request model before invocation. | +| `InvocationsChannel` | Maps the request body into `input`, `options`, and session behavior for the hosted target. | +| `TelegramChannel` | Maps incoming messages or commands into `input`, `stream`, and session defaults appropriate for the chat. | + +### ASGI server portability + +The hosting architecture is coupled to **ASGI/Starlette**, not to **Uvicorn** specifically. + +- `host.app` is the canonical portability surface. +- `host.serve(...)` is only the default convenience path (lazy-imports `uvicorn`). +- Because `host.app` is a standard Starlette/ASGI app, it can run on Hypercorn, Daphne, Granian, or Gunicorn-with-Uvicorn-workers. +- ASGI **WebSocket** scope/frames are first-class: any channel may contribute `WebSocketRoute`s alongside HTTP routes, and the chosen ASGI server must support the WebSocket scope (Uvicorn, Hypercorn, Daphne, and Granian all do). + +The packaging question for `uvicorn` (required dependency vs optional extra) is therefore a **convenience choice**, not an architectural constraint. See Open Questions. + +### Error Responses + +| Status | Condition | Notes | +|---|---|---| +| `400 Bad Request` | Channel-specific protocol validation failure | Owned by the channel codec. | +| `401 Unauthorized` / `403 Forbidden` | Channel-specific auth/signature validation failure | Owned by channel middleware (e.g. Telegram secret token, Invocations auth). | +| `404 Not Found` | Route not contributed by any channel | Standard Starlette behavior. | +| `409 Conflict` | Session-resolution conflict with `session_mode="required"` and no resolvable session | Host-level. | +| `422 Unprocessable Entity` | `run_hook` raised a validation error | Channel surfaces the hook's error per protocol conventions. | + +## Terminology + +- **Host** (`AgentFrameworkHost`): The Python object that owns one Starlette app, one **hostable target** (an agent or a workflow), and a sequence of channels. Provides `host.app` (canonical ASGI surface) and `host.serve(...)` (uvicorn convenience). Named `AgentFrameworkHost` rather than `AgentHost` because the target is not restricted to agents. +- **Hostable target**: The executable object the host fronts — either a `SupportsAgentRun`-compatible agent or a `Workflow`. The host detects the kind and dispatches to the appropriate execution seam; channels remain unchanged. +- **Channel**: A pluggable component that contributes routes, middleware, commands, and lifecycle hooks to a host. One channel = one external protocol surface (Responses, Invocations, Telegram, …). Used interchangeably with "head" in earlier discussions; **Channel** is the canonical name. +- **`ChannelRequest`**: The host-neutral, normalized invocation envelope produced by a channel before the host calls the target's execution seam. Carries `input`, `options`, `session`, `session_mode`, and channel-specific `attributes`. +- **`ChannelSession`**: A small session hint with a stable `key`, an optional protocol-visible `conversation_id`, and an opaque `isolation_key`. The host resolves it into an `AgentSession`; storage specifics are deferred. +- **`isolation_key`**: An opaque partition boundary aligned with hosted-agent terminology — may represent a user, tenant, chat, or other scope without baking direct identity semantics into the generic host. +- **Channel-native identity** (`ChannelIdentity`): The **user/account** identifier the channel observes from its own platform (Telegram `from.id`, Teams `from.aadObjectId`, WhatsApp phone number, Slack user id). Always per-channel; never assumed to align across channels. Distinct from the **conversation locator** (`ChannelRequest.conversation_id` / `ChannelSession.conversation_id`) — in multi-user surfaces (Telegram groups, Teams group chats and channels) the two never coincide. See [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels). +- **`IdentityResolver`**: Host-level callable that maps a `ChannelIdentity` to an `isolation_key`. The default resolver **auto-issues** a fresh, stable `isolation_key` the first time a `(channel, native_id)` pair is seen and persists it in the host's identity store, so every end user automatically gets a per-user partition on first contact through any channel — without app code. Linking (see `IdentityLinker`) **merges** the second channel's auto-issued key onto the first channel's `isolation_key`, so cross-channel continuity is a one-shot operation, not a per-channel mapping hook. Apps that already own an identity namespace (corporate user id, tenant-scoped account id) can supply a custom resolver that returns those values directly. +- **`IdentityLinker`**: Host-level component that runs a connect ceremony — typically OAuth, MFA, or a signed one-time code — to associate a new `ChannelIdentity` with an existing `isolation_key`. Contributes its own routes (e.g. OAuth callback) and lifecycle to the host. A built-in `link`/`connect` `ChannelCommand` is exposed automatically when one is configured. On successful ceremony completion, also stores any verified IdP claim recovered from the proof (e.g. Entra ID `oid`) so subsequent channels that supply the same claim can be auto-merged onto the same `isolation_key` silently. Combined with `Channel(require_link=True)`, this enables an "authenticate before chatting" enforcement model where the first channel forces the OAuth ceremony and every other channel using the same IdP joins the same session without a second `/link`. +- **`LinkChallenge`**: The protocol-neutral artifact returned by `IdentityLinker.begin(...)` describing what the user must do to complete the ceremony — typically one of: a URL to visit (OAuth), a short code to enter on the other channel (one-time code), or an MFA prompt. +- **`ResponseTarget`**: Per-request directive on `ChannelRequest` controlling **where** the response is delivered: `originating` (default), `active`, a specific channel, a list of channels, `all_linked`, or `none`. Independent of `session_mode`. +- **`ChannelPush`**: Optional channel capability for proactive outbound delivery — Telegram proactive message, Activity Protocol proactive message via Azure Bot Service, webhook callback, SSE broadcast. Required to be the destination of a non-`originating` `ResponseTarget`. +- **Active channel**: The channel most recently observed for a given `isolation_key`. Tracked by the host on every successfully resolved request; consumed by `ResponseTarget.active`. +- **`ContinuationToken`**: First-class artifact for background/asynchronous runs, returned immediately from `host.run_in_background(request)`. Carries an opaque, URL-safe `token` plus `status`, `isolation_key`, `result`/`error`, and the configured `response_target`. Persisted via `HostStateStore` (file-based by default in v1) so background runs survive host restarts. Host pushes the result to the response target when ready and serves it via channel poll routes. +- **Background run**: A `ChannelRequest` submitted via `host.run_in_background(request)` (or any request with `background=True`). The originating call returns a `ContinuationToken` immediately; the response is delivered later via the configured `ResponseTarget` and/or polled by token. +- **`HostStateStore`**: Single persistence seam for host-execution metadata — continuation tokens, identity-link grants, last-seen records. V1 default `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis (fast follow, req #23). Distinct from `ContextProvider` (per-conversation) and `CheckpointStorage` (workflow), but a deployment MAY back all three with the same physical store. +- **`session_mode`**: Per-request directive (`auto` | `required` | `disabled`) that controls whether the host resolves a session before invoking the target. Lets `run_hook`s express explicit policy — e.g. translating Responses `store=false` into `session_mode="disabled"` to honor the caller's "don't store" intent at the `HistoryProvider` layer (the channel does not do this automatically — see [The Responses store parameter](#the-responses-store-parameter)). +- **`confidentiality_tier`** (channel-level): Opaque label (`"corp"`, `"public"`, `"internal"`, …) declared on a `Channel` and consumed by the host's `LinkPolicy`. Two channels with different confidentiality tiers can share an agent target on one host while remaining session-isolated. +- **`LinkPolicy`**: Host-level decision over which channel pairs may share an `isolation_key` (link) and which channel pairs may be `ResponseTarget` source/destination for one another (deliver). Built-in variants: allow-all (default), same-tier-only, explicit allow-list, deny-all. See [LinkPolicy and confidentiality_tier](#linkpolicy-and-confidentiality_tier) for the full contract and built-ins table. +- **`ChannelContribution`**: What a channel returns from `contribute(...)` — routes, middleware, commands, and `on_startup`/`on_shutdown` hooks. The host aggregates contributions into one Starlette app. +- **`ChannelCommand`**: A transport-neutral command descriptor (`name`, `description`, `handle`). Message channels project these into native command surfaces — Telegram bot commands, future Activity Protocol slash commands / adaptive cards, WhatsApp menus. +- **`ChannelRunHook`**: Per-request callable on built-in channels. Runs after the channel's default `ChannelRequest` is produced, before session resolution. The escape hatch for forcing or forbidding session use, requiring extra options, adapting to targets like `A2AAgent`, **and** reshaping a channel's free-form input into the typed inputs a `Workflow` target expects. +- **Native command registration**: The startup-time projection of `ChannelCommand` metadata into a platform's native command catalog (e.g. Telegram `set_my_commands(...)`). +- **`SupportsAgentRun`**: The existing framework agent execution seam (`run(..., session=..., stream=...)`) — the contract the host uses when the hostable target is an agent. +- **`Workflow`**: The framework workflow execution seam — the contract the host uses when the hostable target is a workflow. The host wraps the workflow's outputs into the same `HostedRunResult` / `HostedStreamResult` shape so channels do not need to distinguish. + +## Hero Code Samples + +> **Common prerequisite:** Every sample below calls `host.serve(...)`, which lazy-imports `uvicorn`. Install `uvicorn` (e.g. `pip install uvicorn`) — or the corresponding `agent-framework-hosting[serve]` extra if the package ships one (see Open Question #2) — alongside the per-sample dependencies listed in each scenario's **Prerequisites** block. Samples that use `host.app` directly (handed to Hypercorn/Daphne/Granian/Gunicorn+uvicorn workers) do not require `uvicorn`. + +### Scenario 1: Expose one agent on the Responses API + +A developer has an agent and wants to expose it as the OpenAI-compatible Responses API on `localhost:8000` with no manual server bootstrap. + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting` and `agent-framework-hosting-responses` are installed +> - An `OPENAI_API_KEY` is available in the environment + +```python +from agent_framework import Agent +from agent_framework.openai import OpenAIChatClient +from agent_framework.hosting import AgentFrameworkHost, ResponsesChannel + +agent = Agent( + name="WeatherAgent", + instructions="You are a helpful weather agent.", + client=OpenAIChatClient(model="gpt-4.1-mini"), +) + +host = AgentFrameworkHost( + target=agent, + channels=[ResponsesChannel()], +) + +if __name__ == "__main__": + host.serve(host="localhost", port=8000) +``` + +This exposes the Responses routes under `/responses/v1`. No manual `uvicorn` import, no protocol handlers written by the user. + +### Scenario 2: Expose Responses + Invocations on one host with shared Starlette middleware + +Same agent, both protocols, with CORS applied at the host level. + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting`, `-responses`, and `-invocations` are installed +> - A Foundry project with a `gpt-4.1` model deployment + +```python +from azure.identity import AzureCliCredential +from starlette.middleware import Middleware +from starlette.middleware.cors import CORSMiddleware + +from agent_framework import Agent +from agent_framework.foundry import FoundryChatClient +from agent_framework.hosting import AgentFrameworkHost, InvocationsChannel, ResponsesChannel + +agent = Agent( + name="TravelAgent", + instructions="Help users plan travel and keep answers concise.", + client=FoundryChatClient( + project_endpoint="https://my-project.services.ai.azure.com/api/projects/travel", + model="gpt-4.1", + credential=AzureCliCredential(), + ), +) + +host = AgentFrameworkHost( + target=agent, + channels=[ + ResponsesChannel(), # -> /responses/v1 + InvocationsChannel(), # -> /invocations/invoke + ], + middleware=[ + Middleware( + CORSMiddleware, + allow_origins=["https://chat.contoso.com"], + allow_methods=["*"], + allow_headers=["*"], + ), + ], +) + +# Hand the canonical ASGI app to any server, or use the convenience method. +app = host.app # for Hypercorn / Granian / Gunicorn+uvicorn workers +host.serve(host="localhost", port=8000) +``` + +### Scenario 3: Per-request run hook on the Responses channel + +The developer wants to enforce that every Responses call sets `temperature`, and to **harden** session handling so that `session_mode="required"` (fail if no session can be resolved) — explicitly ignoring caller `store=false` since the channel's default already keeps the agent's `HistoryProvider` active regardless of that wire flag (see [The Responses store parameter](#the-responses-store-parameter)). None of this is part of the official Responses spec, but all of it is valid app policy. + +> **Prerequisites:** This sample assumes: +> - The Responses channel is wired into an `AgentFrameworkHost` (see Scenario 1) + +```python +from dataclasses import replace + +from agent_framework.hosting import ( + AgentFrameworkHost, + ChannelRequest, + ResponsesChannel, +) + + +def responses_policy(request: ChannelRequest, **kwargs) -> ChannelRequest: + if request.options is None or request.options.temperature is None: + raise ValueError("This host requires temperature on every Responses call.") + + # Harden session handling: even when the caller sends store=false, keep host-managed + # sessions and fail closed instead of auto-issuing. The HistoryProvider would already + # run under the default "auto" mode; "required" upgrades that to a hard error if no + # session can be resolved (e.g. missing previous_response_id and no resolver match). + return replace(request, session_mode="required") + + +host = AgentFrameworkHost( + target=agent, + channels=[ResponsesChannel(run_hook=responses_policy)], +) +host.serve(host="localhost", port=8000) +``` + +The hook runs **after** the channel produces its default `ChannelRequest` and **before** the host resolves session behavior and calls `SupportsAgentRun.run(...)`. The same shape works to adapt to targets like `A2AAgent` — strip or remap channel-derived options that the target does not consume. + +### Scenario 4: Telegram channel with native command catalog (polling) + +A developer wants to expose the same agent as a Telegram bot, with first-class native commands (`/start`, `/new`, `/sessions`, …) registered into Telegram's command menu at startup. Modeled after PR #5393. + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting-telegram` is installed +> - `TELEGRAM_BOT_TOKEN` is set in the environment + +```python +import os + +from agent_framework.hosting import ( + AgentFrameworkHost, + ChannelCommand, + ChannelCommandContext, + TelegramChannel, +) + + +async def handle_start(context: ChannelCommandContext) -> None: + await context.reply( + "Hi! Commands: /new, /sessions, /todo, /memories, /reminders, " + "/resume, /cancel, /reasoning, /tokens." + ) + + +async def handle_noop(context: ChannelCommandContext) -> None: + await context.reply("Command received.") + + +TELEGRAM_COMMANDS = [ + ChannelCommand("start", "Introduce the bot", handle_start), + ChannelCommand("new", "Start a new local session", handle_noop), + ChannelCommand("sessions", "List local sessions", handle_noop), + ChannelCommand("todo", "List todos for the active session", handle_noop), + ChannelCommand("memories", "List memory topics for the active session", handle_noop), + ChannelCommand("reminders", "List reminders for the active session", handle_noop), + ChannelCommand("resume", "Resume the latest pending or previous session", handle_noop), + ChannelCommand("cancel", "Cancel the active response", handle_noop), + ChannelCommand("reasoning", "Toggle the transient reasoning preview", handle_noop), + ChannelCommand("tokens", "Toggle token usage details", handle_noop), +] + +telegram = TelegramChannel( + bot_token=os.environ["TELEGRAM_BOT_TOKEN"], + transport="polling", + commands=TELEGRAM_COMMANDS, + register_native_commands=True, +) + +host = AgentFrameworkHost(target=agent, channels=[telegram]) +host.serve(host="localhost", port=8000) +``` + +This mirrors the important shape from PR #5393: command metadata is declared once, the channel registers it into Telegram's native menu at startup (`set_my_commands(...)`), and runtime command dispatch stays channel-local. + +### Scenario 5: Telegram webhook mode on the same host as Responses + Invocations + +Same agent, three channels, one Starlette app, one process. + +> **Prerequisites:** Same as Scenario 4, plus a public HTTPS URL for the webhook. + +```python +host = AgentFrameworkHost( + target=agent, + channels=[ + ResponsesChannel(), # -> /responses/v1 + InvocationsChannel(), # -> /invocations/invoke + TelegramChannel( + bot_token=os.environ["TELEGRAM_BOT_TOKEN"], + transport="webhook", # -> /telegram/webhook + commands=TELEGRAM_COMMANDS, + ), + ], +) + +host.serve(host="0.0.0.0", port=8000) +``` + +Webhook transport contributes `/telegram/webhook` by default; the command catalog remains identical to the polling sample. + +### Scenario 6: Linking a new channel to an existing identity via OAuth + +A developer wants every Telegram chat to be **authenticated up front** via OAuth (Microsoft Entra ID) before the agent will respond, and wants Teams chats from the same Entra ID user to be **auto-linked** to the existing session — no second `/link` ceremony, just sign in once on the first channel and the rest follow automatically. This delivers cross-channel chat continuity as a side-effect of identity linking; Scenario 7 covers the alternative pattern where a trusted server-side relay supplies identity directly without a link ceremony. + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting`, `agent-framework-hosting-telegram`, and the (future) `agent-framework-hosting-activity` channel are installed +> - An OAuth provider is configured (Microsoft Entra ID in this example) + +```python +import os + +from agent_framework.hosting import ( + AgentFrameworkHost, + OAuthIdentityLinker, + TelegramChannel, +) + + +# The OAuth linker contributes its own /identity/oauth/microsoft/{start,callback} +# routes to the host. On successful completion, the host's built-in identity +# store atomically records BOTH the originating channel-native identity AND the +# verified IdP claim (Entra ID object id) so future channels that authenticate +# the same IdP account can auto-link without a second ceremony. +linker = OAuthIdentityLinker( + provider="microsoft", + client_id=os.environ["AAD_CLIENT_ID"], + client_secret=os.environ["AAD_CLIENT_SECRET"], +) + +host = AgentFrameworkHost( + target=agent, + identity_linker=linker, + channels=[ + # require_link=True gates the channel: any inbound message from an + # un-linked ChannelIdentity is short-circuited to a LinkChallenge reply + # instead of being dispatched to the agent. + TelegramChannel( + bot_token=os.environ["TELEGRAM_BOT_TOKEN"], + transport="webhook", + require_link=True, + ), + # ActivityChannel(app_id=..., require_link=True), # future — same flag + ], +) +host.serve(host="0.0.0.0", port=8000) +``` + +The flow: + +1. `alice` sends her first message on Telegram. The `TelegramChannel` extracts `ChannelIdentity(channel="telegram", native_id="")` and asks the linker `is_linked(...)`. It is not. Because `require_link=True`, the channel does **not** invoke the agent; instead it asks `linker.begin(channel_identity)` for a `LinkChallenge`, renders the challenge URL into Telegram (clickable button), and returns. +2. `alice` clicks the button, signs in with Microsoft Entra ID, and the OAuth callback hits the linker's route. `linker.complete(...)` verifies the authorization code and records **two things atomically** in the identity store: + - `(channel="telegram", native_id="") → isolation_key="hk_018f…a3"` + - `verified_claim("microsoft.oid", "") → isolation_key="hk_018f…a3"` +3. `alice` replies on Telegram. The channel sees the link is now present, resolves the existing `isolation_key`, and forwards the message to the agent normally. From here on, Telegram chats are routed without further ceremony. +4. The next day, `alice` opens Teams. The `ActivityChannel` extracts both the channel-native identity (`activity`, ``) **and** the verified IdP claim from the inbound activity (Teams already authenticates with Entra ID via Bot Service, so the AAD object id is trusted). It asks the linker `is_linked(...)`. The `(activity, )` pair is **not** in the store — but the verified claim `("microsoft.oid", "")` **is**. The linker auto-merges `(activity, ) → isolation_key="hk_018f…a3"` without any user-visible `/link` ceremony. +5. From the next turn on, both Telegram and Teams resolve to the **same** `isolation_key` and the **same** `AgentSession`. The agent sees the conversation history from both channels as one continuous thread. + +The two enabling pieces: + +- **`require_link: bool` on the channel** — when `True`, the channel checks the linker before dispatching every inbound request. Un-linked identities are short-circuited to a rendered `LinkChallenge` instead of an agent invocation. Default is `False` (the opportunistic flow below). +- **Verified IdP claims in the linker's identity store** — when an OAuth ceremony completes, the linker records the verified identity claim (e.g. `(microsoft.oid, )`) alongside the channel-native identity. Channels that can supply the same kind of verified claim from their own auth context (Teams via the AAD bearer on the activity, future M365 channels via the same bearer, …) get **auto-linked silently** on first contact when their claim matches an existing entry. This is what makes "sign in once on Telegram, Teams just works" possible without any per-channel link ceremony. + +**Variant — opportunistic linking (`require_link=False`).** Leave the flag at its default and the channel will dispatch un-linked identities straight to the agent (the host's default resolver auto-issues a fresh `isolation_key` for them). The user can later run the `link` `ChannelCommand` manually to merge that auto-issued key onto an existing one. This is the lower-friction onboarding flow at the cost of allowing pre-link conversations to exist in their own isolated session until merged. + +**Variant — alternative ceremony.** Swapping the linker for `OneTimeCodeIdentityLinker(...)` changes the ceremony to "complete `/link` on channel A, get a 6-digit code, run `/link 482931` on channel B"; with `require_link=True` the channel just renders the code-entry instructions instead of an OAuth URL. Apps with their own corporate identity namespace can additionally pass a custom `identity_resolver` so the post-link `isolation_key` is the corporate user id instead of the host-issued opaque key. Channels themselves are unchanged across these variants — only the linker and (optionally) the resolver change. + +### Scenario 7: Trusted server-side caller relays a Responses request and pushes the answer back to the user's Telegram chat + +A developer runs an internal application server that already knows its end users (e.g. via an SSO session) and wants to expose **two surfaces against the same agent**: the OpenAI-compatible **Responses API** (so the application backend can drive the agent programmatically on behalf of the signed-in user) and **Telegram** (so the same end user can also chat with the agent directly). When the application backend submits a Responses call, it should be possible to (a) link that call to the same `isolation_key` as the user's existing Telegram chats — so the agent sees one continuous conversation history — and optionally (b) have the agent's response pushed back to the user's Telegram chat instead of (or in addition to) being returned synchronously on the Responses HTTP call. + +This works **without** an `IdentityLinker` because the application backend is a **trusted relay**: it already authenticated the user through its own SSO and knows both the user's app-internal id and (because the user has previously connected their Telegram account in the application's own settings page) the user's Telegram `chat_id`. The host just needs to be told. + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting`, `agent-framework-hosting-responses`, and `agent-framework-hosting-telegram` are installed +> - The application backend can attach two extra fields to its Responses call: an `app_user_id` (the user's stable id in the application's own namespace) and, optionally, a `push_to_telegram_chat_id` (the user's known Telegram chat id from the application's own database) + +```python +import os +from dataclasses import replace + +from agent_framework.hosting import ( + AgentFrameworkHost, + ChannelIdentity, + ChannelRequest, + IdentityResolver, + ResponseTarget, + ResponsesChannel, + TelegramChannel, +) + + +# A custom identity resolver that promotes the app's own user id to the +# isolation_key whenever a channel can supply one. The Telegram channel exposes +# the chat_id (pre-registered in the application's settings page → so the +# application maps chat_id → app_user_id and tells the host); the Responses +# channel exposes the app_user_id directly via extra_body (see run_hook below). +async def app_identity_resolver(identity: ChannelIdentity, **_) -> str | None: + # Both channels populate ChannelIdentity.attributes["app_user_id"] — see + # the run hooks below. + return identity.attributes.get("app_user_id") + + +# Telegram channel maps Telegram chat_id → app_user_id from the application's +# pre-registered chat-id table. Cached locally; in real apps this is whatever +# lookup matches the application's own user-account schema. +KNOWN_TELEGRAM_USERS: dict[str, str] = { + "": "user_alice", + # ... +} + + +async def telegram_promote_app_user(request: ChannelRequest, **_) -> ChannelRequest: + chat_id = request.identity.native_id + app_user_id = KNOWN_TELEGRAM_USERS.get(chat_id) + if app_user_id is None: + return request # falls back to host's auto-issued isolation_key + return replace( + request, + identity=replace( + request.identity, + attributes={**request.identity.attributes, "app_user_id": app_user_id}, + ), + ) + + +# The application backend POSTs to /responses/v1/responses with +# +# { +# "model": "...", +# "input": "...", +# "extra_body": { +# "hosting": { +# "app_user_id": "user_alice", # who this request is for +# "push_to_telegram_chat_id": "", # optional +# } +# } +# } +# +# The Responses channel surfaces extra_body["hosting"] on +# ChannelRequest.attributes["hosting"]; this run_hook reads it and rewrites +# both the identity (so the request resolves to the same isolation_key as the +# user's Telegram chats) and the response_target (so the answer is pushed to +# Telegram in addition to / instead of the synchronous Responses reply). +async def responses_relay_hook(request: ChannelRequest, **_) -> ChannelRequest: + hosting = request.attributes.get("hosting", {}) + app_user_id = hosting.get("app_user_id") + push_chat_id = hosting.get("push_to_telegram_chat_id") + + if app_user_id is None: + return request # plain Responses call, no relay → keep defaults + + # Promote app_user_id onto the identity so the resolver returns it as + # isolation_key. + new_identity = replace( + request.identity, + attributes={**request.identity.attributes, "app_user_id": app_user_id}, + ) + + # If the caller also supplied a Telegram chat id, push the answer there + # via ResponseTarget.identities (explicit recipient — bypasses the link + # store, which is empty for this user since no link ceremony ran). The + # Responses HTTP call returns a ContinuationToken so the application + # backend can correlate. + if push_chat_id: + return replace( + request, + identity=new_identity, + response_target=ResponseTarget.identities([ + ChannelIdentity(channel="telegram", native_id=push_chat_id), + ]), + background=True, + ) + + return replace(request, identity=new_identity) + + +host = AgentFrameworkHost( + target=agent, + identity_resolver=IdentityResolver(app_identity_resolver), + channels=[ + ResponsesChannel(run_hook=responses_relay_hook), + TelegramChannel( + bot_token=os.environ["TELEGRAM_BOT_TOKEN"], + transport="webhook", + run_hook=telegram_promote_app_user, + ), + ], +) +host.serve(host="0.0.0.0", port=8000) +``` + +The flow: + +1. Alice has previously connected her Telegram account on the application's settings page; the application stored `chat_id_of_alice → user_alice` in `KNOWN_TELEGRAM_USERS` (a real deployment uses a database). +2. Alice opens the application's web UI and types a question. The application backend (signed in as `user_alice`) calls the Responses API mounted on this host with `extra_body={"hosting": {"app_user_id": "user_alice"}}` (and no `push_to_telegram_chat_id`). The `responses_relay_hook` promotes `app_user_id` onto the identity, the resolver returns `isolation_key="user_alice"`, the agent runs, and the answer is returned synchronously over HTTP. The agent's `HistoryProvider` appends both turns to the session keyed by `user_alice`. +3. Later, Alice messages the same agent on Telegram from her registered chat. The Telegram channel's `run_hook` promotes `app_user_id="user_alice"` onto the identity (because her chat_id is in the known-users table), the resolver returns the **same** `isolation_key="user_alice"`, the agent loads the **same** session — and sees the earlier turn from the web UI. **One continuous conversation across two channels, no link ceremony required, no `IdentityLinker` configured.** +4. Now Alice walks away from her desk. The application backend wants to fire a long-running task on her behalf and have the answer reach her on Telegram. It calls the Responses API with `extra_body={"hosting": {"app_user_id": "user_alice", "push_to_telegram_chat_id": ""}}`. The `responses_relay_hook` rewrites the request to `background=True` and `response_target=ResponseTarget.identities([ChannelIdentity("telegram", "")])`. The Responses HTTP call returns a `ContinuationToken` immediately (so the application backend can correlate); when the agent completes, the host calls `TelegramChannel.push(ChannelIdentity("telegram", ""), result)` and the answer arrives in Alice's Telegram chat. + +The two enabling pieces: + +- **`extra_body["hosting"]` as a developer-controlled relay envelope.** The Responses channel surfaces an opaque `hosting` block from `extra_body` onto `ChannelRequest.attributes["hosting"]`. The hosting core does **not** define what goes in there — the developer decides what their trusted backend may carry (here `app_user_id` and `push_to_telegram_chat_id`) and reads it in their `run_hook`. This is the same pattern the `store=` table calls out for richer per-call control. +- **`ResponseTarget.identities([...])` for explicit caller-known recipients.** This bypasses the link store and pushes to a channel-native identity the caller already knows. Use it when the originating caller is a trusted relay that authenticated the user through some other means (corporate SSO, an internal API key bound to a user) and just needs the host to dispatch. `LinkPolicy` is still consulted per delivery, so a corp-tier Responses call cannot smuggle a public-tier Telegram push if the policy disallows it. + +**Variant — same scenario with an `IdentityLinker` configured.** If the host *does* have an `IdentityLinker` (Scenario 6), the application backend doesn't need to maintain its own `chat_id → app_user_id` table at all: when Alice runs `/link` once on Telegram, the linker records the channel-native identity against `isolation_key="user_alice"` (resolved from the Entra OAuth claim that matches the application's own SSO). After that, the run hook can simply use `ResponseTarget.channel("telegram")` (link-store recipient) instead of `ResponseTarget.identities([...])`. The explicit-identities variant remains useful when the application owns identity end-to-end and prefers not to delegate to a host-level linker. + +### Scenario 8: Background run with cross-channel response delivery + +A developer wants the user to start a long-running task on Telegram and pick up the response on Teams (whichever channel the user happens to be on when the result is ready). The originating Telegram message returns a `ContinuationToken` immediately; when the agent completes, the host pushes the result to the user's currently active channel via `ChannelPush`. A poll route is also exposed for callers that prefer polling. + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting`, `agent-framework-hosting-telegram`, and the (future) `agent-framework-hosting-activity` channel are installed +> - The user is already linked across Telegram and Teams (Scenario 6) + +```python +import os +from dataclasses import replace + +from agent_framework.hosting import ( + AgentFrameworkHost, + ChannelRequest, + ResponseTarget, + TelegramChannel, +) + + +# Override the Telegram channel default: any inbound message becomes a +# background run delivered to the user's currently active channel. +async def telegram_background(request: ChannelRequest, **kwargs) -> ChannelRequest: + return replace( + request, + background=True, + response_target=ResponseTarget.active, + ) + + +host = AgentFrameworkHost( + target=agent, + identity_linker=linker, # from Scenario 6 + channels=[ + TelegramChannel( + bot_token=os.environ["TELEGRAM_BOT_TOKEN"], + transport="webhook", + run_hook=telegram_background, + ), + # ActivityChannel(...), # future + ], +) +host.serve(host="0.0.0.0", port=8000) +``` + +The flow: + +1. `alice` sends a Telegram message that triggers a long-running tool. The Telegram channel produces a `ChannelRequest`; the hook flips `background=True` and sets `response_target=ResponseTarget.active`. +2. `host.run_in_background(request)` returns a `ContinuationToken(token="ct_018f…", status="queued")`. The Telegram channel acknowledges with a short "Working on it…" reply that includes the token (it could equally render a "Cancel" inline button bound to the token). +3. The host runs the target asynchronously. When complete, it resolves `ResponseTarget.active` against the host-tracked last-seen channel for `isolation_key="alice@contoso.com"`. If `alice` is currently on Teams, the host calls `ActivityChannel.push(channel_identity, hosted_run_result)`; if she is still on Telegram, it calls `TelegramChannel.push(...)` (so the same setup gracefully degrades to "reply on Telegram if she never switched"). +4. `ContinuationToken` is updated to `status="completed"` with the populated `result`. Any caller can poll `GET /telegram/runs/{continuation_token}` (or the equivalent route the channel exposes) to retrieve the run state by id. + +Variants without changing channel code: + +- `ResponseTarget.channel("activity")` — always deliver to Teams, regardless of where the user is. +- `ResponseTarget.all_linked` — broadcast to every channel `alice` has linked. +- `ResponseTarget.none` — fully detached: caller polls `host.get_continuation(token)` (or the channel's poll route); no proactive push. +- `background=False` with `response_target=ResponseTarget.active` — synchronous wait, but result still routed away from the originating channel (rare; mostly useful for pipelines where the originating call is a programmatic trigger and the human user lives elsewhere). + +If the chosen destination channel does not implement `ChannelPush` (e.g. Responses), the host falls back to the `originating` channel and records the fallback in telemetry. This makes the Responses + background-run combo work as "submit on Responses, poll on Responses" without surprising silent drops. + +### Scenario 9: Hosting a `Workflow` instead of an agent (with checkpoint storage) + +> **Prerequisites:** This sample assumes: +> - `agent-framework-hosting` and `agent-framework-hosting-invocations` are installed +> - A `Workflow` definition with typed inputs (`OrderIntakeInputs`) +> - A directory writable by the host process for workflow checkpoints + +```python +from dataclasses import dataclass, replace +from pathlib import Path + +from agent_framework import FileCheckpointStorage, WorkflowBuilder +from agent_framework.hosting import ( + AgentFrameworkHost, + ChannelRequest, + InvocationsChannel, +) + + +@dataclass +class OrderIntakeInputs: + customer_id: str + sku: str + quantity: int + + +# Build the workflow with a CheckpointStorage so individual executor frames +# are persisted as the workflow runs. FileCheckpointStorage writes one file +# per checkpoint under the configured directory; survives host restarts. +checkpoint_storage = FileCheckpointStorage(directory=Path("./.af-hosting/checkpoints/")) + +workflow = ( + WorkflowBuilder(checkpoint_storage=checkpoint_storage) + .add_executor(...) # application-defined + .build() +) + + +def adapt_to_workflow_inputs(request: ChannelRequest, *, protocol_request=None, **kwargs) -> ChannelRequest: + # The channel produces a default ChannelRequest with text input. The workflow + # needs typed OrderIntakeInputs — the hook is the adapter point. The same + # hook is the place to surface a caller-supplied checkpoint id (to resume + # an interrupted run) by promoting it onto request.attributes; the host's + # workflow dispatch reads it on the way to Workflow.run(...). + payload = protocol_request # raw Invocations request body + inputs = OrderIntakeInputs( + customer_id=payload["customer_id"], + sku=payload["sku"], + quantity=int(payload["quantity"]), + ) + new_attrs = dict(request.attributes) + if checkpoint_id := payload.get("resume_from_checkpoint"): + new_attrs["workflow.checkpoint_id"] = checkpoint_id + return replace(request, input=inputs, attributes=new_attrs) + + +host = AgentFrameworkHost( + target=workflow, + channels=[ + InvocationsChannel(run_hook=adapt_to_workflow_inputs), + ], +) +host.serve(host="localhost", port=8000) +``` + +The host detects that `target` is a `Workflow` and dispatches the resulting `ChannelRequest.input` to `Workflow.run(...)` instead of `SupportsAgentRun.run(...)`. The channel does not need to know which kind of target it is fronting — `HostedRunResult` and `HostedStreamResult` are normalized across both seams. The same workflow target could equally be exposed on Telegram or a Responses channel by supplying the appropriate `run_hook` to translate inbound chat messages into typed workflow inputs. + +**Checkpoint storage** is wired onto the workflow itself (via `WorkflowBuilder(checkpoint_storage=...)` or per-run via `Workflow.run(..., checkpoint_storage=...)`), **not** on the host. The host treats it as workflow-runtime state — structurally distinct from the `HostStateStore` (which persists `ContinuationToken`s, identity-link grants, and last-seen records — host-execution metadata, not workflow internals) and from `ContextProvider` (per-conversation context). All three protocols stay separate, but a deployment MAY back them with the same physical store. When `request.attributes["workflow.checkpoint_id"]` is set (as the run hook does above when the caller supplies `resume_from_checkpoint`), the host's workflow dispatch path passes it through to `Workflow.run(checkpoint_id=...)` so the workflow resumes from that frame instead of running from scratch — useful for long-running intake flows that survive host restarts or retries. + +### Scenario 10: Authoring a new channel package + +The shape any new channel follows: parse external protocol → produce default `ChannelRequest` → optionally apply hook → `context.run(...)` / `context.stream(...)` → serialize back. + +```python +from starlette.requests import Request +from starlette.responses import JSONResponse +from starlette.routing import Route + +from agent_framework.hosting import ( + Channel, + ChannelContext, + ChannelContribution, + ChannelRequest, + ChannelSession, +) + + +class MyWebhookChannel: + name = "mywebhook" + + def __init__(self, *, path: str = "/mywebhook") -> None: + self._path = path + + def contribute(self, context: ChannelContext) -> ChannelContribution: + async def endpoint(request: Request) -> JSONResponse: + payload = await request.json() + channel_request = ChannelRequest( + channel=self.name, + operation="message.create", + input=payload["text"], + session=ChannelSession( + key=payload["thread_id"], + isolation_key=payload["account_id"], + ), + ) + result = await context.run(channel_request) + # See "Result is rich, not just text" below — `result.text` is the + # plain-text projection; this channel chooses to also surface + # citations and any tool-call traces it cares about. The exact + # serialization is the channel's call. + return JSONResponse(_render_for_mywebhook(result)) + + return ChannelContribution(routes=[Route(f"{self._path}/inbound", endpoint, methods=["POST"])]) +``` + +**Result is rich, not just text.** `result` here is a `HostedRunResult` wrapping an `AgentRunResult` (or a workflow output). It is **not** limited to a flat string — `result.text` is the convenience plain-text projection, but the underlying object carries: + +- the full `messages: list[ChatMessage]` thread the agent produced this turn — each message holds an ordered list of typed `Contents` (see [`Contents` in core](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_types.py)): `TextContent`, `DataContent` (inline base64 blobs), `UriContent` (URLs to images/audio/files), `FunctionCallContent` and `FunctionResultContent` (tool-call traces), `HostedFileContent` / `HostedVectorStoreContent` (provider-side file/vector references), `UsageContent` (token usage), `ErrorContent`, `TextReasoningContent` (reasoning traces), and channel-extensible custom content kinds. Each content also has `additional_properties` for provider-specific extensions (citations, image alt text, source spans, …), +- `value: T | None` — the typed structured output when the agent returned one (e.g. via response-format / structured-output features), +- `usage_details: UsageDetails | None`, `raw_representation`, and per-message `additional_properties` carrying provider-native extras. + +A channel author is free to project this into **whatever the channel's native shape supports**. Examples: + +- The built-in **Telegram channel** renders `text` segments with Telegram's `MarkdownV2` parse mode (escaping the special set), uploads `DataContent` images via `sendPhoto` and audio via `sendAudio` as separate Telegram messages in the same chat, and emits inline-button keyboards from `FunctionCallContent` traces when the channel is configured to surface tool calls as user-confirmable actions. Citations attached to a `TextContent.additional_properties["citations"]` slot are rendered as numbered footnote links the user can tap. +- The built-in **Responses channel** preserves the full content-list shape on the wire — every `ChatMessage` round-trips as a Responses-shaped output item so callers can inspect the typed mix of text, function-call traces, image/file outputs, reasoning, and structured-output `value`s exactly as the agent produced them. There is no lossy collapse to a single text field. +- A channel fronting a **chat UI** can render `TextContent` as full GitHub-Flavored Markdown / HTML (tables, code fences with syntax highlighting, math), `DataContent` and `UriContent` as inline images/audio/video players, `FunctionCallContent` / `FunctionResultContent` as collapsible "tool ran" cards, and `TextReasoningContent` as a collapsible reasoning panel — all from the same `result`. +- A **voice channel** can route `TextContent` through TTS, play `DataContent(audio/*)` directly, and surface `FunctionCallContent` only as audio earcons (or skip them entirely) — the same `result` object drives a completely different surface. +- A **richly-typed RPC channel** can return `result.value` (the structured output) directly when the workflow / agent produced one, and fall back to `result.text` only when no typed output is available. + +The host imposes no projection — `result.text` is offered as a convenience for channels whose native shape really is "single string in, single string out", and channels are encouraged to lean on the full content list when their protocol supports more. + +## Information Design + +### Canonical flow + +```text +external request/event + -> channel-specific parsing + validation + -> ChannelIdentity extraction (per-channel native id) + -> default channel invocation mapping + -> optional run_hook + -> ChannelRequest (carries response_target, background) + -> AgentFrameworkHost / ChannelContext + -> identity_resolver(ChannelIdentity) -> isolation_key + -> host records (isolation_key, channel, now) as last-seen (for ResponseTarget.active) + -> AgentSession resolution (per session_mode, scoped by isolation_key) + -> [foreground] target execution seam -> HostedRunResult/HostedStreamResult -> originating channel serialization + -> [background or response_target != originating] + -> ContinuationToken returned immediately to originating channel + -> target executes asynchronously + -> on completion, deliver to ResponseTarget via destination channel.push(...) + -> ContinuationToken updated; available via host.get_continuation(token) and channel poll routes +``` + +A parallel **link ceremony flow** runs out-of-band when a user invokes the host-provided `link`/`connect` command on a channel: + +```text +channel /link command + -> linker.begin(ChannelIdentity) -> LinkChallenge + -> channel-specific rendering (URL, code, MFA prompt) + -> user completes the ceremony out-of-band (browser, second channel, MFA app) + -> linker callback/verification route + -> linker.complete(challenge_id, proof) -> isolation_key + -> host atomically associates (channel, native_id) -> isolation_key + -> subsequent requests resolve to the linked AgentSession +``` + +### Inbound ownership + +| Concern | Owned by | Notes | +|---|---|---| +| HTTP / WebSocket route shape | Channel package | e.g. `/responses/v1`, `/responses/ws`, `/invocations/invoke`, `/telegram/webhook` — channels may contribute either or both | +| Protocol request model | Channel package | e.g. Responses items (HTTP body or WS frames), Invocations body, Telegram webhook payload | +| Signature/auth validation | Channel package or host middleware | channel-specific unless generic Starlette middleware | +| Request-to-agent invocation mapping | Channel package + optional `run_hook` | forwards caller parameters into `ChannelRequest.options`, chooses `session_mode`, can enforce extra app policy | +| Native command catalog | Channel package using host-defined `ChannelCommand` | e.g. Telegram bot commands, future Activity Protocol slash-command / adaptive-card surfaces, WhatsApp menus | +| Command registration at startup | Channel package | e.g. Telegram `set_my_commands(...)` | +| Command dispatch | Channel package | commands may reply locally, manipulate channel-owned state, or invoke the agent | +| Normalized input to the agent | Host core | `ChannelRequest.input` reuses `AgentRunInputs` | +| Session resolution | Host core | based on `ChannelSession` + `ChannelRequest.session_mode`; storage specifics deferred | +| Channel-native identity extraction | Channel package | populates `ChannelIdentity(channel, native_id, attributes)` per request | +| Identity resolution (`native_id` → `isolation_key`) | Host core via `IdentityResolver` | default **auto-issues and persists** a per-user `isolation_key` on first contact per `(channel, native_id)`; user-supplied resolver can return app-owned identities directly | +| Identity store (`(channel, native_id) → isolation_key`) | Host core via `HostStateStore` | file-based by default in v1 (`FileHostStateStore`); pluggable for Cosmos / SQL / Redis in fast follow (req #23). Owns auto-issuance and atomic merge-on-link. | +| Identity link ceremony (OAuth / MFA / one-time code) | Host core via `IdentityLinker` | linker contributes its own routes + lifecycle; channels surface a built-in `link`/`connect` command | +| Link & delivery policy across confidentiality tiers | Host core via `LinkPolicy` | consulted at link time (refuse incompatible link attempts) and at delivery time (drop incompatible `ResponseTarget` destinations); built-in policies cover all-allow, same-tier, explicit allow-list, deny-all | +| Active-channel tracking | Host core | updated on every successfully resolved request; consumed by `ResponseTarget.active` | +| Response-target resolution | Host core | translates `ResponseTarget` (originating, active, specific, list, all_linked, none) into an ordered set of `(channel, ChannelIdentity)` deliveries | +| Proactive outbound delivery | Channel package via optional `ChannelPush` capability | channels that can push (Telegram, Activity Protocol via Bot Service, webhook, SSE) implement `push(identity, result)`; channels that can't are only valid as `originating` targets | +| Per-delivery audit + replay state | Host core writes the intent + status onto the assistant `Message.additional_properties["hosting"]["deliveries"]`; provider opts into in-place updates via `SupportsDeliveryTracking` for crash-safe lifecycle | Universal data model; live update is provider capability. See [Delivery tracking on assistant messages](#delivery-tracking-on-assistant-messages). | +| Background-run lifecycle | Host core | owns `ContinuationToken` issuance, async execution, completion notification; persists via `HostStateStore` (file-based default — survives restarts) | +| Run poll routes | Channel package | each channel exposes its own protocol-shaped poll route (`/responses/v1/{continuation_token}`, `/invocations/{continuation_token}`) backed by `host.get_continuation(token)` | +| Conversation history (all channels — Responses, Invocations, Telegram, Activity Protocol, …) | Agent's core `HistoryProvider` (`agent_framework._sessions.HistoryProvider`) | Channels project their wire id (`previous_response_id`, `conversation_id`, request body `session_id`, host-tracked alias, …) into `ChannelSession.key`; the host resolves an `AgentSession` and the agent's `HistoryProvider` does the load / append. No channel-specific history seam. Multi-provider composition (with a single `load_messages=True`) is the standard AF convention; see [Conversation history for the Responses channel](#conversation-history-for-the-responses-channel) for the Foundry-backed variant. | +| Channel-owned non-message per-thread state (e.g. AG-UI `client_state`) | Channel-shipped `ContextProvider` subclass written into the same per-source state slot | Reuses the existing `ContextProvider` seam — *not* a new storage protocol. Channel reads `ChannelRequest.client_state` in `before_run`, lets the agent observe/mutate the slot, then reads the post-run value in `after_run` to emit channel-specific events (e.g. AG-UI `StateSnapshotEvent` / `StateDeltaEvent`). Composition rules unchanged (one `HistoryProvider` carries `load_messages=True`; additional `ContextProvider`s attach alongside). See [Channel-owned per-thread state](#channel-owned-per-thread-state). | +| Agent invocation | Host core | always through the target's execution seam — `SupportsAgentRun.run(...)` for agent targets, `Workflow.run(...)` for workflow targets | +| Protocol response/event model | Channel package | core returns agent results; channel serializes them | +| ASGI server bootstrap | Host core convenience | `host.serve(...)` for default uvicorn path; `host.app` for custom hosting | + +### Channel session-carriage models + +Channels split into two families based on **who owns the session identifier across requests**. This distinction is invisible to the agent target, but it changes which host-side mechanisms are load-bearing for that channel. + +| Model | Examples | `ChannelSession.key` source | How a caller starts a new thread | +|---|---|---|---| +| **Caller-supplied session** | Responses (`previous_response_id` / `conversation_id`), Invocations, A2A, MCP — generally any HTTP/RPC-shaped channel | The wire payload carries it; the channel parses it into `ChannelSession.key`. `None` means "ephemeral / fresh thread". | Omit the previous id (or send a fresh one). The caller is in control. | +| **Host-tracked session** | Telegram, Activity Protocol via Azure Bot Service (Teams/Web Chat/Slack/…), WhatsApp — generally any chat surface whose protocol carries identity (`chat_id`, AAD oid, `from.id`) but no per-conversation key | The channel leaves `ChannelSession.key = None` and lets the host's per-`isolation_key` alias decide which `AgentSession` to resolve (rule #8 below). | The channel surfaces a `/new`-style command (a `ChannelCommand`) that calls `host.reset_session(isolation_key)`; the host's session-id alias rotates. There is no in-band way for the user to address a specific past thread. | + +Identity is an **orthogonal axis** (anonymous vs. identified). The realized cells in v1 are: + +| | Anonymous | Identified | +|---|---|---| +| **Caller-supplied session** | ✓ — bare `curl /responses` + `previous_response_id`. The id effectively *is* the identity (the resolver may project `previous_response_id` into the `isolation_key` for that turn). | ✓ — Responses + `safety_identifier`, or any caller-supplied channel behind a JWT/OAuth bearer that the resolver maps to an `isolation_key`. | +| **Host-tracked session** | n/a in v1 | ✓ — Telegram / Activity Protocol (Bot Service) / WhatsApp. The channel always authenticates; the resolver maps `(channel, native_id)` to `isolation_key`. | + +**Channel-author guidance.** When implementing a new channel: + +- If your upstream protocol carries a per-conversation identifier on every request, populate `ChannelSession.key` from it. You are a **caller-supplied** channel. `host.reset_session(...)` is **not** the right primitive for your `/new`-equivalent (your callers control that by simply omitting the previous id). Cross-channel linking via `IdentityLinker` is opt-in and depends on whether you also extract a stable identity (header, JWT, etc.) into `ChannelIdentity`. +- If your upstream protocol carries identity but **no** per-conversation key, leave `ChannelSession.key = None`. You are a **host-tracked** channel. To support "start a fresh thread", expose a channel-native command (Telegram `/new`, Teams adaptive-card button, …) that invokes `host.reset_session(isolation_key)` — the host alias rotation does the rest, and prior history remains addressable under its previous session id. You are the canonical case for cross-channel linking; populate `ChannelIdentity` faithfully so `IdentityLinker` and `ResponseTarget.active`/`.all_linked` can find your users. + +**Mixing on one host.** A single `AgentFrameworkHost` can mount channels of both families. A user can chat on Telegram (host-tracked) and have it linked via `IdentityLinker` to a Responses-channel session keyed by `previous_response_id`; in that case the linker's identity merge collapses both sides onto the same `isolation_key` and the host-tracked channel's alias becomes a peer of the caller-supplied `previous_response_id` for the same `AgentSession`. This is the v1 mechanism for "agent built on Responses, exposed to humans on Telegram, with continuity across both". + +### Session resolution rules + +1. If `ChannelRequest.session_mode == "disabled"`, the host bypasses session resolution and calls the target with `session=None`. +2. If `session_mode == "auto"`, the host resolves `ChannelSession.key` to an `AgentSession`, scoped by `isolation_key` when supplied. +3. If `session_mode == "auto"` and no key is supplied, the host may create an ephemeral session. +4. If `session_mode == "required"`, the host must resolve or create a usable session before invoking the target. +5. **Cross-channel resolution rule:** when two channels mounted on the same `AgentFrameworkHost` produce the same `isolation_key` (and either both omit `key` or both produce equivalent keys derived from `isolation_key`), the host resolves them to the **same** `AgentSession`. This is the v1 mechanism for cross-channel chat continuity (e.g. Telegram → Teams against the same conversation history). The **canonical** path for translating a channel's native per-channel identifier (Telegram `chat_id`, Teams AAD object id, …) into the stable `isolation_key` is the host-level `IdentityResolver` (per-channel `run_hook` mapping is supported as a lower-level alternative). When the channel-native identity is not yet linked, the `IdentityLinker` runs a connect ceremony (OAuth, MFA, signed one-time code) to associate it with an existing `isolation_key`. +6. The first spec does **not** standardize a cross-package storage API; cross-host/cross-process continuity is deferred to the pluggable session store (req #23), which also persists identity-link grants beyond the host process lifetime. +7. Responses and other conversation-aware channels may still own protocol-specific conversation/item storage above this layer. +8. **Session rotation (`reset_session`).** The host exposes `reset_session(isolation_key)` so **host-tracked** channels (see [Channel session-carriage models](#channel-session-carriage-models)) can implement "start a fresh thread" commands (e.g. Telegram `/new`). The default behavior **rotates the active session id alias** (`` → `#`) rather than deleting on-disk history: prior history remains addressable by its original session id while subsequent runs for that `isolation_key` resolve to a brand-new `AgentSession`. Apps that want destructive reset can layer that on top by calling into their own `HistoryProvider`. **Caller-supplied** channels do not call `reset_session`; their callers branch threads by sending a fresh / no `previous_response_id` (or equivalent) on the next request. + +### Channel metadata persisted onto stored messages + +When the host invokes the target, it does **not** pass the raw `ChannelRequest.input` directly. It first wraps the input into a `Message(role="user", contents=[...])` whose `additional_properties["hosting"]` carries an envelope describing where the message came from and where its response should go. This makes the resulting conversation history self-describing for any `HistoryProvider` (`FileHistoryProvider`, future Cosmos/Foundry providers, …) without that provider having to know anything channel-specific. + +```jsonc +{ + "channel": "telegram", // ChannelRequest.channel + "identity": { // populated from ChannelRequest.identity + "channel": "telegram", + "native_id": "", + "attributes": { /* channel-specific */ } + }, + "response_target": { // populated from ChannelRequest.response_target + "kind": "originating", + "targets": [] // [(channel, native_id), ...] for explicit targets + } +} +``` + +Round-trip is guaranteed by `Message.to_dict()` / `Message.from_dict()`. Future providers that key on protocol shape (e.g. a Responses `previous_response_id`-keyed store) can read this envelope to reconstruct cross-channel context without needing a separate channel-metadata sidecar. + +`FoundryHostedAgentHistoryProvider` round-trips the entire `additional_properties["hosting"]` namespace (and any other AF-side namespace) through the Foundry response store via a single opaque `agent_framework` container key written onto each `OutputItem`. See [Foundry storage gap: `update_item`](#foundry-storage-gap-update_item) for the one part of the schema (post-push `deliveries[]` mutation) that depends on a service-side addition. + +### Delivery tracking on assistant messages + +The inbound envelope above captures **intent**. To support **audit** ("which destinations actually received this response, and when?") and **replay** ("Telegram was offline; resend to that user when it comes back"), the assistant `Message` produced by the host carries a parallel envelope that records the *resolved destination set* and per-destination outcome. + +Schema on `Message.additional_properties["hosting"]` for a host-produced assistant message: + +```jsonc +{ + "originating": { // mirror of the inbound envelope above + "channel": "telegram", + "identity": { "channel": "telegram", "native_id": "12345", "attributes": {} }, + "response_target": { "kind": "all_linked", "targets": [] } + }, + "deliveries": [ + { + "destination": { "channel": "activity", "native_id": "29:abc..." }, + "status": "delivered", // pending | delivered | failed | skipped + "attempts": 1, + "first_attempt_at": "2026-04-29T08:31:11Z", + "last_attempt_at": "2026-04-29T08:31:11Z", + "last_error": null, + "delivery_id": "msg_018f..." // channel-issued id, when the channel returns one + }, + { + "destination": { "channel": "telegram", "native_id": "12345" }, + "status": "failed", + "attempts": 3, + "first_attempt_at": "2026-04-29T08:31:11Z", + "last_attempt_at": "2026-04-29T08:36:11Z", + "last_error": { "code": "channel_offline", "message": "Telegram getUpdates 502" }, + "delivery_id": null + } + ] +} +``` + +Status values: + +| Value | Meaning | +|---|---| +| `pending` | Host has resolved the destination but has not yet attempted (or is between attempts) `ChannelPush.push(...)`. | +| `delivered` | Push succeeded. `delivery_id` is populated when the destination channel returns a stable id. | +| `failed` | Push raised. `last_error` is populated. Eligible for replay. | +| `skipped` | Destination was excluded by `LinkPolicy`, or the destination channel does not implement `ChannelPush`. Recorded so audit shows *why* a destination resolved by `ResponseTarget` did not receive the message. | + +Lifecycle the host follows: + +1. After `ResponseTarget` resolution and `LinkPolicy` filtering, **before** any push attempt, the host writes the assistant `Message` with one `deliveries[]` entry per destination, all `status="pending"` (excluded ones written as `"skipped"`). This guarantees the intent is durable across host crashes. +2. After each `ChannelPush.push(destination_identity, result)`, the host updates the matching `deliveries[]` entry in place — `status`, `attempts`, `first_attempt_at` (set on first attempt), `last_attempt_at`, `last_error`, `delivery_id`. +3. The mechanism for **retrying** failed deliveries (background worker, operator action, `host.retry_delivery(message_id, destination)`, …) is **out of scope** for this spec — it is enabled by the data model and tracked under Open Questions. + +#### `SupportsDeliveryTracking` provider capability + +Updating a stored message in place is provider-specific. The shape above is universal; the *update semantics* are opt-in: + +```python +from typing import Protocol, Sequence + +class SupportsDeliveryTracking(Protocol): + async def update_deliveries( + self, + *, + session_id: str, + message_id: str, + deliveries: Sequence[Mapping[str, Any]], + ) -> None: ... +``` + +| Provider | `SupportsDeliveryTracking`? | Behavior | +|---|---|---| +| `FileHistoryProvider` (append-only JSONL) | No (capability not implemented) | Host writes the assistant `Message` **once**, at the end of the delivery cycle, with terminal `deliveries[]`. Pre-attempt `pending` snapshot is not durable; a host crash mid-delivery loses per-destination state for in-flight pushes. **Audit-complete, replay-best-effort.** | +| `FoundryHostedAgentHistoryProvider` (Foundry response store) | **Partial — initial-write only** (see [Foundry storage gap](#foundry-storage-gap-update_item) below) | Inbound envelope (`channel`/`identity`/`response_target`) and **initial-write `deliveries[]` snapshot** (all `pending`, plus any `skipped`) round-trip through the Foundry response store unchanged via the `agent_framework` extras container key the provider writes onto each `OutputItem`. **Per-destination updates after each push attempt are not durable** because the Foundry storage SDK does not yet expose a way to mutate an individual stored history item. Behaves as `FileHistoryProvider` in that regard until the [service ask](#foundry-storage-gap-update_item) lands. **Audit-complete-on-write, replay-best-effort.** | +| Cosmos / SQL providers (when introduced) | Expected to implement | Same as above. | + +Providers that omit the capability are still valid hosts for any `ResponseTarget` configuration — they just cannot offer durable replay. The host detects the capability with `isinstance(provider, SupportsDeliveryTracking)` and degrades to write-once when absent. + +> **Why on the message and not in a separate delivery log?** Two reasons. First, the message store is the single source of truth for an assistant turn; piggy-backing on it avoids a second consistency boundary between "message written" and "delivery scheduled". Second, any operator who wants a queryable delivery dashboard can ETL the array out of `additional_properties["hosting"]["deliveries"]` into their preferred outbox/log store — the on-message form does not preclude that. The spec commits only to the on-message shape; outbox layers are an implementation choice. + +#### Foundry storage gap: `update_item` + +`FoundryHostedAgentHistoryProvider` round-trips arbitrary `Message.additional_properties` namespaces through the Foundry response store as opaque JSON via a single `agent_framework` container key on each `OutputItem` (see `_shared.py:_collect_af_extras` / `_inject_af_extras` / `_attach_extras`). This makes the **initial-write** parts of the schema above durable: + +- The inbound `hosting` envelope (`channel`, `identity`, `response_target`) on user messages. +- The initial-write `deliveries[]` snapshot on assistant messages (all entries `pending` or `skipped`, written before the first push attempt). + +What is **not yet** durable through this provider is **post-push mutation** of an individual stored item. The `azure.ai.agentserver.responses.store.FoundryStorageProvider` SDK exposes `create_response`, `get_response`, `update_response`, `delete_response`, `get_input_items`, `get_items`, and `get_history_item_ids` — but no `update_item` / PATCH on a single history item. So when the host updates an entry in `deliveries[]` after `ChannelPush.push()` returns (`status` → `delivered`/`failed`, `attempts`, timestamps, `last_error`, `delivery_id`), there is no way to push that mutation back into the per-item storage row. + +**Workarounds and trade-offs:** + +| Option | Trade-off | +|---|---| +| Encode `deliveries[]` on the *response object* (under `agent_framework`) instead of the assistant *item*, and use `update_response` to mutate it. | Works today, but deliveries are no longer co-located with the assistant message — schema for the `Message` round-trip becomes provider-specific. | +| Delete + recreate the assistant item with the updated body. | Likely loses the `previous_response_id` chain pointer, breaks subsequent `get_history_item_ids` walks, and re-stamps the storage `id` (audit-trail noise). | +| Wait for Foundry storage to add `update_item`. | Cleanest end-state. **This is the recommended path.** | + +**Service ask for the FoundryHostedAgent / Foundry response store team:** + +- Add `update_item(item_id, item_body, *, isolation: IsolationContext | None = None) -> None` (PATCH semantics) to `azure.ai.agentserver.responses.store.FoundryStorageProvider` and the underlying `POST/PATCH /storage/items/{item_id}` REST surface. +- Required because the Hosting spec's per-destination delivery-tracking lifecycle (`pending → delivered`/`failed`/`skipped`) needs to mutate an individual stored item after the first push attempt completes. +- Without it, the FoundryHostedAgentHistoryProvider's `SupportsDeliveryTracking` implementation is permanently stuck at "write-once-best-effort" and durable replay through Foundry storage is unreachable. +- Existing `update_response` is not sufficient because deliveries belong on the **assistant `Message`** (so they round-trip with the message into any provider that consumes the standard `Message` schema), not on the response envelope. + +## Reference and Parity Plan + +The new core sits **below** the conceptual boundary of today's top-level Responses/Invocations host wrappers but is implemented in Agent Framework-owned code. Existing top-level `agentserver` hosts inform behavior, naming, and parity targets — **without** becoming runtime dependencies of the hosting core. Individual channel packages MAY consume lower-level building blocks shipped in `azure.ai.agentserver` (e.g. `FoundryHostedAgentHistoryProvider` builds on the Foundry response-store SDK). + +| Existing code area | Proposed treatment | Why | +|---|---|---| +| `SupportsAgentRun.run(..., session=..., stream=...)` | Reuse directly in core for agent targets | Already the correct Python execution seam | +| `Workflow.run(...)` and workflow streaming events | Reuse directly in core for workflow targets; normalize outputs into `HostedRunResult`/`HostedStreamResult` | Lets channels stay target-agnostic | +| Session resolution logic in current hosting layers | Implement in core, using current behavior as reference | Host behavior, not protocol behavior | +| Starlette app assembly and route aggregation | Implement in core, referencing current servers | Needed by every channel | +| PR #5393 Telegram `BOT_COMMANDS`, `CommandHandler(...)`, `set_my_commands(...)` | Reference for the generic `ChannelCommand` capability | Clearest current prior art for native command catalogs + runtime dispatch | +| `agent_framework_foundry_hosting._to_chat_options` | Inspiration for Responses channel-owned mapping | Still protocol-specific | +| `agent_framework_foundry_hosting._items_to_messages` / `_output_item_to_message` | Inspiration / parity reference in Responses channel codec | Useful, not generic hosting | +| `agent_framework_foundry_hosting._to_outputs` and `ResponseEventStream` | Inspiration for Responses event mapping; the new Responses channel owns its own AF-native serialization rather than reusing top-level `agentserver` host wrappers | Responses-specific serialization | +| `azure.ai.agentserver.responses.ResponseContext.get_history()` + `Store` | Folded into the agent's normal core `HistoryProvider` flow. The Responses channel projects `previous_response_id` / `conversation_id` into `ChannelSession.key`; the agent's `HistoryProvider` does the load / append exactly as for any other session. No Responses-specific history Protocol. | One uniform history seam across channels — the developer chooses where to store, and may compose multiple providers under the standard "single `load_messages=True`" rule. | +| `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider` (HTTP-backed Foundry storage with `IsolationContext` user/chat headers) | Wrapped by a native `FoundryHostedAgentHistoryProvider` in `agent-framework-foundry-hosting` that **builds on top of** the SDK and exposes the standard core `HistoryProvider` Protocol. Agents attach it the same way they attach `FileHistoryProvider`. | Lets the Foundry response store back conversations driven through the new host, while keeping the channel agnostic to the storage backend. The provider owns a runtime dependency on `azure.ai.agentserver` (for the storage SDK) so it stays aligned with the SDK's wire contract, auth, and isolation headers without duplication. Same provider also works for non-Responses channels (Telegram, Invocations, …) so the choice is "where do I want history persisted" rather than "which channel am I exposing". | +| `agent_framework_foundry_hosting._invocations.InvocationsHostServer._sessions` (in-process `dict[str, AgentSession]`) | Replace with the host's normal `ChannelSession.key → AgentSession` resolution; agent history flows through its own (optional) core `HistoryProvider(load_messages=True)` | Invocations does **not** need a protocol-shaped history seam — confirmed by today's foundry hosting which keeps no `Store` on the Invocations side | +| `ResponsesAgentServerHost` / `InvocationAgentServerHost` top-level wrappers | Conceptual prior art only | Sit too high; encode protocol ownership | +| Workflow checkpoint behavior in current Responses hosting | Defer; reference only for future work | Needs separate design if it becomes shared | + +## Dependencies & Commitment Status + +| Dependency | Team | DRI | Status | +|---|---|---|---| +| `SupportsAgentRun` execution seam | Agent Framework Core (Python) | TBD | Committed (existing) | +| `Workflow` execution seam | Agent Framework Core (Python) | TBD | Committed (existing); host wraps workflow outputs into `HostedRunResult`/`HostedStreamResult` | +| `AgentSession` / conversation primitives | Agent Framework Core (Python) | TBD | Committed (existing); cross-package storage standardization deferred | +| Starlette | External (BSD-licensed) | n/a | Committed; required runtime dep of `agent-framework-hosting` | +| Uvicorn | External (BSD-licensed) | n/a | Open Question — required dep vs optional extra (see Open Questions) | +| `agent-framework-foundry-hosting` parity reference | Agent Framework Hosting | TBD | Reference-only, no runtime dependency | +| `FoundryHostedAgentHistoryProvider` (in `agent-framework-foundry-hosting`, built on `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider`) | Agent Framework Foundry | TBD | Proposed v1 deliverable so Foundry-defined (and any other) agents can use Foundry's response store as a `HistoryProvider` through the new host. Implements the standard core `HistoryProvider` Protocol — usable from any channel, no Responses-specific Protocol. Owns a runtime dep on `azure.ai.agentserver` for the storage SDK. | +| PR #5393 Telegram sample (commands, polling/webhook patterns) | Agent Framework | PR author | Reference-only; informs `ChannelCommand` and `TelegramChannel` design | +| Telegram Bot API SDK | External | n/a | Committed (runtime dep of `agent-framework-hosting-telegram`) | +| `microsoft/teams.py` SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`) | External (MIT, Microsoft) | n/a | Proposed runtime dep of `agent-framework-hosting-teams` (req #28). The SDK already ships a "Build an agent using Microsoft Agent Framework" guide and a pluggable `HttpServerAdapter`, so the hosting package mounts the SDK's `App` into the host's Starlette app and reuses its Adaptive Cards / Streaming / Citations / Feedback / Suggested-prompts / Dialogs / Message-Extensions / SSO surface instead of re-implementing them. | +| `agent-framework-ag-ui`, `-a2a`, `-devui` | Agent Framework | various | Out of scope for first implementation; future convergence kept as a possibility | + +## Open Questions + +| # | Question | On Point | Notes | +|---|---|---|---| +| 5 | How much of the Responses Conversations API should the Responses channel own vs a future shared conversation utility? | Eng / PM | Tied to whether session storage gets standardized. | +| 6 | Should a later phase define a pluggable session store interface? | Eng | Needs to be designed **holistically across all storage axes** — sessions, messages, identity links, run-state / continuation tokens, workflow checkpoints — rather than per-axis. Tracked as v1 fast-follow / requirement #23. | +| 8 | Should command scopes / projection metadata become first-class — e.g. private-chat-only vs group-chat-visible commands, or per-locale descriptions? | Eng / PM | Telegram's `BotCommandScope` and `language_code` would need to be representable cross-channel. | +| 10 | Is "Channel" the GA name? "Head" was used interchangeably during design discussions. | PM | "Channel" chosen for the spec; confirm before public docs. | +| 12 | Should `ChannelRequest.session_mode` grow additional values (e.g. `"shared"` for multi-channel session sharing) or stay closed at three? | Eng | The taxonomy needs a **dedicated design exercise** covering all known channel session-shape patterns; revisit after that exercise. | +| 14 | Where do issued link grants live — short-lived in-memory state on the host, the same pluggable session store (#23), or a separate identity store? | Eng | Resolved as part of the **`HostStateStore`** seam (see [Host state storage](#host-state-storage)). Link grants live alongside continuation tokens and last-seen records in the v1 file-based default (`FileHostStateStore` → `link_grants/` namespace, 15min TTL). Pluggable Cosmos / SQL / Redis adapters tracked in req #23. **→ Move to Resolved Questions in next pass.** | +| 17 | Should `ResponseTarget.active` honor a configurable **time window** (last seen within N minutes) and what is the fallback when the window has expired before the response is ready — `originating`, `all_linked`, drop with `ContinuationToken` `status="failed"`? | PM / Eng | Likely yes with sensible default (e.g. 24h fall back to `originating`); per-request override via the run hook. | +| 22 | For the Responses WebSocket transport, what subprotocol identifier (if any) should be advertised on the `Upgrade` and how is auth conveyed — `Authorization` header on the upgrade, a `Sec-WebSocket-Protocol` token, or a query-string-bound short-lived token? | Eng / PM | Aligning with whatever OpenAI ships for Responses WS is preferable; keep the codec swappable so the channel can track upstream changes without breaking the host contract. | +| 27 | What is the retention contract for completed `deliveries[]` entries — keep forever for audit, GC after the message itself ages out, or cap per-message at a fixed attempt count? Should `last_error` payloads be redacted to a code/message pair to avoid logging PII from the underlying channel SDK? | Eng / Compliance | Suggest "lifetime equals message lifetime" + redacted error shape (`{code, message}` only, no provider stack frames or payload echoes) as the default; revisit when the persistent store contract lands. | + +### Resolved Questions (decisions log) + +Original numbering preserved so external references (checkpoints, ADR cross-links) still resolve. Decisions captured here may imply spec-body changes elsewhere — see [Decisions-driven follow-ups](#decisions-driven-follow-ups) below. + +| # | Question | Decision | +|---|---|---| +| 1 | Final distribution package names? | `agent-framework-hosting` with suffixes (`-responses`, `-invocations`, `-telegram`, …). Public imports stay at `agent_framework.hosting`. | +| 2 | `uvicorn` required vs optional extra? | Use **hypercorn** instead of uvicorn; the `serve` extra remains optional. `host.app` is still the canonical server-agnostic ASGI surface. | +| 3 | Keep `HostedRunResult` wrapper or return `AgentResponse` directly? | **Keep `HostedRunResult`.** It wraps both `AgentRunResult` *and* the unknown output type of a `Workflow`, and adds host-run metadata (resolved session, etc.). | +| 4 | Where do generic auth helpers live? | Only the **mechanisms** live in core. Concrete implementations sit in their own packages when they pull dependencies; dep-free helpers may live in `hosting`. | +| 7 | `protocol_request` typed (`Any`) or typed kwargs? | **Keep `Any`.** | +| 9 | Allow nested routers / `path=""`? | **Yes.** The host developer is responsible for ensuring routes do not overlap. | +| 11 | Should the host support multiple targets? | **No** — final. Solve a layer above (an external router that owns multiple single-target hosts). | +| 13 | Which identity linkers ship in phase 1? | **Entra linker** (in the Entra package) + **one-time-code linker** (in core). Drop MFA for now; investigating additional linkers tracked as a follow-up. | +| 15 | Identity resolver invoked once on host vs per channel? | **Once on the host** with `ChannelIdentity(channel, native_id, ...)`. | +| 16 | Should `IdentityLinker` and `Channel` share a base `Contributor` protocol? | **A linker *is* a Channel — specialised.** Use the single Channel-shaped contract; collapse `IdentityLinker` into a Channel specialisation. | +| 18 | Contract for `ChannelPush` failures? | **Annotate the failure on the relevant `deliveries[]` entry** in the data model (see §"Delivery tracking on assistant messages"). Re-delivery is future work (Q26). | +| 19 | `host.run_in_background(...)` `notify` callback? | Programmatic non-channel delivery will be expressed via the **`continuation_token`** mechanism (see Q20), not a separate `notify` callback. | +| 20 | Storage / TTL of `ContinuationToken`s? | **Done in this revision.** `ContinuationToken` is the type, with an opaque `token: str` field that channels surface to callers; equivalent continuation-token support is added to the **Invocations channel** alongside the existing Responses behaviour. Push-capable channels can still use it; default behaviour remains "push on completion", but the developer can choose other UX (poll-after-push, hybrid, …). Persistence is the **`HostStateStore`** seam — v1 default is **`FileHostStateStore`** (atomic JSON writes, 24h TTL on completed entries), so background runs survive host restarts. | +| 21 | Partial-failure surfacing for `all_linked`? | **Handled by the `deliveries[]` array** in the data model, updated per-destination as each push attempt completes. | +| 23 | Share one backing store contract for host-level vs `ContextProvider`? | **Stay separate protocols** (current draft direction confirmed). A deployment may still bind both onto the same physical backend. | +| 24 | Where does the Foundry history provider live? | Tentative name **`FoundryHostedAgentHistoryProvider`**, in the **`foundry-hosting`** package (shares the dependency). Confirm with Foundry package owners before launch. | +| 25 | `Channel.confidentiality_tier` opaque vs enum? | Keep as `str?` for now; can revisit before Release. | +| 26 | Where does the delivery-replay mechanism live? | **In the Host**, but **out of scope for v1.** The on-message `deliveries[]` envelope is sufficient input for any future replayer. | + +### Decisions-driven follow-ups + +The following resolutions imply prose / API edits elsewhere in the spec body (not just the table above). Captured here so they aren't lost; the edits themselves are deferred to a separate pass. + +- **Q2** — Switch all install / `host.serve()` references from `uvicorn` to `hypercorn`. +- **Q3** — Update `HostedRunResult` documentation to cover the workflow-output case and the host-run metadata it adds on top of `AgentRunResult`. +- **Q11** — Strip any remaining "multi-target hedge" language from the spec body. +- **Q13** — Update the linker catalogue: Entra (in Entra package) + one-time-code (in core); remove MFA references. +- **Q16** — Collapse `IdentityLinker` into a Channel specialisation in the spec body (architecture diagrams, contracts, examples). +- **Q20** — ✅ Done. `ContinuationToken` type carries an opaque `token: str`; routes use `/{continuation_token}`; Invocations channel gets equivalent continuation-token support; persistence via `HostStateStore` (v1 default file-based). From 1e2599e68e43219f86bd921a43fc3b8701f91db9 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Fri, 22 May 2026 14:01:17 +0200 Subject: [PATCH 02/20] docs: renumber hosting channels ADR Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../{0026-hosting-channels.md => 0027-hosting-channels.md} | 0 1 file changed, 0 insertions(+), 0 deletions(-) rename docs/decisions/{0026-hosting-channels.md => 0027-hosting-channels.md} (100%) diff --git a/docs/decisions/0026-hosting-channels.md b/docs/decisions/0027-hosting-channels.md similarity index 100% rename from docs/decisions/0026-hosting-channels.md rename to docs/decisions/0027-hosting-channels.md From a6027a1c9199713176f286611172f5115a446809 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Fri, 22 May 2026 14:55:56 +0200 Subject: [PATCH 03/20] Python: add agent-framework-hosting core package (#5638) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(hosting): add agent-framework-hosting core package New ``agent-framework-hosting`` package implementing ADR 0026 / SPEC-002: the channel-neutral host that lets a single ``Agent`` (or ``Workflow``) fan out across multiple wire protocols ("channels") behind one Starlette ASGI app. Surface (re-exported from ``agent_framework_hosting``): - ``AgentFrameworkHost`` — wraps a hostable target, mounts channels onto an ASGI app, owns per-isolation-key ``AgentSession`` reuse, threads request context (``response_id`` / ``previous_response_id``) into context providers via an ``ExitStack`` of ``bind_request_context`` calls, and exposes an opt-in Hypercorn ``serve()`` helper (extra ``[serve]``). - ``Channel`` protocol + ``ChannelContribution`` — the surface a channel package implements (routes, lifespans, identity hooks, …). - ``ChannelRequest`` / ``ChannelSession`` / ``ChannelIdentity`` / ``ChannelPush`` / ``ChannelCommand[Context]`` / ``ChannelRunHook`` / ``ChannelStreamTransformHook`` / ``DeliveryReport`` / ``HostedRunResult`` / ``ResponseTarget`` / ``ResponseTargetKind`` / ``apply_run_hook`` — channel-side dataclasses + helpers. - ``IsolationKeys`` + ``ISOLATION_HEADER_USER`` / ``..._CHAT`` + ``get/set/reset_current_isolation_keys`` — the host's ASGI middleware reads the ``x-agent-{user,chat}-isolation-key`` headers off each inbound request and exposes them to the agent stack via a ``ContextVar`` so storage-side providers (e.g. ``FoundryHostedAgentHistoryProvider``) can apply per-tenant partitioning without channels having to forward anything. Includes 45 unit tests covering the host, channel contributions, isolation contextvar, and shared types. Registers the package in ``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching pyright ``executionEnvironments`` entry for tests. Hypercorn is an optional dependency (``[serve]`` extra); the soft import in ``serve()`` is annotated for pyright since it isn't on the default install. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting): address PR-2 review comments Source-code changes - _suppress_already_consumed: narrow contract — RuntimeError now logs at WARNING with exc_info; non-RuntimeError still logs at exception(). Docstring clarifies that any non-clean teardown is observable. - _BoundResponseStream: add aclose() and route __await__ through get_final_response() so the binding is always released — fixes contextvar leak when channels abandon the stream or use the await-the-stream convenience. - Lifespan: aggregate startup/shutdown callback errors; every callback runs, all failures are logged with their qualname, and the first error is re-raised so Starlette still aborts boot. - _build_run_kwargs: switch session-cache write to dict.setdefault so concurrent racers cannot orphan a session if create_session ever yields. - _deliver_response: introduce DeliveryReport.failed for push outages vs explicit "no link" drops; an outage no longer triggers an originating fallback so the channel can decide degraded behaviour. Test additions - tests/test_isolation.py (new): full coverage of IsolationKeys, the contextvar helpers, header constants, and end-to-end ASGI middleware lift / reset / passthrough. - tests/test_host.py: TestBindRequestContext, TestBoundResponseStream (aclose / __await__ / __getattr__ forwarding / double-close idempotency), TestWrapInputListMessages (list[Message] LAST precedence), TestLifespanAggregation (startup + shutdown). - tests/test_types.py: TestApplyRunHook (sync/async/None), and TestDeliveryReport (new failed field). - Updated test_push_exception_marks_skipped -> test_push_exception_lands_in_failed_no_fallback to match the new delivery contract. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting): address PR-2 round-2 review comments - Refactor workflow checkpoint restoration into shared helpers (_restore_workflow_checkpoint for blocking; the streaming sibling drains the rehydration stream) so the blocking and streaming paths rehydrate identically — clarifies the previously inline _maybe_restore by hoisting the pattern next to the blocking call site. - Document that blocking workflow output is text-only by design; richer modalities ride the streaming AgentResponseUpdate channel, which preserves all content parts. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * review: address PR-4 _host.py round 2 feedback These review comments were filed on PR-4 (#5640) but target lines that live in the hosting-core package (PR-2 / #5638), so the fixes land here and PR-4's stack will pick them up on rebase. - _suppress_already_consumed: narrow the RuntimeError catch to the two documented benign messages (`Inner stream not available`, `Event loop is closed`); any other RuntimeError now logs at ERROR with a full traceback so executor bugs / runner-context state errors / checkpoint RuntimeErrors during the post-run flush no longer masquerade as benign cleanup noise. Still no propagation (we're in an async-generator finally during teardown) — see the docstring. - _restore_workflow_checkpoint{,_streaming}: log a WARNING when a non-None latest checkpoint drains to zero events, so a stale or partially-written checkpoint_id surfaces as an operator signal instead of a silent state-loss. (The `deliver_response` "no destinations resolvable" vs "every destination errored" concern raised in 3198268038 is already addressed by the existing `failed` vs `skipped` distinction surfaced through `DeliveryReport.failed` — see lines 1080-1102 and the `DeliveryReport` docstring.) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting): reject path-traversal patterns in checkpoint isolation_key The host's `_resolve_checkpoint_storage` joined `request.session.isolation_key` directly into the configured `checkpoint_location`. The key is caller- controlled — sourced from inbound headers (`x-agent-{user,chat}-isolation-key` injected by the Foundry runtime), from channel-supplied derivations such as `telegram:` / `entra:`, or from values set by a channel `run_hook`. A value like `../../../etc/foo` or an absolute path would let the resulting checkpoint directory escape the configured root (CWE-22). This matches the path-traversal class fixed upstream in #5851 for the foundry_hosting checkpoint storage. New `_checkpoint_path_for_isolation_key(root, isolation_key)` helper: - Uses a denylist (not allowlist) so legitimate namespaced keys (`telegram:42`, `entra:abc-def`) continue to pass through unmodified. - Rejects path separators (`/`, `\`), NUL, all-dot reductions (`.`, `..`, `...`, ...), absolute paths (`os.path.isabs`), and drive-letter prefixes (`os.path.splitdrive` plus an explicit `^[A-Za-z]:` check so payloads crafted on a POSIX host still fail closed if the resulting directory ever round-trips to Windows storage). - After joining, resolves both sides and verifies `target.is_relative_to(root)` as defence-in-depth. `_resolve_checkpoint_storage` now logs a WARNING and returns `None` for invalid keys rather than crashing the request — checkpointing is best- effort and we prefer dropping it to letting one malformed key abort an otherwise valid agent run. Tests: - `TestCheckpointPathForIsolationKey` exercises the helper directly with legitimate keys (alphanumeric, `:`-namespaced, dotted, 200-char), all rejected traversal patterns from #5851's MSRC repro list, and non-string input. - `TestHostWorkflowCheckpointingPathTraversal` verifies the end-to-end request path: a traversal key (`../escape`) and an in-key separator (`evil/sub`) both produce a successful agent response with no files written under `checkpoint_location`, and the traversal case logs a WARNING citing `isolation_key`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting): address PR-2 round-3 review feedback + add response hooks Round-3 review comment fixes: - _types.py: drop the _EMPTY_MAPPING sentinel; ChannelIdentity.attributes uses plain dict() as the default — simpler, no extra symbol to track. - _host.py: drop the local `import asyncio` + `from typing import cast as _cast` inside `serve()`; rely on the module-level imports. - _host.py: switch `_log_incoming` to structured `extra={...}` payloads for both INFO and DEBUG so log aggregators get queryable fields. - _host.py: delete `_flat_context_providers` and stop descending into a `.providers` attribute. Aggregator providers (AggregateContextProvider / ContextProviderBase) are responsible for forwarding `response_context` to their children themselves; the host treats whatever `agent.context_providers` exposes as the final, flat list. - _host.py: stop collapsing agent / workflow output to text. `_invoke` forwards `AgentResponse.messages` (and `raw_response`) on the `HostedRunResult`. `_invoke_workflow` builds a per-event message list via a new `_workflow_output_to_messages` helper that preserves AgentResponse / AgentResponseUpdate / Message / Content branches and falls back to text only for arbitrary objects. - _host.py: `_workflow_event_to_update` carries Content payloads through unchanged so multi-modal workflow outputs (images, function-call metadata, ...) survive into channels. New features (per design discussion in the PR thread): - HostedRunResult: rebuilt around `messages: list[Message]` with `.text` / `.contents` as projections, a `raw_response` slot for the underlying AgentResponse, and a `replace(messages=..., raw_response=...)` clone helper used by the delivery layer for per-destination isolation. The `HostedRunResult(text="...")` ctor is preserved as a back-compat shim that synthesises a single assistant text message. - ResponseTarget: gain `echo_input: bool = False` (also exposed on `.channel(name, *, echo_input=...)` / `.channels([...], *, echo_input=...)`). When set, the host pushes the originating user message to each non-originating destination before the agent reply. Channels can filter or transform echoes via their response_hook. - DeliveryReport: add `echoed` / `echo_failed` tuples to surface per-destination outcomes of the new echo phase. Echo failures do not abort the corresponding response push on the same destination. - ChannelResponseHook + ChannelResponseContext + apply_response_hook: duck-typed `response_hook` attribute on channels for per-destination post-processing. Receives a clone of the HostedRunResult and a context carrying the request, channel name, destination identity, originating flag, and `is_echo` phase flag. Channels stay modality-aware (text-only wires flatten via the hook; card-capable channels render structured contents directly). - _deliver_response: clone-before-hook fan-out so a hook mutating one channel's payload cannot leak into another destination's view. Tests: - Update _FakeAgentResponse to expose `.messages` (single assistant text message synthesised from `text`) so existing tests pass unchanged on the new multi-modal _invoke path. - Replace the obsolete `test_bind_descends_one_level_into_providers_attribute` with a regression guard asserting the host does NOT descend into `.providers` (matches new contract). - New tests for HostedRunResult multi-modal preservation, echo_input fan-out with success + failure, response_hook applied per destination, per-destination mutation isolation, and is_echo phase observability. Docs: - spec 002: rewrite Canonical flow with the new input → run_hook → host → target → wrap → per-destination clone → response_hook → push pipeline; document multi-modality contract and per-destination cloning; add `echo_input` row to ResponseTarget table; rewrite HostedRunResult/HostedStreamResult row; add ChannelResponseHook / ChannelResponseContext / apply_response_hook table; log decisions Q28 (no host-side text collapse), Q29 (duck-typed response_hook), Q30 (opt-in `echo_input` on ResponseTarget). - ADR 0026: add ChannelResponseHook + multi-modality bullets; surface `echo_input` on the ResponseTarget bullet. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting): drop HostedRunResult(text=...) back-compat shim; use from_text() Pre-release cleanup — no released callers to break, so consolidate on one canonical entry point plus a classmethod for the ergonomic single-text-message case: - HostedRunResult.__init__ takes ``messages`` positionally (required); no more ``text=`` kwarg overload, no more "synthesise an empty message when no args" path. - New HostedRunResult.from_text(text, *, role="assistant", raw_response=None) classmethod for the common "wrap a single text content as one message" case (tests, channels emitting plain strings, the echo-input phase wrapping a user's text turn). - ``_build_echo_payload`` uses ``HostedRunResult.from_text(raw, role="user")`` for the ``str`` and fallback branches; the other branches use the plain ctor with explicit ``Message`` lists. - Tests rewritten to use ``from_text("reply")`` everywhere ``HostedRunResult(text="reply")`` appeared. Added an explicit ``test_from_text_role_kwarg_overrides_default`` regression guard. - spec 002: HostedRunResult row updated to describe the ``from_text(text, *, role="assistant")`` classmethod instead of the removed back-compat shim. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(hosting-core): reshape HostedRunResult into generic typed envelope Replace the flattened multi-modal HostedRunResult (carrying messages/raw_response/.text projections) with a typed generic envelope around the target's full-fidelity output: class HostedRunResult(Generic[TResult]): result: TResult session: AgentSession | None - Agent targets produce HostedRunResult[AgentResponse]; channels read result.messages, result.text, result.value, result.response_id, result.usage_details directly off the underlying response. - Workflow targets produce HostedRunResult[WorkflowRunResult]; channels iterate result.get_outputs() and inspect result.get_final_state() themselves (the host no longer collapses workflow outputs onto a synthesised message list). - The echo-input phase synthesises a HostedRunResult[AgentResponse] wrapping the user's turn so the same per-destination delivery machinery applies. - replace() is now {result, session} only; the host's clone is shallow — channels that need to mutate result itself are responsible for their own deep copy. Rationale: the earlier shape pre-shaped target output (collapsing workflows onto a Message list, losing per-executor outputs, final state, and structured value affordances). Carrying the target output unchanged keeps the host modality-agnostic, gives channel authors static typing where they want it, and removes 30+ lines of host-side projection helpers. Also updates ADR 0026 + spec 002 (Q3, Q28, Q29 amended; new Q31 captures the generic-envelope decision and rationale). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting-core): document echo vs response distinction for push channels The host already encodes the echo-vs-response phase via the underlying Message.role on the pushed HostedRunResult: - echo phase: payload.result.messages[*].role == "user" - response phase: payload.result.messages[*].role == "assistant" Both pushes go through the same ChannelPush.push(identity, payload) entry point. Channels distinguish either by inspecting role (which works for any push-capable channel) or — when a response_hook is wired — by branching on ChannelResponseContext.is_echo directly. Expand the ChannelPush Protocol docstring to make this discoverable for channel implementers (esp. chat bots that cannot impersonate the user on their wire and need to render echoes as quoted / prefixed blocks rather than as bot replies). Mirror the explanation into the spec's echo_input section. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting-core): fix quickstart to use current Agent API ChatAgent was renamed to Agent and the preferred construction pattern is client.as_agent(...). Also drop the sibling channel import so the snippet imports only modules declared as dependencies of this package; point readers at the sibling packages instead. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(hosting-core): drop redundant @pytest.mark.asyncio decorators asyncio_mode = "auto" is configured in pyproject.toml, so individual @pytest.mark.asyncio decorators are unnecessary. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): add authorization profiles + IdentityAllowlist seam to ADR/spec Composes `require_link` + `allowlist` into three named profiles (open, forced-link, allowlist) with the allowlist itself keyed on either the channel-native id (pre-link) or a verified IdP claim (post-link), plus `AnyOf`/`AllOf` combinators for mixed setups. Lifts the design into an explicit host seam (`host.authorize(...)` → `AuthorizationOutcome` of `Allowed` / `LinkRequired` / `Denied`) instead of leaving each channel to roll its own. Key contract bits: - Tri-state `AllowlistDecision` (ALLOW / DENY / ABSTAIN) so claim-based lists can ABSTAIN until claims are available without composition silently flipping that into DENY. - `AuthorizationContext` carries explicit `phase` + `claim_source` so allowlists can tell pre-link from post-link without overloading `verified_claims is None`. - Channel-side `allowlist: ... | Literal["inherit"] | None` with an explicit inheritance sentinel, so the host-level `default_allowlist` is opt-out, not opt-in. - Construction-time validator rejects silent-deny configurations (`LinkedClaimAllowlist` without a claim source) with a typed `ChannelConfigurationError`. - Group-chat denial mirrors the existing `LinkChallenge` DM-redirect pattern; only the redacted `user_message` reaches the wire, structured `log_details` stay in telemetry. Ships in two waves: the Protocol + `NativeIdAllowlist` + config validator land with the next core PR ahead of the linker; the full pipeline + `LinkedClaimAllowlist` enforcement land with the `IdentityLinker` core PR. Updates: ADR 0026 (summary bullet + conceptual-API table row + resolved Q16), spec 002 (new req #22, renumbered v1 fast-follow #23..#29 and stretch #30..#31, new "Authorization profiles and the IdentityAllowlist seam" subsection, inbound-ownership row, resolved Q32, follow-up entry). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting): add DurableTaskRunner seam + runtime_mode auto-detect Introduces the explicit long-running vs ephemeral runtime distinction and a generic DurableTaskRunner Protocol that owns non-originating push dispatch — collapsing the previous deliveries[] per-destination state machine, SupportsDeliveryTracking provider capability, and Foundry update_item service ask down to a single immutable intended_targets[] write on the message. Spec / ADR: - New §"Runtime modes" with auto-detect markers + defaults matrix. - Rewrites §"Delivery tracking" → §"Intended targets + durable delivery": intent-only on the message, operational state lives in the runner. - New §"Durable task runner" defining DurableTaskRunner / RetryPolicy / TaskHandle / TaskStatus. - Drops §SupportsDeliveryTracking and §Foundry update_item gap. - Resolved Qs: 12, 18, 21, 26 revised; new 17/18/19 (ADR) and 33/34/35 (spec). Code: - New _runner.py with InProcessTaskRunner (asyncio + bounded retry, bounded terminal-status cache, register-after-start guard, shutdown drain). - _host.py: runtime_mode + durable_task_runner ctor params; auto-detect via FOUNDRY_HOSTING_ENVIRONMENT / AZURE_FUNCTIONS_ENVIRONMENT / AWS_LAMBDA_FUNCTION_NAME; HOSTING_PUSH_TASK_NAME handler registered eagerly so _deliver_response can be called outside the lifespan; _handle_push_task does echo-then-response inline per destination; _deliver_response now schedules one task per destination via the runner (DeliveryReport.pushed = scheduled; .failed = schedule-time outage only). - _types.py: new DurableTaskRunner Protocol + RetryPolicy / TaskHandle / TaskStatus; DeliveryReport drops echoed / echo_failed (echo outcome owned by the runner). - __init__.py exports the new public surface. Tests: 132 passing, 90% coverage. New test_runner.py covers InProcessTaskRunner success/retry/terminal-failure/cancellation/ register-after-start, runtime-mode auto-detect with synthetic env, and the warning-on-ephemeral-without-runner path. test_host.py delivery tests use a sync runner fake for deterministic assertions and validate the new "schedule succeeded vs runner backend unreachable" semantics. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting): rubber-duck round-5 — strict ephemeral, codec seam, allowlist Wave-1, drop DeliveryReport Adopts the rubber-duck-approved package of changes from the round-5 review of PR #5638 (modulo DeliveryReport.failed — the value type is removed entirely now that durable delivery covers the failure surface, per user direction). Code: - Drop DeliveryReport value type; host-internal _deliver_response returns bool. Failure observability is now logs (in-process) / runner backend (durable adapters). - Strict ephemeral default: ephemeral runtime_mode with the default in-process runner raises RuntimeError; opt-in via allow_in_process_runner=True (warns). - ChannelPushCodec Protocol + DurableTaskPayloadMode enum + _validate_runner_codec_pairing so JSON-mode runners can be safely paired with channels via codecs; _handle_push_task accepts both object- and JSON-envelope shapes. - ResponseTarget.identity(...) / .identities([...]) builders + IDENTITIES kind for explicit caller-supplied recipients; field rename identities → _target_identities (private) with a target_identities property to resolve the classmethod collision. - Intent-only audit: _annotate_intended_targets writes hosting.intended_targets / skipped_targets / includes_originating / originating_channel onto assistant messages — single immutable write per the runner-owned operational-state model. - InProcessTaskRunner: 2-phase drain on shutdown (shutdown_grace_seconds, default 5.0) so a clean shutdown does not abandon work mid-retry; payload_mode = OBJECT class-level. - Echo idempotency: _handle_push_task tracks an echo_done cursor on runner-owned task state so a retry that fires after the echo phase succeeded does not double-echo. Wave-1 authorization seam (full landing): - New _authorization.py with AllowlistDecision tri-state, AuthorizationContext, IdentityAllowlist Protocol, AllowAll / NativeIdAllowlist (with async loader cache + channel-scope ABSTAIN) / LinkedClaimAllowlist (raise-until-Wave-2) / AnyOfAllowlists / AllOfAllowlists / CallableAllowlist built-ins, Allowed / LinkRequired / Denied outcomes, ChannelConfigurationError. - Host(default_allowlist=..., identity_linker=...) + per-channel allowlist parameter with 'inherit' / None semantics. - _validate_channel_authorization enforces all three rules at construction: claim-source requirement, linker presence for require_link=True (elevated from no-op — must not ship unenforced), and NativeIdAllowlist(channel=...) typo detection. Combinator-walking via _flatten_allowlists catches nested misconfigs. - host.authorize(...) for the native-id pipeline: open path returns Allowed with auto-issued : isolation key (or the existing key when the identity has been seen); ABSTAIN on a claim-required allowlist maps to Denied(reason_code='allowlist_requires_link') until Wave 2 wires the linker to convert it to LinkRequired. Spec / ADR: - docs/specs/002-python-hosting-channels.md: Wave-1 status updated to reflect the linker-presence rule elevation and the host.authorize landing; new sub-sections (codec contract, drain, echo cursor); Qs 18 / 21 DeliveryReport references purged; new resolved Qs 36–40 covering the strict-ephemeral default, codec contract, DeliveryReport removal, echo cursor, and drain. - docs/decisions/0026-hosting-channels.md: Q12 DeliveryReport reference purged; Q16 updated to reflect Wave-1 landing; new resolved Qs 20 (codec contract) + 21 (strict ephemeral / drain / echo cursor). Tests: - New tests/test_authorization.py (35 cases) covering every Wave-1 built-in, the three validator rules, combinator decision semantics, and host.authorize across open / allow / deny / abstain-with-claim-dep / abstain-without-claim-dep paths plus existing-key reuse and verified-claims propagation. - tests/test_host.py: TestDeliverResponse rewritten for the bool return + runner.scheduled-count assertions; new tests for IDENTITIES variant + echo idempotency. - tests/test_runner.py: strict-ephemeral now expects RuntimeError; allow_in_process_runner opt-in tests; shutdown drain test; payload_mode default test. - tests/test_types.py: TestDeliveryReport removed; new TestDurableTaskPayloadMode + TestResponseTargetIdentities. Validation: 178 tests pass, 91% coverage, fmt + lint + pyright + mypy clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting): add mermaid flow diagrams to ADR, spec, README Insert the 10 hosting flow diagrams reviewed in python/.user/hosting-diagrams.md into the public docs: - README: runtime topology (1a) + cross-link to the spec for the richer set. - ADR: runtime topology, channel contribution shape, and authorization decision (1a, 1b, 3) at the end of 'Conceptual API shape'. - Spec: all 10 diagrams — 1a/1b at the top of API Surface, 2 in Canonical flow, 3 in Authorization profiles, 4-7 in Scenarios 6-8, 8 in Codec contract, 9 in Echo idempotency, 10 in Scenario 9. Doc-only; no API or behaviour change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting): add opt-in disk persistence via state_dir Long-running hosts (always-on container, single-VM bot, local dev) lose state on every restart today. Add an opt-in disk persistence layer under a new `state_dir` constructor parameter on `AgentFrameworkHost` that survives process restarts without taking on a heavyweight database dependency. Backed by `diskcache` (installed via the new `[disk]` optional extra). An OS-level advisory file lock guarantees single-owner semantics so two hosts pointed at the same directory cannot double-execute scheduled pushes. What persists when `state_dir` is set: - Pending durable-task records — scheduled-but-not-yet-completed pushes replay on the next host startup via `InProcessTaskRunner.resume()`. Records that crashed mid-attempt resume with the already-consumed retry budget (no full-budget re-grant). - `_session_aliases` — per-isolation-key session-id rewrites. - `_active` — most-recently-active channel per isolation key. - `_identities` — `ChannelIdentity` rows for fan-out targeting, including nested mutations of the form `self._identities[ik][channel] = identity`. The `state_dir` parameter accepts any of: - `None` — today's purely in-memory behaviour. - `str` / `PathLike` — single root; host auto-creates `runner/` and `sessions/` subfolders. - `HostStatePaths` TypedDict / plain mapping — per-component overrides routed to different roots. Unknown keys raise `ValueError` to surface typos early. Unpicklable push payloads raise `PushPayloadNotPicklable` eagerly from `schedule()` so issues surface at the call site rather than on the next restart. Corrupt on-disk records are quarantined-and-logged; the runner never crashes on resume. Live `AgentSession` objects stay in memory and are rehydrated lazily by the history provider on the next turn. - New modules: `_persistence.py` (lock + normalisation), `_state_store.py` (session-bookkeeping store). - Runner rewrite: 4-state model (`pending` / `succeeded` / `failed` / `cancelled`); the transient `running` state was a bug that caused resume to skip records that crashed mid-handler. - New tests: `test_runner_disk.py` (8 tests), `test_host_disk.py` (8 tests). 194 passed total. pyright + mypy + ruff clean. - README: new "Optional disk persistence" section with code samples. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting): add checkpoints to state_dir + fix host docstring Three related polish changes on top of the disk-persistence landing: 1. Extend `state_dir` to cover workflow checkpoints. Adds `checkpoints` as a third `HostStatePaths` key. Single-path form (`state_dir="/foo"`) now also auto-derives `/foo/checkpoints/` for workflow targets (equivalent to passing `checkpoint_location="/foo/checkpoints"`). The mapping form lets workflow callers opt out by omitting the key, or route checkpoints to a different volume. Conflict / precedence rules: * Explicit `checkpoint_location` always wins over the state_dir derived path; a warning surfaces the double-config. * Single-path `state_dir` + non-Workflow target → checkpoints path silently ignored (no eager directory creation either). * Mapping form with `checkpoints` + non-Workflow target → warn (almost certainly dead config). * Derived path with a workflow that already has its own `checkpoint_storage` → same `RuntimeError` as the explicit parameter triggers, so ownership stays unambiguous. Checkpoint persistence uses `FileCheckpointStorage` from the framework core — no extra dependency. Only `runner` and `sessions` require the `[disk]` extra. 2. Move `AgentFrameworkHost.__init__` parameter docs from `Args:` to `Keyword Args:` for every parameter after the `*`. Only `target` remains under `Args:`. Brings the docstring in line with the actual signature (the params have always been keyword-only). 3. `HostStatePaths` already existed as a TypedDict but did not cover `checkpoints`; updated to document the new key with the same per-attribute docstring style as `runner` / `sessions` so editors can surface help on the keys. Validation: 201 tests pass (was 194; +7 checkpoint integration tests in test_host_disk.py). pyright + mypy + ruff + bandit clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting): add core IdentityLinker authorization seam Fold the core IdentityLinker pieces into the hosting-core PR so the authorization surface no longer has a deferred Wave-2 placeholder. Provider-specific linkers (for example Entra OAuth helpers) can now plug into core without core depending on an IdP SDK. Core additions: - Add LinkChallenge, LinkedIdentity, LinkResolution, and IdentityLinker. IdentityLinker.resolve(identity) is a single-call decision that returns either a linked identity with verified claims or a challenge the channel can render. - Enable LinkedClaimAllowlist end-to-end. It now abstains pre-link and allows/denies post-link against verified claims, including multi-valued claims such as groups. - Add AuthPolicy factories for common allowlist shapes. - Extend Allowed with verified_claims and claim_source for audit/telemetry without requiring callers to re-derive how the decision was made. Host behavior: - identity_linker is now typed as IdentityLinker | None. - authorize() supports open, native-id, forced-link, and linked-claim profiles end-to-end. - require_link=True resolves via the linker and returns LinkRequired when the identity is not linked. - claim-based allowlists use channel-emitted verified_claims when present, or linker-resolved claims otherwise. - authorize() remains decision-only and does not mutate _identities/_active; identity registry writes remain on the actual request execution path. Docs/tests: - Remove Wave-1/Wave-2 language from core/spec/ADR surfaces touched here. - Update the spec/ADR to describe the core linker seam and provider-specific linker packages. - Add authorization tests for linker challenges, linked identities, linked claim allowlists, channel-emitted claims, AuthPolicy factories, and the no-mutation contract. Validation: 214 tests pass, pyright/mypy/ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting): add link-store path to state_dir Identity linking introduces host-adjacent state that needs the same state_dir treatment as runner, session, and checkpoint state. Add a links component to the host state paths so applications and linker packages have a typed, discoverable persistence location. Changes: - Extend HostStatePaths with links and include it in state_dir normalization (state_dir/links/ for the single-path form). - Add SupportsLinkStorePath, an optional protocol for identity linkers that accept a host-provided link-store path. - AgentFrameworkHost now offers state_dir links to compatible linkers, warns when an explicit links path is supplied without a linker, and warns when the configured linker manages persistence directly instead of implementing SupportsLinkStorePath. - Update README and spec text to document the link-store component and clarify that concrete linkers still own the storage format. - Add disk-state tests for compatible, missing, and non-configurable linkers. Validation: 217 tests pass, pyright/mypy/ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions/0027-hosting-channels.md | 123 +- docs/specs/002-python-hosting-channels.md | 878 +++++- python/packages/hosting/LICENSE | 21 + python/packages/hosting/README.md | 192 ++ .../agent_framework_hosting/__init__.py | 143 + .../agent_framework_hosting/_authorization.py | 485 ++++ .../hosting/agent_framework_hosting/_host.py | 2353 +++++++++++++++++ .../agent_framework_hosting/_isolation.py | 76 + .../agent_framework_hosting/_persistence.py | 195 ++ .../agent_framework_hosting/_runner.py | 751 ++++++ .../agent_framework_hosting/_state_store.py | 402 +++ .../hosting/agent_framework_hosting/_types.py | 915 +++++++ python/packages/hosting/pyproject.toml | 110 + python/packages/hosting/tests/__init__.py | 0 .../hosting/tests/_workflow_fixtures.py | 43 + .../hosting/tests/test_authorization.py | 580 ++++ python/packages/hosting/tests/test_host.py | 1846 +++++++++++++ .../packages/hosting/tests/test_host_disk.py | 424 +++ .../packages/hosting/tests/test_isolation.py | 282 ++ python/packages/hosting/tests/test_runner.py | 333 +++ .../hosting/tests/test_runner_disk.py | 278 ++ python/packages/hosting/tests/test_types.py | 252 ++ python/pyproject.toml | 2 + python/uv.lock | 44 + 24 files changed, 10620 insertions(+), 108 deletions(-) create mode 100644 python/packages/hosting/LICENSE create mode 100644 python/packages/hosting/README.md create mode 100644 python/packages/hosting/agent_framework_hosting/__init__.py create mode 100644 python/packages/hosting/agent_framework_hosting/_authorization.py create mode 100644 python/packages/hosting/agent_framework_hosting/_host.py create mode 100644 python/packages/hosting/agent_framework_hosting/_isolation.py create mode 100644 python/packages/hosting/agent_framework_hosting/_persistence.py create mode 100644 python/packages/hosting/agent_framework_hosting/_runner.py create mode 100644 python/packages/hosting/agent_framework_hosting/_state_store.py create mode 100644 python/packages/hosting/agent_framework_hosting/_types.py create mode 100644 python/packages/hosting/pyproject.toml create mode 100644 python/packages/hosting/tests/__init__.py create mode 100644 python/packages/hosting/tests/_workflow_fixtures.py create mode 100644 python/packages/hosting/tests/test_authorization.py create mode 100644 python/packages/hosting/tests/test_host.py create mode 100644 python/packages/hosting/tests/test_host_disk.py create mode 100644 python/packages/hosting/tests/test_isolation.py create mode 100644 python/packages/hosting/tests/test_runner.py create mode 100644 python/packages/hosting/tests/test_runner_disk.py create mode 100644 python/packages/hosting/tests/test_types.py diff --git a/docs/decisions/0027-hosting-channels.md b/docs/decisions/0027-hosting-channels.md index 66610a14c41..24fc7ae04b7 100644 --- a/docs/decisions/0027-hosting-channels.md +++ b/docs/decisions/0027-hosting-channels.md @@ -110,10 +110,12 @@ We will introduce a new hosting core distribution package per language. The full - **Hostable target** — may be either an **agent** (per-language agent execution seam) or a **workflow** (per-language workflow execution seam). The host detects the kind and dispatches; channels are unchanged. - **Channel**, **`ChannelContext`**, **`ChannelRequest`**, **`ChannelSession`**, **`ChannelContribution`**, **`ChannelCommand`** — the channel-authoring surface. Defined in Terminology. - **`ChannelRunHook`** — the developer's runtime escape hatch over the uniform `ChannelRequest` envelope. Channels translate their native protocol payload into `ChannelRequest`; the hook then runs **after** that translation and **before** target invocation, receiving and returning a `ChannelRequest`. Examples (illustrative): reshaping a chat message into a workflow's typed input, dropping/injecting `ChatOptions` fields, enforcing required options, overriding `session_mode` / `response_target`. -- **`IdentityResolver`** + **`IdentityLinker`** — the channel-neutral identity stack. Resolver maps channel-native ids to `isolation_key`; linker runs the **link/connect ceremony** (OAuth / MFA / signed one-time code) so a new channel can join an existing `isolation_key`. The host owns the routes and short-lived state the linker needs; channels surface entry points. Channels may declare `require_link=True` to enforce "authenticate before chatting", and the linker stores verified IdP claims (e.g. Entra ID `oid`) so subsequent channels that supply the same claim are auto-merged onto the same `isolation_key` without a second ceremony. -- **`ResponseTarget`** + **`ChannelPush`** + **`ContinuationToken`** + **active channel** — the response-delivery stack. `ResponseTarget` decouples *where* a response is delivered from *where* it originated; `ChannelPush` is the optional channel capability used for non-`originating` delivery; `ContinuationToken` makes background runs first-class with a stable id and status; the host tracks last-seen `(isolation_key, channel)` to resolve `response_target="active"`. +- **`ChannelResponseHook`** — the *outbound* counterpart to `ChannelRunHook`. Applied per destination after target invocation, before the channel pushes the result onto its wire. Receives the `HostedRunResult` and a `ChannelResponseContext` (request, channel name, destination identity, originating flag, `is_echo` phase flag) and returns a (possibly transformed) `HostedRunResult`. Used for channel-side projections of the target's output: a text-only wire reads `result.result.text` (for agent targets) or projects `result.result.get_outputs()` into a single text turn (for workflow targets); a card-capable channel iterates the underlying contents; a workflow result with a typed final output can be rebound to a channel-friendly `AgentResponse` via `result.replace(result=...)`. Hooks are stored as a `response_hook` attribute on the channel instance — duck-typed, not part of the `Channel` Protocol, so adding hook support to a new channel package never breaks the Protocol contract. The host clones the `HostedRunResult` envelope per destination before invoking the hook so one channel's `replace(result=...)` cannot leak into another's payload. +- **`HostedRunResult[TResult]` is a generic typed envelope around the target's full-fidelity output.** For agent targets `TResult` narrows to `AgentResponse` (channels read `result.messages`, `result.value`, `result.usage_details`, `result.response_id`, … directly); for workflow targets to `WorkflowRunResult` (channels iterate `result.get_outputs()` / inspect `result.get_final_state()`). The host never collapses or pre-shapes — multi-modality and structured outputs survive end-to-end. The envelope also carries the resolved `session: AgentSession | None` (None for workflows, which do not own session state in the agent sense). Channels decide what subset their wire can carry through their `response_hook` and their native serializer. +- **`IdentityResolver`** + **`IdentityLinker`** + **`IdentityAllowlist`** — the channel-neutral identity stack. Resolver maps channel-native ids to `isolation_key`; linker runs the **link/connect ceremony** (OAuth / MFA / signed one-time code) so a new channel can join an existing `isolation_key`. The host owns the routes and short-lived state the linker needs; channels surface entry points. `IdentityAllowlist` is the **authorization** seam, orthogonal to linking: combined with the per-channel `require_link: bool` it produces three named profiles — **open** (default), **forced-link** (any authenticated identity), and **allowlist** (native-id list, IdP-claim list, or composition of both). Decisions are tri-state (`ALLOW` / `DENY` / `ABSTAIN`) so the host can run the allowlist twice — once with the raw channel identity and again with linker-emitted claims — and compose multiple lists (`AnyOfAllowlists`, `AllOfAllowlists`) without one list's missing information silently denying the request. The host runs a startup validator that rejects silent-deny-everyone configurations (e.g. a claim-based allowlist with no source of verified claims). Channels may declare `require_link=True` to enforce "authenticate before chatting", and the linker stores verified IdP claims (e.g. Entra ID `oid`) so subsequent channels that supply the same claim are auto-merged onto the same `isolation_key` without a second ceremony. +- **`ResponseTarget`** + **`ChannelPush`** + **`ContinuationToken`** + **active channel** — the response-delivery stack. `ResponseTarget` decouples *where* a response is delivered from *where* it originated; `ChannelPush` is the optional channel capability used for non-`originating` delivery; `ContinuationToken` makes background runs first-class with a stable id and status; the host tracks last-seen `(isolation_key, channel)` to resolve `response_target="active"`. `ResponseTarget` constructors that name destinations accept `echo_input=True` to also push the originating user message onto each non-originating destination before the agent reply — keeps the destination channel's UI coherent with the user's actual turn when the host orchestrates cross-channel delivery. - **`confidentiality_tier`** + **`LinkPolicy`** — the multi-tier-on-one-host stack. `confidentiality_tier` is an opaque per-channel label; `LinkPolicy` is the host-level decision over which channel pairs may share an `isolation_key` (link) and which may push to one another (deliver). Built-in `DenyAllLinks` enforces "share a target, never share a session"; running multiple hosts is always a valid alternative. -- **Persisted delivery envelope** — assistant messages stored by the host carry a `deliveries[]` array on `Message.additional_properties["hosting"]` capturing the resolved destination set (per-destination `status`, `attempts`, timestamps, `last_error`, channel-issued `delivery_id`). This is the data model for **audit** ("which destinations did this response actually reach?") and for **replay** ("Telegram was offline; resend to that user when it comes back"). The replay *mechanism* is out of scope for v1; the data model is committed to so providers (especially the Foundry-backed Responses store) and operators can build on it. Live in-place updates require an opt-in `SupportsDeliveryTracking` provider capability; append-only providers degrade to write-once at completion. +- **Intent-only delivery envelope + pluggable `DurableTaskRunner`** — assistant messages stored by the host carry an `intended_targets[]` array on `Message.additional_properties["hosting"]` capturing the resolved destination set (after `ResponseTarget` + `LinkPolicy` filtering). The write is **immutable** — a single record of intent, never mutated post-push. Per-destination operational state (attempts, retries, last error, success timestamp, channel-issued id) lives in a pluggable `DurableTaskRunner` (`register` / `schedule` / `get`) that the host uses to fan out non-originating pushes. Built-in `InProcessTaskRunner` (asyncio + bounded retry) is the default for `long_running` deployments; adapter packages (`agent-framework-hosting-durabletask`, future Foundry adapter) plug in for `ephemeral` deployments. Replay is a property of the configured runner — native for durable adapters, not supported for in-process. This eliminates the earlier `pending`/`delivered`/`failed`/`skipped` state machine, the `SupportsDeliveryTracking` provider capability, and the Foundry `update_item` service ask. - **Caller-supplied vs. host-tracked session carriage** — channels split into two families based on whether the upstream protocol carries a per-conversation key on every request. *Caller-supplied* channels (Responses' `previous_response_id`, Invocations, A2A, MCP) parse it into `ChannelSession.key` and let the caller branch threads by sending fresh ids. *Host-tracked* channels (Telegram, Activity Protocol via Azure Bot Service — Teams/Web Chat/Slack/…— WhatsApp) carry only a stable identity and rely on the host's per-`isolation_key` session alias plus a `host.reset_session(...)` `/new`-style command. The split is invisible to the agent target and explains why `reset_session` and aliasing exist at all (host-tracked channels have no other way to start a fresh thread). Anonymous vs. identified is an orthogonal axis; identity is supplied by the channel, the resolver, or both. - **Multi-user surfaces are first-class.** Telegram groups, supergroups, forum topics, and Activity Protocol multi-user `conversationType`s (`groupChat`, `channel`) are designed-in from v1 — not retrofitted. The contract enforces a clean separation of **user identity** (`ChannelIdentity.native_id` = `from.id` / `from.aadObjectId`) and **conversation locator** (`ChannelRequest.conversation_id` = `chat.id` (+ optional `message_thread_id` / `replyToId`)). Channel implementations expose a `conversation_scope` option (`per_user`, `per_user_per_conversation` (default in groups), `per_conversation`) and an `accept_in_group` addressing rule (`mention_only` (default), `command_only`, `mention_or_command`, `all`) so the bot does not respond to every message in a group and so a single user's group context does not leak into their DM by default. Linker challenge messages (OAuth URL / one-time code) MUST redirect to the user's DM in group contexts. - **Built-in channels** — own their protocol-defined relative routes under default mount roots (`/responses/v1`, `/invocations/invoke`, `/telegram/webhook`) without the app author spelling those out. @@ -154,8 +156,11 @@ The top-level user experience should look the same conceptually in every languag | Per-request hook | `ChannelRunHook = Callable[..., ChannelRequest \| Awaitable[ChannelRequest]]` invoked as `hook(request, *, target=..., protocol_request=...)` | `Func>` / delegate with named extras | | Identity resolver | `IdentityResolver = Callable[[ChannelIdentity], str \| None]` | `IIdentityResolver` (returns `isolation_key`) | | Identity linker | `IdentityLinker` Protocol with `begin(...)` / `complete(...)` plus `routes()` for callback / verification endpoints | `IIdentityLinker` interface with begin/complete + route contributions | +| Authorization policy | `require_link: bool` + `allowlist: IdentityAllowlist \| Literal["inherit"] \| None` on each channel; built-in allowlists `AllowAll`, `NativeIdAllowlist`, `LinkedClaimAllowlist`, `AnyOfAllowlists`, `AllOfAllowlists`, `CallableAllowlist`; host seam `host.authorize(identity, *, require_link, allowlist, verified_claims) -> AuthorizationOutcome` (`Allowed` \| `LinkRequired` \| `Denied`) with tri-state `AllowlistDecision` (`ALLOW` / `DENY` / `ABSTAIN`); named factories on `AuthPolicy` (`.open()` / `.require_link()` / `.native_allowlist(...)` / `.linked_claim_allowlist(...)`) | `IIdentityAllowlist.EvaluateAsync(AuthorizationContext)` returning `AllowlistDecision`; built-ins `AllowAll`, `NativeIdAllowlist`, `LinkedClaimAllowlist`, `AnyOfAllowlists`, `AllOfAllowlists`; `IAgentFrameworkHost.AuthorizeAsync(...)` returning `AuthorizationOutcome` discriminated union with `Allowed` / `LinkRequired` / `Denied` variants | | Response routing | `ChannelRequest.response_target = ResponseTarget.originating \| .active \| .channel("activity") \| .all_linked \| .none`; channels expose `ChannelPush` if they can deliver proactively | `ChannelRequest.ResponseTarget` discriminated union; `IChannelPush` interface for proactive delivery | | Background runs | `ContinuationToken` returned by `host.run_in_background(request)`; channels may return it as their protocol response and/or expose a poll route | `ContinuationToken` record + `HostStateStore` for persistence (file-based default; pluggable Cosmos / SQL / Redis) | +| Runtime mode | `runtime_mode: Literal["long_running", "ephemeral"] \| None = None` on `AgentFrameworkHost`; `None` triggers auto-detect via deployment env markers (`FOUNDRY_HOSTING_ENVIRONMENT`, `AZURE_FUNCTIONS_ENVIRONMENT`, `AWS_LAMBDA_FUNCTION_NAME`); falls back to `"long_running"`. Advisory — drives defaults for `HostStateStore` / `DurableTaskRunner` / identity-link state. | `RuntimeMode` enum on `AgentFrameworkHostBuilder` with the same auto-detection contract | +| Durable task runner | `DurableTaskRunner` Protocol (`register` / `schedule` / `get`) on the host; built-in `InProcessTaskRunner` (asyncio + bounded retry); adapter packages plug TaskHub / Foundry / SQLite backends. Used internally for non-originating push fan-out; in v1 fast-follow shared with background-run plumbing. | `IDurableTaskRunner` interface with the same register/schedule/get triple; built-in in-process runner; adapter packages mirror the Python set | | Confidentiality tier on a channel | `Channel.confidentiality_tier: str \| None` (opaque) | `IChannel.ConfidentialityTier { get; }` (opaque string) | | Link / delivery policy | `LinkPolicy = Callable[[LinkPolicyContext], bool]` with built-ins `AllowAllLinks`, `SameConfidentialityTierOnly`, `ExplicitAllowList`, `DenyAllLinks` | `ILinkPolicy.IsAllowed(LinkPolicyContext)` with the same set of built-in implementations | | Command descriptor | `ChannelCommand` dataclass | `ChannelCommand` record | @@ -165,6 +170,105 @@ Built-in channels own the default mapping from each protocol's request model int The full Python API surface — exact types, fields, default routes, code samples — is specified in the companion Python spec. A future .NET spec captures the .NET-idiomatic API surface for the same model. +#### Runtime topology + +How the pieces wire at runtime. Channels contribute routes to the host's app; inbound traffic splits at the parse step into a command dispatch (handled in the channel) or a message that flows through `host.authorize` → target invocation → response delivery. Non-originating destinations go through the configured `DurableTaskRunner`; the originating channel is rendered synchronously. + +```mermaid +graph LR + Caller[External caller /
messaging app] + + subgraph Host[AgentFrameworkHost] + direction TB + ASGI[Starlette / ASP.NET Core app] + Router[Channel router] + Parse{parse →
command or
message?} + Auth[host.authorize] + Resolver[IdentityResolver] + Delivery[_deliver_response] + Push[_handle_push_task] + end + + Channels[Channels
Responses · Invocations ·
Telegram · Activity ·
IdentityLinker] + CmdHandler[CommandHandler
via ChannelCommandContext] + Target[(Agent or Workflow)] + Runner[DurableTaskRunner] + StateStore[(HostStateStore)] + + Caller --> ASGI + ASGI --> Router + Router --> Parse + Parse -- /command --> CmdHandler + Parse -- message --> Auth + CmdHandler -- ctx.run --> Auth + CmdHandler -- local reply --> Channels + Auth --> Resolver + Resolver --> StateStore + Auth --> Target + Target --> Delivery + Delivery -- originating sync --> Channels + Delivery -- non-originating --> Runner + Runner --> Push + Push --> Channels + Channels --> ASGI +``` + +#### Channel contribution shape + +Every channel exposes the same three contribution slots (all optional except `routes`). The host duck-types each slot and stitches them in at construction. + +```mermaid +graph LR + subgraph C[ConcreteChannel
e.g. TelegramChannel] + direction TB + Routes[routes:
webhook / poller / API endpoints
→ Starlette router] + Commands[commands: Sequence ChannelCommand
name · description · handle ·
scopes · locales · expose_in_ui] + Push[ChannelPush.push
+ optional ChannelPushCodec
+ optional response_hook] + end + + Host[Host] + Native[Platform native catalog
Telegram set_my_commands ·
Teams app manifest · …] + Dispatch[CommandHandler dispatch] + Delivery[Originating sync delivery
+ runner-scheduled fan-out] + + Routes -- contribute at startup --> Host + Commands -- startup projection --> Native + Commands -- runtime dispatch --> Dispatch + Push -- driven by --> Delivery +``` + +#### Authorization decision + +`require_link` and `allowlist` are orthogonal axes. The `require_link` gate runs first against the current link state from `StateStore`; an unlinked identity on a `require_link=True` channel returns `LinkRequired` regardless of allowlist. A claim-dependent allowlist that has not yet seen claims returns `ABSTAIN` from `evaluate` and is converted into a `LinkRequired` outcome so the user gets a link prompt rather than a silent deny. + +```mermaid +flowchart TB + Start([authorize identity,
require_link, allowlist]) + Linked{identity already
linked?
StateStore lookup} + Required{require_link?} + OpenPath{allowlist is None?} + Resolve[/isolation_key:
linked → existing,
else auto-issue channel:native_id/] + Evaluate[/allowlist.evaluate context/] + Decision{decision} + Abstain{requires_linked_claims?} + Allowed([Allowed isolation_key]) + DeniedPre([Denied
allowlist_denied_pre_link]) + LinkReq([LinkRequired
via configured linker]) + + Start --> Linked + Linked -- yes --> OpenPath + Linked -- no --> Required + Required -- yes --> LinkReq + Required -- no --> OpenPath + OpenPath -- yes --> Resolve --> Allowed + OpenPath -- no --> Evaluate --> Decision + Decision -- ALLOW --> Resolve + Decision -- DENY --> DeniedPre + Decision -- ABSTAIN --> Abstain + Abstain -- yes --> LinkReq + Abstain -- no --> Resolve +``` + ## Terminology These terms are language-neutral and shared between Python and .NET implementations. Each language realizes them with idiomatic types and naming. @@ -285,9 +389,15 @@ The decision is validated when, in each implementing language: | 9 | Where do issued link grants live? | **File storage for v1**, leveraging Hosted Agents' isolated, persistent per-instance file storage. Resolved together with Q11. | | 10 | Should the identity resolver be invoked per channel or once on the host with `(channel_id, native_id)`? | **Host-level resolver receiving `(channel_id, native_id)`** so cross-channel decisions stay in one place. Per-channel overrides remain a future option if real cases emerge. | | 11 | Where does the continuation-token store live? At-rest format and TTL? | Same as Q9 — **file storage for v1** (`FileHostStateStore` under `./.af-hosting/continuations/`, atomic JSON-per-token writes, 24h default TTL on completed entries). Shares the host-level `HostStateStore` contract with link grants and last-seen records. Pluggable Cosmos / SQL / Redis adapters tracked in spec req #23. | -| 12 | Contract for `ChannelPush` failure (offline destination, opt-out, expired token)? | Default: **fall back to the originating channel**, recorded on the persisted `deliveries[]` array with telemetry. Per-request override via `run_hook`. (This already matches the spec; cross-check the wording.) | -| 13 | Should `response_target="active"` use a time window? Behavior on expiry? | Yes — configurable `active_window_seconds` on the host (suggested default **300 s**). On expiry, fall back to `originating`, then to `all_linked`. Recorded on `deliveries[]`. Per-request override via `run_hook`. | +| 12 | Contract for `ChannelPush` failure (offline destination, opt-out, expired token)? | **Retry handled by the configured `DurableTaskRunner` per its `RetryPolicy`.** The host registers a single internal `"hosting.push"` handler at startup; each non-originating destination becomes a `runner.schedule("hosting.push", payload)` call. Failures inside the handler are caught by the runner, retried with backoff, and ultimately marked terminal-failed when `max_attempts` is exhausted. Downstream push outcomes live in the runner's own log — there is no per-destination return surface. The earlier `DeliveryReport` value type has been removed; the host's internal `_deliver_response` helper returns `bool` (whether any work was scheduled) for the originating channel. Per-request override via `run_hook`. See the Python spec's [Intended targets + durable delivery](../specs/002-python-hosting-channels.md#intended-targets--durable-delivery) and [Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). | +| 13 | Should `response_target="active"` use a time window? Behavior on expiry? | Yes — configurable `active_window_seconds` on the host (suggested default **300 s**). On expiry, fall back to `originating`, then to `all_linked`. Per-request override via `run_hook`. | | 15 | Should `Channel.confidentiality_tier` stay opaque or become an ordered enum? | **Keep as opaque string.** Apps define their own taxonomy. Built-in policies do equality / set membership checks only — no ordered-comparison policy is shipped. | +| 16 | Should authorization (per-channel allowlist) ship as a single `auth_mode` enum or as two orthogonal parameters? | **Two orthogonal parameters (`require_link: bool` + `allowlist: IdentityAllowlist`)** plus named `AuthPolicy` factories for ergonomics. A single enum cannot express the Mixed profile (native ids bypass auth, everyone else is funneled into linking) without sub-parameters that defeat the point. Composition uses a **tri-state `AllowlistDecision` (`ALLOW` / `DENY` / `ABSTAIN`)** so claim-based allowlists can `ABSTAIN` until claims are available without that being read as a denial. `LinkedClaimAllowlist` use without a source of verified claims is rejected at host startup via a typed `ChannelConfigurationError` — silent deny-everyone is the worst possible default and is not allowed. The core PR includes the channel-neutral pieces: `IdentityAllowlist` Protocol, `AllowlistDecision`, built-ins (`AllowAll` / `NativeIdAllowlist` / `LinkedClaimAllowlist` / combinators / `CallableAllowlist`), `IdentityLinker` Protocol, `LinkedIdentity`, `LinkChallenge`, `AuthPolicy` factories, `Allowed` / `LinkRequired` / `Denied` outcomes, `Host(default_allowlist=..., identity_linker=...)` + per-channel `allowlist`, three-rule construction validator (`require_link=True` without a linker now raises), and `host.authorize(...)` for open, native-id, and linked-claim profiles. Provider-specific linkers (for example Entra OAuth helpers) ship as separate packages. See the Python spec's [Authorization profiles section](../specs/002-python-hosting-channels.md#authorization-profiles-and-the-identityallowlist-seam) for full mechanics. | +| 17 | How does the host decide long-running vs ephemeral runtime, and is that distinction enforced? | **Single `runtime_mode` parameter, advisory, auto-detected by default.** `None` (the default) inspects known deployment markers (`FOUNDRY_HOSTING_ENVIRONMENT`, `AZURE_FUNCTIONS_ENVIRONMENT`, `AWS_LAMBDA_FUNCTION_NAME`) and picks `"ephemeral"` on the first hit; otherwise falls back to `"long_running"` (sensible local-dev / always-on default). The mode drives the *default selection* of seams that have FHA-shaped vs container-shaped defaults — `HostStateStore`, `DurableTaskRunner`, identity-link state — but every choice remains independently overridable. Detected mode is logged at startup so misdetection is visible. See the Python spec's [Runtime modes](../specs/002-python-hosting-channels.md#runtime-modes). | +| 18 | How does delivery to non-originating destinations actually happen, and what is the retry / replay contract? | **Out-of-band via a pluggable `DurableTaskRunner`.** The host registers an internal `"hosting.push"` handler at startup; each non-originating destination becomes a `runner.schedule("hosting.push", payload)` call. The originating destination (when `ResponseTarget` includes it) is **still rendered synchronously** on the originating channel's wire — only fan-out goes through the runner. Default runner is `InProcessTaskRunner` (asyncio + bounded retry, no cross-restart persistence — suitable for `long_running`). Durable adapter packages (`agent-framework-hosting-durabletask`, future Foundry adapter) plug into the same Protocol for `ephemeral` deployments. Replay across host restarts is a property of the configured runner (native for durable adapters; not supported for the in-process runner). See the Python spec's [Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). | +| 19 | What is the audit shape on the assistant message — full per-destination state machine, or intent only? | **Intent only.** `Message.additional_properties["hosting"]["intended_targets"]` is a single immutable write that records the resolved destination set (after `ResponseTarget` + `LinkPolicy` filtering). Operational state — attempt count, last error, success timestamp, channel-issued id — lives in the `DurableTaskRunner` and is observed via the runner's backend. This eliminates the earlier per-destination `pending`/`delivered`/`failed`/`skipped` state machine, the `SupportsDeliveryTracking` provider capability, and the Foundry `update_item` service ask. See the Python spec's [Intended targets + durable delivery](../specs/002-python-hosting-channels.md#intended-targets--durable-delivery). | +| 20 | What is the wire contract between `DurableTaskRunner` and push-capable channels when the runner is out-of-process? | **A two-piece contract.** Each `DurableTaskRunner` declares its `payload_mode` (`OBJECT` for in-process pass-by-reference; `JSON` for runners that round-trip through JSON). Push-capable channels that ship non-JSON-native payloads expose a `ChannelPushCodec` (`encode` / `decode`). At construction the host validates the pairing and refuses a `JSON`-mode runner paired with codec-less push channels (`ChannelConfigurationError`). The push handler accepts both `OBJECT` and `JSON` envelope shapes so the same handler serves both runner backends. See the Python spec's [Codec contract for durable serialisation](../specs/002-python-hosting-channels.md#codec-contract-for-durable-serialisation). | +| 21 | What is the operational contract for `runtime_mode="ephemeral"` without a configured durable runner, and for clean shutdown of the in-process runner? | **Strict ephemeral by default + 2-phase drain.** `ephemeral` + default (in-process) runner raises `RuntimeError` at construction unless `allow_in_process_runner=True` is opted in (warning logged) — silently using the in-process runner in an ephemeral environment would drop in-flight pushes on the next scale-to-zero. For `long_running`, `InProcessTaskRunner` ships a `shutdown_grace_seconds` window (default `5.0`) that lets in-flight retries finish before cancellation; `CancelledError` from the cancellation phase is swallowed as the expected shutdown shape. When `echo_input=True`, the push task carries an `echo_done` cursor in runner-owned state so a retry that fires after the echo succeeded does not double-echo. See the Python spec's [Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). | ## Decisions-driven follow-ups @@ -297,10 +407,11 @@ These are spec-body / sample / code edits implied by the resolutions above, **ou - **Q7** — spec, `ChannelCommand` reference, and the Telegram channel design need optional `scopes` and `locales` fields with clear "channels free to ignore" semantics. - **Q8** — ✅ Done in spec rev. Req #24 lists only `OAuthIdentityLinker` and `OneTimeCodeIdentityLinker`; the linker-helper table and the OAuth scenario no longer reference `MfaIdentityLinker`. - **Q9 + Q11** — ✅ Resolved in spec rev. Spec req #23 now names the seam **`HostStateStore`** with a v1 default of `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`), so continuation tokens, link grants, and last-seen records all survive single-node restarts. Pluggable Cosmos / SQL / Redis adapters remain v1 fast follow. -- **Q12** — verify the spec's `ChannelPush` failure narrative includes "recorded on `deliveries[]`" alongside "telemetry warning"; tighten if needed. +- **Q12** — ✅ Resolved by the durable-delivery seam (Q18). The `ChannelPush` failure narrative is now "retry per `RetryPolicy` via the runner, observe in the runner's backend"; no separate `deliveries[]` annotation required. - **Q13** — add `active_window_seconds` (default 300 s) to the host config surface and document the `originating` → `all_linked` fallback chain. - **Q14** — explicitly document the **swappable WS codec** property in the Responses channel section (host contract does not depend on the framing) so the spec stays valid as upstream OpenAI evolves. - **Q15** — confirm the spec consistently treats `confidentiality_tier` as an opaque string and that no built-in policy assumes an ordered hierarchy. +- **Q17 / Q18 / Q19** — ✅ Spec text added: new top-level §[Runtime modes](../specs/002-python-hosting-channels.md#runtime-modes), rewritten §[Intended targets + durable delivery](../specs/002-python-hosting-channels.md#intended-targets--durable-delivery), new §[Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). Core code lands `DurableTaskRunner` Protocol + `InProcessTaskRunner` + `runtime_mode` constructor parameter + auto-detection in this PR; durable runner adapters (`agent-framework-hosting-durabletask`, Foundry adapter) ship as separate follow-up packages. ## More Information diff --git a/docs/specs/002-python-hosting-channels.md b/docs/specs/002-python-hosting-channels.md index d5ece88023c..ea396ceadf2 100644 --- a/docs/specs/002-python-hosting-channels.md +++ b/docs/specs/002-python-hosting-channels.md @@ -129,15 +129,16 @@ After we deliver `agent-framework-hosting` and its first channel packages, users 19. **Target any `SupportsAgentRun` or `Workflow`** — host an `Agent`, `A2AAgent`, or a `Workflow`; the `run_hook` is the seam for adapting the channel's default `ChannelRequest` into the target-specific input shape (free-form messages for agents, typed inputs for workflows). 20. **Contribute WebSocket endpoints from a channel** — `ChannelContribution.routes` accepts both `Route` (HTTP) and `WebSocketRoute` (WS); the channel codec is responsible for framing and the same `run_hook` / default mapping pipeline applies. Built-in `ResponsesChannel` exposes a WebSocket transport (default `/responses/ws`, controlled by `transports=("http", "websocket")`) alongside its HTTP+SSE transport, anticipating the OpenAI Responses WebSocket transport. The host requires an ASGI server with WebSocket scope support (Uvicorn, Hypercorn, Daphne, Granian). 21. **Mix channels of different confidentiality tiers on one host** — every `Channel` may declare an opaque `confidentiality_tier: str | None` (e.g. `"corp"`, `"public"`). The host's `LinkPolicy` decides which `(source_tier, target_tier)` pairs may share an `isolation_key` (link) and which may be `ResponseTarget` source/destination for one another (deliver). Built-in policies (`AllowAllLinks` (default), `SameConfidentialityTierOnly`, `ExplicitAllowList`, `DenyAllLinks`) and the policy contract are defined in [LinkPolicy](#linkpolicy-and-confidentiality_tier). Cross-tier link attempts are refused with a typed error; cross-tier deliveries are dropped — so two tiers can share **an agent target** on one host while remaining strictly session-isolated. +22. **Choose an authorization profile per channel** — every channel that emits a `ChannelIdentity` composes from two orthogonal parameters, `require_link: bool` and `allowlist: IdentityAllowlist | None`, producing the three named profiles **open** (default), **forced-link** (must authenticate, any authenticated identity accepted), and **allowlist** (only listed identities — keyed on either the channel-native id pre-link or on a verified IdP claim post-link). Built-in allowlists (`NativeIdAllowlist`, `LinkedClaimAllowlist`, plus `AnyOfAllowlists` / `AllOfAllowlists` combinators) and the unified host seam (`host.authorize(...)` → `AuthorizationOutcome` of `Allowed` / `LinkRequired` / `Denied`) are defined in [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam). The host applies a `default_allowlist` to every channel whose `allowlist` is left at the sentinel `"inherit"`, so app authors can lock down a whole bot in one place. Configuration combinations that would silently deny every user (e.g. `LinkedClaimAllowlist` on a channel with `require_link=False` and no native verified claims) are rejected at host startup with a typed `ChannelConfigurationError`. ### v1 Fast Follow -22. **Generic auth helpers** — shared middleware for common channel auth patterns (HMAC signature, bearer token). -23. **Pluggable host state store** — interface for cross-host persistence of `ContinuationToken`s, identity-link grants, and last-seen `(isolation_key, channel)` records. Default implementation in v1 is **file-based** (`FileHostStateStore`); `InMemoryHostStateStore` is available for tests. A future `CosmosHostStateStore` / `SQLHostStateStore` would extend cross-channel chat continuity (req #9), background runs (req #14), and identity-link continuity (req #11) beyond a single host/process — but the v1 file-based default already survives host restarts on a single node. Same protocol covers session aliasing where applicable. -24. **First-party identity linker helpers** — concrete `OAuthIdentityLinker` (with provider presets) and `OneTimeCodeIdentityLinker` (cross-channel code exchange) shipped as opt-in helpers on top of the `IdentityLinker` contract. Investigation of additional first-party linker types tracked as a follow-up. -25. **`A2AChannel` package** (`agent-framework-hosting-a2a`) — exposes the hostable target over the Agent-to-Agent protocol so other agents can consume it as a peer. Caller-supplied-session family (alongside Responses and Invocations): A2A's per-conversation id maps to `ChannelSession.key`; the calling agent's identity (e.g. its A2A agent card / signed JWT) flows through `IdentityResolver`; structured replies fit the existing `ChannelRequest` + `ResponseTarget` envelope. No new host primitives required — only the protocol binding and package. -26. **`MCPToolChannel` package** (`agent-framework-hosting-mcp`) — exposes the hostable target as a **Model Context Protocol tool** so MCP clients (other agents, IDE tooling) can invoke it. Same caller-supplied-session family: the MCP `tool/call` carries the conversation key into `ChannelSession.key`; the MCP client identity flows through `IdentityResolver`; the tool result is the target's response. Streaming MCP tools map onto the host's existing streaming response delivery; long-running MCP tools map onto background runs with `ContinuationToken` when the work outlasts a single tool-call round-trip. -27. **`ActivityChannel` package** (`agent-framework-hosting-activity`) — exposes the hostable target behind **Azure Bot Service**, which fronts Teams, Web Chat, Slack-style connectors, and the rest of the Bot Framework / M365 connector ecosystem. Provides **native translations** between Activity Protocol objects (`Activity`, `ConversationReference`, adaptive cards, `Invoke` activities, …) and the host's `ChannelRequest` / `ChannelResponse` types — so the contract is **explicit** rather than implicit through a generic Invocations endpoint. Host-tracked-session family: Bot Service authenticates with a JWT carrying the AAD object id, the channel populates `ChannelIdentity` from `from.aadObjectId`, the host's per-`isolation_key` alias decides which `AgentSession` to resolve, and `host.reset_session(...)` is reachable via a Teams slash command or adaptive-card action. `ChannelPush` is implemented over Bot Service's `ConversationReference` + `continueConversationAsync` pattern. Naming this channel **Activity** rather than **Teams** keeps a `TeamsChannel` name available for the Teams-native channel below (req #28) and for any future direct-to-Teams transport. -28. **`TeamsChannel` package** (`agent-framework-hosting-teams`) — Teams-native channel built on the MIT-licensed [`microsoft/teams.py`](https://github.com/microsoft/teams.py) SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`). Where `ActivityChannel` (req #27) targets the **generic** Activity Protocol surface across all Bot Service-fronted channels, `TeamsChannel` exploits **Teams-specific affordances** that the generic Activity Protocol does not surface natively: +23. **Generic auth helpers** — shared middleware for common channel auth patterns (HMAC signature, bearer token). +24. **Pluggable host state store** — interface for cross-host persistence of `ContinuationToken`s, identity-link grants, and last-seen `(isolation_key, channel)` records. Default implementation in v1 is **file-based** (`FileHostStateStore`); `InMemoryHostStateStore` is available for tests. A future `CosmosHostStateStore` / `SQLHostStateStore` would extend cross-channel chat continuity (req #9), background runs (req #14), and identity-link continuity (req #11) beyond a single host/process — but the v1 file-based default already survives host restarts on a single node. Same protocol covers session aliasing where applicable. +25. **First-party identity linker helpers** — concrete `OAuthIdentityLinker` (with provider presets) and `OneTimeCodeIdentityLinker` (cross-channel code exchange) shipped as opt-in helpers on top of the `IdentityLinker` contract. Investigation of additional first-party linker types tracked as a follow-up. +26. **`A2AChannel` package** (`agent-framework-hosting-a2a`) — exposes the hostable target over the Agent-to-Agent protocol so other agents can consume it as a peer. Caller-supplied-session family (alongside Responses and Invocations): A2A's per-conversation id maps to `ChannelSession.key`; the calling agent's identity (e.g. its A2A agent card / signed JWT) flows through `IdentityResolver`; structured replies fit the existing `ChannelRequest` + `ResponseTarget` envelope. No new host primitives required — only the protocol binding and package. +27. **`MCPToolChannel` package** (`agent-framework-hosting-mcp`) — exposes the hostable target as a **Model Context Protocol tool** so MCP clients (other agents, IDE tooling) can invoke it. Same caller-supplied-session family: the MCP `tool/call` carries the conversation key into `ChannelSession.key`; the MCP client identity flows through `IdentityResolver`; the tool result is the target's response. Streaming MCP tools map onto the host's existing streaming response delivery; long-running MCP tools map onto background runs with `ContinuationToken` when the work outlasts a single tool-call round-trip. +28. **`ActivityChannel` package** (`agent-framework-hosting-activity`) — exposes the hostable target behind **Azure Bot Service**, which fronts Teams, Web Chat, Slack-style connectors, and the rest of the Bot Framework / M365 connector ecosystem. Provides **native translations** between Activity Protocol objects (`Activity`, `ConversationReference`, adaptive cards, `Invoke` activities, …) and the host's `ChannelRequest` / `ChannelResponse` types — so the contract is **explicit** rather than implicit through a generic Invocations endpoint. Host-tracked-session family: Bot Service authenticates with a JWT carrying the AAD object id, the channel populates `ChannelIdentity` from `from.aadObjectId`, the host's per-`isolation_key` alias decides which `AgentSession` to resolve, and `host.reset_session(...)` is reachable via a Teams slash command or adaptive-card action. `ChannelPush` is implemented over Bot Service's `ConversationReference` + `continueConversationAsync` pattern. Naming this channel **Activity** rather than **Teams** keeps a `TeamsChannel` name available for the Teams-native channel below (req #29) and for any future direct-to-Teams transport. +29. **`TeamsChannel` package** (`agent-framework-hosting-teams`) — Teams-native channel built on the MIT-licensed [`microsoft/teams.py`](https://github.com/microsoft/teams.py) SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`). Where `ActivityChannel` (req #28) targets the **generic** Activity Protocol surface across all Bot Service-fronted channels, `TeamsChannel` exploits **Teams-specific affordances** that the generic Activity Protocol does not surface natively: - **Adaptive Cards** via the typed `microsoft-teams-cards` builder, attached as tool side-effects through a `ContextVar`-scoped pending-cards collector consumed by the channel's result projector. - **Streamed assistant replies** via `ctx.stream.emit(chunk)` — the channel projects `agent.run(..., stream=True)` chunks directly. - **Teams "AI generated" badge**, **built-in feedback controls + custom feedback form**, **suggested-prompt chips** (`SuggestedActions` / `CardAction(IM_BACK)`), **inline citations** (`CitationAppearance` populated from a `FunctionMiddleware` that assigns stable positions to tool-result sources). @@ -149,23 +150,96 @@ After we deliver `agent-framework-hosting` and its first channel packages, users Mounts the SDK's `App` into the host's Starlette app via a custom `HttpServerAdapter` that defers `register_route(...)` to `ChannelContribution.routes` — the SDK does **not** start its own server; the host owns the lifecycle. Host-tracked-session family (same as `ActivityChannel`): `from.aadObjectId` populates `ChannelIdentity`. The result projector reads `AgentRunResult.messages[*].contents` and routes the rich content variants to their Teams-native renderings (`TextContent` → markdown body, `DataContent`/structured output → Adaptive Card, citation entries from `additional_properties` → `add_citation`, `ErrorContent` → typed error card). - **Note on transport.** `TeamsChannel` **still rides on Azure Bot Service in v1** — the `microsoft/teams.py` SDK is a higher-level Pythonic wrapper over the same Activity Protocol pipeline that `ActivityChannel` exposes raw. The difference is **what the developer writes against**, not the underlying network path. A truly Bot-Service-free Teams transport is *not currently possible* and is tracked as a separate, speculative stretch item (req #30); when/if Microsoft ships one, the new transport would slot in under the same `TeamsChannel` package without changing this requirement. + **Note on transport.** `TeamsChannel` **still rides on Azure Bot Service in v1** — the `microsoft/teams.py` SDK is a higher-level Pythonic wrapper over the same Activity Protocol pipeline that `ActivityChannel` exposes raw. The difference is **what the developer writes against**, not the underlying network path. A truly Bot-Service-free Teams transport is *not currently possible* and is tracked as a separate, speculative stretch item (req #31); when/if Microsoft ships one, the new transport would slot in under the same `TeamsChannel` package without changing this requirement. **`ActivityChannel` vs `TeamsChannel` — pick by audience:** | Channel | Built on | Audience | |---|---|---| - | `ActivityChannel` (req #27) | Activity Protocol over HTTP, no Teams-specific helpers | Bot Service-fronted channels generically (Teams, Web Chat, Slack-style connectors, DirectLine, …); maximum portability across the Bot Framework / M365 connector ecosystem | - | `TeamsChannel` (req #28) | `microsoft/teams.py` `App` mounted via custom `HttpServerAdapter` into the host's Starlette app | Teams-first deployments that want Adaptive Cards, modal Dialogs, Message Extensions, citations, feedback, suggested-prompt chips, and SSO out-of-the-box | + | `ActivityChannel` (req #28) | Activity Protocol over HTTP, no Teams-specific helpers | Bot Service-fronted channels generically (Teams, Web Chat, Slack-style connectors, DirectLine, …); maximum portability across the Bot Framework / M365 connector ecosystem | + | `TeamsChannel` (req #29) | `microsoft/teams.py` `App` mounted via custom `HttpServerAdapter` into the host's Starlette app | Teams-first deployments that want Adaptive Cards, modal Dialogs, Message Extensions, citations, feedback, suggested-prompt chips, and SSO out-of-the-box | Deployments that only need plain Activity Protocol over Bot Service stick with `ActivityChannel`; `TeamsChannel` is the upgrade path when Teams-native richness is wanted. ### Stretch -29. **WhatsApp channel package** — using the same `Channel` + `ChannelCommand` model, designed so it participates in cross-channel continuity (req #9) and can serve as a `ChannelPush` destination (req #13) when paired with a stable per-user `isolation_key`. -30. **Direct-to-Teams channel package** — *speculative*. Reserved for a future transport that connects to Teams **without going through Azure Bot Service** (and therefore without the Activity Protocol pipeline that backs both `ActivityChannel` (req #27) and `TeamsChannel` (req #28)). At the time of writing **no such transport is publicly available** — the Microsoft Graph chat APIs (`/teams/{id}/channels/{id}/messages`, `/chats/{id}/messages`) and the `microsoft/teams.py` SDK both ultimately route through Bot Service for the bot-as-conversation-participant pattern. This requirement is kept on the roadmap purely to preserve the `TeamsChannel` naming line for if/when Microsoft ships a Bot-Service-free transport (a native Teams REST/RPC, a Graph subscription strong enough to drive both inbound and outbound message flow, or similar). Until then, **the canonical Teams channel is `TeamsChannel` (req #28)** and `ActivityChannel` (req #27) covers the generic Bot Service surface. +30. **WhatsApp channel package** — using the same `Channel` + `ChannelCommand` model, designed so it participates in cross-channel continuity (req #9) and can serve as a `ChannelPush` destination (req #13) when paired with a stable per-user `isolation_key`. +31. **Direct-to-Teams channel package** — *speculative*. Reserved for a future transport that connects to Teams **without going through Azure Bot Service** (and therefore without the Activity Protocol pipeline that backs both `ActivityChannel` (req #28) and `TeamsChannel` (req #29)). At the time of writing **no such transport is publicly available** — the Microsoft Graph chat APIs (`/teams/{id}/channels/{id}/messages`, `/chats/{id}/messages`) and the `microsoft/teams.py` SDK both ultimately route through Bot Service for the bot-as-conversation-participant pattern. This requirement is kept on the roadmap purely to preserve the `TeamsChannel` naming line for if/when Microsoft ships a Bot-Service-free transport (a native Teams REST/RPC, a Graph subscription strong enough to drive both inbound and outbound message flow, or similar). Until then, **the canonical Teams channel is `TeamsChannel` (req #29)** and `ActivityChannel` (req #28) covers the generic Bot Service surface. ## API Surface +### Architecture overview + +The host wires one `Agent` (or `Workflow`) to one or more channels, each contributing routes, commands, and a push back-channel. **1a** is the runtime topology — how an inbound request flows through the host. **1b** is the contribution shape — what each channel hands the host at construction. + +#### Runtime topology + +```mermaid +graph LR + Caller[External caller /
messaging app] + + subgraph Host[AgentFrameworkHost] + direction TB + ASGI[Starlette app] + Router[Channel router] + Parse{parse →
command or
message?} + Auth[host.authorize] + Resolver[IdentityResolver] + Delivery[_deliver_response] + Push[_handle_push_task] + Annot[_annotate_intended_targets] + end + + Channels[Channels
Responses · Invocations ·
Telegram · Activity ·
IdentityLinker] + CmdHandler[CommandHandler
via ChannelCommandContext] + Target[(Agent or Workflow)] + Runner[DurableTaskRunner] + StateStore[(HostStateStore)] + + Caller --> ASGI + ASGI --> Router + Router --> Parse + Parse -- /command --> CmdHandler + Parse -- message --> Auth + CmdHandler -- ctx.run --> Auth + CmdHandler -- local reply --> Channels + Auth --> Resolver + Resolver --> StateStore + Auth --> Target + Target --> Delivery + Delivery -- originating sync --> Channels + Delivery -- non-originating --> Runner + Delivery --> Annot + Runner --> Push + Push --> Channels + Channels --> ASGI +``` + +#### Channel contribution shape + +Every channel exposes the same three contribution slots, all optional except `routes`. The host duck-types each slot and stitches them in at construction. + +```mermaid +graph LR + subgraph C[ConcreteChannel
e.g. TelegramChannel] + direction TB + Routes[routes:
webhook / poller / API endpoints
→ Starlette router] + Commands[commands: Sequence ChannelCommand
name · description · handle ·
scopes · locales · expose_in_ui] + Push[ChannelPush.push
+ optional ChannelPushCodec
+ optional response_hook] + end + + Host[Host] + Native[Platform native catalog
Telegram set_my_commands ·
Teams app manifest · …] + Dispatch[CommandHandler dispatch] + Delivery[Originating sync delivery
+ runner-scheduled fan-out] + + Routes -- contribute at startup --> Host + Commands -- startup projection --> Native + Commands -- runtime dispatch --> Dispatch + Push -- driven by --> Delivery +``` + +The `IdentityLinker` is itself a Channel specialisation: when one is configured, the host auto-inserts a `link` / `connect` `ChannelCommand` into every other channel's catalog (opt-out per channel via `expose_in_ui=False` or rename via metadata). + ### Packages | Distribution package | Public import surface | Purpose | @@ -226,7 +300,7 @@ TelegramChannel(path="/bots/telegram", bot_token=token) # -> /bots/telegram/web | Method | Type | Description | |---|---|---| -| `run(request: ChannelRequest)` | `-> HostedRunResult` | One-shot invocation. | +| `run(request: ChannelRequest)` | `-> HostedRunResult[Any]` | One-shot invocation. For agent targets `TResult` narrows to `AgentResponse`; for workflow targets to `WorkflowRunResult`. | | `stream(request: ChannelRequest)` | `-> HostedStreamResult` | Streaming invocation. | **`ChannelContribution`** — what a channel returns from `contribute(...)`. @@ -305,7 +379,7 @@ Runs **after** the channel has produced its default `ChannelRequest`, **before** IdentityResolver = Callable[[ChannelIdentity], Awaitable[str | None] | (str | None)] ``` -The **default resolver auto-issues** an `isolation_key` the first time a `(channel, native_id)` is seen and persists the mapping in the host's identity store, so every end user automatically gets a stable per-user `isolation_key` on first contact through **any** channel — no per-channel boilerplate is required for the single-channel case. Returning `None` is reserved for advanced cases where the resolver wants to refuse unknown identities (e.g. allow-list enforcement). +The **default resolver auto-issues** an `isolation_key` the first time a `(channel, native_id)` is seen and persists the mapping in the host's identity store, so every end user automatically gets a stable per-user `isolation_key` on first contact through **any** channel — no per-channel boilerplate is required for the single-channel case. Returning `None` is reserved for advanced cases where the resolver wants to refuse unknown identities; the dedicated host seam for accept/reject decisions is **`IdentityAllowlist`** — see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) below. Cross-channel continuity is then a one-shot **merge** operation: after a successful link ceremony (Scenario 6), the host atomically rewrites the second channel's auto-issued key to point at the first channel's existing `isolation_key`. Apps never have to write per-channel mapping hooks just to get continuity to work. @@ -328,7 +402,197 @@ Apps that already own an identity namespace (corporate user id, tenant-scoped ac A built-in `link` (or `connect`) `ChannelCommand` is exposed automatically when an `IdentityLinker` is configured. Its `handle` invokes `linker.begin(...)` and replies with the `LinkChallenge` payload (URL, code, instructions) projected through the channel's native rendering. Channels may opt out (`expose_in_ui=False`) or override the command's name per channel. -**`require_link` (per-channel)** — every channel that emits a `ChannelIdentity` accepts a `require_link: bool = False` constructor argument. When `True`, the channel calls `linker.is_linked(identity, verified_claims=…)` before producing a `ChannelRequest`; un-linked identities are short-circuited to a rendered `LinkChallenge` reply (the same payload the `link` command would emit) and the agent is **not** invoked for that turn. Combined with the linker's verified-claim auto-link, this gives an "authenticate before chatting" enforcement model where the first channel forces the OAuth ceremony and subsequent channels join the same `isolation_key` silently. See [Scenario 6](#scenario-6-linking-a-new-channel-to-an-existing-identity-via-oauth) for the end-to-end flow. Default is `False`, which preserves the opportunistic flow (auto-issued `isolation_key`, link manually later). Channels whose protocol does not authenticate the user (e.g. anonymous Responses calls) ignore the flag. +**`require_link` (per-channel)** — every channel that emits a `ChannelIdentity` accepts a `require_link: bool = False` constructor argument. When `True`, the channel calls `linker.is_linked(identity, verified_claims=…)` before producing a `ChannelRequest`; un-linked identities are short-circuited to a rendered `LinkChallenge` reply (the same payload the `link` command would emit) and the agent is **not** invoked for that turn. Combined with the linker's verified-claim auto-link, this gives an "authenticate before chatting" enforcement model where the first channel forces the OAuth ceremony and subsequent channels join the same `isolation_key` silently. See [Scenario 6](#scenario-6-linking-a-new-channel-to-an-existing-identity-via-oauth) for the end-to-end flow. Default is `False`, which preserves the opportunistic flow (auto-issued `isolation_key`, link manually later). Channels whose protocol does not authenticate the user (e.g. anonymous Responses calls) ignore the flag. `require_link` is the **"identity must be linked"** axis; the **orthogonal "identity is on the accept list"** axis is `allowlist` — see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) below. + +#### Authorization profiles and the `IdentityAllowlist` seam + +`require_link` (above) and `allowlist` (below) compose into the **three named authorization profiles** the spec supports for any channel that emits a `ChannelIdentity`. The two parameters stay **orthogonal** on the channel constructor — there is no single `auth_mode` enum — but the host exposes named factories on `AuthPolicy` (`AuthPolicy.open()` / `.require_link()` / `.native_allowlist(...)` / `.linked_claim_allowlist(...)` / `.mixed(...)`) for ergonomic configuration: + +| Profile | Channel config | What gets gated | Typical use | +|---|---|---|---| +| **Open** (default) | `require_link=False`, `allowlist=None` | Nothing — every identity gets an auto-issued `isolation_key` on first contact. | Public chatbot, internal dev/demo, single-tenant deployments. | +| **Forced link** | `require_link=True`, `allowlist=None` | Identity must complete the link ceremony at least once. Any successfully authenticated identity is then allowed. | "Sign in once with your corporate account, then chat freely" style bots that gate on tenancy via the IdP rather than per-user. | +| **Native allowlist** | `require_link=False`, `allowlist=NativeIdAllowlist(...)` | Only listed channel-native ids (Telegram `chat_id`s, WhatsApp numbers, Slack user ids) get through. Pre-link, no IdP claim involved. | Personal bots, single-user prototypes, small fixed-membership channels. | +| **Linked-claim allowlist** | `require_link=True`, `allowlist=LinkedClaimAllowlist(...)` | Identity must (a) complete the link ceremony **and** (b) carry an IdP claim whose value is on the list (e.g. AAD `oid in {…}` or `tid == ""`). | Multi-channel corporate bot where any channel works but only specific people in a specific tenant are admitted. | +| **Mixed** | `require_link=False`, `allowlist=AnyOfAllowlists(NativeIdAllowlist(...), LinkedClaimAllowlist(...))` | Either the native id is preapproved **or** the user successfully links and matches the claim allowlist. Native-id hits bypass the link ceremony; everyone else is funneled into it. | A bot that wants ops-team Telegram ids in immediately while still letting other corp users self-onboard via OAuth. | + +The decision pipeline that produces each of those profiles: + +```mermaid +flowchart TB + Start([authorize identity,
require_link, allowlist]) + Linked{identity already
linked?
StateStore lookup} + Required{require_link?} + OpenPath{allowlist is None?} + Resolve[/isolation_key:
linked → existing,
else auto-issue channel:native_id/] + Evaluate[/allowlist.evaluate context/] + Decision{decision} + Abstain{requires_linked_claims?} + Allowed([Allowed isolation_key]) + DeniedPre([Denied
allowlist_denied_pre_link]) + LinkReq([LinkRequired
via configured linker]) + + Start --> Linked + Linked -- yes --> OpenPath + Linked -- no --> Required + Required -- yes --> LinkReq + Required -- no --> OpenPath + OpenPath -- yes --> Resolve --> Allowed + OpenPath -- no --> Evaluate --> Decision + Decision -- ALLOW --> Resolve + Decision -- DENY --> DeniedPre + Decision -- ABSTAIN --> Abstain + Abstain -- yes --> LinkReq + Abstain -- no --> Resolve +``` + +The flow shows three terminal states: `Allowed`, `LinkRequired`, `Denied`. `LinkRequired` is reachable whenever `require_link=True` and the identity has not completed the link ceremony (or an allowlist `ABSTAIN`ed and `requires_linked_claims=True`), independent of whether an allowlist is configured. + +##### `IdentityAllowlist` Protocol (tri-state) + +Allowlists are evaluated by a host-level pipeline (`host.authorize(...)`, below) that calls them twice — once with the raw channel-native identity (`phase="pre_link"`) and, if necessary, again after the link ceremony surfaces verified IdP claims (`phase="post_link"`). To make composition (`AnyOfAllowlists`, `AllOfAllowlists`) well-defined and to keep claim-based allowlists from accidentally denying everyone when claims are not yet available, the contract is **tri-state**: + +```python +class AllowlistDecision(StrEnum): + ALLOW = "allow" # accept this identity unconditionally + DENY = "deny" # reject this identity unconditionally + ABSTAIN = "abstain" # this allowlist has no opinion at this phase + # (e.g. a claim-based list during pre_link) + +@dataclass(frozen=True) +class AuthorizationContext: + identity: ChannelIdentity + phase: Literal["pre_link", "post_link"] + isolation_key: str | None # None at pre_link; resolved at post_link + verified_claims: Mapping[str, str] # {} when no claims; populated post_link + claim_source: Literal["linker", "channel", "none"] + # "channel" when the channel itself emits + # verified claims (e.g. Activity Protocol + # bearer with AAD oid); "linker" when the + # IdentityLinker surfaces them; "none" otherwise. + +class IdentityAllowlist(Protocol): + requires_linked_claims: bool = False # if True, host validation rejects + # configurations where neither `require_link` + # nor a claim-emitting channel can deliver + # the claims this allowlist needs. + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: ... +``` + +`ABSTAIN` is **not** a denial — it is "this allowlist has no information yet". The host's decision pipeline (below) is what turns an all-`ABSTAIN` outcome into the appropriate next step (allow when open, escalate to a link ceremony when the configuration calls for one). Boolean allowlists were rejected as part of this design pass because two-state composition cannot distinguish "claim allowlist denies you" from "claim allowlist hasn't seen any claims yet" — a critical distinction for the **Mixed** profile. + +##### Built-in allowlists + +| Helper | Pre-link behavior | Post-link behavior | Notes | +|---|---|---|---| +| `AllowAll()` | `ALLOW` | `ALLOW` | Explicit "open" sentinel; useful for tests and for overriding a host-level `default_allowlist`. | +| `NativeIdAllowlist(channel=None, native_ids=...)` | `ALLOW` if `(channel, native_id)` is on the list; `DENY` if `channel` matches but `native_id` does not; `ABSTAIN` if `channel` does not match (allows mixing per-channel native lists under one `AnyOfAllowlists`). | Same as pre-link — native-id allowlists do not depend on link state. | Constructor accepts `native_ids: Collection[str] \| Callable[[], Awaitable[Collection[str]]]` so the list can be loaded asynchronously (config file, secret store). | +| `LinkedClaimAllowlist(claim, values)` | `ABSTAIN` (no claims available yet). | `ALLOW` if `verified_claims.get(claim)` is in `values`; `DENY` otherwise. | `requires_linked_claims = True`. Host construction-time validator rejects use with `require_link=False` on a channel that does not also emit verified claims natively — this prevents the silent-deny-everyone footgun. | +| `AnyOfAllowlists(*allowlists)` | `ALLOW` if any child `ALLOW`s; `DENY` only if **all** children `DENY`; otherwise `ABSTAIN`. | Same rule. | Composition for the **Mixed** profile. | +| `AllOfAllowlists(*allowlists)` | `DENY` if any child `DENY`s; `ALLOW` only if **all** children `ALLOW`; otherwise `ABSTAIN`. | Same rule. | E.g. require both tenancy (`LinkedClaimAllowlist("tid", ...)`) **and** group membership (`LinkedClaimAllowlist("groups", ...)`). | +| `CallableAllowlist(fn)` | Calls `fn(context)` and returns its result. | Same. | Escape hatch for app-specific logic; recommended only after exhausting the structured variants. | + +##### Host configuration: `default_allowlist` + explicit channel inheritance + +Allowlists can be configured at the host level (`AgentFrameworkHost(default_allowlist=...)`) and per-channel. The channel-side default is **explicit inheritance**, not an implicit `None`: + +```python +class SomeChannel: + def __init__( + self, + *, + require_link: bool = False, + allowlist: IdentityAllowlist | Literal["inherit"] | None = "inherit", + ): ... +``` + +- `allowlist="inherit"` (default) → the host's `default_allowlist` applies. If the host did not set one either, the channel is open. +- `allowlist=None` → the channel is **explicitly open**, even if the host has a `default_allowlist`. Used to carve out a public endpoint inside an otherwise-locked-down host. +- `allowlist=` → that allowlist applies, overriding the host default. To **add to** the host default rather than replace it, compose explicitly: `allowlist=AllOfAllowlists(host.default_allowlist, MyExtraList())`. + +##### `host.authorize(...)` and `AuthorizationOutcome` + +Channels do not run the decision pipeline themselves — they call into a single host seam after extracting `ChannelIdentity` and any natively verified claims: + +```python +@dataclass(frozen=True) +class Allowed: + isolation_key: str + +@dataclass(frozen=True) +class LinkRequired: + challenge: LinkChallenge + +@dataclass(frozen=True) +class Denied: + reason_code: str # stable, machine-readable + user_message: str | None = None # safe to render publicly (group-chat-safe) + log_details: Mapping[str, Any] = {} # never shown to users; structured for audit + +AuthorizationOutcome = Allowed | LinkRequired | Denied + +async def host.authorize( + identity: ChannelIdentity, + *, + require_link: bool, + allowlist: IdentityAllowlist | None, + verified_claims: Mapping[str, str] | None = None, + conversation_context: ConversationContext | None = None, # for group-chat policy +) -> AuthorizationOutcome: ... +``` + +**Decision order** (the pipeline the host runs): + +1. Build `AuthorizationContext(phase="pre_link", verified_claims=verified_claims or {}, claim_source=…)`. +2. `decision_pre = allowlist.evaluate(context_pre)` (defaults to `ALLOW` when `allowlist is None`). +3. `decision_pre == DENY` → `Denied(reason_code="allowlist_denied_pre_link", ...)`. +4. `decision_pre == ALLOW`: + - If `require_link=True` and the linker has no record yet → `LinkRequired(linker.begin(identity))`. + - Otherwise → `Allowed(resolved_or_auto_issued_isolation_key)`. +5. `decision_pre == ABSTAIN`: + - If `require_link=True` **or** the allowlist declared `requires_linked_claims`: attempt `linker.is_linked(identity, verified_claims=…)`. + - Not linked → `LinkRequired(linker.begin(identity))`. + - Linked → evaluate again at `phase="post_link"` with the linker-emitted claims. + - `ALLOW` → `Allowed(linked_isolation_key)`. + - `DENY` → `Denied(reason_code="allowlist_denied_post_link", ...)`. + - `ABSTAIN` post-link is a misconfiguration (no allowlist had an opinion even after linking); logged and treated as `Denied(reason_code="allowlist_abstain_after_link")`. + - Otherwise (open profile, no claim dependency): `Allowed(auto_issued_isolation_key)`. + +The channel **renders** the outcome — `Allowed` proceeds to `ChannelRequest`, `LinkRequired` projects the `LinkChallenge` through the channel's native UX (same path the `link` command already uses), `Denied` projects `user_message` (when set) through a short refusal. The channel **never** sees `log_details` and is responsible for not echoing `reason_code` to end users. + +##### Configuration validation (fail-fast) + +The host runs a startup validator across `(channel.require_link, channel.allowlist)` for every channel: + +1. If `channel.allowlist` (after resolving `"inherit"`) contains any allowlist with `requires_linked_claims=True`, the channel **must** either have `require_link=True` or declare via a channel attribute that it natively emits verified claims (`Channel.emits_verified_claims: bool = False`). Otherwise: `raise ChannelConfigurationError("LinkedClaimAllowlist requires a source of verified claims; set require_link=True on or use a channel that emits them natively")`. +2. If `channel.allowlist` contains a `LinkedClaimAllowlist` and the host has no `identity_linker` configured: same `ChannelConfigurationError`. +3. If `channel.allowlist` contains a `NativeIdAllowlist(channel=)` whose `` is not a known channel on this host: `ChannelConfigurationError`. + +These errors are raised eagerly at `AgentFrameworkHost.__init__` (or `host.serve(...)` startup), not on the first inbound request — silent deny-everyone is the worst possible default and is not allowed. + +##### Group chats and privacy of denial + +Authorization runs **per message**, not per conversation: in a group chat, one allowlisted user invoking the bot does not authorize other group members for subsequent messages. The host also mirrors the `LinkChallenge` group-chat redirect pattern (see [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels)) for denials: + +- In a 1:1 chat, the channel may render the full `user_message` from `Denied`. +- In a group chat, the channel renders a generic refusal in-room (e.g. "You don't have access to this bot.") and, where the channel supports it, follows up with a DM containing the longer `user_message`. The full `log_details` payload only reaches the host's structured logs / OpenTelemetry span — never the wire. + +Built-in `user_message` defaults are intentionally bland and tenancy-free ("You don't have access to this bot." / "Please link your account to continue.") to avoid leaking who else is in the allowlist or which tenant gates it. + +##### v1 shipping surface + +The core PR includes the channel-neutral authorization and identity-linking seam; provider-specific linker packages (for example Entra OAuth helpers) plug into it without making the core package depend on an IdP SDK: + +- `IdentityAllowlist` Protocol + `AllowlistDecision` enum + `AuthorizationContext` dataclass. +- `AllowAll`, `NativeIdAllowlist`, `LinkedClaimAllowlist`, `AnyOfAllowlists`, `AllOfAllowlists`, `CallableAllowlist` built-ins. +- `IdentityLinker` Protocol, `LinkedIdentity`, and `LinkChallenge` core types. A linker resolves a channel-native identity in one call, returning either a linked identity with verified claims or a challenge for the channel to render. +- `AuthorizationOutcome` (`Allowed` / `LinkRequired` / `Denied`) types. +- `AuthPolicy` factory helpers on the public surface. +- `Host(default_allowlist=..., identity_linker=...)` + per-channel `allowlist: ... | Literal["inherit"] | None` parameter and the construction-time config validator. The validator enforces rules #1 (claim-source), **#2 (linker presence — channels with `require_link=True` must be paired with a configured `identity_linker`; otherwise a `ChannelConfigurationError` is raised at construction so misconfigurations cannot ship)**, and #3 (NativeIdAllowlist channel typo). Combinator walking (`AnyOf` / `AllOf`) is recursive so nested misconfigurations are caught at the host level. +- `host.authorize(identity, *, require_link, allowlist, verified_claims=None)` supports open, native-id allowlist, and claim allowlist profiles end-to-end. The open path returns `Allowed` with an auto-issued `:` isolation key (linear-scan registry lookup re-issues a known key when the identity has been seen before). Native-id allowlists return `Allowed`/`Denied` per the list. Claim-based allowlists use channel-emitted `verified_claims` when present; otherwise, when a linker is configured, the host returns `LinkRequired(challenge)` for unresolved identities or evaluates `LinkedClaimAllowlist` against the linker's verified claims for resolved identities. + + #### `LinkPolicy` and `confidentiality_tier` @@ -408,6 +672,8 @@ Messages that don't satisfy the rule are ignored at the channel layer — no `Ch | All linked | `ResponseTarget.all_linked` | Delivered to every channel where the resolved `isolation_key` is known. | | None | `ResponseTarget.none` | Background-only — caller must poll the `ContinuationToken`. Forces `background=True`. | +`ResponseTarget` constructors that take at least one channel id (`.channel(...)`, `.channels([...])`, `.identities([...])`) accept an `echo_input: bool = False` kwarg. When true, the host pushes the **originating user's input** to each non-originating destination as a `HostedRunResult[AgentResponse]` whose underlying `messages[*].role == "user"` **before** the agent reply (whose `messages[*].role == "assistant"`). Used when the developer wants downstream channels to mirror what the user said so their UI stays coherent (e.g. a workflow originating on Telegram that pushes to Teams as well — the Teams transcript shows both turns). The echo and the response are bundled into the **same scheduled push task** per destination (the runner-managed unit of work — see [Intended targets + durable delivery](#intended-targets--durable-delivery)); the echo is dispatched first, and an echo-push failure is logged and swallowed inside the task so a channel that drops echoes still receives the agent reply. Both pushes go through the same `ChannelPush.push(identity, payload)` entry point — channels distinguish the echo phase from the response phase by inspecting `payload.result.messages[*].role`, or (for channels that wire a `response_hook`) by branching on `ChannelResponseContext.is_echo` directly. Channels that cannot impersonate the user on their wire (most chat bots can only send as the bot) typically render echoes as a quoted / prefixed block, drop them, or rewrite them via their `response_hook`. + When `response_target` is anything other than `originating`, the originating channel's protocol response is the **`ContinuationToken`** (e.g. an Invocations 202 with the token in the response body and/or a polling URL header), and the actual agent response is delivered out-of-band via the destination channel(s)' `ChannelPush`. If the destination channel doesn't implement `ChannelPush`, the host falls back per the configured policy (default: deliver to `originating`; surfaces a warning in telemetry). The configured `LinkPolicy` is consulted for every destination — destinations that fail the policy (e.g. a corp-tier channel addressed from a public-tier originating request) are dropped, and if every destination is dropped the host falls back to `originating`. **`ChannelPush`** (Protocol) — optional capability for channels that can deliver outbound messages without a prior request. @@ -446,7 +712,14 @@ V1 ships two implementations: - **`FileHostStateStore(directory: Path = "./.af-hosting/")`** — default; one JSON file per record under `continuations/`, `link_grants/`, plus a `last_seen.json` keyed by isolation key. Atomic writes; per-namespace TTL cleanup (continuations 24h, link grants 15min, last-seen 30d by default). Suitable for single-node hosts and dev; works in hosted-agent environments where the working directory is persisted and isolated per agent. - **`InMemoryHostStateStore()`** — testing / ephemeral; same protocol, no persistence. -Pluggable v1-fast-follow implementations (Cosmos, SQL, Redis) plug into the same protocol — see req #23. +Pluggable v1-fast-follow implementations (Cosmos, SQL, Redis) plug into the same protocol — see req #24. + +In the Python core package, the host-level `state_dir` shorthand reserves a +`links` component for this identity-link store. Passing a single path derives +`state_dir/links/`; the `HostStatePaths` mapping form accepts `links=...` for +placing link-store data on a separate volume. The core host offers that path to +identity linkers that implement `SupportsLinkStorePath`; linkers that own a +provider-specific store can ignore it and be configured directly. **`ChannelCommand` / `ChannelCommandContext` / `CommandHandler`** — cross-channel native command model (per PR #5393). @@ -460,11 +733,23 @@ Pluggable v1-fast-follow implementations (Cosmos, SQL, Redis) plug into the same | Type | Fields | Description | |---|---|---| -| `HostedRunResult` | `response: AgentResponse`, `session: AgentSession?`, `text` | One-shot outcome. | +| `HostedRunResult[TResult]` | `result: TResult`, `session: AgentSession \| None` | One-shot outcome. `result` carries the **target's full-fidelity output unchanged**: `HostedRunResult[AgentResponse]` for agent targets (channels read `result.messages`, `result.text`, `result.value`, `result.response_id`, `result.usage_details`, … directly off the underlying response), `HostedRunResult[WorkflowRunResult]` for workflow targets (channels iterate `result.get_outputs()` and inspect `result.get_final_state()`). The host never pre-shapes, flattens, or filters — multi-modality and structured outputs survive end-to-end and each channel (through its `response_hook` and its native serializer) decides what subset its wire renders. The echo-input phase synthesises an `HostedRunResult[AgentResponse]` wrapping the originating user turn so the same delivery machinery applies. `session` carries the resolved per-isolation_key `AgentSession` (`None` for workflows, which do not own session state in the agent sense). Treat instances as immutable — the host clones per-destination via `result.replace(result=...)` before invoking each channel's `response_hook`; `replace()` is shallow, so channels that need to mutate ``result`` itself are responsible for their own deep copy. | | `HostedStreamResult` | `updates: ResponseStream[...]`, `raw_events: AsyncIterable[Any] \| None`, `session: AgentSession?` | Streaming outcome. `updates` is the **normalized** stream of `AgentRunResponseUpdate` (lossless for messages, function calls, usage) and is the happy path for Responses, Invocations, Telegram, and most channels. `raw_events` is an optional **passthrough seam** onto the underlying agent event stream (before update normalization) for channels whose protocol carries domain events the framework does not model — e.g. AG-UI's `StateSnapshotEvent` / `StateDeltaEvent` / `ToolCallStartEvent`. Channels that consume `raw_events` bear responsibility for the full event translation; the request still flows through `context.stream(...)` so session resolution, identity, push, and policy continue to apply. `None` when the host has no raw upstream (e.g. a workflow-only target produced from cached events). | The host does **not** emit protocol events directly — channels translate `HostedRunResult`/`HostedStreamResult` into Responses events, Invocations SSE, webhook callbacks, or platform messages. +**`ChannelResponseHook` / `ChannelResponseContext`** — dev-supplied post-processing seam applied per destination before push. + +| Type | Shape | Description | +|---|---|---| +| `ChannelResponseHook` | `Callable[[HostedRunResult[Any], *, context: ChannelResponseContext], HostedRunResult[Any] \| Awaitable[HostedRunResult[Any]]]` | Stored as a `response_hook` attribute on a channel instance — **duck-typed**, not part of the `Channel` Protocol. Receives a per-destination clone of the `HostedRunResult` and returns a (possibly rewritten) replacement. Hooks rebind ``result`` via `HostedRunResult.replace(result=...)` rather than mutating it in place. Common uses: flatten multi-modal output to text for a text-only wire, filter out tool-call contents, project a workflow `WorkflowRunResult` into a channel-friendly `AgentResponse` for text-only channels, attach citation entities, decide an Adaptive Card vs plain-text presentation. The hook signature stays `Any`-typed in the envelope's `TResult` so a single channel can serve both agent (`HostedRunResult[AgentResponse]`) and workflow (`HostedRunResult[WorkflowRunResult]`) payloads; channels narrow at hook entry if they want static checking. | +| `ChannelResponseContext` | `request: ChannelRequest`, `channel_name: str`, `destination_identity: ChannelIdentity`, `originating: bool`, `is_echo: bool` | Per-destination context passed to a hook. `originating=False` for push deliveries (current scope of the host's `_deliver_response`); `is_echo=True` when this invocation is for the `ResponseTarget.echo_input` user-message phase rather than the agent reply phase. | +| `apply_response_hook(hook, result, *, context)` | helper | Standardised invocation convention so channels (and the host's delivery layer) all call hooks the same way. | + +The host runs each destination's hook on a **cloned** `HostedRunResult`, so a hook that rebinds `result` cannot leak into the payload another destination observes. The clone is shallow — channels that need to mutate `result` itself (rather than rebind it via `replace()`) are responsible for their own deep copy. + + + ### Built-in channel constructors ```python @@ -625,7 +910,7 @@ To make the picture explicit: there are exactly three distinct *storage seams* i | Seam | Scope | Examples | |---|---|---| | **`ContextProvider`** (per-conversation) | Per-`source_id` data the agent needs at run time. Messages (via `HistoryProvider`), AG-UI per-thread state (via `AgUiStateProvider`), or any future per-conversation extension. **The only public per-conversation seam.** | `FileHistoryProvider`, `FoundryHostedAgentHistoryProvider`, `AgUiStateProvider` | -| **Host-level pluggable store** (per-host) | `ContinuationToken`s for background runs, identity-link grants, last-seen `(isolation_key, channel)` records. **File-based by default** in v1 (`FileHostStateStore`, atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis adapters in v1 fast follow (req #23). MAY be backed by the same physical store as `ContextProvider`, but the protocol is distinct because the data is host-execution metadata, not per-conversation context. | `FileHostStateStore` (v1 default), `InMemoryHostStateStore`, future Cosmos / SQL / Redis adapters | +| **Host-level pluggable store** (per-host) | `ContinuationToken`s for background runs, identity-link grants, last-seen `(isolation_key, channel)` records. **File-based by default** in v1 (`FileHostStateStore`, atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis adapters in v1 fast follow (req #24). MAY be backed by the same physical store as `ContextProvider`, but the protocol is distinct because the data is host-execution metadata, not per-conversation context. | `FileHostStateStore` (v1 default), `InMemoryHostStateStore`, future Cosmos / SQL / Redis adapters | | **`CheckpointStorage`** (workflow runtime) | Workflow executor frames so a workflow can resume after process restart. Structurally distinct from both seams above (the data is workflow-runtime state, not session/identity state). MAY share a physical backend, but the protocol stays separate. | `FileCheckpointStorage`, future `CosmosCheckpointStorage` | Concretely, this means an app deploying onto e.g. Foundry storage can run **all three** against the same Foundry backend and still have three orthogonal protocol surfaces — one per concern — instead of one universal store everything accidentally collides in. @@ -682,7 +967,7 @@ The packaging question for `uvicorn` (required dependency vs optional extra) is - **Active channel**: The channel most recently observed for a given `isolation_key`. Tracked by the host on every successfully resolved request; consumed by `ResponseTarget.active`. - **`ContinuationToken`**: First-class artifact for background/asynchronous runs, returned immediately from `host.run_in_background(request)`. Carries an opaque, URL-safe `token` plus `status`, `isolation_key`, `result`/`error`, and the configured `response_target`. Persisted via `HostStateStore` (file-based by default in v1) so background runs survive host restarts. Host pushes the result to the response target when ready and serves it via channel poll routes. - **Background run**: A `ChannelRequest` submitted via `host.run_in_background(request)` (or any request with `background=True`). The originating call returns a `ContinuationToken` immediately; the response is delivered later via the configured `ResponseTarget` and/or polled by token. -- **`HostStateStore`**: Single persistence seam for host-execution metadata — continuation tokens, identity-link grants, last-seen records. V1 default `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis (fast follow, req #23). Distinct from `ContextProvider` (per-conversation) and `CheckpointStorage` (workflow), but a deployment MAY back all three with the same physical store. +- **`HostStateStore`**: Single persistence seam for host-execution metadata — continuation tokens, identity-link grants, last-seen records. V1 default `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis (fast follow, req #24). Distinct from `ContextProvider` (per-conversation) and `CheckpointStorage` (workflow), but a deployment MAY back all three with the same physical store. - **`session_mode`**: Per-request directive (`auto` | `required` | `disabled`) that controls whether the host resolves a session before invoking the target. Lets `run_hook`s express explicit policy — e.g. translating Responses `store=false` into `session_mode="disabled"` to honor the caller's "don't store" intent at the `HistoryProvider` layer (the channel does not do this automatically — see [The Responses store parameter](#the-responses-store-parameter)). - **`confidentiality_tier`** (channel-level): Opaque label (`"corp"`, `"public"`, `"internal"`, …) declared on a `Channel` and consumed by the host's `LinkPolicy`. Two channels with different confidentiality tiers can share an agent target on one host while remaining session-isolated. - **`LinkPolicy`**: Host-level decision over which channel pairs may share an `isolation_key` (link) and which channel pairs may be `ResponseTarget` source/destination for one another (deliver). Built-in variants: allow-all (default), same-tier-only, explicit allow-list, deny-all. See [LinkPolicy and confidentiality_tier](#linkpolicy-and-confidentiality_tier) for the full contract and built-ins table. @@ -693,6 +978,54 @@ The packaging question for `uvicorn` (required dependency vs optional extra) is - **`SupportsAgentRun`**: The existing framework agent execution seam (`run(..., session=..., stream=...)`) — the contract the host uses when the hostable target is an agent. - **`Workflow`**: The framework workflow execution seam — the contract the host uses when the hostable target is a workflow. The host wraps the workflow's outputs into the same `HostedRunResult` / `HostedStreamResult` shape so channels do not need to distinguish. +## Runtime modes + +The host runs in one of two operational shapes, declared (or auto-detected) via a single `runtime_mode` parameter. The parameter is **advisory** — it sets defaults for the seams below; the developer can override any individual choice. + +```python +AgentFrameworkHost( + target=my_agent, + channels=[...], + runtime_mode=None, # None → auto-detect; "long_running" | "ephemeral" to force +) +``` + +| Value | Shape | When to use | +|---|---|---| +| `"long_running"` | Always-on container / process. Owns its own scheduler. Survives across many requests. | Local dev, OpenClaw-style hosted deployments, classic web-app rollouts on AKS / App Service / Container Apps. | +| `"ephemeral"` | Scale-to-zero / per-request lifecycle. Process may terminate between requests; cold-start cost on each one. | Foundry Hosted Agent, Azure Functions consumption plan, AWS Lambda, and similar serverless runtimes. | +| `None` (default) | Auto-detect. The host inspects environment markers at construction; falls back to `"long_running"` when nothing is detected. | The default. Recommended for portable code that works locally and ships to a serverless target. | + +**Auto-detection.** When `runtime_mode=None`, the host checks for known deployment markers in this order and picks `"ephemeral"` on the first hit: + +| Marker | Meaning | +|---|---| +| `FOUNDRY_HOSTING_ENVIRONMENT` (env var) | Running inside Foundry Hosted Agent. | +| `AZURE_FUNCTIONS_ENVIRONMENT` (env var) | Running inside the Azure Functions worker. | +| `AWS_LAMBDA_FUNCTION_NAME` (env var) | Running inside an AWS Lambda. | + +If none of the markers match, the host defaults to `"long_running"` (a sensible local-dev / container default). Additional markers may be added without bumping the API; the list is documented and overridable via the `runtime_mode` parameter itself. + +**Defaults selected by mode.** The mode drives the *default selection* for these seams. Each is independently overridable: + +| Concern | `"long_running"` default | `"ephemeral"` default | +|---|---|---| +| `HostStateStore` | `InMemoryHostStateStore` (process owns state) | `FileHostStateStore` (atomic JSON under `./.af-hosting/`; survives single-node restart) | +| `ContinuationToken` persistence | In-memory acceptable | Persistence required (file / Cosmos / Foundry) | +| `DurableTaskRunner` | `InProcessTaskRunner` (asyncio + bounded retry) | Adapter expected (`agent-framework-hosting-durabletask`, Foundry, …); falls back to `InProcessTaskRunner` with a startup warning when none configured | +| Background runs (req #14) | Owned by the long-running worker via `InProcessTaskRunner` | Hand off to the durable runner so the process can terminate between requests | +| Channel polling (e.g. Telegram `getUpdates`) | Natural fit — `on_startup` spawns the poller, `on_shutdown` cancels it | Requires an external scheduled trigger or webhook transport; polling channels emit a startup warning when paired with `"ephemeral"` | +| `IdentityLinker` short-lived grants | In-memory TTL fine | Must persist via `HostStateStore` | +| `IdentityAllowlist` lookup | In-memory cache fine | Persisted source or external IdP claim resolution | +| Health checks + readiness probes | First-class | Less relevant — runtime manages liveness | +| Per-channel polling-worker isolation | Important — leaks compound over days/weeks (see [`channels_vs_openclaw.md`](../../python/.user/channels_vs_openclaw.md)) | N/A — process recycles between requests | +| Process-recycle expectations | Days/weeks | Per-request | +| Memory/leak concerns | Important | Less relevant | + +**Detection failures.** Auto-detection is best-effort. If a deployment uses a custom runtime not in the marker list, callers SHOULD set `runtime_mode="ephemeral"` (or `"long_running"`) explicitly. The host logs the detected mode at startup so misdetection is visible in normal operation. + +**Why advisory and not enforced.** Most knobs make sense in both modes (e.g. a developer running a "long-running" container may still want `FileHostStateStore` for state durability across deploys); enforcing strict defaults per mode would force every override to fight a config error. The selected defaults are a starting point. + ## Hero Code Samples > **Common prerequisite:** Every sample below calls `host.serve(...)`, which lazy-imports `uvicorn`. Install `uvicorn` (e.g. `pip install uvicorn`) — or the corresponding `agent-framework-hosting[serve]` extra if the package ships one (see Open Question #2) — alongside the per-sample dependencies listed in each scenario's **Prerequisites** block. Samples that use `host.app` directly (handed to Hypercorn/Daphne/Granian/Gunicorn+uvicorn workers) do not require `uvicorn`. @@ -897,6 +1230,41 @@ Webhook transport contributes `/telegram/webhook` by default; the command catalo A developer wants every Telegram chat to be **authenticated up front** via OAuth (Microsoft Entra ID) before the agent will respond, and wants Teams chats from the same Entra ID user to be **auto-linked** to the existing session — no second `/link` ceremony, just sign in once on the first channel and the rest follow automatically. This delivers cross-channel chat continuity as a side-effect of identity linking; Scenario 7 covers the alternative pattern where a trusted server-side relay supplies identity directly without a link ceremony. +```mermaid +sequenceDiagram + autonumber + actor User + participant Tg as TelegramChannel + participant Host + participant Linker as IdentityLinker
(EntraOAuth) + participant IdP as OAuth provider + participant Store as HostStateStore
(identity_links) + participant Act as ActivityChannel + + User->>Tg: /link + Tg->>Host: ChannelRequest(identity=tg:12345) + Host->>Linker: begin(identity=tg:12345) + Linker-->>Host: LinkChallenge(url, state) + Host->>Tg: response_hook → push challenge URL + Tg-->>User: "click here to sign in" + + User->>IdP: sign in (browser) + IdP-->>User: redirect with code + User->>Linker: /callback?code=…&state=… + Linker->>IdP: exchange code → tokens + claims + IdP-->>Linker: claims (oid, email, …) + Linker->>Store: persist link(tg:12345 ↔ linked_claims) + Linker-->>User: "linked ✅" + + Note over User: later, on Teams (same Entra OID) + + User->>Act: hello + Act->>Host: ChannelRequest(identity=teams-aad-oid) + Host->>Store: lookup linked_claims by oid + Store-->>Host: existing isolation_key (matches tg:12345) + Host->>Host: same session as Telegram +``` + > **Prerequisites:** This sample assumes: > - `agent-framework-hosting`, `agent-framework-hosting-telegram`, and the (future) `agent-framework-hosting-activity` channel are installed > - An OAuth provider is configured (Microsoft Entra ID in this example) @@ -965,6 +1333,37 @@ A developer runs an internal application server that already knows its end users This works **without** an `IdentityLinker` because the application backend is a **trusted relay**: it already authenticated the user through its own SSO and knows both the user's app-internal id and (because the user has previously connected their Telegram account in the application's own settings page) the user's Telegram `chat_id`. The host just needs to be told. +```mermaid +sequenceDiagram + autonumber + actor Backend as Server-side backend + participant Resp as ResponsesChannel + participant Host + participant Hook as run_hook
(responses_relay_hook) + participant Store as HostStateStore
(continuations) + participant Target as Agent + participant Runner as DurableTaskRunner + participant Tg as TelegramChannel + + Backend->>Resp: POST /v1/responses
extra_body.hosting.push_to_telegram_chat_id= + Resp->>Host: ChannelRequest(...) + Host->>Hook: run_hook(request, context) + Hook->>Hook: rewrite to
background=True,
response_target=identities([tg:]) + Host->>Store: write continuation(token, status=in_progress) + Host-->>Resp: ContinuationToken (token) + Resp-->>Backend: 200 with continuation token + + Note over Host,Target: background task + + Host->>Target: run (async) + Target-->>Host: AgentResponse + Host->>Store: continuation.complete(token, result) + Host->>Runner: schedule("hosting.push",
payload for tg:) + Runner->>Host: _handle_push_task(payload) + Host->>Tg: response_hook → push + Tg-->>User: answer arrives in Telegram chat +``` + > **Prerequisites:** This sample assumes: > - `agent-framework-hosting`, `agent-framework-hosting-responses`, and `agent-framework-hosting-telegram` are installed > - The application backend can attach two extra fields to its Responses call: an `app_user_id` (the user's stable id in the application's own namespace) and, optionally, a `push_to_telegram_chat_id` (the user's known Telegram chat id from the application's own database) @@ -1102,6 +1501,42 @@ The two enabling pieces: A developer wants the user to start a long-running task on Telegram and pick up the response on Teams (whichever channel the user happens to be on when the result is ready). The originating Telegram message returns a `ContinuationToken` immediately; when the agent completes, the host pushes the result to the user's currently active channel via `ChannelPush`. A poll route is also exposed for callers that prefer polling. +```mermaid +sequenceDiagram + autonumber + actor User + participant Tg as TelegramChannel + participant Host + participant Hook as run_hook + participant Store as HostStateStore
(continuations · last_seen) + participant Target as Agent + participant Runner as DurableTaskRunner + participant Act as ActivityChannel + + User->>Tg: long-running ask + Tg->>Host: ChannelRequest(identity=tg:12345) + Host->>Hook: run_hook + Hook->>Hook: background=True,
response_target=active + Host->>Store: write continuation(in_progress) + Host-->>Tg: ContinuationToken + Tg-->>User: "working on it…" + + Note over User: user opens Teams,
last_seen updates to "activity" + + User->>Act: hello on Teams + Act->>Host: ChannelRequest(identity=teams-aad-oid) + Host->>Store: record_last_seen(isolation_key, activity, now) + + Note over Host,Target: background completes + + Target-->>Host: AgentResponse + Host->>Store: get_last_seen(isolation_key) → activity + Host->>Runner: schedule("hosting.push",
payload for activity) + Runner->>Host: _handle_push_task + Host->>Act: push + Act-->>User: answer arrives on Teams +``` + > **Prerequisites:** This sample assumes: > - `agent-framework-hosting`, `agent-framework-hosting-telegram`, and the (future) `agent-framework-hosting-activity` channel are installed > - The user is already linked across Telegram and Teams (Scenario 6) @@ -1161,6 +1596,42 @@ If the chosen destination channel does not implement `ChannelPush` (e.g. Respons ### Scenario 9: Hosting a `Workflow` instead of an agent (with checkpoint storage) +The host shape is unchanged when the target is a `Workflow`; the result wrapper narrows to `HostedRunResult[WorkflowRunResult]` and `response_hook` carries the projection that lets text-only channels render workflow output. + +```mermaid +sequenceDiagram + autonumber + actor User + participant Channel + participant Host + participant Workflow + participant Store as HostStateStore
(workflow_checkpoints) + participant Hook as response_hook
(per-channel) + + User->>Channel: message + Channel->>Host: ChannelRequest + Host->>Workflow: run(messages) + loop per executor / event + Workflow->>Store: write checkpoint + Workflow-->>Host: WorkflowEvent + end + Workflow-->>Host: WorkflowRunResult + Host->>Host: wrap → HostedRunResult[WorkflowRunResult]
(get_outputs, get_final_state, …) + + alt text-only channel + Host->>Hook: response_hook(result, context) + Hook->>Hook: result.replace(
result=AgentResponse(
text=workflow.get_outputs()[-1])) + Hook-->>Host: HostedRunResult[AgentResponse] + Host->>Channel: push (sync or via runner) + else card-capable channel + Host->>Hook: response_hook(result, context) + Hook->>Hook: render adaptive card
from workflow get_outputs + Hook-->>Host: HostedRunResult[Any] + Host->>Channel: push + end + Channel-->>User: reply (channel-native) +``` + > **Prerequisites:** This sample assumes: > - `agent-framework-hosting` and `agent-framework-hosting-invocations` are installed > - A `Workflow` definition with typed inputs (`OrderIntakeInputs`) @@ -1274,11 +1745,13 @@ class MyWebhookChannel: return ChannelContribution(routes=[Route(f"{self._path}/inbound", endpoint, methods=["POST"])]) ``` -**Result is rich, not just text.** `result` here is a `HostedRunResult` wrapping an `AgentRunResult` (or a workflow output). It is **not** limited to a flat string — `result.text` is the convenience plain-text projection, but the underlying object carries: +**Result is rich, not just text.** `result` here is a `HostedRunResult[TResult]` — a thin generic envelope around the target's **full-fidelity output**. For agent targets `TResult` narrows to `AgentResponse`, so channels read everything the target produced directly off `result.result`: - the full `messages: list[ChatMessage]` thread the agent produced this turn — each message holds an ordered list of typed `Contents` (see [`Contents` in core](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_types.py)): `TextContent`, `DataContent` (inline base64 blobs), `UriContent` (URLs to images/audio/files), `FunctionCallContent` and `FunctionResultContent` (tool-call traces), `HostedFileContent` / `HostedVectorStoreContent` (provider-side file/vector references), `UsageContent` (token usage), `ErrorContent`, `TextReasoningContent` (reasoning traces), and channel-extensible custom content kinds. Each content also has `additional_properties` for provider-specific extensions (citations, image alt text, source spans, …), - `value: T | None` — the typed structured output when the agent returned one (e.g. via response-format / structured-output features), -- `usage_details: UsageDetails | None`, `raw_representation`, and per-message `additional_properties` carrying provider-native extras. +- `response_id`, `usage_details: UsageDetails | None`, `raw_representation`, and per-message `additional_properties` carrying provider-native extras. + +For workflow targets `TResult` is `WorkflowRunResult`, so `result.result.get_outputs()` iterates the per-executor output payloads and `result.result.get_final_state()` exposes terminal-state info. The host never collapses or pre-shapes workflow outputs — channels (and developer-supplied `response_hook`s) own the projection, since "what counts as a renderable output" is wire-format-specific. A channel author is free to project this into **whatever the channel's native shape supports**. Examples: @@ -1286,33 +1759,92 @@ A channel author is free to project this into **whatever the channel's native sh - The built-in **Responses channel** preserves the full content-list shape on the wire — every `ChatMessage` round-trips as a Responses-shaped output item so callers can inspect the typed mix of text, function-call traces, image/file outputs, reasoning, and structured-output `value`s exactly as the agent produced them. There is no lossy collapse to a single text field. - A channel fronting a **chat UI** can render `TextContent` as full GitHub-Flavored Markdown / HTML (tables, code fences with syntax highlighting, math), `DataContent` and `UriContent` as inline images/audio/video players, `FunctionCallContent` / `FunctionResultContent` as collapsible "tool ran" cards, and `TextReasoningContent` as a collapsible reasoning panel — all from the same `result`. - A **voice channel** can route `TextContent` through TTS, play `DataContent(audio/*)` directly, and surface `FunctionCallContent` only as audio earcons (or skip them entirely) — the same `result` object drives a completely different surface. -- A **richly-typed RPC channel** can return `result.value` (the structured output) directly when the workflow / agent produced one, and fall back to `result.text` only when no typed output is available. +- A **richly-typed RPC channel** can return `result.result.value` (the structured output) directly when the workflow / agent produced one, and fall back to a text projection only when no typed output is available. -The host imposes no projection — `result.text` is offered as a convenience for channels whose native shape really is "single string in, single string out", and channels are encouraged to lean on the full content list when their protocol supports more. +The host imposes no projection — channels read `result.result.text` for a convenience plain-text rollup on agent targets, but are encouraged to lean on the full underlying payload when their protocol supports more. ## Information Design ### Canonical flow +The default request/response shape — single channel, originating response, no fan-out. Authorization runs before `run_hook`; `response_hook` runs per-destination (here just one). + +```mermaid +sequenceDiagram + autonumber + actor User + participant Channel as Channel
(inbound) + participant Host + participant Auth as host.authorize + participant Target as Agent / Workflow + participant Annot as _annotate_intended_targets + + User->>Channel: native payload (webhook / poll / HTTP body) + Channel->>Channel: parse → ChannelRequest
(identity, conversation_id, content, response_target=originating) + Channel->>Host: dispatch(ChannelRequest) + Host->>Auth: authorize(identity, require_link, allowlist) + alt Denied / LinkRequired + Auth-->>Host: Denied(reason_code, user_message)
or LinkRequired(challenge) + Host-->>Channel: render denial / link challenge
(channel-appropriate UX) + Channel-->>User: short refusal in-room + else Allowed(isolation_key) + Host->>Host: resolve session via StateStore + Host->>Host: run_hook(request, context) + Host->>Target: target.run(messages, session=...) + Target-->>Host: AgentResponse / WorkflowRunResult + Host->>Host: wrap → HostedRunResult[TResult] + Host->>Annot: write hosting.intended_targets
onto assistant message + Host->>Channel: response_hook(result, context) + Channel->>Channel: shape to native payload + Channel-->>User: reply on originating wire + end +``` + +The textual trace of the same flow (showing more of the per-step bookkeeping): + ```text external request/event -> channel-specific parsing + validation -> ChannelIdentity extraction (per-channel native id) -> default channel invocation mapping - -> optional run_hook - -> ChannelRequest (carries response_target, background) + -> optional run_hook (dev-supplied; default no-op) + -> ChannelRequest (carries response_target, background, echo_input) -> AgentFrameworkHost / ChannelContext -> identity_resolver(ChannelIdentity) -> isolation_key -> host records (isolation_key, channel, now) as last-seen (for ResponseTarget.active) -> AgentSession resolution (per session_mode, scoped by isolation_key) - -> [foreground] target execution seam -> HostedRunResult/HostedStreamResult -> originating channel serialization + -> target execution seam (Agent.run / Workflow.run) + -> HostedRunResult[AgentResponse] | HostedRunResult[WorkflowRunResult] + (full-fidelity result carried unchanged; no pre-shaping by the host) + -> [foreground] fan-out: + for each destination resolved from ResponseTarget: + -> clone HostedRunResult envelope (per-destination isolation; shallow copy) + -> optional channel response_hook (dev-supplied; default = identity) + -> hook receives ChannelResponseContext(request, channel_name, destination_identity, originating, is_echo) + -> hook may rebind result via HostedRunResult.replace(result=...) + (e.g. project a WorkflowRunResult to an AgentResponse for a text-only wire) + -> channel-native serialization (channel chooses what content types / outputs it can render) + -> channel.push(identity, shaped_payload) | originating return value + if ResponseTarget.echo_input is True: + each non-originating destination receives the user's input first + (synthesised as a HostedRunResult[AgentResponse] with a role="user" message), + then the agent reply. Both pushes execute inside the same scheduled + push task; an echo-push failure is logged and swallowed so the + response push on the same destination is still attempted. -> [background or response_target != originating] -> ContinuationToken returned immediately to originating channel -> target executes asynchronously - -> on completion, deliver to ResponseTarget via destination channel.push(...) + -> on completion, the same fan-out (clone + response_hook + push) applies -> ContinuationToken updated; available via host.get_continuation(token) and channel poll routes ``` +**Full-fidelity contract.** The host never collapses agent / workflow output. `HostedRunResult[TResult]` carries the target output unchanged: agent targets see the full `AgentResponse` (multi-modal `messages`, `value`, `usage_details`, `response_id`, …); workflow targets see the full `WorkflowRunResult` (per-executor outputs via `get_outputs()`, terminal state via `get_final_state()`). Each channel — through its `response_hook` and its own serializer — decides what subset its wire can carry. A text-only channel iterates `result.result.messages` (or projects the workflow's outputs into a single text turn via a response hook); a card-capable channel inspects the underlying contents directly. + +**Per-destination cloning.** Before invoking a channel's `response_hook`, the host clones the `HostedRunResult` envelope so one channel's `replace(result=...)` cannot leak into the payload another destination observes. The clone is shallow — channels that need to mutate `result` itself (rather than rebind it) own the deep copy. + +**`response_hook` is a channel-level convention, not part of the `Channel` Protocol.** Channels expose a `response_hook` attribute (callable accepting `(result, *, context: ChannelResponseContext) -> HostedRunResult[Any] | Awaitable[HostedRunResult[Any]]`). The host duck-types this attribute. Adding hook support to an existing channel package does not break the public `Channel` Protocol. + + A parallel **link ceremony flow** runs out-of-band when a user invokes the host-provided `link`/`connect` command on a channel: ```text @@ -1341,13 +1873,14 @@ channel /link command | Session resolution | Host core | based on `ChannelSession` + `ChannelRequest.session_mode`; storage specifics deferred | | Channel-native identity extraction | Channel package | populates `ChannelIdentity(channel, native_id, attributes)` per request | | Identity resolution (`native_id` → `isolation_key`) | Host core via `IdentityResolver` | default **auto-issues and persists** a per-user `isolation_key` on first contact per `(channel, native_id)`; user-supplied resolver can return app-owned identities directly | -| Identity store (`(channel, native_id) → isolation_key`) | Host core via `HostStateStore` | file-based by default in v1 (`FileHostStateStore`); pluggable for Cosmos / SQL / Redis in fast follow (req #23). Owns auto-issuance and atomic merge-on-link. | +| Identity store (`(channel, native_id) → isolation_key`) | Host core via `HostStateStore` | file-based by default in v1 (`FileHostStateStore`); pluggable for Cosmos / SQL / Redis in fast follow (req #24). Owns auto-issuance and atomic merge-on-link. | | Identity link ceremony (OAuth / MFA / one-time code) | Host core via `IdentityLinker` | linker contributes its own routes + lifecycle; channels surface a built-in `link`/`connect` command | +| Authorization (allowlist + link enforcement) | Host core via `host.authorize(...)` + per-channel `IdentityAllowlist` | tri-state allowlist evaluated pre- and post-link; combines with `require_link` to produce one of three named profiles (open / forced-link / allowlist); see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) | | Link & delivery policy across confidentiality tiers | Host core via `LinkPolicy` | consulted at link time (refuse incompatible link attempts) and at delivery time (drop incompatible `ResponseTarget` destinations); built-in policies cover all-allow, same-tier, explicit allow-list, deny-all | | Active-channel tracking | Host core | updated on every successfully resolved request; consumed by `ResponseTarget.active` | | Response-target resolution | Host core | translates `ResponseTarget` (originating, active, specific, list, all_linked, none) into an ordered set of `(channel, ChannelIdentity)` deliveries | | Proactive outbound delivery | Channel package via optional `ChannelPush` capability | channels that can push (Telegram, Activity Protocol via Bot Service, webhook, SSE) implement `push(identity, result)`; channels that can't are only valid as `originating` targets | -| Per-delivery audit + replay state | Host core writes the intent + status onto the assistant `Message.additional_properties["hosting"]["deliveries"]`; provider opts into in-place updates via `SupportsDeliveryTracking` for crash-safe lifecycle | Universal data model; live update is provider capability. See [Delivery tracking on assistant messages](#delivery-tracking-on-assistant-messages). | +| Per-delivery audit + replay state | Host core writes intent-only — the resolved destination set onto the assistant `Message.additional_properties["hosting"]["intended_targets"]` (immutable, single write). Operational state (attempts, retries, last error, success timestamp) lives in the `DurableTaskRunner` and is observed via the runner's own backend. | Replay across host restarts is a property of the configured runner (native for durable adapters; not supported for `InProcessTaskRunner`). See [Intended targets + durable delivery](#intended-targets--durable-delivery) and [Durable task runner](#durable-task-runner). | | Background-run lifecycle | Host core | owns `ContinuationToken` issuance, async execution, completion notification; persists via `HostStateStore` (file-based default — survives restarts) | | Run poll routes | Channel package | each channel exposes its own protocol-shaped poll route (`/responses/v1/{continuation_token}`, `/invocations/{continuation_token}`) backed by `host.get_continuation(token)` | | Conversation history (all channels — Responses, Invocations, Telegram, Activity Protocol, …) | Agent's core `HistoryProvider` (`agent_framework._sessions.HistoryProvider`) | Channels project their wire id (`previous_response_id`, `conversation_id`, request body `session_id`, host-tracked alias, …) into `ChannelSession.key`; the host resolves an `AgentSession` and the agent's `HistoryProvider` does the load / append. No channel-specific history seam. Multi-provider composition (with a single `load_messages=True`) is the standard AF convention; see [Conversation history for the Responses channel](#conversation-history-for-the-responses-channel) for the Foundry-backed variant. | @@ -1386,7 +1919,7 @@ Identity is an **orthogonal axis** (anonymous vs. identified). The realized cell 3. If `session_mode == "auto"` and no key is supplied, the host may create an ephemeral session. 4. If `session_mode == "required"`, the host must resolve or create a usable session before invoking the target. 5. **Cross-channel resolution rule:** when two channels mounted on the same `AgentFrameworkHost` produce the same `isolation_key` (and either both omit `key` or both produce equivalent keys derived from `isolation_key`), the host resolves them to the **same** `AgentSession`. This is the v1 mechanism for cross-channel chat continuity (e.g. Telegram → Teams against the same conversation history). The **canonical** path for translating a channel's native per-channel identifier (Telegram `chat_id`, Teams AAD object id, …) into the stable `isolation_key` is the host-level `IdentityResolver` (per-channel `run_hook` mapping is supported as a lower-level alternative). When the channel-native identity is not yet linked, the `IdentityLinker` runs a connect ceremony (OAuth, MFA, signed one-time code) to associate it with an existing `isolation_key`. -6. The first spec does **not** standardize a cross-package storage API; cross-host/cross-process continuity is deferred to the pluggable session store (req #23), which also persists identity-link grants beyond the host process lifetime. +6. The first spec does **not** standardize a cross-package storage API; cross-host/cross-process continuity is deferred to the pluggable session store (req #24), which also persists identity-link grants beyond the host process lifetime. 7. Responses and other conversation-aware channels may still own protocol-specific conversation/item storage above this layer. 8. **Session rotation (`reset_session`).** The host exposes `reset_session(isolation_key)` so **host-tracked** channels (see [Channel session-carriage models](#channel-session-carriage-models)) can implement "start a fresh thread" commands (e.g. Telegram `/new`). The default behavior **rotates the active session id alias** (`` → `#`) rather than deleting on-disk history: prior history remains addressable by its original session id while subsequent runs for that `isolation_key` resolve to a brand-new `AgentSession`. Apps that want destructive reset can layer that on top by calling into their own `HistoryProvider`. **Caller-supplied** channels do not call `reset_session`; their callers branch threads by sending a fresh / no `previous_response_id` (or equivalent) on the next request. @@ -1411,11 +1944,47 @@ When the host invokes the target, it does **not** pass the raw `ChannelRequest.i Round-trip is guaranteed by `Message.to_dict()` / `Message.from_dict()`. Future providers that key on protocol shape (e.g. a Responses `previous_response_id`-keyed store) can read this envelope to reconstruct cross-channel context without needing a separate channel-metadata sidecar. -`FoundryHostedAgentHistoryProvider` round-trips the entire `additional_properties["hosting"]` namespace (and any other AF-side namespace) through the Foundry response store via a single opaque `agent_framework` container key written onto each `OutputItem`. See [Foundry storage gap: `update_item`](#foundry-storage-gap-update_item) for the one part of the schema (post-push `deliveries[]` mutation) that depends on a service-side addition. - -### Delivery tracking on assistant messages - -The inbound envelope above captures **intent**. To support **audit** ("which destinations actually received this response, and when?") and **replay** ("Telegram was offline; resend to that user when it comes back"), the assistant `Message` produced by the host carries a parallel envelope that records the *resolved destination set* and per-destination outcome. +`FoundryHostedAgentHistoryProvider` round-trips the entire `additional_properties["hosting"]` namespace (and any other AF-side namespace) through the Foundry response store via a single opaque `agent_framework` container key written onto each `OutputItem`. Because the schema is now **intent-only** (no per-destination mutation after the initial write — see [Intended targets + durable delivery](#intended-targets--durable-delivery)), no service-side additions to the Foundry storage SDK are required for it to round-trip. + +### Intended targets + durable delivery + +The inbound envelope above captures the caller's **intent**. The assistant `Message` produced by the host carries a parallel envelope that records the *resolved destination set* — what the host actually intended to deliver to, after `ResponseTarget` resolution and `LinkPolicy` filtering. **This is a single write, never mutated.** Operational state for each push attempt (status, attempts, retries, last error, channel-issued id) lives in the [`DurableTaskRunner`](#durable-task-runner) — not on the message — because the runner is the component that performs and (when durable) retries the push. + +The shape of the fan-out — synchronous on the originating wire, scheduled via the runner for every non-originating destination — is the same in every multi-target scenario (`all_linked`, `active`, `channels([...])`, `identities([...])`): + +```mermaid +sequenceDiagram + autonumber + actor User + participant Tg as TelegramChannel
(originating) + participant Host + participant Target as Agent + participant Runner as DurableTaskRunner + participant Annot as _annotate_intended_targets + participant Act as ActivityChannel
(linked) + participant Resp as ResponsesChannel
(linked) + + User->>Tg: message + Tg->>Host: ChannelRequest(
identity=tg:12345,
response_target=all_linked) + Host->>Host: resolve isolation_key + Host->>Target: run + Target-->>Host: AgentResponse + Host->>Annot: hosting.intended_targets =
[tg, activity, responses] + + par originating — synchronous + Host->>Tg: response_hook → push (sync) + Tg-->>User: reply on Telegram + and non-originating — durable + Host->>Runner: schedule("hosting.push",
payload for activity) + Runner->>Host: _handle_push_task(payload) + Host->>Act: response_hook → push + Act-->>User: reply in Teams (or wherever) + and + Host->>Runner: schedule("hosting.push",
payload for responses) + Runner->>Host: _handle_push_task(payload) + Host->>Resp: response_hook → push + end +``` Schema on `Message.additional_properties["hosting"]` for a host-produced assistant message: @@ -1426,94 +1995,184 @@ Schema on `Message.additional_properties["hosting"]` for a host-produced assista "identity": { "channel": "telegram", "native_id": "12345", "attributes": {} }, "response_target": { "kind": "all_linked", "targets": [] } }, - "deliveries": [ - { - "destination": { "channel": "activity", "native_id": "29:abc..." }, - "status": "delivered", // pending | delivered | failed | skipped - "attempts": 1, - "first_attempt_at": "2026-04-29T08:31:11Z", - "last_attempt_at": "2026-04-29T08:31:11Z", - "last_error": null, - "delivery_id": "msg_018f..." // channel-issued id, when the channel returns one - }, + "intended_targets": [ + { "destination": { "channel": "activity", "native_id": "29:abc..." } }, + { "destination": { "channel": "telegram", "native_id": "12345" } } + ], + "skipped_targets": [ // optional — present only when LinkPolicy excluded something { - "destination": { "channel": "telegram", "native_id": "12345" }, - "status": "failed", - "attempts": 3, - "first_attempt_at": "2026-04-29T08:31:11Z", - "last_attempt_at": "2026-04-29T08:36:11Z", - "last_error": { "code": "channel_offline", "message": "Telegram getUpdates 502" }, - "delivery_id": null + "destination": { "channel": "corp-only", "native_id": "..." }, + "reason": "link_policy" // link_policy | no_push_capability } ] } ``` -Status values: +Lifecycle the host follows: -| Value | Meaning | -|---|---| -| `pending` | Host has resolved the destination but has not yet attempted (or is between attempts) `ChannelPush.push(...)`. | -| `delivered` | Push succeeded. `delivery_id` is populated when the destination channel returns a stable id. | -| `failed` | Push raised. `last_error` is populated. Eligible for replay. | -| `skipped` | Destination was excluded by `LinkPolicy`, or the destination channel does not implement `ChannelPush`. Recorded so audit shows *why* a destination resolved by `ResponseTarget` did not receive the message. | +1. After `ResponseTarget` resolution and `LinkPolicy` filtering, the host writes the assistant `Message` **once**, with the resolved `intended_targets[]` (every destination it will attempt) and an optional `skipped_targets[]` for destinations dropped at resolution time (so audit can show *why* a resolved-by-`ResponseTarget` destination did not receive the message — `link_policy` or `no_push_capability`). This write is immutable. +2. For each non-originating destination, the host schedules a `"hosting.push"` task via the configured [`DurableTaskRunner`](#durable-task-runner). The runner is responsible for attempting, retrying per its `RetryPolicy`, and (for durable runners) surviving host restarts. The push handler resolves the channel, runs the channel's `response_hook`, and calls `ChannelPush.push(...)`. +3. Operational delivery state — attempt count, last error, success timestamp, channel-issued message id — lives in the runner's own log. Replay across host restarts is a property of the runner (native for durable runners; not supported for the in-process runner). Operators who want a queryable delivery dashboard can read it from their runner backend's observability surface (TaskHub, Foundry durable tasks, …) — the host does not project it back onto the message. -Lifecycle the host follows: +The originating destination (when `ResponseTarget` includes it) is **not** routed through the runner. It is rendered synchronously on the originating channel's wire; the host-internal `_deliver_response` helper returns `bool` (`True` if any push was scheduled / delivered, `False` otherwise) for the channel's own bookkeeping. Per-destination delivery outcomes are not collated back to the caller — durable runners surface them in their own logs / dashboards, and the in-process runner logs failures with structured fields. See [Built-in routes](#built-in-routes) for the synchronous return contract. -1. After `ResponseTarget` resolution and `LinkPolicy` filtering, **before** any push attempt, the host writes the assistant `Message` with one `deliveries[]` entry per destination, all `status="pending"` (excluded ones written as `"skipped"`). This guarantees the intent is durable across host crashes. -2. After each `ChannelPush.push(destination_identity, result)`, the host updates the matching `deliveries[]` entry in place — `status`, `attempts`, `first_attempt_at` (set on first attempt), `last_attempt_at`, `last_error`, `delivery_id`. -3. The mechanism for **retrying** failed deliveries (background worker, operator action, `host.retry_delivery(message_id, destination)`, …) is **out of scope** for this spec — it is enabled by the data model and tracked under Open Questions. +> **Why intent-only on the message, with operational state in the runner?** A single immutable write keeps the message store as the source of truth for "what the host intended", without requiring providers to implement in-place mutation (no `SupportsDeliveryTracking` capability, no Foundry `update_item` service ask). Per-destination retry, replay, and failure surfacing become responsibilities of the runner, which is the right component because it owns the work queue. Operators who already use a durable runner (TaskHub, Foundry durable tasks) get observability through the runner's existing tooling rather than through a parallel ETL on the message store. -#### `SupportsDeliveryTracking` provider capability +### Durable task runner -Updating a stored message in place is provider-specific. The shape above is universal; the *update semantics* are opt-in: +The host delegates non-originating push fan-out — and, in v1 fast-follow, background runs — to a pluggable `DurableTaskRunner`. The runner is the component that owns "this work needs to happen; retry on failure; survive (or don't survive) restarts depending on which runner you chose". Channel packages never see it directly; they just implement `ChannelPush.push(...)`. ```python -from typing import Protocol, Sequence +from typing import Protocol, Callable, Awaitable, Mapping, Any, Literal +from dataclasses import dataclass + +@dataclass(frozen=True) +class RetryPolicy: + max_attempts: int = 5 + initial_backoff_seconds: float = 1.0 + backoff_multiplier: float = 2.0 + max_backoff_seconds: float = 60.0 -class SupportsDeliveryTracking(Protocol): - async def update_deliveries( +@dataclass(frozen=True) +class TaskHandle: + task_id: str # opaque, runner-issued + name: str # the registered handler name + +TaskStatus = Literal["scheduled", "running", "succeeded", "failed", "cancelled"] + +class DurableTaskRunner(Protocol): + def register( self, - *, - session_id: str, - message_id: str, - deliveries: Sequence[Mapping[str, Any]], + name: str, + handler: Callable[[Mapping[str, Any]], Awaitable[None]], ) -> None: ... + + async def schedule( + self, + name: str, + payload: Mapping[str, Any], + *, + retry_policy: RetryPolicy | None = None, + ) -> TaskHandle: ... + + async def get(self, handle: TaskHandle) -> TaskStatus | None: ... ``` -| Provider | `SupportsDeliveryTracking`? | Behavior | +The host registers an internal handler `"hosting.push"` at startup. Each non-originating destination becomes a single `runner.schedule("hosting.push", payload)` call. The handler: + +1. Resolves the channel from `payload["channel_id"]`. +2. Clones the `HostedRunResult` and runs the channel's `response_hook` (if any). +3. Calls `ChannelPush.push(identity, shaped_result)`. +4. Returns normally on success. On exception, the runner records the failure and either schedules a retry per `RetryPolicy` or marks the task `failed` (terminal). + +Built-in runner shipped in core: + +| Runner | Persistence | Replay across restarts | Default for | +|---|---|---|---| +| `InProcessTaskRunner` | None — `asyncio.create_task` + in-process retry | No (in-flight tasks lost on process death) | `runtime_mode="long_running"` | + +Adapter packages (deferred to v1 Fast Follow; no runtime dep from core): + +| Package | Backend | Notes | |---|---|---| -| `FileHistoryProvider` (append-only JSONL) | No (capability not implemented) | Host writes the assistant `Message` **once**, at the end of the delivery cycle, with terminal `deliveries[]`. Pre-attempt `pending` snapshot is not durable; a host crash mid-delivery loses per-destination state for in-flight pushes. **Audit-complete, replay-best-effort.** | -| `FoundryHostedAgentHistoryProvider` (Foundry response store) | **Partial — initial-write only** (see [Foundry storage gap](#foundry-storage-gap-update_item) below) | Inbound envelope (`channel`/`identity`/`response_target`) and **initial-write `deliveries[]` snapshot** (all `pending`, plus any `skipped`) round-trip through the Foundry response store unchanged via the `agent_framework` extras container key the provider writes onto each `OutputItem`. **Per-destination updates after each push attempt are not durable** because the Foundry storage SDK does not yet expose a way to mutate an individual stored history item. Behaves as `FileHistoryProvider` in that regard until the [service ask](#foundry-storage-gap-update_item) lands. **Audit-complete-on-write, replay-best-effort.** | -| Cosmos / SQL providers (when introduced) | Expected to implement | Same as above. | +| `agent-framework-hosting-durabletask` | `agent-framework-durabletask` (gRPC TaskHub) | Suits `ephemeral` deployments that already run a Durable Task sidecar. | +| `agent-framework-hosting-foundry` (extension) | Foundry durable-task API | Deferred until the FHA durable-task surface is finalized. | +| (possibly) SQLite-outbox runner | SQLite under the existing `HostStateStore` root | Lowest-dep "survives single-node restart" option for ephemeral hosts without an external sidecar. | + +Default selection follows [Runtime modes](#runtime-modes). `long_running` defaults to `InProcessTaskRunner`. `ephemeral` is **strict**: if `durable_task_runner` is not configured and `allow_in_process_runner=True` is not opted in, the host raises `RuntimeError` at construction — falling back to the in-process runner in an ephemeral environment would silently drop in-flight pushes on the next scale-to-zero. The `allow_in_process_runner=True` escape hatch is intentionally noisy (warning) and meant for local dev / smoke tests. + +#### Codec contract for durable serialisation + +When a `DurableTaskRunner` is configured for a deployment that uses out-of-process scheduling (e.g. a sidecar / gRPC TaskHub), task payloads must be **JSON-serialisable** end to end. Two pieces of the contract enforce this: + +- **`DurableTaskRunner.payload_mode`** — a class-level attribute declared by each runner implementation: + - `OBJECT` — the in-process runner; payloads pass Python objects by reference. No serialisation required. + - `JSON` — out-of-process runners; payloads must round-trip through JSON. +- **`ChannelPushCodec`** — a Protocol exposed by push-capable channels whose payloads are not natively JSON-serialisable. The codec defines `encode(payload) -> Mapping[str, Any]` / `decode(envelope) -> Any` so the channel owns the over-the-wire shape of its push payloads. Channels without exotic payloads can leave the codec unset and rely on the host's default `dataclasses.asdict`-style encode. + +At construction the host runs `_validate_runner_codec_pairing`: if the configured runner declares `payload_mode == JSON` and any push-capable channel does not expose a codec, the host raises `ChannelConfigurationError` so the misconfiguration is caught before traffic. On the consumer side `_handle_push_task` accepts both `OBJECT`-mode (in-memory object) and `JSON`-mode (`{"type": "push", ...}` envelope) shapes so the same handler serves both runner backends. + +```mermaid +sequenceDiagram + autonumber + participant Host + participant Codec as ChannelPushCodec
(on the push channel) + participant Runner as Durable runner
(payload_mode=JSON) + participant Worker as Runner worker
(may run after host scaled to zero) + participant Channel as Push channel + participant External as External service
(Telegram / Bot Framework / …) + + Note over Host,Runner: construction-time:
_validate_runner_codec_pairing
(refuse JSON runner + codec-less channel) + + Host->>Codec: encode(payload) → JSON-safe Mapping + Host->>Runner: schedule("hosting.push",
{"type": "push", "channel": "tg",
"payload": }) + Runner->>Runner: persist task + Runner-->>Host: TaskHandle + Host-->>Host: synchronous return path
(originating already delivered) + + Note over Worker: ... host may scale to zero ... + + Worker->>Runner: dequeue task + Worker->>Host: invoke "hosting.push" handler
(JSON envelope) + Host->>Host: _handle_push_task
detect envelope shape (OBJECT or JSON) + Host->>Codec: decode(payload) → in-memory object + Host->>Channel: ChannelPush.push(identity, result) + Channel->>External: native API call + alt success + External-->>Channel: ok + Channel-->>Worker: handler returns + Worker->>Runner: mark task succeeded + else transient failure + External-->>Channel: 5xx + Channel-->>Worker: raise + Worker->>Runner: retry per RetryPolicy + else terminal failure + Worker->>Runner: max_attempts → mark failed
(log only) + end +``` -Providers that omit the capability are still valid hosts for any `ResponseTarget` configuration — they just cannot offer durable replay. The host detects the capability with `isinstance(provider, SupportsDeliveryTracking)` and degrades to write-once when absent. +#### In-process runner shutdown drain -> **Why on the message and not in a separate delivery log?** Two reasons. First, the message store is the single source of truth for an assistant turn; piggy-backing on it avoids a second consistency boundary between "message written" and "delivery scheduled". Second, any operator who wants a queryable delivery dashboard can ETL the array out of `additional_properties["hosting"]["deliveries"]` into their preferred outbox/log store — the on-message form does not preclude that. The spec commits only to the on-message shape; outbox layers are an implementation choice. +`InProcessTaskRunner` ships a two-phase shutdown driven by `shutdown_grace_seconds` (default `5.0`): -#### Foundry storage gap: `update_item` +1. After lifespan shutdown signals, in-flight `"hosting.push"` tasks are given the grace period to finish — during which retries keep happening — so a clean Ctrl-C does not abandon work that is one network call away from completing. +2. When the grace expires, remaining tasks are cancelled and their `CancelledError` is swallowed (not logged as a failure — it is the expected shutdown shape). -`FoundryHostedAgentHistoryProvider` round-trips arbitrary `Message.additional_properties` namespaces through the Foundry response store as opaque JSON via a single `agent_framework` container key on each `OutputItem` (see `_shared.py:_collect_af_extras` / `_inject_af_extras` / `_attach_extras`). This makes the **initial-write** parts of the schema above durable: +This is purely operational hygiene for the `long_running` default; durable adapters get this behaviour for free from their backends. -- The inbound `hosting` envelope (`channel`, `identity`, `response_target`) on user messages. -- The initial-write `deliveries[]` snapshot on assistant messages (all entries `pending` or `skipped`, written before the first push attempt). +#### Echo idempotency on retry -What is **not yet** durable through this provider is **post-push mutation** of an individual stored item. The `azure.ai.agentserver.responses.store.FoundryStorageProvider` SDK exposes `create_response`, `get_response`, `update_response`, `delete_response`, `get_input_items`, `get_items`, and `get_history_item_ids` — but no `update_item` / PATCH on a single history item. So when the host updates an entry in `deliveries[]` after `ChannelPush.push()` returns (`status` → `delivered`/`failed`, `attempts`, timestamps, `last_error`, `delivery_id`), there is no way to push that mutation back into the per-item storage row. +When `ResponseTarget.channel(name, echo_input=True)` is set, the host packages an echo (`role="user"`) push *and* the agent reply (`role="assistant"`) into the same `"hosting.push"` task per non-originating destination. The handler tracks an `echo_done` cursor on the task state and short-circuits the echo phase on retry: a retry that fires after the echo succeeded but before the response push completed will not double-echo the user's message. The cursor lives on the runner-owned task state, not the message — same principle as the broader "intent only on the message, operational state in the runner" rule. -**Workarounds and trade-offs:** +```mermaid +sequenceDiagram + autonumber + participant Runner + participant Host + participant Channel + participant External -| Option | Trade-off | -|---|---| -| Encode `deliveries[]` on the *response object* (under `agent_framework`) instead of the assistant *item*, and use `update_response` to mutate it. | Works today, but deliveries are no longer co-located with the assistant message — schema for the `Message` round-trip becomes provider-specific. | -| Delete + recreate the assistant item with the updated body. | Likely loses the `previous_response_id` chain pointer, breaks subsequent `get_history_item_ids` walks, and re-stamps the storage `id` (audit-trail noise). | -| Wait for Foundry storage to add `update_item`. | Cleanest end-state. **This is the recommended path.** | + Runner->>Host: _handle_push_task(
echo, response,
state={echo_done: False}) + Host->>Host: read echo_done → False + Host->>Channel: response_hook(echo, is_echo=True) + Channel->>External: push user message + External-->>Channel: ok + Host->>Runner: persist state.echo_done = True + + Host->>Channel: response_hook(response, is_echo=False) + Channel->>External: push assistant reply + External-->>Channel: 5xx (transient) + Channel-->>Host: raise -**Service ask for the FoundryHostedAgent / Foundry response store team:** + Runner->>Runner: retry per RetryPolicy
(backoff) -- Add `update_item(item_id, item_body, *, isolation: IsolationContext | None = None) -> None` (PATCH semantics) to `azure.ai.agentserver.responses.store.FoundryStorageProvider` and the underlying `POST/PATCH /storage/items/{item_id}` REST surface. -- Required because the Hosting spec's per-destination delivery-tracking lifecycle (`pending → delivered`/`failed`/`skipped`) needs to mutate an individual stored item after the first push attempt completes. -- Without it, the FoundryHostedAgentHistoryProvider's `SupportsDeliveryTracking` implementation is permanently stuck at "write-once-best-effort" and durable replay through Foundry storage is unreachable. -- Existing `update_response` is not sufficient because deliveries belong on the **assistant `Message`** (so they round-trip with the message into any provider that consumes the standard `Message` schema), not on the response envelope. + Runner->>Host: _handle_push_task(
echo, response,
state={echo_done: True}) + Host->>Host: read echo_done → True (skip echo) + Host->>Channel: response_hook(response, is_echo=False) + Channel->>External: push assistant reply + External-->>Channel: ok + Host-->>Runner: handler returns → succeeded +``` ## Reference and Parity Plan @@ -1548,7 +2207,7 @@ The new core sits **below** the conceptual boundary of today's top-level Respons | `FoundryHostedAgentHistoryProvider` (in `agent-framework-foundry-hosting`, built on `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider`) | Agent Framework Foundry | TBD | Proposed v1 deliverable so Foundry-defined (and any other) agents can use Foundry's response store as a `HistoryProvider` through the new host. Implements the standard core `HistoryProvider` Protocol — usable from any channel, no Responses-specific Protocol. Owns a runtime dep on `azure.ai.agentserver` for the storage SDK. | | PR #5393 Telegram sample (commands, polling/webhook patterns) | Agent Framework | PR author | Reference-only; informs `ChannelCommand` and `TelegramChannel` design | | Telegram Bot API SDK | External | n/a | Committed (runtime dep of `agent-framework-hosting-telegram`) | -| `microsoft/teams.py` SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`) | External (MIT, Microsoft) | n/a | Proposed runtime dep of `agent-framework-hosting-teams` (req #28). The SDK already ships a "Build an agent using Microsoft Agent Framework" guide and a pluggable `HttpServerAdapter`, so the hosting package mounts the SDK's `App` into the host's Starlette app and reuses its Adaptive Cards / Streaming / Citations / Feedback / Suggested-prompts / Dialogs / Message-Extensions / SSO surface instead of re-implementing them. | +| `microsoft/teams.py` SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`) | External (MIT, Microsoft) | n/a | Proposed runtime dep of `agent-framework-hosting-teams` (req #29). The SDK already ships a "Build an agent using Microsoft Agent Framework" guide and a pluggable `HttpServerAdapter`, so the hosting package mounts the SDK's `App` into the host's Starlette app and reuses its Adaptive Cards / Streaming / Citations / Feedback / Suggested-prompts / Dialogs / Message-Extensions / SSO surface instead of re-implementing them. | | `agent-framework-ag-ui`, `-a2a`, `-devui` | Agent Framework | various | Out of scope for first implementation; future convergence kept as a possibility | ## Open Questions @@ -1560,10 +2219,9 @@ The new core sits **below** the conceptual boundary of today's top-level Respons | 8 | Should command scopes / projection metadata become first-class — e.g. private-chat-only vs group-chat-visible commands, or per-locale descriptions? | Eng / PM | Telegram's `BotCommandScope` and `language_code` would need to be representable cross-channel. | | 10 | Is "Channel" the GA name? "Head" was used interchangeably during design discussions. | PM | "Channel" chosen for the spec; confirm before public docs. | | 12 | Should `ChannelRequest.session_mode` grow additional values (e.g. `"shared"` for multi-channel session sharing) or stay closed at three? | Eng | The taxonomy needs a **dedicated design exercise** covering all known channel session-shape patterns; revisit after that exercise. | -| 14 | Where do issued link grants live — short-lived in-memory state on the host, the same pluggable session store (#23), or a separate identity store? | Eng | Resolved as part of the **`HostStateStore`** seam (see [Host state storage](#host-state-storage)). Link grants live alongside continuation tokens and last-seen records in the v1 file-based default (`FileHostStateStore` → `link_grants/` namespace, 15min TTL). Pluggable Cosmos / SQL / Redis adapters tracked in req #23. **→ Move to Resolved Questions in next pass.** | +| 14 | Where do issued link grants live — short-lived in-memory state on the host, the same pluggable session store (#24), or a separate identity store? | Eng | Resolved as part of the **`HostStateStore`** seam (see [Host state storage](#host-state-storage)). Link grants live alongside continuation tokens and last-seen records in the v1 file-based default (`FileHostStateStore` → `link_grants/` namespace, 15min TTL). Pluggable Cosmos / SQL / Redis adapters tracked in req #24. **→ Move to Resolved Questions in next pass.** | | 17 | Should `ResponseTarget.active` honor a configurable **time window** (last seen within N minutes) and what is the fallback when the window has expired before the response is ready — `originating`, `all_linked`, drop with `ContinuationToken` `status="failed"`? | PM / Eng | Likely yes with sensible default (e.g. 24h fall back to `originating`); per-request override via the run hook. | | 22 | For the Responses WebSocket transport, what subprotocol identifier (if any) should be advertised on the `Upgrade` and how is auth conveyed — `Authorization` header on the upgrade, a `Sec-WebSocket-Protocol` token, or a query-string-bound short-lived token? | Eng / PM | Aligning with whatever OpenAI ships for Responses WS is preferable; keep the codec swappable so the channel can track upstream changes without breaking the host contract. | -| 27 | What is the retention contract for completed `deliveries[]` entries — keep forever for audit, GC after the message itself ages out, or cap per-message at a fixed attempt count? Should `last_error` payloads be redacted to a code/message pair to avoid logging PII from the underlying channel SDK? | Eng / Compliance | Suggest "lifetime equals message lifetime" + redacted error shape (`{code, message}` only, no provider stack frames or payload echoes) as the default; revisit when the persistent store contract lands. | ### Resolved Questions (decisions log) @@ -1573,7 +2231,7 @@ Original numbering preserved so external references (checkpoints, ADR cross-link |---|---|---| | 1 | Final distribution package names? | `agent-framework-hosting` with suffixes (`-responses`, `-invocations`, `-telegram`, …). Public imports stay at `agent_framework.hosting`. | | 2 | `uvicorn` required vs optional extra? | Use **hypercorn** instead of uvicorn; the `serve` extra remains optional. `host.app` is still the canonical server-agnostic ASGI surface. | -| 3 | Keep `HostedRunResult` wrapper or return `AgentResponse` directly? | **Keep `HostedRunResult`.** It wraps both `AgentRunResult` *and* the unknown output type of a `Workflow`, and adds host-run metadata (resolved session, etc.). | +| 3 | Keep `HostedRunResult` wrapper or return `AgentResponse` directly? | **Keep `HostedRunResult`,** now shaped as a **generic typed envelope `HostedRunResult[TResult]`** (see Q31). It wraps both `AgentResponse` *and* `WorkflowRunResult`, and carries host-run metadata (resolved `session`) alongside the full-fidelity target output. | | 4 | Where do generic auth helpers live? | Only the **mechanisms** live in core. Concrete implementations sit in their own packages when they pull dependencies; dep-free helpers may live in `hosting`. | | 7 | `protocol_request` typed (`Any`) or typed kwargs? | **Keep `Any`.** | | 9 | Allow nested routers / `path=""`? | **Yes.** The host developer is responsible for ensuring routes do not overlap. | @@ -1581,22 +2239,38 @@ Original numbering preserved so external references (checkpoints, ADR cross-link | 13 | Which identity linkers ship in phase 1? | **Entra linker** (in the Entra package) + **one-time-code linker** (in core). Drop MFA for now; investigating additional linkers tracked as a follow-up. | | 15 | Identity resolver invoked once on host vs per channel? | **Once on the host** with `ChannelIdentity(channel, native_id, ...)`. | | 16 | Should `IdentityLinker` and `Channel` share a base `Contributor` protocol? | **A linker *is* a Channel — specialised.** Use the single Channel-shaped contract; collapse `IdentityLinker` into a Channel specialisation. | -| 18 | Contract for `ChannelPush` failures? | **Annotate the failure on the relevant `deliveries[]` entry** in the data model (see §"Delivery tracking on assistant messages"). Re-delivery is future work (Q26). | +| 18 | Contract for `ChannelPush` failures? | **The `DurableTaskRunner` owns retry and final-failure semantics**, per its `RetryPolicy`. Push handler exceptions are caught by the runner, which retries with backoff and ultimately marks the task `failed` when `max_attempts` is exhausted. Downstream push outcomes live in the runner's own log — there is no per-destination status surfaced on the message and no synchronous failure object returned to the caller. The host's internal `_deliver_response` helper returns `bool` (whether any work was scheduled) for the originating channel; observability for downstream pushes comes from the runner backend (TaskHub, Foundry durable tasks, log fields on `InProcessTaskRunner`). The earlier `DeliveryReport` value type has been removed. See [Intended targets + durable delivery](#intended-targets--durable-delivery) and [Durable task runner](#durable-task-runner). | | 19 | `host.run_in_background(...)` `notify` callback? | Programmatic non-channel delivery will be expressed via the **`continuation_token`** mechanism (see Q20), not a separate `notify` callback. | | 20 | Storage / TTL of `ContinuationToken`s? | **Done in this revision.** `ContinuationToken` is the type, with an opaque `token: str` field that channels surface to callers; equivalent continuation-token support is added to the **Invocations channel** alongside the existing Responses behaviour. Push-capable channels can still use it; default behaviour remains "push on completion", but the developer can choose other UX (poll-after-push, hybrid, …). Persistence is the **`HostStateStore`** seam — v1 default is **`FileHostStateStore`** (atomic JSON writes, 24h TTL on completed entries), so background runs survive host restarts. | -| 21 | Partial-failure surfacing for `all_linked`? | **Handled by the `deliveries[]` array** in the data model, updated per-destination as each push attempt completes. | +| 21 | Partial-failure surfacing for `all_linked`? | **Runner-only.** Originating-destination outcome is rendered synchronously on the originating channel's wire; the host's `_deliver_response` helper returns `bool` for the channel's own bookkeeping. Non-originating destinations are scheduled as `"hosting.push"` tasks on the `DurableTaskRunner`; per-task outcome (success / retried / terminal-failure) is observable via the runner's backend (TaskHub, Foundry durable tasks, structured log fields on `InProcessTaskRunner`). The host does not collate per-destination status back onto the message and no longer emits a `DeliveryReport`. | | 23 | Share one backing store contract for host-level vs `ContextProvider`? | **Stay separate protocols** (current draft direction confirmed). A deployment may still bind both onto the same physical backend. | | 24 | Where does the Foundry history provider live? | Tentative name **`FoundryHostedAgentHistoryProvider`**, in the **`foundry-hosting`** package (shares the dependency). Confirm with Foundry package owners before launch. | | 25 | `Channel.confidentiality_tier` opaque vs enum? | Keep as `str?` for now; can revisit before Release. | -| 26 | Where does the delivery-replay mechanism live? | **In the Host**, but **out of scope for v1.** The on-message `deliveries[]` envelope is sufficient input for any future replayer. | +| 26 | Where does the delivery-replay mechanism live? | **In the `DurableTaskRunner`.** Durable adapters (TaskHub, Foundry durable tasks) provide retry-with-backoff and survive host restarts natively — replay is "the runner keeps retrying until `max_attempts` is exhausted or the push succeeds". The built-in `InProcessTaskRunner` retries within the process but does **not** survive restarts (in-flight tasks are lost). Operator-driven replay (`host.replay(task_handle)`) is out of scope for v1; the runner's own surface is sufficient for the common case. | +| 28 | Should the host collapse agent / workflow output to text? | **No.** `HostedRunResult[TResult]` carries the target output **unchanged** — full `AgentResponse` (with its multi-modal `messages`, `value`, `usage_details`) for agent targets, full `WorkflowRunResult` (with its `get_outputs()` / `get_final_state()`) for workflow targets. Channels decide what subset their wire renders; a `response_hook` may rebind `result` (e.g. project a workflow output into an `AgentResponse` for a text-only wire) via `HostedRunResult.replace(result=...)`. The host never loses fidelity it has, and never restricts modality. | +| 29 | How do channels do per-destination post-processing (text flattening, card rendering, citation attachment) without breaking the `Channel` Protocol? | **Channels expose a `response_hook` instance attribute** (callable accepting `(result, *, context: ChannelResponseContext) -> HostedRunResult[Any] \| Awaitable[HostedRunResult[Any]]`). The host duck-types this attribute and applies it on a per-destination clone of the `HostedRunResult` envelope before push. The `Channel` Protocol stays a small `name / path / contribute` contract — adding hook support to a new channel does not require Protocol changes. | +| 30 | Should non-originating destinations also see the user's input message, not just the agent reply? | **Opt-in via `ResponseTarget.channel(name, echo_input=True)`** (and the same kwarg on `.channels([...])` / `.identities([...])`). The host synthesises a `HostedRunResult[AgentResponse]` wrapping the user's input as a `role="user"` message and bundles it into the same scheduled push task as the agent reply per non-originating destination; the echo is dispatched first inside the task and an echo-push failure is logged and swallowed so the response push on the same destination is still attempted. Channels can transform or drop echoes via their `response_hook` (which receives `is_echo=True` for the echo phase). | +| 31 | Should `HostedRunResult` be flattened (text / messages) or carry the full target output? | **Carry the full target output, generically typed.** `HostedRunResult[TResult]` exposes a single `result: TResult` field — `AgentResponse` for agent targets, `WorkflowRunResult` for workflow targets — plus an optional `session: AgentSession \| None`. Earlier drafts carried a flattened `messages: list[Message]` projection alongside `raw_response`; this lost workflow-specific affordances (`get_outputs()`, `get_final_state()`, structured per-executor payloads) and forced the host to pre-shape data only some channels needed. The generic envelope keeps the host modality-agnostic, lets channels read the canonical accessor on the underlying type (`result.messages`, `result.value`, `result.get_outputs()`, …), and gives channel authors static typing where they want it. | +| 32 | Should authorization (per-channel allowlist) ship as a single `auth_mode` enum or as two orthogonal parameters? | **Two orthogonal parameters (`require_link: bool` + `allowlist: IdentityAllowlist \| Literal["inherit"] \| None = "inherit"`)** plus named `AuthPolicy` factories for the three common combinations. A single enum collapses `require_link` and `allowlist` into one axis and cannot express the Mixed profile (`AnyOfAllowlists(NativeIdAllowlist, LinkedClaimAllowlist)` with `require_link=False` — native ids bypass auth, everyone else is funneled into linking) without re-introducing per-value sub-parameters that would defeat the point. Composition is built on a **tri-state `AllowlistDecision` (`ALLOW` / `DENY` / `ABSTAIN`)** rather than a boolean, because boolean composition cannot distinguish "claim allowlist denies you" from "claim allowlist hasn't seen any claims yet" — a critical distinction for the Mixed profile. `LinkedClaimAllowlist` is rejected at host startup if no source of verified claims is available (config validator, fail-fast), preventing the silent-deny-everyone footgun. Group-chat denials apply the same DM-redirect pattern as `LinkChallenge` (short generic refusal in-room, fuller `user_message` in DM, structured `log_details` only in logs). Shipping in two waves: the Protocol + `NativeIdAllowlist` + config validator ship with the next core PR; full `host.authorize(...)` pipeline + `LinkedClaimAllowlist` enforcement land with the `IdentityLinker` core PR. See [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam). | +| 33 | How does the host decide whether it is running long-running vs ephemeral? | **Single `runtime_mode` parameter on `AgentFrameworkHost`**, defaulting to `None` for auto-detection. Auto-detect inspects known deployment markers (`FOUNDRY_HOSTING_ENVIRONMENT`, `AZURE_FUNCTIONS_ENVIRONMENT`, `AWS_LAMBDA_FUNCTION_NAME`) and picks `"ephemeral"` on the first hit; otherwise falls back to `"long_running"` (sensible local-dev / always-on default). The mode is **advisory** — it drives *defaults* for `HostStateStore`, `DurableTaskRunner`, identity-link state, and similar seams, but every individual choice remains overridable. Detected mode is logged at startup so misdetection is visible. See [Runtime modes](#runtime-modes). | +| 34 | How does delivery to non-originating destinations actually happen — synchronously in the originating request handler, or out-of-band? | **Out-of-band via a `DurableTaskRunner`.** The host registers an internal handler `"hosting.push"` at startup; each non-originating destination becomes a single `runner.schedule("hosting.push", payload)` call. The originating destination (when `ResponseTarget` includes it) is **still rendered synchronously** on the originating channel's wire — only fan-out goes through the runner. Default runner is `InProcessTaskRunner` (asyncio + bounded retry, no cross-restart persistence — suitable for `long_running`). Durable adapter packages (`agent-framework-hosting-durabletask`, future Foundry adapter) plug into the same Protocol for `ephemeral` deployments. See [Durable task runner](#durable-task-runner). | +| 35 | What is the audit shape on the assistant message — full per-destination state machine, or intent only? | **Intent only.** `Message.additional_properties["hosting"]["intended_targets"]` is a single immutable write that records the resolved destination set (after `ResponseTarget` + `LinkPolicy` filtering). Operational state — attempt count, last error, success timestamp, channel-issued id — lives in the `DurableTaskRunner` and is observed via the runner's backend. This eliminates the previous `deliveries[]` status state machine (`pending`/`delivered`/`failed`/`skipped`), the `SupportsDeliveryTracking` provider capability, and the Foundry `update_item` service ask. See [Intended targets + durable delivery](#intended-targets--durable-delivery). | +| 36 | What happens when `runtime_mode="ephemeral"` and no `durable_task_runner` is configured? | **Raise at construction.** Silently falling back to `InProcessTaskRunner` in an ephemeral environment would drop every in-flight push on the next scale-to-zero — a footgun. The host raises `RuntimeError` unless `allow_in_process_runner=True` is opted in (warning logged). The opt-in is intended for local-dev / smoke tests where the developer accepts the in-flight loss. See [Durable task runner](#durable-task-runner). | +| 37 | What is the wire contract for push payloads under a durable (out-of-process) runner? | **A two-piece contract.** Each `DurableTaskRunner` declares its `payload_mode` (`OBJECT` for in-process pass-by-reference; `JSON` for runners that round-trip through JSON). Channels that ship non-JSON-native payloads expose a `ChannelPushCodec` (`encode` / `decode`). At construction the host runs `_validate_runner_codec_pairing` and refuses a `JSON`-mode runner paired with codec-less push channels. The push handler accepts both `OBJECT` and `JSON` envelope shapes so the same handler serves both runner backends. See [Codec contract for durable serialisation](#codec-contract-for-durable-serialisation). | +| 38 | Should `DeliveryReport` remain as a per-destination return value? | **No — removed.** Operational state lives in the runner; observability comes from the runner's backend (TaskHub, Foundry durable tasks, structured log fields on `InProcessTaskRunner`). The host's internal `_deliver_response` helper now returns `bool` (whether any work was scheduled / delivered) for the originating channel's own bookkeeping. Removing the value type collapses the public surface and removes a coupling point that would have needed a "schedule-time failure" subtype to round-trip durable failures back to the caller — failures live where they originate (the runner), not on a parallel object passed back through the synchronous return. | +| 39 | How is double-echo avoided when a push task retries after the echo phase succeeded but the response phase failed? | **An `echo_done` cursor on the runner-owned task state.** When `echo_input=True`, the `"hosting.push"` handler packages both the echo (`role="user"`) and the assistant reply into the same task; on the first attempt the handler dispatches the echo, sets `echo_done=True` on the task state, and then dispatches the reply. A retry that fires after the echo succeeded but the reply failed reads the cursor and short-circuits the echo phase. The cursor lives in the runner — same principle as the broader "intent only on the message, operational state in the runner" rule. See [Echo idempotency on retry](#echo-idempotency-on-retry). | +| 40 | What happens to in-flight `"hosting.push"` tasks on a clean `InProcessTaskRunner` shutdown? | **Two-phase drain.** A `shutdown_grace_seconds` window (default `5.0`) lets in-flight retries finish; remaining tasks are then cancelled and `CancelledError` is swallowed (not logged as a failure — it is the expected shutdown shape). Operators with longer worst-case retry chains can extend the grace via the constructor. Durable adapters get equivalent behaviour from their backends. See [In-process runner shutdown drain](#in-process-runner-shutdown-drain). | ### Decisions-driven follow-ups The following resolutions imply prose / API edits elsewhere in the spec body (not just the table above). Captured here so they aren't lost; the edits themselves are deferred to a separate pass. - **Q2** — Switch all install / `host.serve()` references from `uvicorn` to `hypercorn`. -- **Q3** — Update `HostedRunResult` documentation to cover the workflow-output case and the host-run metadata it adds on top of `AgentRunResult`. +- **Q3** — ✅ Done. `HostedRunResult[TResult]` is now generic over the target output type; see Q31 below for the rationale. - **Q11** — Strip any remaining "multi-target hedge" language from the spec body. - **Q13** — Update the linker catalogue: Entra (in Entra package) + one-time-code (in core); remove MFA references. - **Q16** — Collapse `IdentityLinker` into a Channel specialisation in the spec body (architecture diagrams, contracts, examples). - **Q20** — ✅ Done. `ContinuationToken` type carries an opaque `token: str`; routes use `/{continuation_token}`; Invocations channel gets equivalent continuation-token support; persistence via `HostStateStore` (v1 default file-based). +- **Q32** — Spec text added (see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) and req #22). The core PR includes `IdentityAllowlist` Protocol, `AllowlistDecision` enum, `AuthorizationContext`, `AllowAll` / `NativeIdAllowlist` / `LinkedClaimAllowlist` / `AnyOfAllowlists` / `AllOfAllowlists` / `CallableAllowlist` built-ins, `IdentityLinker` Protocol, `LinkedIdentity`, `LinkChallenge`, `AuthPolicy` factories, `Allowed` / `LinkRequired` / `Denied` outcomes, `Host(default_allowlist=..., identity_linker=...)` + per-channel `allowlist` parameter, construction-time validator (rules #1 + #2 + #3 — `require_link=True` without `identity_linker` now raises), and `host.authorize(...)` for open, native-id, and linked-claim profiles. Provider-specific linkers (for example Entra OAuth helpers) are separate channel/helper packages. +- **Q36 / Q37 / Q38 / Q39 / Q40** — Spec text added: strict-ephemeral default + `allow_in_process_runner` opt-in in §[Durable task runner](#durable-task-runner); new sub-sections [Codec contract for durable serialisation](#codec-contract-for-durable-serialisation), [In-process runner shutdown drain](#in-process-runner-shutdown-drain), [Echo idempotency on retry](#echo-idempotency-on-retry); `DeliveryReport` references purged from §[Intended targets + durable delivery](#intended-targets--durable-delivery) and Qs 18 / 21. Code lands in this core PR: `DurableTaskPayloadMode` + `ChannelPushCodec` + `PushPayloadNotSerializable` exception in `_types.py`; `_validate_runner_codec_pairing` + dual-mode `_handle_push_task` + `_build_push_payload` + `echo_done` cursor + `_annotate_intended_targets` in `_host.py`; `shutdown_grace_seconds` + 2-phase drain in `_runner.py`. +- **Q33 / Q34 / Q35** — Spec text added: new top-level §[Runtime modes](#runtime-modes), rewritten §[Intended targets + durable delivery](#intended-targets--durable-delivery), new §[Durable task runner](#durable-task-runner). Code lands in this core PR: `DurableTaskRunner` Protocol + `InProcessTaskRunner` + `runtime_mode` constructor parameter + auto-detection. Durable runner adapters (`agent-framework-hosting-durabletask`, Foundry adapter) are separate follow-up packages tracked under §[Decisions-driven follow-ups](#decisions-driven-follow-ups). Bumping req #14 (background runs) to share the same runner is a non-goal of this PR — the `ContinuationToken` machinery and the runner can be wired together in a later pass without re-shaping either contract. diff --git a/python/packages/hosting/LICENSE b/python/packages/hosting/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting/README.md b/python/packages/hosting/README.md new file mode 100644 index 00000000000..e214fa08088 --- /dev/null +++ b/python/packages/hosting/README.md @@ -0,0 +1,192 @@ +# agent-framework-hosting + +Multi-channel hosting for Microsoft Agent Framework agents. + +`agent-framework-hosting` lets you serve a single agent (or workflow) +target through one or more **channels** — pluggable adapters that +expose the target over different transports. The result is a single +Starlette ASGI application you can host anywhere (local Hypercorn, +Azure Container Apps, Foundry Hosted Agents, …). + +The base package contains only the channel-neutral plumbing: + +- `AgentFrameworkHost` — the Starlette host +- `Channel` / `ChannelPush` — the channel protocols +- `ChannelRequest` / `ChannelSession` / `ChannelIdentity` / `ResponseTarget` + — the request envelope and routing primitives +- `ChannelContext` / `ChannelContribution` / `ChannelCommand` — the + channel-side hooks for invoking the target and contributing routes, + commands, and lifecycle callbacks +- `ChannelRunHook` / `ChannelStreamTransformHook` — the per-request + customization seams +- `DurableTaskRunner` + `InProcessTaskRunner` — the seam used to + dispatch non-originating push fan-out; the in-process runner is the + default. Plug in a durable adapter (e.g. + `agent-framework-hosting-durabletask`) for `runtime_mode="ephemeral"` + deployments. + +Concrete channels live in their own packages so you only install what +you use: + +| Package | Transport | +|---|---| +| `agent-framework-hosting-responses` | OpenAI Responses API | +| `agent-framework-hosting-invocations` | Foundry-native invocation envelope | +| `agent-framework-hosting-telegram` | Telegram Bot API | +| `agent-framework-hosting-activity-protocol` | Bot Framework Activity Protocol (Teams, Direct Line, Web Chat, …) | +| `agent-framework-hosting-teams` | Microsoft Teams (Teams SDK) | +| `agent-framework-hosting-entra` | Entra (OAuth) identity-link sidecar | + +## Architecture + +```mermaid +graph LR + Caller[External caller /
messaging app] + + subgraph Host[AgentFrameworkHost] + direction TB + ASGI[Starlette app] + Router[Channel router] + Parse{parse →
command or
message?} + Auth[host.authorize] + Resolver[IdentityResolver] + Delivery[_deliver_response] + Push[_handle_push_task] + end + + Channels[Channels
Responses · Invocations ·
Telegram · Activity ·
IdentityLinker] + CmdHandler[CommandHandler
via ChannelCommandContext] + Target[(Agent or Workflow)] + Runner[DurableTaskRunner] + StateStore[(HostStateStore)] + + Caller --> ASGI + ASGI --> Router + Router --> Parse + Parse -- /command --> CmdHandler + Parse -- message --> Auth + CmdHandler -- ctx.run --> Auth + CmdHandler -- local reply --> Channels + Auth --> Resolver + Resolver --> StateStore + Auth --> Target + Target --> Delivery + Delivery -- originating sync --> Channels + Delivery -- non-originating --> Runner + Runner --> Push + Push --> Channels + Channels --> ASGI +``` + +For a richer set of flow diagrams — identity linking, multi-channel +fan-out, server-side relays, background runs, durable-runner codec +envelopes, echo idempotency, workflow targets — see the +[Python hosting spec](https://github.com/microsoft/agent-framework/blob/main/docs/specs/002-python-hosting-channels.md). + +## Install + +```bash +pip install agent-framework-hosting agent-framework-hosting-responses +# or with uvicorn pre-installed for the demo `host.serve(...)` helper +pip install "agent-framework-hosting[serve]" agent-framework-hosting-responses +# add the [disk] extra to opt in to on-disk persistence (see below) +pip install "agent-framework-hosting[disk]" +``` + +## Quickstart + +```python +from agent_framework.openai import OpenAIChatClient +from agent_framework_hosting import AgentFrameworkHost, Channel + +agent = OpenAIChatClient().as_agent(name="Assistant") + +# Add channels from sibling packages, e.g. `agent-framework-hosting-responses` +# exposes a `ResponsesChannel` that serves the OpenAI Responses API. +channels: list[Channel] = [] + +host = AgentFrameworkHost(target=agent, channels=channels) +host.serve(port=8000) +``` + +See the [hosting samples](https://github.com/microsoft/agent-framework/tree/main/python/samples/04-hosting/af-hosting) +for richer multi-channel apps (Telegram + Teams + Responses fan-out, +identity linking, `ResponseTarget` routing, etc.). + +## Optional disk persistence (`state_dir`) + +By default the host keeps everything in memory: the durable-task runner's +pending push queue, the per-isolation-key session aliases, the active-channel +map, and the per-channel `ChannelIdentity` map. That is the right shape for +**ephemeral** runtimes (Foundry Hosted Agents et al.) where the host is +restarted per request and persistence lives behind a service like the Foundry +response store, and for short-lived local dev. + +For **long-running** deployments (an always-on container, a local dev server +you restart often, a single-VM bot) opt in to disk persistence by passing +`state_dir` to `AgentFrameworkHost`. The runner queue and the session +bookkeeping use [`diskcache`](https://grantjenks.com/docs/diskcache/) +(installed via the `[disk]` extra) protected by an OS-level advisory file +lock so two hosts pointed at the same directory can't double-execute +scheduled pushes. Workflow checkpoints (when the target is a `Workflow`) +use the framework's `FileCheckpointStorage` — no extra dependency. The +identity-link store path is offered to linkers that implement +`SupportsLinkStorePath`; linkers that manage persistence themselves should +be configured directly. + +```python +from agent_framework_hosting import AgentFrameworkHost + +# Single path → host auto-derives `runner/`, `sessions/`, `links/`, and +# (for workflow targets) `checkpoints/` subpaths. +host = AgentFrameworkHost( + target=agent, + channels=channels, + state_dir="./.host-state", +) + +# Or route components to different roots — use the HostStatePaths TypedDict +# (or a plain dict with the same keys) for editor autocomplete on the keys. +# Omit a key to opt that component out of persistence. +from agent_framework_hosting import HostStatePaths + +host = AgentFrameworkHost( + target=workflow, + channels=channels, + state_dir=HostStatePaths( + runner="/var/lib/myapp/tasks", + sessions="/var/lib/myapp/state", + checkpoints="/var/lib/myapp/checkpoints", + links="/var/lib/myapp/links", + ), +) +``` + +What survives a restart: + +- **Pending durable-task records** — scheduled but not-yet-completed push + deliveries replay on the next host startup via `runner.resume()`. Records + that crashed mid-attempt resume with their already-consumed retry budget. +- **`_session_aliases`** — per-isolation-key session-id rewrites (via the + reset-session command). +- **`_active`** — the most recently active channel for each isolation key + (consumed by `ResponseTarget.active`). +- **`_identities`** — channel-native `ChannelIdentity` rows used by + `ResponseTarget.channels([...])` / `.all_linked` fan-out. +- **Workflow checkpoints** — when the target is a `Workflow`, the host wraps + the `checkpoints` path in a per-isolation-key `FileCheckpointStorage` + (equivalent to passing `checkpoint_location=...` directly; the explicit + parameter takes precedence and emits a warning when both are set). +- **Identity-link store** — when the configured linker implements + `SupportsLinkStorePath`, the host passes the `links` path to it so pending + challenges, linked identities, and verified claims can survive restarts. + +What doesn't: + +- Live `AgentSession` objects (rehydrated lazily by the history provider on the + next turn). +- The `ContinuationToken` store (separate concern, plug in your own). + +Unpicklable push payloads raise `PushPayloadNotPicklable` *eagerly* from +`schedule()` so issues surface at the call site, not on the next restart. + diff --git a/python/packages/hosting/agent_framework_hosting/__init__.py b/python/packages/hosting/agent_framework_hosting/__init__.py new file mode 100644 index 00000000000..72553d1aef6 --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/__init__.py @@ -0,0 +1,143 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Multi-channel hosting for Microsoft Agent Framework agents. + +Serve a single agent target through one or more **channels** — pluggable +adapters that expose the target over different transports such as the +OpenAI Responses API, Microsoft Teams, Telegram, and others. The base +package contains only the channel-neutral plumbing; concrete channels +ship in their own packages (``agent-framework-hosting-responses``, +``agent-framework-hosting-telegram``, …) so users install only what +they need. +""" + +import importlib.metadata + +from ._authorization import ( + AllOfAllowlists, + AllowAll, + Allowed, + AllowlistDecision, + AnyOfAllowlists, + AuthorizationContext, + AuthorizationOutcome, + AuthPolicy, + CallableAllowlist, + ChannelConfigurationError, + ClaimValue, + Denied, + IdentityAllowlist, + IdentityLinker, + LinkChallenge, + LinkedClaimAllowlist, + LinkedIdentity, + LinkRequired, + LinkResolution, + NativeIdAllowlist, + SupportsLinkStorePath, +) +from ._host import AgentFrameworkHost, ChannelContext, RuntimeMode, logger +from ._isolation import ( + ISOLATION_HEADER_CHAT, + ISOLATION_HEADER_USER, + IsolationKeys, + get_current_isolation_keys, + reset_current_isolation_keys, + set_current_isolation_keys, +) +from ._runner import InProcessTaskRunner +from ._types import ( + Channel, + ChannelCommand, + ChannelCommandContext, + ChannelContribution, + ChannelIdentity, + ChannelPush, + ChannelPushCodec, + ChannelRequest, + ChannelResponseContext, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + ChannelStreamTransformHook, + DurableTaskPayloadMode, + DurableTaskRunner, + HostedRunResult, + HostStatePaths, + PushPayloadNotPicklable, + PushPayloadNotSerializable, + ResponseTarget, + ResponseTargetKind, + RetryPolicy, + TaskHandle, + TaskStatus, + apply_response_hook, + apply_run_hook, +) + +try: + __version__ = importlib.metadata.version(__name__) +except importlib.metadata.PackageNotFoundError: + __version__ = "0.0.0" + +__all__ = [ + "ISOLATION_HEADER_CHAT", + "ISOLATION_HEADER_USER", + "AgentFrameworkHost", + "AllOfAllowlists", + "AllowAll", + "Allowed", + "AllowlistDecision", + "AnyOfAllowlists", + "AuthPolicy", + "AuthorizationContext", + "AuthorizationOutcome", + "CallableAllowlist", + "Channel", + "ChannelCommand", + "ChannelCommandContext", + "ChannelConfigurationError", + "ChannelContext", + "ChannelContribution", + "ChannelIdentity", + "ChannelPush", + "ChannelPushCodec", + "ChannelRequest", + "ChannelResponseContext", + "ChannelResponseHook", + "ChannelRunHook", + "ChannelSession", + "ChannelStreamTransformHook", + "ClaimValue", + "Denied", + "DurableTaskPayloadMode", + "DurableTaskRunner", + "HostStatePaths", + "HostedRunResult", + "IdentityAllowlist", + "IdentityLinker", + "InProcessTaskRunner", + "IsolationKeys", + "LinkChallenge", + "LinkRequired", + "LinkResolution", + "LinkedClaimAllowlist", + "LinkedIdentity", + "NativeIdAllowlist", + "PushPayloadNotPicklable", + "PushPayloadNotSerializable", + "ResponseTarget", + "ResponseTargetKind", + "RetryPolicy", + "RuntimeMode", + "SupportsLinkStorePath", + "TaskHandle", + "TaskStatus", + "__version__", + "apply_response_hook", + "apply_run_hook", + "get_current_isolation_keys", + "logger", + "reset_current_isolation_keys", + "set_current_isolation_keys", +] diff --git a/python/packages/hosting/agent_framework_hosting/_authorization.py b/python/packages/hosting/agent_framework_hosting/_authorization.py new file mode 100644 index 00000000000..882dad18cc0 --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_authorization.py @@ -0,0 +1,485 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Authorization seam — :class:`IdentityAllowlist`, :class:`IdentityLinker`, and outcomes. + +Channels that emit a :class:`ChannelIdentity` compose authorization from +two **orthogonal** parameters set per channel: + +- ``require_link: bool`` — "identity must be linked to an IdP claim". The + host delegates this to the configured :class:`IdentityLinker`; pairing + ``require_link=True`` with no linker is rejected at construction + (silent-deny-everyone is the worst possible default). +- ``allowlist: IdentityAllowlist | Literal["inherit"] | None`` — "identity + is on the accept list". The host evaluates the allowlist on every + inbound message via :func:`AgentFrameworkHost.authorize`. + +The two axes compose into the three named profiles **open** (no gate), +**forced-link** (any authenticated identity), and **allowlist** (only +listed identities, keyed either on the channel-native id pre-link or on +a verified IdP claim post-link). See +``docs/specs/002-python-hosting-channels.md`` § +"Authorization profiles and the IdentityAllowlist seam". + +This module ships the channel-neutral core pieces. Provider-specific +linking channels (for example Entra OAuth helpers) can implement +:class:`IdentityLinker` without the core package taking a dependency on +their transport or identity-provider SDKs. +""" + +from __future__ import annotations + +import os +from collections.abc import Awaitable, Callable, Collection, Mapping, Sequence +from dataclasses import dataclass, field +from datetime import datetime +from enum import Enum +from typing import Any, Literal, Protocol, TypeAlias, runtime_checkable + +from ._types import ChannelIdentity + + +class AllowlistDecision(str, Enum): + """Tri-state allowlist evaluation outcome. + + ``ABSTAIN`` is **not** a denial — it means "this allowlist has no + information yet" (typically a claim-based allowlist evaluated at + ``pre_link``). The host's :meth:`AgentFrameworkHost.authorize` + pipeline is what turns an all-``ABSTAIN`` outcome into the next + step (allow when open, escalate to a link ceremony when the config + calls for one). Boolean composition cannot distinguish "claim + allowlist denies you" from "claim allowlist hasn't seen any claims + yet" — a critical distinction for the **Mixed** profile. + """ + + ALLOW = "allow" + DENY = "deny" + ABSTAIN = "abstain" + + +ClaimValue: TypeAlias = str | Sequence[str] +"""Verified claim value shape understood by :class:`LinkedClaimAllowlist`.""" + + +def _empty_claim_mapping() -> Mapping[str, ClaimValue]: + return {} + + +def _empty_any_mapping() -> Mapping[str, Any]: + return {} + + +@dataclass(frozen=True) +class AuthorizationContext: + """Inputs to a single :meth:`IdentityAllowlist.evaluate` call.""" + + identity: ChannelIdentity + phase: Literal["pre_link", "post_link"] + isolation_key: str | None = None + verified_claims: Mapping[str, ClaimValue] = field(default_factory=_empty_claim_mapping) + claim_source: Literal["linker", "channel", "none"] = "none" + + +@runtime_checkable +class IdentityAllowlist(Protocol): + """Per-channel accept/deny gate evaluated by the host. + + ``requires_linked_claims`` declares that this allowlist's + :meth:`evaluate` cannot ``ALLOW`` until verified claims are + available — the host's construction-time validator rejects + configurations that would silently deny everyone (e.g. a + :class:`LinkedClaimAllowlist` on a channel that neither has + ``require_link=True`` nor natively emits verified claims). + """ + + requires_linked_claims: bool + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: ... + + +class AllowAll: + """Explicit "open" sentinel. + + Useful for tests, sample code, and for **overriding** a host-level + ``default_allowlist`` on a specific channel that should be public + inside an otherwise locked-down host. + """ + + requires_linked_claims: bool = False + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: + return AllowlistDecision.ALLOW + + +class NativeIdAllowlist: + """Accept only listed channel-native ids. + + Telegram ``chat_id``, WhatsApp number, Slack user id, etc. The + list can be a plain collection or an async loader so allowlist + sources can be config files, secret stores, or feature flags. + Pre-link and post-link behaviour is identical — native-id + allowlists do not depend on link state. + + When ``channel`` is set, the allowlist participates in + :class:`AnyOfAllowlists` composition by returning ``ABSTAIN`` for + requests from other channels — this lets per-channel native lists + coexist under a single combinator without one channel's ``DENY`` + masking another channel's ``ALLOW``. + + Keyword Args: + native_ids: A static collection of ids, or an async loader. + channel: When set, only requests whose + ``ChannelIdentity.channel`` matches participate; others + ``ABSTAIN``. + """ + + requires_linked_claims: bool = False + + def __init__( + self, + native_ids: Collection[str] | Callable[[], Awaitable[Collection[str]]], + *, + channel: str | None = None, + ) -> None: + self._native_ids: Collection[str] | None + self._loader: Callable[[], Awaitable[Collection[str]]] | None + if callable(native_ids): + self._native_ids = None + self._loader = native_ids + else: + self._native_ids = frozenset(native_ids) + self._loader = None + self.channel = channel + + async def _resolve(self) -> Collection[str]: + if self._native_ids is not None: + return self._native_ids + loader = self._loader + if loader is None: # pragma: no cover - defensive + raise RuntimeError("NativeIdAllowlist: loader missing after cache miss") + loaded = await loader() + # Cache the resolved set so subsequent calls avoid re-loading. + self._native_ids = frozenset(loaded) + self._loader = None + return self._native_ids + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: + if self.channel is not None and context.identity.channel != self.channel: + return AllowlistDecision.ABSTAIN + ids = await self._resolve() + if context.identity.native_id in ids: + return AllowlistDecision.ALLOW + return AllowlistDecision.DENY + + +class LinkedClaimAllowlist: + """Accept only identities whose verified IdP claim is on the list. + + ``evaluate`` returns ``ABSTAIN`` at ``pre_link`` (no claims yet) + and ``ALLOW``/``DENY`` at ``post_link``. Claim values may be plain + strings or a sequence of strings (for multi-valued claims such as + group ids); any intersection with ``values`` allows the identity. + + Keyword Args: + claim: The verified-claim key to inspect (e.g. ``"oid"``, + ``"tid"``, ``"groups"``). + values: Accepted values. + """ + + requires_linked_claims: bool = True + + def __init__(self, claim: str, values: Collection[str]) -> None: + self.claim = claim + self.values = frozenset(values) + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: + if context.phase == "pre_link": + return AllowlistDecision.ABSTAIN + value = context.verified_claims.get(self.claim) + if value is None: + return AllowlistDecision.DENY + if isinstance(value, str): + return AllowlistDecision.ALLOW if value in self.values else AllowlistDecision.DENY + return AllowlistDecision.ALLOW if any(item in self.values for item in value) else AllowlistDecision.DENY + + +class AnyOfAllowlists: + """Combinator: any child ``ALLOW`` wins; ``DENY`` only if all children ``DENY``. + + Use this for the **Mixed** profile (native id OR linked claim). + Returns ``ABSTAIN`` when no child decides. + """ + + def __init__(self, *allowlists: IdentityAllowlist) -> None: + self._children = allowlists + self.requires_linked_claims = any(getattr(a, "requires_linked_claims", False) for a in allowlists) + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: + any_abstain = False + all_deny = True + for child in self._children: + decision = await child.evaluate(context) + if decision is AllowlistDecision.ALLOW: + return AllowlistDecision.ALLOW + if decision is AllowlistDecision.ABSTAIN: + any_abstain = True + all_deny = False + # DENY contributes to all_deny without short-circuit. + if all_deny and self._children: + return AllowlistDecision.DENY + if any_abstain: + return AllowlistDecision.ABSTAIN + # No children — treat as ABSTAIN to avoid surprise DENY. + return AllowlistDecision.ABSTAIN + + +class AllOfAllowlists: + """Combinator: any child ``DENY`` wins; ``ALLOW`` only if all children ``ALLOW``. + + Use this to require multiple conditions (e.g. tenancy + **and** group membership). Returns ``ABSTAIN`` when no child + denies but at least one ``ABSTAIN``s. + """ + + def __init__(self, *allowlists: IdentityAllowlist) -> None: + self._children = allowlists + self.requires_linked_claims = any(getattr(a, "requires_linked_claims", False) for a in allowlists) + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: + any_abstain = False + for child in self._children: + decision = await child.evaluate(context) + if decision is AllowlistDecision.DENY: + return AllowlistDecision.DENY + if decision is AllowlistDecision.ABSTAIN: + any_abstain = True + if not self._children: + return AllowlistDecision.ABSTAIN + if any_abstain: + return AllowlistDecision.ABSTAIN + return AllowlistDecision.ALLOW + + +class CallableAllowlist: + """Escape hatch: wrap an arbitrary async function as an allowlist. + + Recommended only after exhausting the structured variants — + composition is harder to reason about with opaque callables. + """ + + def __init__( + self, + fn: Callable[[AuthorizationContext], Awaitable[AllowlistDecision]], + *, + requires_linked_claims: bool = False, + ) -> None: + self._fn = fn + self.requires_linked_claims = requires_linked_claims + + async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: + return await self._fn(context) + + +# --------------------------------------------------------------------------- # +# Outcome types # +# --------------------------------------------------------------------------- # + + +@dataclass(frozen=True) +class LinkChallenge: + """Challenge a channel can render to complete an identity link. + + Attributes: + challenge_id: Opaque linker-owned id for correlating the challenge + with the later completion callback. + url: Optional URL (OAuth authorization URL, device-flow URL, etc.) + the user should open. + expires_at: Optional challenge expiry time. + message: Optional safe text a channel may render with the challenge. + attributes: Linker-specific structured metadata. Channels should + only use keys documented by the concrete linker they integrate. + """ + + challenge_id: str + url: str | None = None + expires_at: datetime | None = None + message: str | None = None + attributes: Mapping[str, Any] = field(default_factory=_empty_any_mapping) + + +@dataclass(frozen=True) +class LinkedIdentity: + """Resolved IdP-backed identity returned by :class:`IdentityLinker`. + + Attributes: + isolation_key: Stable key the host should use for the linked user. + verified_claims: Claims verified by the linker or by a channel that + natively authenticates the user. + claim_source: Where the claims came from. + """ + + isolation_key: str + verified_claims: Mapping[str, ClaimValue] = field(default_factory=_empty_claim_mapping) + claim_source: Literal["linker", "channel"] = "linker" + + +LinkResolution: TypeAlias = LinkedIdentity | LinkChallenge +"""Result returned by :meth:`IdentityLinker.resolve`.""" + + +class IdentityLinker(Protocol): + """Resolve a channel-native identity or return a challenge to link it. + + Concrete linker packages own the storage, OAuth/device-code routes, and + provider-specific claim mapping. The core host only consumes the single + resolution call so authorization can be a one-round-trip decision. + """ + + async def resolve(self, identity: ChannelIdentity) -> LinkResolution: + """Return a linked identity or the challenge needed to create one.""" + ... + + +@runtime_checkable +class SupportsLinkStorePath(Protocol): + """Optional protocol for linkers that accept host-provided persistence. + + When ``AgentFrameworkHost(state_dir=...)`` derives a ``links`` path, the + host calls this hook on identity linkers that implement it. Linkers that + manage their own persistence can ignore this protocol and should be + configured directly by the application. + """ + + def configure_link_store_path(self, path: str | os.PathLike[str]) -> None: + """Configure where the linker should persist its link store.""" + ... + + +@dataclass(frozen=True) +class Allowed: + """The identity is authorized; ``isolation_key`` is its stable key.""" + + isolation_key: str + verified_claims: Mapping[str, ClaimValue] = field(default_factory=_empty_claim_mapping) + claim_source: Literal["linker", "channel", "none"] = "none" + + +@dataclass(frozen=True) +class LinkRequired: + """The identity must complete the link ceremony before proceeding. + + Channels render ``challenge`` through their native UX (the same + path the ``link`` command uses). + """ + + challenge: LinkChallenge + + +@dataclass(frozen=True) +class Denied: + """The identity is rejected. + + Attributes: + reason_code: Stable, machine-readable token (e.g. + ``"allowlist_denied_pre_link"``). Never echoed to end + users. + user_message: Safe to render publicly (group-chat-safe); + ``None`` falls back to a bland default ("You don't have + access to this bot."). + log_details: Structured payload for audit/observability; + **never** shown to users. + """ + + reason_code: str + user_message: str | None = None + log_details: Mapping[str, Any] = field(default_factory=_empty_any_mapping) + + +AuthorizationOutcome = Allowed | LinkRequired | Denied +"""Result of :func:`AgentFrameworkHost.authorize`. Channels render +each variant through their native UX.""" + + +class AuthPolicy: + """Factory helpers for common authorization policies. + + These helpers are thin wrappers over the concrete allowlist types; they + exist so application code can describe authorization intent without + importing each building block separately. + """ + + @staticmethod + def open() -> AllowAll: + """Allow every identity.""" + return AllowAll() + + @staticmethod + def native_ids( + native_ids: Collection[str] | Callable[[], Awaitable[Collection[str]]], + *, + channel: str | None = None, + ) -> NativeIdAllowlist: + """Allow listed channel-native ids.""" + return NativeIdAllowlist(native_ids, channel=channel) + + @staticmethod + def linked_claim(claim: str, values: Collection[str]) -> LinkedClaimAllowlist: + """Allow identities whose verified claim matches one of ``values``.""" + return LinkedClaimAllowlist(claim, values) + + @staticmethod + def any_of(*allowlists: IdentityAllowlist) -> AnyOfAllowlists: + """Allow when any child allowlist allows.""" + return AnyOfAllowlists(*allowlists) + + @staticmethod + def all_of(*allowlists: IdentityAllowlist) -> AllOfAllowlists: + """Allow only when every child allowlist allows.""" + return AllOfAllowlists(*allowlists) + + @staticmethod + def custom( + fn: Callable[[AuthorizationContext], Awaitable[AllowlistDecision]], + *, + requires_linked_claims: bool = False, + ) -> CallableAllowlist: + """Wrap a custom async allowlist function.""" + return CallableAllowlist(fn, requires_linked_claims=requires_linked_claims) + + +# --------------------------------------------------------------------------- # +# Configuration error # +# --------------------------------------------------------------------------- # + + +class ChannelConfigurationError(ValueError): + """Raised at host construction for authorization config that would deny all users. + + The host validator runs three rules (see spec §"Configuration + validation"); any failure is reported here rather than letting + the misconfigured host start up and reject every request. + """ + + +__all__ = [ + "AllOfAllowlists", + "AllowAll", + "Allowed", + "AllowlistDecision", + "AnyOfAllowlists", + "AuthPolicy", + "AuthorizationContext", + "AuthorizationOutcome", + "CallableAllowlist", + "ChannelConfigurationError", + "ClaimValue", + "Denied", + "IdentityAllowlist", + "IdentityLinker", + "LinkChallenge", + "LinkRequired", + "LinkResolution", + "LinkedClaimAllowlist", + "LinkedIdentity", + "NativeIdAllowlist", + "SupportsLinkStorePath", +] diff --git a/python/packages/hosting/agent_framework_hosting/_host.py b/python/packages/hosting/agent_framework_hosting/_host.py new file mode 100644 index 00000000000..a64d3071dde --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_host.py @@ -0,0 +1,2353 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""The :class:`AgentFrameworkHost` and its :class:`ChannelContext` bridge. + +The host is a tiny Starlette wrapper: + +- ``__init__`` accepts a hostable target (``SupportsAgentRun`` agent or + ``Workflow``) and a sequence of channels. +- :meth:`AgentFrameworkHost.app` lazily builds a Starlette app by calling + every channel's ``contribute`` and mounting the returned routes under + the channel's ``path`` (empty path → mount at the app root). +- :class:`ChannelContext` exposes ``run`` / ``run_stream`` / + ``deliver_response`` for channels to invoke; the host handles + per-``isolation_key`` session caching, identity tracking, and + :class:`ResponseTarget` fan-out. + +Per SPEC-002 (and ADR-0026), the host is intentionally thin so the bulk +of channel-specific behaviour stays in the channel package. Identity +linking, link policies, response targets, background runs, and the like +are pluggable extensions that the future identity/foundry packages will +contribute on top of this surface. +""" + +from __future__ import annotations + +import asyncio +import logging +import os +import uuid +from collections.abc import AsyncIterator, Awaitable, Callable, Mapping, Sequence +from contextlib import AbstractContextManager, ExitStack, asynccontextmanager +from pathlib import Path +from typing import TYPE_CHECKING, Any, Literal, cast + +from agent_framework import ( + AgentResponse, + AgentResponseUpdate, + CheckpointStorage, + Content, + FileCheckpointStorage, + Message, + ResponseStream, + SupportsAgentRun, + Workflow, + WorkflowEvent, +) +from starlette.applications import Starlette +from starlette.middleware import Middleware +from starlette.requests import Request +from starlette.responses import PlainTextResponse +from starlette.routing import BaseRoute, Mount, Route +from starlette.types import ASGIApp, Receive, Scope, Send + +from ._authorization import ( + Allowed, + AllowlistDecision, + AuthorizationContext, + AuthorizationOutcome, + ChannelConfigurationError, + ClaimValue, + Denied, + IdentityAllowlist, + IdentityLinker, + LinkChallenge, + LinkRequired, + SupportsLinkStorePath, +) +from ._isolation import ( + ISOLATION_HEADER_CHAT, + ISOLATION_HEADER_USER, + IsolationKeys, + reset_current_isolation_keys, + set_current_isolation_keys, +) +from ._persistence import normalize_state_dir +from ._runner import InProcessTaskRunner +from ._state_store import SessionsStateStore, build_session_dicts +from ._types import ( + Channel, + ChannelIdentity, + ChannelPush, + ChannelPushCodec, + ChannelRequest, + ChannelResponseContext, + ChannelResponseHook, + DurableTaskPayloadMode, + DurableTaskRunner, + HostedRunResult, + HostStatePaths, + PushPayloadNotSerializable, + ResponseTargetKind, + apply_response_hook, +) + +if TYPE_CHECKING: + from agent_framework._workflows._workflow import WorkflowRunResult + +logger = logging.getLogger("agent_framework.hosting") + + +# Environment markers that auto-detect ``runtime_mode="ephemeral"``. Order +# matters only for telemetry — the first match wins and is logged at +# startup. Adding a new marker is a non-breaking change; consumers can +# always override via the ``runtime_mode`` constructor parameter. +_EPHEMERAL_RUNTIME_MARKERS: tuple[str, ...] = ( + "FOUNDRY_HOSTING_ENVIRONMENT", + "AZURE_FUNCTIONS_ENVIRONMENT", + "AWS_LAMBDA_FUNCTION_NAME", +) + + +RuntimeMode = Literal["long_running", "ephemeral"] + + +def _detect_runtime_mode(env: Mapping[str, str] | None = None) -> tuple[RuntimeMode, str | None]: + """Inspect deployment markers and return ``(mode, matched_marker_or_None)``. + + Pure / side-effect-free so the host can call it once at construction + and tests can pass a synthetic env. ``env`` defaults to + :data:`os.environ`. Returns ``"long_running"`` when nothing matches — + that's the sensible default for local dev and always-on container + deployments. + """ + source = env if env is not None else os.environ + for marker in _EPHEMERAL_RUNTIME_MARKERS: + if source.get(marker): + return ("ephemeral", marker) + return ("long_running", None) + + +# Internal name the host uses when registering the push handler on the +# durable task runner. Exposed as a module constant so adapter packages +# (and the future background-run wiring under req #14) can use the same +# name for cross-runner observability. +HOSTING_PUSH_TASK_NAME = "hosting.push" + + +def _flatten_allowlists(allowlist: IdentityAllowlist) -> tuple[IdentityAllowlist, ...]: + """Walk an allowlist tree to expose nested :class:`IdentityAllowlist` instances. + + Used by :meth:`AgentFrameworkHost._validate_channel_authorization` + to inspect every leaf so type-checks like + ``NativeIdAllowlist(channel=)`` can be detected even + when buried inside :class:`AnyOfAllowlists` / :class:`AllOfAllowlists`. + """ + children = getattr(allowlist, "_children", None) + if children: + flat: list[IdentityAllowlist] = [allowlist] + for child in children: + flat.extend(_flatten_allowlists(child)) + return tuple(flat) + return (allowlist,) + + +def _checkpoint_path_for_isolation_key(root: Path, isolation_key: str) -> Path: + r"""Return ``root / isolation_key`` after rejecting path-traversal patterns. + + Isolation keys are intentionally caller-controlled: they originate from + inbound HTTP headers (``x-agent-{user,chat}-isolation-key`` injected by + the Foundry runtime), from channel-supplied derivations such as + ``telegram:`` / ``entra:``, or from a channel ``run_hook`` + that may read body fields. Joining such a value into a filesystem path + without validation is CWE-22: a value such as ``../../../etc/foo`` or + ``\\foo`` (Windows UNC) would let the resulting checkpoint directory + escape the configured root. + + The check intentionally uses a denylist so legitimate namespaced keys + (``telegram:42``, ``entra:abc-def``) are preserved as-is. Rejected: + + * any key containing ``/``, ``\\``, or NUL; + * keys that reduce to empty after stripping dots (``.``, ``..``, ``...``, + ...); + * absolute paths (``os.path.isabs``); + * keys carrying a drive letter prefix (``os.path.splitdrive`` — catches + Windows ``C:/...`` and single-letter ``X:foo`` constructs that + ``Path("/root") / "X:foo"`` would otherwise interpret as drive-rooted). + + After joining, both ``root`` and the resolved target are normalised and + the target is verified to stay under the resolved root as defence in + depth — if the denylist ever misses a pattern, this final check still + refuses the join. + + Raises: + ValueError: If ``isolation_key`` is not a non-empty string or fails + any of the validation steps above. + """ + if not isinstance(isolation_key, str) or not isolation_key: + raise ValueError("isolation_key must be a non-empty string") + if ( + "/" in isolation_key + or "\\" in isolation_key + or "\x00" in isolation_key + or isolation_key.strip(".") == "" + or os.path.isabs(isolation_key) + or os.path.splitdrive(isolation_key)[0] + # ``splitdrive`` only recognises drive letters on Windows; reject + # the ``X:rest`` pattern explicitly so a payload crafted on a + # POSIX host still fails closed if the resulting directory ever + # round-trips to Windows storage. + or (len(isolation_key) >= 2 and isolation_key[0].isalpha() and isolation_key[1] == ":") + ): + raise ValueError(f"Invalid isolation_key for checkpoint path: {isolation_key!r}") + + root_resolved = root.resolve() + target = (root_resolved / isolation_key).resolve() + if not target.is_relative_to(root_resolved): + raise ValueError(f"Invalid isolation_key for checkpoint path: {isolation_key!r}") + return target + + +def _workflow_output_to_text(value: Any) -> str: + """Render a single workflow ``output`` payload as plain text. + + Used by the streaming path (``_workflow_event_to_update``) when an + executor emits an arbitrary Python object that the host then has to + serialise into an :class:`AgentResponseUpdate` content for the SSE + stream. ``AgentResponse`` and ``AgentResponseUpdate`` carry text + natively; everything else is best-effort ``str()``. + """ + text = getattr(value, "text", None) + if isinstance(text, str): + return text + return str(value) + + +def _workflow_event_to_update(event: WorkflowEvent[Any]) -> AgentResponseUpdate | None: + """Map a :class:`WorkflowEvent` to a channel-friendly :class:`AgentResponseUpdate`. + + Returns ``None`` for events the host should drop (anything that is not + user-visible output). The original event is preserved on the update's + ``raw_representation`` so consumers can recover full workflow context. + """ + if event.type != "output": + return None + payload: Any = event.data + if isinstance(payload, AgentResponseUpdate): + # Already a streaming update — pass through but tag the source so + # downstream hooks can tell it came from a workflow executor. + if payload.raw_representation is None: + payload.raw_representation = event + return payload + if isinstance(payload, Content): + # Preserve the original content (image, function call, audio, …) + # rather than stringifying — the host stays modality-agnostic + # and lets each destination channel decide what it can render. + return AgentResponseUpdate( + contents=[payload], + role="assistant", + author_name=event.executor_id, + raw_representation=event, + ) + text = _workflow_output_to_text(payload) + return AgentResponseUpdate( + contents=[Content.from_text(text=text)], + role="assistant", + author_name=event.executor_id, + raw_representation=event, + ) + + +@asynccontextmanager +async def _suppress_already_consumed() -> AsyncIterator[None]: # noqa: RUF029 + """Yield, swallowing finalizer failures so consumer cleanup never crashes the host. + + The bridge stream calls ``get_final_response()`` after iterating the + workflow stream so the workflow's cleanup hooks run; on some paths the + stream considers itself already finalized (or its inner stream was + closed by ``__anext__`` auto-finalization) and the finalizer raises. + We are inside an async-generator ``finally`` block during teardown, + so we MUST NOT propagate — that would mask the iteration's real + result and cascade into the channel's own cleanup. We always log + with ``exc_info=True`` so the swallowed failure is observable in + operator logs (a regression in the workflow's own cleanup hooks + would otherwise vanish into a clean run). + """ + try: + yield + except RuntimeError as exc: + # Narrow match: only the two documented benign messages produced + # by ``ResponseStream`` / async-iteration teardown should be + # swallowed. Anything else (executor-side ``RuntimeError`` from a + # ``raise RuntimeError(...)`` in user code, runner-context state + # error, checkpoint-store ``RuntimeError`` during the post-run + # flush, …) is a real bug and is escalated to the unexpected-error + # branch so it's logged with a full stack trace at ERROR. We + # still don't propagate (we're in an async-generator ``finally`` + # during teardown) — see the docstring. + message = str(exc) + if "Inner stream not available" in message or "Event loop is closed" in message: + logger.warning("workflow stream finalize raised RuntimeError; cleanup skipped", exc_info=True) + else: + logger.exception("workflow stream finalize raised an unexpected RuntimeError; cleanup skipped") + except Exception: + # Anything else (checkpoint write failure, context-provider + # error in a cleanup hook, executor-side bug, …) is a real + # problem. ``logger.exception`` includes the traceback and + # routes at ERROR so it's grep-able in production. We still + # don't propagate — see the docstring. + logger.exception("workflow stream finalize raised an unexpected error; cleanup skipped") + + +class _BoundResponseStream: + """Adapter that keeps an :class:`ExitStack` open across stream iteration. + + Streaming runs return a :class:`ResponseStream` synchronously, but + consumption happens later (the channel iterates). For host-bound + request context (e.g. Foundry response-id binding) to survive that + gap, we hold the stack open until the underlying stream is exhausted + or :meth:`aclose` is called. We forward awaitable + async-iterator + + ``get_final_response`` semantics so the channel sees a normal + ``ResponseStream``-shaped object. + + Lifecycle: + + * Async iteration (``async for u in stream``) — the stack is closed + in the iterator's ``finally`` after the inner stream is drained. + * ``await stream`` — convenience for ``await get_final_response()``; + the stack is closed when ``get_final_response`` runs because that + path also routes through :meth:`_close`. + * ``await stream.get_final_response()`` — closes the stack in + ``finally``. + * Manual cleanup — call :meth:`aclose` (idempotent). Safe to call + from a ``finally`` even after iteration / ``get_final_response`` + already closed the stack. + """ + + def __init__(self, inner: Any, stack: ExitStack) -> None: + self._inner = inner + self._stack = stack + self._closed = False + + def _close(self) -> None: + if self._closed: + return + self._closed = True + self._stack.close() + + async def aclose(self) -> None: + """Idempotently release the bound request context. + + Channels that abandon the stream without iterating it (e.g. + early-return on a validation failure) MUST call this in a + ``finally`` so the host-bound contextvars don't leak for the + lifetime of the host. Calling after the stack already closed + (via iteration / ``get_final_response``) is a no-op. + """ + self._close() + + def __await__(self) -> Any: + # Convenience: ``await stream`` ≡ ``await stream.get_final_response()``. + # We route through ``get_final_response`` so the stack closes in + # its ``finally`` block, instead of leaking the binding for the + # host's lifetime as the previous direct-await delegation did. + return self.get_final_response().__await__() + + def __aiter__(self) -> AsyncIterator[Any]: + return self._wrap() + + async def _wrap(self) -> AsyncIterator[Any]: + try: + async for item in self._inner: + yield item + finally: + self._close() + + async def get_final_response(self) -> Any: + try: + return await self._inner.get_final_response() + finally: + self._close() + + def __getattr__(self, name: str) -> Any: + return getattr(self._inner, name) + + +class ChannelContext: + """Host-owned bridge that channels call to invoke the target.""" + + def __init__(self, host: AgentFrameworkHost) -> None: + """Bind the context to its owning :class:`AgentFrameworkHost`. + + The host instance is the source of truth for the target, registered + channels, identity stores, sessions, and lifecycle state. Channels + only ever receive a context; they never see the host directly. + """ + self._host = host + + @property + def target(self) -> SupportsAgentRun | Workflow: + """The hostable target the channel should invoke.""" + return self._host.target + + async def run(self, request: ChannelRequest) -> HostedRunResult[Any]: + """Invoke the target for ``request`` and return a channel-neutral result. + + For agent targets the return type narrows to + ``HostedRunResult[AgentResponse]``; for workflow targets to + ``HostedRunResult[WorkflowRunResult]``. The static return is left + as ``HostedRunResult[Any]`` because :class:`ChannelContext` is + agnostic to which target shape the host was constructed with; + channels narrow at the call site if they need it. + """ + return await self._host._invoke(request) # pyright: ignore[reportPrivateUsage] + + def run_stream(self, request: ChannelRequest) -> ResponseStream[AgentResponseUpdate, AgentResponse]: + """Invoke the target with ``stream=True`` and return the agent's ResponseStream. + + Channels iterate the stream directly (it acts like an AsyncGenerator) + and are responsible for delivering updates to their wire protocol. + Apply per-channel ``transform_hook`` callables during iteration to + rewrite or drop individual updates before they hit the wire. + """ + return self._host._invoke_stream(request) # pyright: ignore[reportPrivateUsage] + + async def deliver_response( + self, + request: ChannelRequest, + payload: HostedRunResult[Any], + ) -> bool: + """Resolve ``request.response_target`` and push ``payload`` to each destination. + + Returns ``True`` when the originating channel should render the + agent reply on its own wire (i.e. the resolved target included + the originating channel — explicitly via + ``ResponseTarget.originating``, implicitly via + ``ResponseTarget.channels(["originating", ...])``, or as the + host's "every destination dropped, fall back to originating" + recovery path). Returns ``False`` when the reply is fanned out + purely to non-originating destinations (or + :data:`ResponseTarget.none` suppresses the reply entirely) — in + which case the originating channel typically responds with a + bare ack. + + Per-destination push outcomes (scheduled, retried, terminally + failed) live in the durable task runner's own log; this method + emits structured log entries for every resolution-time skip and + every schedule-time outage so operators have a single grep + anchor for "where did my reply go?". + """ + return await self._host._deliver_response(request, payload) # pyright: ignore[reportPrivateUsage] + + +class _FoundryIsolationASGIMiddleware: + """Lift the two well-known Foundry isolation headers into a contextvar. + + The Foundry Hosted Agents runtime injects + ``x-agent-{user,chat}-isolation-key`` on every inbound HTTP request. + Storage providers that need partition-aware writes (notably + :class:`FoundryHostedAgentHistoryProvider`) read those keys via + :func:`get_current_isolation_keys` to avoid every channel having to + parse Foundry-specific headers itself. We intentionally inspect + only HTTP scopes; lifespan/websocket scopes are forwarded + untouched. When neither header is present the contextvar stays at + its default ``None``, so local-dev requests behave as before. + """ + + def __init__(self, app: ASGIApp) -> None: + self.app = app + + async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None: + if scope["type"] != "http": + await self.app(scope, receive, send) + return + user_key: str | None = None + chat_key: str | None = None + for raw_name, raw_value in scope.get("headers") or (): + name = raw_name.decode("latin-1").lower() + if name == ISOLATION_HEADER_USER: + user_key = raw_value.decode("latin-1") or None + elif name == ISOLATION_HEADER_CHAT: + chat_key = raw_value.decode("latin-1") or None + if user_key is None and chat_key is None: + await self.app(scope, receive, send) + return + token = set_current_isolation_keys(IsolationKeys(user_key=user_key, chat_key=chat_key)) + try: + await self.app(scope, receive, send) + finally: + reset_current_isolation_keys(token) + + +class AgentFrameworkHost: + """Owns one Starlette app, one hostable target, and a sequence of channels.""" + + def __init__( + self, + target: SupportsAgentRun | Workflow, + *, + channels: Sequence[Channel], + debug: bool = False, + checkpoint_location: str | os.PathLike[str] | CheckpointStorage | None = None, + runtime_mode: RuntimeMode | None = None, + durable_task_runner: DurableTaskRunner | None = None, + allow_in_process_runner: bool = False, + default_allowlist: IdentityAllowlist | None = None, + identity_linker: IdentityLinker | None = None, + state_dir: str | os.PathLike[str] | HostStatePaths | Mapping[str, str | os.PathLike[str]] | None = None, + ) -> None: + """Create a host for ``target`` and its channels. + + Args: + target: The hostable target to invoke from channels — either a + ``SupportsAgentRun``-compatible agent or a ``Workflow``. The + host detects the kind and dispatches to the appropriate + execution seam (``agent.run(...)`` vs ``workflow.run(message=...)``). + For workflow targets, channels (or their ``run_hook``) are + responsible for shaping ``ChannelRequest.input`` into the + workflow start executor's typed input. + + Keyword Args: + channels: The channels to expose. Each channel contributes routes + and commands that are mounted under ``channel.path`` (defaulting + to the channel name). + debug: Whether to enable Starlette's debug mode (stack traces in + responses, etc.) and per-channel debug logging. + checkpoint_location: When ``target`` is a :class:`Workflow`, the + location used to persist workflow checkpoints across requests. + Either a filesystem path (``str`` / ``PathLike``) — the host + creates a per-conversation + :class:`~agent_framework.FileCheckpointStorage` rooted at + ``checkpoint_location / `` — or a + :class:`~agent_framework.CheckpointStorage` instance the host + uses as-is (caller owns scoping). Per-request behaviour: + requests without ``ChannelRequest.session.isolation_key`` + are run without checkpointing. When set on a workflow that + already has its own checkpoint storage configured + (``WorkflowBuilder(checkpoint_storage=...)``), the host + refuses to start so ownership of checkpointing is + unambiguous. Ignored for ``SupportsAgentRun`` targets (a + warning is emitted). Takes precedence over + ``state_dir['checkpoints']`` (or the auto-derived + ``state_dir/checkpoints/`` subfolder); a warning surfaces + the double-configuration. + runtime_mode: Hint that drives the *defaults* for runtime-shape + dependent components (currently the durable task runner, + and — by extension — anything that wants to know whether + the process is expected to outlive a single request). + ``"long_running"`` (containers, OpenClaw-style always-on + deployments, local dev) → in-process / in-memory defaults. + ``"ephemeral"`` (Foundry Hosted Agents, Azure Functions, + AWS Lambda) → the host expects a durable runner to be + supplied via ``durable_task_runner`` and logs a warning + otherwise. ``None`` (the default) auto-detects from + deployment environment markers (currently + ``FOUNDRY_HOSTING_ENVIRONMENT``, ``AZURE_FUNCTIONS_ENVIRONMENT``, + ``AWS_LAMBDA_FUNCTION_NAME``); falls back to + ``"long_running"``. + durable_task_runner: The runner used to dispatch + non-originating push fan-out. Defaults to a process-local + :class:`InProcessTaskRunner` (asyncio + bounded retry, no + persistence) — appropriate for ``runtime_mode="long_running"`` + deployments. Ephemeral deployments should pass a durable + adapter (e.g. ``agent-framework-hosting-durabletask``, + or a Foundry-native adapter once available) so scheduled + pushes survive process restarts. + allow_in_process_runner: Opt-in escape hatch that allows + ``runtime_mode="ephemeral"`` to be paired with the + default in-process runner. Without this flag, the host + refuses to start in ephemeral mode without an explicit + ``durable_task_runner`` because the failure mode — + non-originating pushes silently lost on process recycle — + is the worst class of production bug (works in light + testing, drops work under load / lifecycle events). + Useful for local dev that wants to exercise ephemeral + code paths without standing up a durable backend; **not** + appropriate for production. + default_allowlist: Host-level fallback applied to every + channel that leaves ``allowlist="inherit"``. ``None`` + (the default) means the channel is open unless it sets + its own ``allowlist``. Channels can opt out of the host + default by setting ``allowlist=None`` explicitly. + identity_linker: Optional :class:`IdentityLinker` used to + resolve channel-native identities into verified IdP-backed + identities, or to return a :class:`LinkChallenge` the + channel can render when the user still needs to sign in. + Channels with ``require_link=True`` require this to be + configured unless they provide their own native verified + claims. + state_dir: Opt-in disk persistence for host-managed state. + When set, the host writes the in-process task runner's + pending queue and the session-related dicts + (``_session_aliases``, ``_active``, ``_identities``) to + a :mod:`diskcache`-backed store under ``state_dir`` and + replays the runner queue on next startup. When the + target is a :class:`Workflow`, the auto-derived + ``state_dir/checkpoints/`` subfolder (or the + ``checkpoints`` key of the mapping form) is also used + as the workflow checkpoint location (equivalent to + passing ``checkpoint_location`` directly). The + auto-derived ``state_dir/links/`` subfolder (or the + ``links`` key of the mapping form) is offered to + identity linkers that implement + :class:`SupportsLinkStorePath`. Accepts: + + * ``None`` (default) — everything stays in memory; the + process owns its state and loses it on exit. Matches + today's behaviour exactly. + * ``str`` / :class:`os.PathLike` — the host derives + default subpaths ``state_dir/runner/``, + ``state_dir/sessions/``, ``state_dir/links/``, and + (for workflow targets) ``state_dir/checkpoints/``. + Recommended for most + long-running-host deployments — one path, no extra + config, all components persist together. Note: when + the target is a Workflow this enables workflow + checkpoint persistence; use the mapping form below + and omit ``checkpoints`` to opt out. + * :class:`HostStatePaths` typed dict / plain + ``Mapping`` — per-component overrides for callers that + want each component on a different volume (fast local + SSD for the runner, network-attached volume for + sessions, …). Components missing from the mapping + fall back to in-memory (or, for ``checkpoints``, to + no checkpoint persistence). Unknown keys raise + ``ValueError`` to surface typos early. + + The ``runner`` and ``sessions`` components require the + optional ``diskcache`` dependency (install with + ``pip install 'agent-framework-hosting[disk]'``); + ``checkpoints`` uses the core + :class:`~agent_framework.FileCheckpointStorage` and has + no extra dependency. Each disk-cache-backed component + acquires an OS-level advisory lock on its directory; a + second host pointed at the same paths raises + :class:`RuntimeError` at construction so two processes + do not double-execute queued tasks. When + ``durable_task_runner`` is supplied explicitly, the + ``runner`` sub-path is ignored — the caller owns the + runner's persistence story. When ``checkpoint_location`` + is supplied explicitly, the ``checkpoints`` sub-path is + ignored. When an ``identity_linker`` does not implement + :class:`SupportsLinkStorePath`, the ``links`` sub-path is + ignored and the linker must be configured directly. + """ + self.target: SupportsAgentRun | Workflow = target + self._is_workflow = isinstance(target, Workflow) + self.channels = list(channels) + self._debug = debug + self._app: Starlette | None = None + # Disk persistence — normalise the per-component map up front so + # the runner, session-store, and checkpoint paths are resolved + # before any consumer (including ``checkpoint_location``) is + # built. ``None`` (default) means everything stays in memory. + self._state_paths: dict[str, Path | None] = normalize_state_dir(state_dir) + # Track whether the user passed the mapping form so we can + # distinguish "auto-derived from single path" (silent ignore for + # non-workflow targets) from "explicit mapping key" (warn for + # non-workflow targets, since that's almost certainly dead config). + checkpoints_explicit_in_mapping = isinstance(state_dir, Mapping) and "checkpoints" in state_dir + links_explicit_in_mapping = isinstance(state_dir, Mapping) and "links" in state_dir + # Resolve the effective workflow checkpoint location: the + # explicit ``checkpoint_location`` argument wins; otherwise we + # fall back to ``state_dir['checkpoints']`` (single-path form + # auto-derives ``state_dir/checkpoints/``). + derived_checkpoint_path = self._state_paths.get("checkpoints") + self._checkpoint_location: Path | CheckpointStorage | None = None + effective_checkpoint_source: str | os.PathLike[str] | CheckpointStorage | None = checkpoint_location + if checkpoint_location is None and derived_checkpoint_path is not None: + # Only consume the derived path when the target is a + # Workflow; non-workflow targets get a warning (explicit + # mapping case) or a silent ignore (single-path case). + if self._is_workflow: + effective_checkpoint_source = derived_checkpoint_path + elif checkpoints_explicit_in_mapping: + logger.warning("state_dir['checkpoints'] is set but target is not a Workflow; ignoring.") + elif checkpoint_location is not None and derived_checkpoint_path is not None: + # Both the legacy parameter and the new state_dir component + # configure the same thing. Keep the explicit one and + # surface the double-config so the user notices the no-op. + logger.warning( + "Both checkpoint_location and state_dir['checkpoints'] are set " + "(state_dir['checkpoints']=%s); the explicit checkpoint_location " + "takes precedence and the state_dir sub-path is ignored. " + "Use the HostStatePaths mapping form and omit 'checkpoints' to " + "configure runner/sessions persistence without also enabling " + "host-managed workflow checkpointing.", + derived_checkpoint_path, + ) + if effective_checkpoint_source is not None: + if not self._is_workflow: + # Only the legacy parameter path can reach here for a + # non-workflow target (the derived path was already + # short-circuited above). Preserve the historical + # warning text so existing users see the same message. + logger.warning("checkpoint_location is set but target is not a Workflow; ignoring.") + else: + workflow: Workflow = target # type: ignore[assignment] + if workflow._runner_context.has_checkpointing(): # type: ignore[reportPrivateUsage] + raise RuntimeError( + "Workflow already has checkpoint storage configured " + "(WorkflowBuilder(checkpoint_storage=...)). The host " + "manages checkpoints when checkpoint_location (or " + "state_dir['checkpoints']) is set; remove one of the " + "two configurations." + ) + if isinstance(effective_checkpoint_source, (str, os.PathLike)): + self._checkpoint_location = Path(os.fspath(effective_checkpoint_source)) + else: + # Anything else is treated as a CheckpointStorage instance. + # ``CheckpointStorage`` is a non-runtime-checkable Protocol, + # so we cannot ``isinstance``-check it directly. + self._checkpoint_location = effective_checkpoint_source + # Runtime mode + durable task runner. We resolve mode first + # because the warning-on-ephemeral-without-runner only fires + # when both are at their defaults. + if runtime_mode is None: + resolved_mode, matched_marker = _detect_runtime_mode() + self._runtime_mode: RuntimeMode = resolved_mode + self._runtime_mode_source: str = ( + f"auto-detected from {matched_marker}" if matched_marker is not None else "auto-detected default" + ) + else: + self._runtime_mode = runtime_mode + self._runtime_mode_source = "explicit" + if durable_task_runner is None: + if self._runtime_mode == "ephemeral" and not allow_in_process_runner: + raise RuntimeError( + "AgentFrameworkHost is running in ephemeral runtime mode " + f"({self._runtime_mode_source}) without a durable_task_runner. " + "Non-originating push deliveries would be lost on process " + "recycle. Pass `durable_task_runner=...` (e.g. an " + "agent-framework-hosting-durabletask runner) for production, " + "or set `allow_in_process_runner=True` to opt out of this " + "check (e.g. for local dev exercising ephemeral code paths)." + ) + # When state_dir["runner"] is set, the default in-process + # runner persists its queue to disk so a long-running host + # can replay in-flight pushes after a crash / restart. + self._durable_task_runner: DurableTaskRunner = InProcessTaskRunner( + state_dir=self._state_paths.get("runner"), + ) + self._owns_runner = True + if self._runtime_mode == "ephemeral": + logger.warning( + "AgentFrameworkHost is running in ephemeral runtime mode " + "with the default InProcessTaskRunner (allow_in_process_runner=True). " + "Non-originating push deliveries will be lost if the process is " + "recycled mid-flight — this configuration is intended for local dev only." + ) + else: + self._durable_task_runner = durable_task_runner + self._owns_runner = False + if self._state_paths.get("runner") is not None: + # The caller supplied both a runner and a runner state + # path. The path would only have applied to the default + # in-process runner; surface the misconfig so it doesn't + # silently become a no-op. + logger.warning( + "state_dir['runner'] is set but a durable_task_runner was " + "supplied explicitly; the runner sub-path is ignored — " + "configure persistence on the runner instance directly." + ) + # Validate the runner / push-codec pairing eagerly: a JSON-mode + # durable runner cannot persist payloads for a push-capable + # channel that has no codec. Failing here makes the misconfig + # visible at process start rather than on first push. + self._validate_runner_codec_pairing() + # Register the internal push handler eagerly so it is available + # whether callers invoke ``_deliver_response`` directly (e.g. + # tests) or through the lifespan-managed ASGI app. Doing this + # in ``__init__`` is safe because runner handler registration + # has no I/O — it only associates a name with a callable. + self._durable_task_runner.register(HOSTING_PUSH_TASK_NAME, self._handle_push_task) + # Per-isolation_key session cache. The real spec backs this with a + # pluggable session store; this base host keeps it in-process. + # NOTE: live ``AgentSession`` objects are NOT persisted to disk + # — the history provider rehydrates them from its own store on + # the next turn. ``state_dir`` only persists the lightweight + # pickle-friendly bookkeeping below. + self._sessions: dict[str, Any] = {} + # Open the disk-backed sessions store first when persistence is + # on; the three persisted dicts share the same cache + lock to + # minimise file handles and acquisition cost. + sessions_path = self._state_paths.get("sessions") + self._sessions_store: SessionsStateStore | None + if sessions_path is not None: + self._sessions_store = SessionsStateStore(sessions_path) + # ``isolation_key -> active session_id``. Normally identical to the + # isolation_key, but ``reset_session`` rotates this to a fresh id so + # the next turn starts a new ``AgentSession`` while the old history + # remains on disk under its original session_id. Persisted so a + # rotation survives a restart. + aliases_dict, active_dict, identities_dict = build_session_dicts(self._sessions_store) + self._session_aliases: dict[str, str] = aliases_dict + # (isolation_key -> last-seen channel name) for ResponseTarget.active. + self._active: dict[str, str] = active_dict + # Per-isolation_key identity registry: which channels we've seen this + # user on, and which native_id they used on each. Powers + # ResponseTarget.active / .channel(name) / .channels([...]) / + # .all_linked. + # Shape: { isolation_key: { channel_name: ChannelIdentity } }. + self._identities: dict[str, dict[str, ChannelIdentity]] = identities_dict + else: + self._sessions_store = None + self._session_aliases = {} + self._active = {} + self._identities = {} + # Set by ``serve()`` so the lifespan startup handler doesn't + # double-log the banner; remains ``False`` when callers mount + # ``host.app`` under their own ASGI server. + self._startup_logged: bool = False + # Authorization seam: allowlists, optional identity linker, and + # construction-time validation for fail-fast misconfigurations. + self._default_allowlist: IdentityAllowlist | None = default_allowlist + self._identity_linker: IdentityLinker | None = identity_linker + self._configure_identity_linker_state( + self._state_paths.get("links"), + explicit=links_explicit_in_mapping, + ) + self._validate_channel_authorization() + + @property + def app(self) -> Starlette: + """Lazily build (and cache) the Starlette application.""" + if self._app is None: + self._app = self._build_app() + return self._app + + def _configure_identity_linker_state(self, links_path: Path | None, *, explicit: bool) -> None: + """Offer the derived ``state_dir['links']`` path to compatible linkers.""" + if links_path is None: + return + linker = self._identity_linker + if linker is None: + if explicit: + logger.warning("state_dir['links'] is set but no identity_linker is configured; ignoring.") + return + if isinstance(linker, SupportsLinkStorePath): + linker.configure_link_store_path(links_path) + return + logger.warning( + "state_dir['links'] is set but the configured identity_linker does not implement " + "SupportsLinkStorePath; configure link-store persistence on the linker directly." + ) + + def _validate_runner_codec_pairing(self) -> None: + """Refuse to start when a JSON-mode runner is paired with codec-less push channels. + + A JSON-mode durable runner (``payload_mode=JSON``) persists every + scheduled task's payload so it survives process restarts. The + host's ``hosting.push`` payload includes a + :class:`HostedRunResult` containing the full agent / workflow + output, which cannot be JSON-serialised without help from the + destination channel. Push-capable channels therefore must + declare a :class:`ChannelPushCodec` (a duck-typed + ``push_codec`` attribute on the channel) when paired with a + JSON-mode runner. + + Object-mode runners (the default in-process runner) accept live + Python references and skip this check. + """ + mode = getattr(self._durable_task_runner, "payload_mode", DurableTaskPayloadMode.OBJECT) + if mode != DurableTaskPayloadMode.JSON: + return + missing: list[str] = [] + for channel in self.channels: + if not isinstance(channel, ChannelPush): + # Channels that don't implement push are never scheduled, + # so a missing codec is fine. + continue + codec = getattr(channel, "push_codec", None) + if codec is None: + missing.append(channel.name) + if missing: + raise RuntimeError( + "Durable task runner declares payload_mode=JSON, but the following " + "push-capable channels have no `push_codec` attribute and cannot " + "be serialised for persistence: " + f"{', '.join(missing)}. Add a ChannelPushCodec to each channel " + "or switch to an object-mode runner (e.g. InProcessTaskRunner)." + ) + + def _resolve_channel_allowlist(self, channel: Channel) -> IdentityAllowlist | None: + """Apply the ``"inherit"`` / ``None`` / explicit semantics. + + - ``"inherit"`` (default) → host's ``default_allowlist``. + - ``None`` → explicitly open (carve-out inside a locked host). + - any other value → use as-is. + """ + raw: Any = getattr(channel, "allowlist", "inherit") + if raw == "inherit": + return self._default_allowlist + # ``None`` and concrete allowlists both pass through unchanged; + # the caller (``authorize``) treats ``None`` as "open". + return cast("IdentityAllowlist | None", raw) + + def _validate_channel_authorization(self) -> None: + """Reject configurations that would silently deny every user. + + Runs three rules (see spec § "Configuration validation"): + + 1. If a channel's resolved allowlist declares + ``requires_linked_claims=True``, the channel must either set + ``require_link=True`` or declare + ``emits_verified_claims=True`` — otherwise no verified + claims will ever reach :meth:`evaluate` and the allowlist + would always ``ABSTAIN`` / ``DENY``. + 2. If any channel has ``require_link=True``, an + ``identity_linker`` must be configured. Silent + deny-everyone is the worst possible default. + 3. ``NativeIdAllowlist(channel=)`` must reference a + channel name that exists on this host — typo-detection. + """ + known_channels = {c.name for c in self.channels} + for channel in self.channels: + allowlist = self._resolve_channel_allowlist(channel) + require_link = bool(getattr(channel, "require_link", False)) + emits_claims = bool(getattr(channel, "emits_verified_claims", False)) + # Rule #2: require_link without a linker. + if require_link and self._identity_linker is None: + raise ChannelConfigurationError( + f"Channel '{channel.name}' has require_link=True but no " + "identity_linker is configured on the host. Configure one or " + "remove require_link=True (silent deny-everyone is rejected)." + ) + if allowlist is None: + continue + # Rule #1: claim-dependent allowlist needs a claim source. + if getattr(allowlist, "requires_linked_claims", False) and not (require_link or emits_claims): + raise ChannelConfigurationError( + f"Channel '{channel.name}' has an allowlist that requires " + "verified IdP claims (requires_linked_claims=True) but the " + "channel neither sets require_link=True nor emits verified " + "claims natively. Configure a source of verified claims for " + "the allowlist (silent deny-everyone is rejected)." + ) + # Rule #3: native-id allowlists pointing at unknown channels. + for nested in _flatten_allowlists(allowlist): + target = getattr(nested, "channel", None) + if target is not None and target not in known_channels: + raise ChannelConfigurationError( + f"NativeIdAllowlist on channel '{channel.name}' references " + f"unknown channel '{target}'. Known channels: " + f"{sorted(known_channels)}." + ) + + async def authorize( + self, + identity: ChannelIdentity, + *, + require_link: bool = False, + allowlist: IdentityAllowlist | None = None, + verified_claims: Mapping[str, ClaimValue] | None = None, + ) -> AuthorizationOutcome: + """Evaluate authorization for ``identity`` against ``allowlist``. + + Channels should call this **before** producing a + :class:`ChannelRequest` so a denied identity never reaches the + agent. The host's run path also re-checks authorization for + defense-in-depth, but channels that surface :class:`Denied` or + :class:`LinkRequired` themselves can render the outcome + through their native UX (refusal message, link challenge) + rather than a generic error. + + Supports open, native-id allowlist, and verified-claim allowlist + profiles. ``require_link=True`` or claim-based allowlists use + the configured :class:`IdentityLinker`; channels that natively + authenticate users may pass ``verified_claims`` directly. + + Returns: + One of :class:`Allowed`, :class:`LinkRequired`, or + :class:`Denied`. + """ + claims: Mapping[str, ClaimValue] = verified_claims or {} + claim_source: Literal["linker", "channel", "none"] = "channel" if claims else "none" + auto_isolation_key = self._auto_issue_isolation_key(identity) + if allowlist is None: + # Open profile (or explicitly carved-out channel). + if require_link: + return await self._resolve_required_link(identity) + return Allowed(isolation_key=auto_isolation_key, verified_claims=claims, claim_source=claim_source) + pre_context = AuthorizationContext( + identity=identity, + phase="pre_link", + isolation_key=None, + verified_claims=claims, + claim_source=claim_source, + ) + decision = await allowlist.evaluate(pre_context) + if decision is AllowlistDecision.ALLOW: + if require_link: + return await self._resolve_required_link(identity) + return Allowed(isolation_key=auto_isolation_key, verified_claims=claims, claim_source=claim_source) + if decision is AllowlistDecision.DENY: + return Denied( + reason_code="allowlist_denied_pre_link", + user_message="You don't have access to this bot.", + log_details={ + "channel": identity.channel, + "phase": "pre_link", + }, + ) + # ABSTAIN: claim-dependent allowlists need a post-link / + # verified-claim evaluation. Non-claim allowlists can fall + # through to the open path, while still honoring require_link. + if getattr(allowlist, "requires_linked_claims", False): + if claims: + post_context = AuthorizationContext( + identity=identity, + phase="post_link", + isolation_key=auto_isolation_key, + verified_claims=claims, + claim_source="channel", + ) + post_decision = await allowlist.evaluate(post_context) + return self._authorization_outcome_from_post_link( + identity=identity, + isolation_key=auto_isolation_key, + claims=claims, + claim_source="channel", + decision=post_decision, + ) + return await self._resolve_and_evaluate_claim_allowlist(identity, allowlist) + if require_link: + return await self._resolve_required_link(identity) + return Allowed(isolation_key=auto_isolation_key, verified_claims=claims, claim_source=claim_source) + + async def _resolve_required_link(self, identity: ChannelIdentity) -> AuthorizationOutcome: + """Resolve ``identity`` through the configured linker or request linking.""" + linker = self._identity_linker + if linker is None: + # Defensive: the construction-time validator should catch this. + return Denied( + reason_code="link_required_without_linker", + user_message="Sign-in is not configured for this bot.", + log_details={"channel": identity.channel}, + ) + resolution = await linker.resolve(identity) + if isinstance(resolution, LinkChallenge): + return LinkRequired(challenge=resolution) + return Allowed( + isolation_key=resolution.isolation_key, + verified_claims=resolution.verified_claims, + claim_source=resolution.claim_source, + ) + + async def _resolve_and_evaluate_claim_allowlist( + self, + identity: ChannelIdentity, + allowlist: IdentityAllowlist, + ) -> AuthorizationOutcome: + """Resolve identity, then run a claim-dependent allowlist post-link.""" + linker = self._identity_linker + if linker is None: + return Denied( + reason_code="allowlist_requires_link", + user_message="Please link your account to continue.", + log_details={"channel": identity.channel, "phase": "pre_link"}, + ) + resolution = await linker.resolve(identity) + if isinstance(resolution, LinkChallenge): + return LinkRequired(challenge=resolution) + post_context = AuthorizationContext( + identity=identity, + phase="post_link", + isolation_key=resolution.isolation_key, + verified_claims=resolution.verified_claims, + claim_source=resolution.claim_source, + ) + post_decision = await allowlist.evaluate(post_context) + return self._authorization_outcome_from_post_link( + identity=identity, + isolation_key=resolution.isolation_key, + claims=resolution.verified_claims, + claim_source=resolution.claim_source, + decision=post_decision, + ) + + def _authorization_outcome_from_post_link( + self, + *, + identity: ChannelIdentity, + isolation_key: str, + claims: Mapping[str, ClaimValue], + claim_source: Literal["linker", "channel"], + decision: AllowlistDecision, + ) -> AuthorizationOutcome: + """Convert a post-link allowlist decision to a host outcome.""" + if decision is AllowlistDecision.ALLOW: + return Allowed(isolation_key=isolation_key, verified_claims=claims, claim_source=claim_source) + if decision is AllowlistDecision.DENY: + return Denied( + reason_code="allowlist_denied_post_link", + user_message="You don't have access to this bot.", + log_details={ + "channel": identity.channel, + "phase": "post_link", + "claim_source": claim_source, + }, + ) + return Denied( + reason_code="allowlist_abstained_post_link", + user_message="You don't have access to this bot.", + log_details={ + "channel": identity.channel, + "phase": "post_link", + "claim_source": claim_source, + }, + ) + + def _auto_issue_isolation_key(self, identity: ChannelIdentity) -> str: + """Auto-issue a stable isolation key for ``identity``. + + Returns the existing key when ``(channel, native_id)`` has + already been seen, or coins ``":"`` on + first contact. Configured :class:`IdentityLinker` instances can + return provider-backed isolation keys for flows that require + verified identity. + """ + # Look for an existing isolation_key that has already linked + # this (channel, native_id). Linear scan is fine for the + # in-process registry. Linker implementations can use their own + # indexed stores for provider-backed identities. + for isolation_key, by_channel in self._identities.items(): + existing = by_channel.get(identity.channel) + if existing is not None and existing.native_id == identity.native_id: + return isolation_key + # First contact — coin a deterministic key. + return f"{identity.channel}:{identity.native_id}" + + @property + def default_allowlist(self) -> IdentityAllowlist | None: + """Host-level fallback allowlist applied to channels with ``allowlist="inherit"``.""" + return self._default_allowlist + + @property + def runtime_mode(self) -> RuntimeMode: + """The resolved runtime mode for this host. + + Either ``"long_running"`` or ``"ephemeral"``. Resolved at + construction from the ``runtime_mode`` constructor argument or + — when unset — auto-detected from deployment environment + markers; see :func:`_detect_runtime_mode`. Advisory: the value + drives the *defaults* selected for runtime-shape-dependent + components (today, the durable task runner) and is logged at + startup for operator visibility. + """ + return self._runtime_mode + + @property + def durable_task_runner(self) -> DurableTaskRunner: + """The durable task runner used to dispatch non-originating pushes. + + Defaults to a process-local :class:`InProcessTaskRunner` when no + runner was supplied at construction. Adapter packages may + replace this with a durable backend (e.g. Foundry-native + scheduling, ``agent-framework-hosting-durabletask``); the host + itself only relies on the :class:`DurableTaskRunner` Protocol + surface so any conforming implementation is usable. + """ + return self._durable_task_runner + + def serve( + self, + *, + host: str = "127.0.0.1", + port: int = 8000, + workers: int = 1, + **config_kwargs: Any, + ) -> None: + """Start the host on ``host:port`` using Hypercorn. + + Hypercorn is the same ASGI server the Foundry Hosted Agents + runtime uses for production deployments, so running locally with + the same server keeps dev/prod parity (Trio fallbacks, lifespan + semantics, HTTP/2 support, …). Install with the ``serve`` extra + (``pip install agent-framework-hosting[serve]``). + + Args: + host: Interface to bind. Defaults to ``127.0.0.1``. + port: TCP port to bind. Defaults to ``8000``. + workers: Number of worker processes. Defaults to ``1``; + Hypercorn's process model only kicks in for ``>1``. + **config_kwargs: Forwarded to :class:`hypercorn.config.Config` + via attribute assignment, so any documented Hypercorn + config field (e.g. ``keep_alive_timeout=...``, + ``access_log_format=...``) can be set directly. + """ + try: + from hypercorn.asyncio import ( # pyright: ignore[reportMissingImports] + serve as _hypercorn_serve, # pyright: ignore[reportUnknownVariableType] + ) + from hypercorn.config import Config # pyright: ignore[reportMissingImports, reportUnknownVariableType] + except ImportError as exc: # pragma: no cover - exercised at runtime + raise RuntimeError( + "AgentFrameworkHost.serve() requires hypercorn. " + "Install with `pip install agent-framework-hosting[serve]` or `pip install hypercorn`." + ) from exc + + config = Config() # pyright: ignore[reportUnknownVariableType] + config.bind = [f"{host}:{port}"] # pyright: ignore[reportUnknownMemberType] + config.workers = workers # pyright: ignore[reportUnknownMemberType] + for key, value in config_kwargs.items(): + setattr(config, key, value) # pyright: ignore[reportUnknownArgumentType] + + # Touch ``self.app`` so the lifespan startup log fires once before + # we hand off to hypercorn — gives a single, readable banner of + # what the host is exposing without requiring channels to log + # individually. + app = self.app + self._log_startup(host=host, port=port, workers=workers) + # Mark as already logged so the lifespan startup handler does not + # double-log the same banner. + self._startup_logged = True + + # ``hypercorn.asyncio.serve`` has a complex partially-typed signature + # (multiple ASGI/WSGI app overloads) and its ``Scope`` definition + # diverges from Starlette's; cast both sides to ``Any`` to keep the + # call site readable without sprinkling per-error suppressions. + serve_callable = cast(Any, _hypercorn_serve) + asyncio.run(serve_callable(app, config)) + + def reset_session(self, isolation_key: str) -> None: + """Rotate ``isolation_key`` to a fresh session id without deleting history. + + Old turns are preserved on disk under their original session id and + remain accessible by passing that id explicitly (e.g. as + ``previous_response_id``). Future requests using ``isolation_key`` + get a new, empty ``AgentSession``. + """ + new_id = f"{isolation_key}#{uuid.uuid4().hex[:8]}" + self._session_aliases[isolation_key] = new_id + self._sessions.pop(isolation_key, None) + + # -- internals --------------------------------------------------------- # + + def _log_startup( + self, + *, + host: str | None = None, + port: int | None = None, + workers: int | None = None, + ) -> None: + """Emit a single human-friendly startup banner. + + Mirrors the ``AgentServerHost`` convention from + ``azure.ai.agentserver.core``: one INFO line that captures the + target type, every channel + its mount path, the bind address + (when known), whether we're running inside a Foundry Hosted + Agents container, and the worker count. Keeps log noise low + while still giving an operator a single grep-able anchor when + triaging. + + Called from both :meth:`serve` (which knows the bind triple) + and the ASGI lifespan ``startup`` phase (which does not — the + host may be embedded under any caller-managed ASGI server). + Bind fields are omitted from the log line when unknown so + operators can still spot the runtime-mode banner under + externally-managed servers. + """ + target_kind = "Workflow" if isinstance(self.target, Workflow) else type(self.target).__name__ + target_name = getattr(self.target, "name", None) or target_kind + channels_repr = ", ".join( + f"{ch.name}@{ch.path or '/'}" # blank path means "mounted at root" + for ch in self.channels + ) + is_hosted = bool(os.environ.get("FOUNDRY_HOSTING_ENVIRONMENT")) + bind = f"{host}:{port}" if host is not None and port is not None else "" + logger.info( + "AgentFrameworkHost starting: target=%s (%s) bind=%s workers=%s hosted=%s " + "runtime_mode=%s (%s) runner=%s channels=[%s]", + target_name, + target_kind, + bind, + workers if workers is not None else "", + is_hosted, + self._runtime_mode, + self._runtime_mode_source, + type(self._durable_task_runner).__name__, + channels_repr or "", + ) + + def _build_app(self) -> Starlette: + context = ChannelContext(self) + routes: list[BaseRoute] = [] + on_startup: list[Callable[[], Awaitable[None]]] = [] + on_shutdown: list[Callable[[], Awaitable[None]]] = [] + + # ``/readiness`` is the standard probe path the Foundry Hosted Agents + # runtime hits to gate traffic. We expose it unconditionally — once the + # ASGI app is up the host considers itself ready (channels register + # their own startup hooks and may run before the first request, but + # readiness is intentionally cheap so the platform's probe never times + # out on transient channel work). Mounted first so a channel cannot + # accidentally shadow it. + async def _readiness(_request: Request) -> PlainTextResponse: # noqa: RUF029 + """Liveness/readiness probe handler used by Foundry Hosted Agents.""" + return PlainTextResponse("ok") + + routes.append(Route("/readiness", _readiness, methods=["GET"])) + + for channel in self.channels: + contribution = channel.contribute(context) + # Channels publish routes relative to their root; mount under channel.path. + # An empty path means "mount at the app root" — useful for single-channel hosts + # that don't want a prefix (e.g. ResponsesChannel exposing POST /responses directly). + if contribution.routes: + if channel.path: + routes.append(Mount(channel.path, routes=list(contribution.routes))) + else: + routes.extend(contribution.routes) + on_startup.extend(contribution.on_startup) + on_shutdown.extend(contribution.on_shutdown) + + @asynccontextmanager + async def lifespan(_app: Starlette) -> AsyncIterator[None]: + # Emit the startup banner once. ``serve()`` may have already + # logged it (it logs eagerly so the banner appears before + # control passes to hypercorn); the lifespan still logs it + # for callers that mount ``host.app`` directly under their + # own ASGI server — that path otherwise wouldn't get a + # runtime-mode banner at all. + if not self._startup_logged: + self._log_startup() + self._startup_logged = True + # Run every startup callback; collect (don't propagate) so + # one bad channel doesn't leave its peers half-initialised + # AND deny us a chance to pair-up shutdown calls. After all + # callbacks have been attempted, raise the FIRST error so + # Starlette / the ASGI server still aborts boot — and log + # every other failure so operators can see them all in one + # log scrape rather than discovering them turn-by-turn. + # (The hosting.push handler is registered eagerly in + # ``__init__`` rather than here, so ``_deliver_response`` + # can be called without first entering the lifespan — e.g. + # in tests, or by callers driving the host without an ASGI + # server.) + startup_errors: list[tuple[str, BaseException]] = [] + # Replay any persisted pending tasks first so re-scheduled + # work runs alongside fresh traffic from the moment the + # host accepts requests. Only meaningful for the host-owned + # in-process runner with disk persistence on; caller-owned + # runners manage their own replay lifecycle. + if ( + self._owns_runner + and isinstance(self._durable_task_runner, InProcessTaskRunner) + and self._state_paths.get("runner") is not None + ): + try: + await self._durable_task_runner.resume() + except Exception as exc: + logger.exception("lifespan startup: durable task runner resume failed") + startup_errors.append(("InProcessTaskRunner.resume", exc)) + for cb in on_startup: + try: + await cb() + except Exception as exc: + name = getattr(cb, "__qualname__", repr(cb)) + logger.exception("lifespan startup: callback %s failed", name) + startup_errors.append((name, exc)) + if startup_errors: + _, first_exc = startup_errors[0] + if len(startup_errors) > 1: + logger.error( + "lifespan startup: %d callback(s) failed; first error re-raised, " + "remaining failures already logged above (%s)", + len(startup_errors), + ", ".join(n for n, _ in startup_errors[1:]), + ) + raise first_exc + try: + yield + finally: + # Same shape on the shutdown side: walk every callback + # so a bad one can't leave its peers leaking + # tasks/sockets/sessions, then raise the first if any + # failed so the server's exit code reflects the failure. + shutdown_errors: list[tuple[str, BaseException]] = [] + for cb in on_shutdown: + try: + await cb() + except Exception as exc: + name = getattr(cb, "__qualname__", repr(cb)) + logger.exception("lifespan shutdown: callback %s failed", name) + shutdown_errors.append((name, exc)) + # Drain the host-owned runner after channel shutdowns — + # channels may legitimately schedule a final push while + # tearing down (e.g. a goodbye message), and we want + # those tasks to get a chance to complete before we + # cancel pending work. For caller-supplied runners we + # leave lifecycle to the caller. + if self._owns_runner and isinstance(self._durable_task_runner, InProcessTaskRunner): + try: + await self._durable_task_runner.shutdown(timeout=5.0) + except Exception as exc: # pragma: no cover - defensive + logger.exception("lifespan shutdown: durable task runner shutdown failed") + shutdown_errors.append(("InProcessTaskRunner.shutdown", exc)) + # Close the persisted sessions store after the runner so + # any in-flight task that touches session state during + # shutdown can still write through. + if self._sessions_store is not None: + try: + self._sessions_store.close() + except Exception as exc: # pragma: no cover - defensive + logger.exception("lifespan shutdown: sessions store close failed") + shutdown_errors.append(("SessionsStateStore.close", exc)) + if shutdown_errors: + _, first_exc = shutdown_errors[0] + if len(shutdown_errors) > 1: + logger.error( + "lifespan shutdown: %d callback(s) failed; first error re-raised, " + "remaining failures already logged above (%s)", + len(shutdown_errors), + ", ".join(n for n, _ in shutdown_errors[1:]), + ) + raise first_exc + + return Starlette( + debug=self._debug, + routes=routes, + lifespan=lifespan, + middleware=[Middleware(_FoundryIsolationASGIMiddleware)], + ) + + def _build_run_kwargs(self, request: ChannelRequest) -> dict[str, Any]: + # The full spec resolves a ChannelSession into an AgentSession here, + # honors session_mode, and consults LinkPolicy / ResponseTarget. This + # base host keys a per-isolation_key AgentSession off the channel's + # session hint so context providers (FileHistoryProvider, …) on the + # target see one session per end user. + session = None + if request.session_mode != "disabled" and request.session is not None: + isolation_key = request.session.isolation_key + if isolation_key is not None and hasattr(self.target, "create_session"): + session_id = self._session_aliases.get(isolation_key, isolation_key) + session = self._sessions.get(isolation_key) + if session is None: + # Concurrency note: ``create_session`` is sync today, + # so the get/set window has no await point and CPython + # serialises us against other tasks. ``setdefault`` is + # the atomic primitive that keeps us safe even if a + # future ``create_session`` ever yields — both racers + # would see ``session is None``, both construct a new + # session, but only the first ``setdefault`` wins; the + # loser's just-built session is discarded (one + # transient orphan max per race window) instead of + # silently overwriting a peer-bound session that + # other in-flight requests are already using. + # ``create_session`` lives on agent-typed targets but not on + # ``Workflow``; the ``hasattr`` above guards the call site. + new_session = self.target.create_session( # pyright: ignore[reportAttributeAccessIssue, reportUnknownVariableType, reportUnknownMemberType] + session_id=session_id + ) + session = self._sessions.setdefault(isolation_key, new_session) # pyright: ignore[reportUnknownArgumentType] + + run_kwargs: dict[str, Any] = {} + if session is not None: + run_kwargs["session"] = session + if request.options: + run_kwargs["options"] = request.options + return run_kwargs + + def _log_incoming(self, request: ChannelRequest, *, stream: bool) -> None: + """Emit a structured INFO summary for every incoming target invocation. + + When ``debug=True`` is set on the host, also dump the channel-native + settings the channel attached to the ``ChannelRequest`` — ``options`` + (the ChatOptions-shaped fields the channel parsed from its protocol + payload, e.g. temperature/tools/tool_choice for Responses), plus + ``attributes`` / ``metadata`` (the channel's protocol-specific bag, + e.g. ``chat_id`` / ``callback_query_id`` for Telegram). + + Uses ``extra={...}`` so structured-logging consumers (the + Foundry hosted-agent log shipper, OpenTelemetry handlers, …) + can index per-field rather than re-parsing a template string. + """ + isolation_key = request.session.isolation_key if request.session is not None else None + logger.info( + "channel request", + extra={ + "channel": request.channel, + "operation": request.operation, + "stream": stream, + "session": isolation_key, + "session_mode": request.session_mode, + }, + ) + logger.debug( + "channel request details", + extra={ + "channel": request.channel, + "options": dict(request.options) if request.options else {}, + "attributes": dict(request.attributes) if request.attributes else {}, + "metadata": dict(request.metadata) if request.metadata else {}, + }, + ) + + def _bind_request_context(self, request: ChannelRequest) -> ExitStack: + """Bind any per-request anchors a target's context-providers expose. + + Channels announce per-request anchors (currently ``response_id`` + and ``previous_response_id``) via ``ChannelRequest.attributes``. + Some history providers — notably the Foundry hosted-agent history + provider — need to write storage under the same ``response_id`` + the channel surfaces on its envelope so the next turn's + ``previous_response_id`` walks the chain. Rather than the host + knowing about specific provider classes, we duck-type: any + context provider on the target that exposes a + ``bind_request_context(response_id=..., previous_response_id=..., + **_)`` context-manager gets it called with the request's + attribute values. Per-request platform isolation keys are handled + separately by :class:`_FoundryIsolationASGIMiddleware` (lifted + off the inbound headers into a contextvar) so providers don't + depend on channels to forward them. Bindings are scoped to the + returned :class:`ExitStack` which the caller must enter before + invoking the target and leave after the run completes. + """ + stack = ExitStack() + attrs = request.attributes or {} + response_id = attrs.get("response_id") + if not isinstance(response_id, str) or not response_id: + return stack + previous_response_id = attrs.get("previous_response_id") + if previous_response_id is not None and not isinstance(previous_response_id, str): + previous_response_id = None + + providers: Sequence[Any] = getattr(self.target, "context_providers", None) or () + + for provider in providers: + bind = getattr(provider, "bind_request_context", None) + if not callable(bind): + continue + stack.enter_context( + cast( + "AbstractContextManager[Any]", + bind( + response_id=response_id, + previous_response_id=previous_response_id, + ), + ) + ) + return stack + + async def _invoke(self, request: ChannelRequest) -> HostedRunResult[AgentResponse]: + self._log_incoming(request, stream=False) + self._record_identity(request) + if self._is_workflow: + # Workflow targets follow a separate path; the dedicated dispatch + # is parameterised on ``WorkflowRunResult`` so the static return + # type of ``_invoke`` itself stays the agent-shaped envelope. + return await self._invoke_workflow(request) # type: ignore[return-value] + run_kwargs = self._build_run_kwargs(request) + with self._bind_request_context(request): + # ``_is_workflow`` is False here so ``self.target`` is an + # ``Agent``-shaped target whose ``.run`` returns + # :class:`AgentResponse`. Narrow back to keep ``result.messages`` + # well-typed without conditional imports of ``Agent``. + agent_target = cast("SupportsAgentRun", self.target) + result = await agent_target.run(self._wrap_input(request), **run_kwargs) + # Carry the full :class:`AgentResponse` as the typed envelope + # ``result`` so channels (and developer-supplied response hooks) + # can read ``messages``, ``value``, ``usage_details``, + # ``response_id`` … directly off the target output without the + # host pre-shaping any of it. The bound session (if any) is + # surfaced so channels that want to render session metadata + # don't have to re-resolve it. + return HostedRunResult(result, session=run_kwargs.get("session")) + + def _invoke_stream(self, request: ChannelRequest) -> ResponseStream[AgentResponseUpdate, AgentResponse]: + self._log_incoming(request, stream=True) + self._record_identity(request) + if self._is_workflow: + return self._invoke_workflow_stream(request) + run_kwargs = self._build_run_kwargs(request) + # ``run(stream=True)`` returns a ResponseStream synchronously (it is + # itself awaitable / async-iterable). We hand it back to the channel + # so the channel can drive iteration and apply its transform hook. + # Streaming flows iterate after this method returns, which is + # *outside* a sync ``with`` block — so we wrap the underlying + # stream in an adapter that holds the binding open across the + # iteration lifecycle. + binder = self._bind_request_context(request) + return _BoundResponseStream( # type: ignore[return-value] + self.target.run(self._wrap_input(request), stream=True, **run_kwargs), + binder, + ) + + def _resolve_checkpoint_storage(self, request: ChannelRequest) -> CheckpointStorage | None: + """Build (or return) the per-request checkpoint storage, or ``None``. + + Returns ``None`` when no ``checkpoint_location`` is configured or + when the request lacks a stable session key — without a key we + cannot scope checkpoints per conversation, and we'd rather skip + checkpointing than pollute a single shared store. + + When ``checkpoint_location`` is a path, the per-conversation + directory is built via :func:`_checkpoint_path_for_isolation_key` + which rejects path-traversal patterns in ``isolation_key`` and + verifies the resolved directory stays under the configured root + (CWE-22 defence). Invalid keys cause the request to skip + checkpointing with a WARNING rather than escape the root or + crash the request. + """ + if self._checkpoint_location is None: + return None + if request.session is None or not request.session.isolation_key: + return None + if isinstance(self._checkpoint_location, Path): + try: + target = _checkpoint_path_for_isolation_key(self._checkpoint_location, request.session.isolation_key) + except ValueError as exc: + logger.warning( + "Skipping checkpoint storage for request: %s", + exc, + ) + return None + return FileCheckpointStorage(str(target)) + # Caller-supplied storage — used as-is; caller owns scoping. + return self._checkpoint_location + + async def _invoke_workflow(self, request: ChannelRequest) -> HostedRunResult[WorkflowRunResult]: + """Dispatch to ``Workflow.run`` and wrap the result in a typed envelope. + + The channel's ``run_hook`` is the canonical adapter for shaping + ``request.input`` into the workflow start executor's typed input + (free-form text from a Telegram message, structured ``Responses`` + ``input`` items, …). When no hook is wired, ``request.input`` is + forwarded verbatim — appropriate for workflows whose start executor + accepts the channel's native input type (commonly ``str``). + + When ``checkpoint_location`` is configured on the host, a + per-conversation checkpoint storage is resolved, the workflow is + restored from its latest checkpoint (if any) and then re-run with + the new input — mirroring the resume semantics of the Foundry + Responses host. + + The full :class:`~agent_framework._workflows._workflow.WorkflowRunResult` + is carried unchanged on :attr:`HostedRunResult.result` so + destination channels can iterate :meth:`WorkflowRunResult.get_outputs`, + inspect :meth:`WorkflowRunResult.get_final_state`, or pull other + per-executor events themselves. The host intentionally does not + map outputs onto messages — channels (and developer-supplied + response hooks) own that projection because what counts as a + "renderable output" is wire-format-specific. + + Workflows do not own session state in the agent sense, so + ``HostedRunResult.session`` is ``None`` for workflow targets. + """ + # Workflows do not own session state in the agent sense and do not + # accept ``session=`` / ``options=`` kwargs. The channel's run_hook is + # the seam for any per-run customization; nothing flows through here. + workflow: Workflow = self.target # type: ignore[assignment] + storage = self._resolve_checkpoint_storage(request) + await self._restore_workflow_checkpoint(workflow, storage) + result = ( + await workflow.run(request.input, checkpoint_storage=storage) + if storage is not None + else await workflow.run(request.input) + ) + return HostedRunResult(result) + + @staticmethod + async def _restore_workflow_checkpoint( + workflow: Workflow, + storage: CheckpointStorage | None, + ) -> None: + """Rehydrate ``workflow`` from its latest checkpoint, if any. + + Shared between the blocking and streaming workflow paths so the + restore step stays in lockstep across both — both must observe + the same in-memory state when they apply the new input. + + If ``storage.get_latest`` returns ``None`` (no prior checkpoint + recorded) the call is a benign no-op. A non-``None`` checkpoint + whose stored events are empty (stale or partially-written + ``checkpoint_id``) is logged at WARNING so operators can detect + the silent-state-loss case without sifting through INFO logs. + """ + if storage is None: + return + latest = await storage.get_latest(workflow_name=workflow.name) + if latest is None: + return + # The blocking restore call is a no-op invocation that just + # rehydrates state; the streaming path drains the same + # restoration stream below to achieve the same effect. + result = await workflow.run(checkpoint_id=latest.checkpoint_id, checkpoint_storage=storage) + events = getattr(result, "events", None) + if events is not None and not events: + logger.warning( + "workflow checkpoint restore produced zero events " + "(workflow=%s checkpoint_id=%s) — state may not be rehydrated", + workflow.name, + latest.checkpoint_id, + ) + + def _invoke_workflow_stream(self, request: ChannelRequest) -> ResponseStream[AgentResponseUpdate, AgentResponse]: + """Bridge ``Workflow.run(stream=True)`` to a channel-facing ``ResponseStream``. + + Wraps the workflow's ``ResponseStream[WorkflowEvent, WorkflowRunResult]`` + in a new ``ResponseStream[AgentResponseUpdate, AgentResponse]`` so + channels can iterate it identically to an agent stream and apply + their ``stream_transform_hook`` callables. + + Mapping rules: + + - ``output`` events whose ``data`` is already an + :class:`AgentResponseUpdate` (the common case for workflows + containing :class:`AgentExecutor`) pass through unchanged. + - ``output`` events with any other ``data`` are wrapped into a + single-text-content :class:`AgentResponseUpdate`. + - All other event types (``status``, ``executor_invoked``, + ``superstep_*``, lifecycle, …) are filtered out — channels only + care about user-visible text. Hooks can opt back in by inspecting + ``raw_representation`` on the produced updates. + + The original :class:`WorkflowEvent` is stashed on + ``AgentResponseUpdate.raw_representation`` so advanced consumers + (telemetry, debug UIs) can recover the full workflow timeline. + + Checkpoint restoration (when ``checkpoint_location`` is set) runs + before the input stream is opened so the new turn observes the + restored state. + """ + workflow: Workflow = self.target # type: ignore[assignment] + storage = self._resolve_checkpoint_storage(request) + + async def _bridge() -> AsyncIterator[AgentResponseUpdate]: + # Same restore step the blocking path runs (see + # ``_restore_workflow_checkpoint``) — kept inside the bridge + # so the in-memory state is rehydrated lazily on first + # iteration rather than at stream-construction time. + await self._restore_workflow_checkpoint_streaming(workflow, storage) + workflow_stream = workflow.run(request.input, stream=True, checkpoint_storage=storage) + try: + async for event in workflow_stream: + update = _workflow_event_to_update(event) + if update is not None: + yield update + finally: + async with _suppress_already_consumed(): + await workflow_stream.get_final_response() + + async def _finalize(updates: Sequence[AgentResponseUpdate]) -> AgentResponse: # noqa: RUF029 + return AgentResponse.from_updates(updates) + + return ResponseStream[AgentResponseUpdate, AgentResponse](_bridge(), finalizer=_finalize) + + @staticmethod + async def _restore_workflow_checkpoint_streaming( + workflow: Workflow, + storage: CheckpointStorage | None, + ) -> None: + """Streaming-path counterpart to :meth:`_restore_workflow_checkpoint`. + + ``Workflow.run(stream=True, checkpoint_id=...)`` returns a stream + whose updates we don't care about — we just need the side-effect + of rehydration. Drained inline so the new-input run that follows + observes the restored state. + + A latest checkpoint that drains to zero events (stale or + partially-written ``checkpoint_id``) is logged at WARNING so + operators can detect the silent-state-loss case, mirroring the + blocking helper. + """ + if storage is None: + return + latest = await storage.get_latest(workflow_name=workflow.name) + if latest is None: + return + drained = 0 + async for _ in workflow.run( + stream=True, + checkpoint_id=latest.checkpoint_id, + checkpoint_storage=storage, + ): + drained += 1 + if drained == 0: + logger.warning( + "workflow checkpoint restore stream produced zero events " + "(workflow=%s checkpoint_id=%s) — state may not be rehydrated", + workflow.name, + latest.checkpoint_id, + ) + + def _wrap_input(self, request: ChannelRequest) -> Message | list[Message]: + """Promote ``request.input`` to ``Message``(s) carrying channel metadata. + + Channels deliver inputs as plain text, a single ``Message``, or a list + of ``Message`` (e.g. a Responses-API request that includes a ``system`` + instruction plus the user turn). To preserve channel provenance + + identity + ``response_target`` on the persisted history record (and + make it visible to context providers, evals, audits), we attach a + ``hosting`` block under ``additional_properties``. AF's + ``Message.to_dict`` round-trips ``additional_properties`` through any + ``HistoryProvider`` that serializes via ``to_dict`` (e.g. + ``FileHistoryProvider``) and the framework explicitly does *not* + forward these fields to model providers, so they are safe to attach. + + For a list of messages we attach the metadata to the LAST message that + will be persisted (typically the user turn) — this keeps a single, + searchable record of where the inbound message came from. + """ + hosting_meta: dict[str, Any] = {"channel": request.channel} + if request.identity is not None: + hosting_meta["identity"] = { + "channel": request.identity.channel, + "native_id": request.identity.native_id, + "attributes": dict(request.identity.attributes) if request.identity.attributes else {}, + } + target = request.response_target + hosting_meta["response_target"] = { + "kind": target.kind.value, + "targets": list(target.targets), + } + + raw = request.input + if isinstance(raw, Message): + raw.additional_properties = {**(raw.additional_properties or {}), "hosting": hosting_meta} + return raw + if isinstance(raw, list) and raw and all(isinstance(m, Message) for m in raw): + messages: list[Message] = [m for m in raw if isinstance(m, Message)] + last = messages[-1] + last.additional_properties = {**(last.additional_properties or {}), "hosting": hosting_meta} + return messages + # ``raw`` is typed as ``AgentRunInputs`` (str | Content | Message | Sequence[…]). + # The remaining cases are str / Content / Mapping — wrap as a single user message. + return Message( + role="user", + contents=[raw], # type: ignore[list-item] + additional_properties={"hosting": hosting_meta}, + ) + + def _record_identity(self, request: ChannelRequest) -> None: + """Update the per-``isolation_key`` identity registry + active-channel hint. + + Called on every successful resolve. ``ResponseTarget.active`` + consumes ``self._active``; ``ResponseTarget.channel(name)`` / + ``.channels([...])`` / ``.all_linked`` consume ``self._identities``. + """ + if request.identity is None or request.session is None: + return + key = request.session.isolation_key + if not key: + return + self._identities.setdefault(key, {})[request.identity.channel] = request.identity + self._active[key] = request.identity.channel + + def _build_echo_payload(self, request: ChannelRequest) -> HostedRunResult[AgentResponse]: + """Build a ``HostedRunResult`` representing the originating user message. + + Used when ``ResponseTarget.echo_input`` is set so non-originating + destinations can mirror the user's turn before the agent reply + arrives. The user-facing payload is synthesised as a one-message + :class:`AgentResponse` (``role="user"``) so it flows through the + same delivery machinery as the agent's reply — channels handle + both via a single ``HostedRunResult[AgentResponse]`` shape. The + hosting metadata that ``_wrap_input`` attaches for agent + invocation is intentionally stripped: the echo is end-user-facing + and we don't leak host-internal bookkeeping onto another + channel's wire. + """ + raw = request.input + if isinstance(raw, Message): + user_messages: list[Message] = [ + Message(role="user", contents=list(raw.contents), author_name=raw.author_name), + ] + elif isinstance(raw, list) and raw and all(isinstance(m, Message) for m in raw): + user_messages = [ + Message(role="user", contents=list(m.contents), author_name=m.author_name) + for m in raw + if isinstance(m, Message) + ] + elif isinstance(raw, str): + user_messages = [Message(role="user", contents=[Content.from_text(text=raw)])] + elif isinstance(raw, Content): + user_messages = [Message(role="user", contents=[raw])] + else: + # AgentRunInputs allows other shapes (mapping, sequence of mixed + # str/Content); stringify as a defensive fallback. + user_messages = [Message(role="user", contents=[Content.from_text(text=str(raw))])] + return HostedRunResult(AgentResponse(messages=user_messages)) + + async def _deliver_payload_to_channel( + self, + channel: ChannelPush, + identity: ChannelIdentity, + payload: HostedRunResult[Any], + *, + request: ChannelRequest, + is_echo: bool, + ) -> HostedRunResult[Any]: + """Clone, run the channel's ``response_hook`` (if any), and push. + + The clone keeps fan-out free from cross-destination mutation: a + hook that rebinds ``result`` on one destination cannot leak into + the next push. Note that the clone is shallow — channels that + need to mutate ``result`` itself (rather than rebind it via + :meth:`HostedRunResult.replace`) are responsible for their own + deep copy. Returns the (possibly hook-shaped) payload so callers + can log post-hook diagnostics rather than the pre-hook ones. + + ``response_hook`` is duck-typed on the channel: any attribute + named ``response_hook`` that is callable participates. The + :class:`Channel` Protocol stays a small "name / path / contribute" + contract; richer surfaces stay attribute-level so adding hook + support to a new channel does not require updating the Protocol. + """ + shaped: HostedRunResult[Any] = payload.replace() + hook = cast(ChannelResponseHook | None, getattr(channel, "response_hook", None)) + if callable(hook): + ctx = ChannelResponseContext( + request=request, + channel_name=channel.name, + destination_identity=identity, + originating=False, + is_echo=is_echo, + ) + shaped = await apply_response_hook(hook, shaped, context=ctx) + await channel.push(identity, shaped) + return shaped + + async def _handle_push_task(self, payload: Mapping[str, Any]) -> None: + """Runner-side handler for ``hosting.push`` tasks. + + Unpacks a single per-destination push payload (one channel, one + identity) and runs the echo (when present) followed by the + response push. Echo failures are logged and swallowed — the + user-visible failure mode is "response delivered without + echo", *not* "no response at all". Response-push failures + re-raise so the runner can retry per the configured + :class:`RetryPolicy`. + + **Retry idempotency for the echo phase.** The payload includes a + mutable ``"echo_done"`` cursor (initialised to ``False`` at + schedule time). If a previous attempt already delivered the + echo but the response push then failed, the runner retries the + whole task; we observe ``echo_done == True`` and skip the + re-echo so end users on channels without server-side + deduplication don't see the same user-message echoed multiple + times. This is a best-effort guarantee for the in-process + runner — payload mutations don't survive process restarts. + Durable adapter packages SHOULD persist the cursor as part of + their task state (their replay machinery typically gives them + that primitive for free). + + Payload shape depends on the configured + :data:`DurableTaskRunner.payload_mode`: + + * Object mode (default) — live Python references: + ``channel_name``, ``identity``, ``result``, ``echo_result``, + ``echo_done``, ``request``. + * JSON mode — a single ``envelope`` produced by the + destination channel's :class:`ChannelPushCodec` plus + ``channel_name`` and ``echo_done``. The handler invokes + ``codec.decode(envelope)`` to recover the live references + before pushing. + """ + channel_name = cast(str, payload["channel_name"]) + echo_done = bool(payload.get("echo_done", False)) + + by_name = {ch.name: ch for ch in self.channels} + channel = by_name.get(channel_name) + if channel is None or not isinstance(channel, ChannelPush): + # Channel was validated at schedule time; if we ever land + # here it means the host's channel list mutated mid-flight, + # which we don't support. Log loudly and drop — re-raising + # would just cause the runner to retry forever. + logger.error( + "hosting.push: channel %r is no longer a ChannelPush; dropping task", + channel_name, + ) + return + push_channel = cast(ChannelPush, channel) + + # Recover the live references. Object-mode runners pass them + # through verbatim; JSON-mode runners persisted an envelope the + # channel's codec produced and we now ask the codec to decode + # it back. + envelope = payload.get("envelope") + if envelope is not None: + codec = cast("ChannelPushCodec | None", getattr(channel, "push_codec", None)) + if codec is None: + logger.error( + "hosting.push: channel %r received a JSON envelope but has no push_codec; dropping task", + channel_name, + ) + return + result, request, identity, echo_result = await codec.decode(envelope) + else: + identity = cast(ChannelIdentity, payload["identity"]) + result = cast(HostedRunResult[Any], payload["result"]) + echo_result = cast("HostedRunResult[Any] | None", payload.get("echo_result")) + request = cast(ChannelRequest, payload["request"]) + + if echo_result is not None and not echo_done: + try: + await self._deliver_payload_to_channel( + push_channel, + identity, + echo_result, + request=request, + is_echo=True, + ) + except Exception: + logger.exception( + "hosting.push: echo push failed for channel=%s native_id=%s", + channel_name, + identity.native_id, + ) + else: + # Mutate the payload mapping so a subsequent retry of + # this task (triggered by a failure in the response + # phase below) skips the echo. The in-process runner + # reuses the same mapping object across retries — see + # ``_run_with_retry``; durable adapters persist the + # cursor as part of their task state. + if isinstance(payload, dict): + payload["echo_done"] = True + logger.info( + "hosting.push: echoed user message", + extra={"channel": channel_name, "native_id": identity.native_id}, + ) + elif echo_result is not None and echo_done: + logger.debug( + "hosting.push: skipping echo on retry (already delivered)", + extra={"channel": channel_name, "native_id": identity.native_id}, + ) + + # Response phase — raise on failure so the runner retries per + # the configured retry policy. The runner is responsible for + # terminal-failure bookkeeping. + await self._deliver_payload_to_channel( + push_channel, + identity, + result, + request=request, + is_echo=False, + ) + logger.info( + "hosting.push: pushed agent response", + extra={"channel": channel_name, "native_id": identity.native_id}, + ) + + async def _deliver_response(self, request: ChannelRequest, payload: HostedRunResult[Any]) -> bool: + """Resolve ``request.response_target``, annotate audit metadata, and schedule pushes. + + Returns ``True`` when the originating channel should render the + agent reply on its own wire (the resolved target included the + originating channel either explicitly or via the host's "every + destination dropped, fall back to originating" recovery path). + Returns ``False`` when the reply is fanned out purely to + non-originating destinations (or :data:`ResponseTarget.none` + suppresses the reply entirely). + + Per SPEC-002 §"Intended targets + durable delivery": for any + non-``originating`` target, the originating channel returns an + acknowledgement and the actual agent reply is dispatched + **asynchronously** via the host's :class:`DurableTaskRunner` — + one scheduled task per destination, with the runner owning + retry / terminal-failure / replay semantics. + + **Immutable audit annotation.** Before scheduling, the host + annotates each resolved assistant ``Message`` in the payload + with the ``hosting.intended_targets`` list (and optionally + ``hosting.skipped_targets`` for destinations dropped at + resolution time). Persistence providers therefore observe the + host's *intent* from a single immutable write — mutable + per-destination delivery state is owned by the runner backend. + + When a destination cannot be resolved (no known native id), or + the destination channel doesn't implement :class:`ChannelPush`, + or no channel by that name is registered, it is dropped + synchronously and logged at WARNING. When the only resolved + destinations all drop at resolution time we fall back to + delivering on the originating channel so the user is never left + without a reply. + + When ``request.response_target.echo_input`` is True the echo + payload (the originating user message) is bundled into the + same per-destination task as the agent response — see + :meth:`_handle_push_task`. The echo is dispatched *before* the + response within that task; an echo failure does not abort the + response push, and a retried task skips an already-delivered + echo via the ``echo_done`` cursor. + + For JSON-mode runners the destination channel's + :class:`ChannelPushCodec` is called to project the in-memory + :class:`HostedRunResult` into a JSON-safe envelope before + scheduling. Codec failures + (:class:`PushPayloadNotSerializable`) abort the schedule for + that destination (logged and treated as skipped); other + destinations still get their chance. + + Each per-destination push (echo and response) goes through + :meth:`_deliver_payload_to_channel`, which clones the payload + and applies the channel's optional ``response_hook`` so + per-channel transforms (e.g. flatten multi-modal to text for a + text-only wire) can't leak across destinations. + """ + target = request.response_target + kind = target.kind + + # Fast paths for the trivial variants. + if kind == ResponseTargetKind.ORIGINATING: + return True + if kind == ResponseTargetKind.NONE: + # Background-only — drop the reply on the floor for now (no + # ContinuationToken in the prototype). + return False + + # Build the destination set. + include_originating = False + # Each entry is (channel_name, identity_override_or_None_to_lookup). + destinations: list[tuple[str, ChannelIdentity | None]] = [] + isolation_key = request.session.isolation_key if request.session is not None else None + known = self._identities.get(isolation_key or "", {}) + + if kind == ResponseTargetKind.ACTIVE: + active = self._active.get(isolation_key or "") + if active is None or active == request.channel: + # Fall back to originating when there's no other active + # channel known (matches the "first message" case). + self._annotate_intended_targets(payload, intended=(), skipped=()) + return True + destinations.append((active, known.get(active))) + + elif kind == ResponseTargetKind.ALL_LINKED: + for channel_name, identity in known.items(): + if channel_name == request.channel: + include_originating = True + continue + destinations.append((channel_name, identity)) + if not destinations and not include_originating: + # No links recorded yet — fall back. + self._annotate_intended_targets(payload, intended=(), skipped=()) + return True + + elif kind == ResponseTargetKind.IDENTITIES: + for ident in target.target_identities: + if ident.channel == request.channel: + # Pointing the originating channel at itself — fold + # into ``include_originating`` so the originating + # channel renders on its own wire rather than + # double-delivering via push. + include_originating = True + continue + destinations.append((ident.channel, ident)) + + elif kind == ResponseTargetKind.CHANNELS: + for entry in target.targets: + if entry == "originating": + include_originating = True + continue + if ":" in entry: + channel_name, _, native_id = entry.partition(":") + if channel_name == request.channel: + # Pointing the originating channel at itself with a + # specific native id — treat as "include + # originating" since the channel will reply on its + # own wire to that user anyway. + include_originating = True + continue + destinations.append((channel_name, ChannelIdentity(channel=channel_name, native_id=native_id))) + else: + if entry == request.channel: + include_originating = True + continue + destinations.append((entry, known.get(entry))) + + # Schedule per-destination push tasks via the durable runner. + by_name = {ch.name: ch for ch in self.channels} + runner_mode = getattr(self._durable_task_runner, "payload_mode", DurableTaskPayloadMode.OBJECT) + intended_tokens: list[str] = [] + skipped_tokens: list[str] = [] + echo_payload = self._build_echo_payload(request) if target.echo_input else None + for channel_name, dest_identity in destinations: + channel = by_name.get(channel_name) + token = f"{channel_name}:{dest_identity.native_id}" if dest_identity is not None else channel_name + if channel is None: + logger.warning("deliver_response: no channel named %r (target=%s)", channel_name, token) + skipped_tokens.append(token) + continue + if not isinstance(channel, ChannelPush): + logger.warning( + "deliver_response: channel %r does not implement ChannelPush (target=%s)", + channel_name, + token, + ) + skipped_tokens.append(token) + continue + if dest_identity is None: + logger.warning( + "deliver_response: no known identity for isolation_key=%s on channel=%s", + isolation_key, + channel_name, + ) + skipped_tokens.append(token) + continue + + # Build the runner payload. Object-mode runners get live + # references for speed; JSON-mode runners get a fully + # encoded envelope from the channel's push codec. + try: + task_payload = await self._build_push_payload( + channel=channel, + channel_name=channel_name, + identity=dest_identity, + request=request, + result=payload, + echo_payload=echo_payload, + runner_mode=runner_mode, + ) + except PushPayloadNotSerializable: + logger.exception( + "deliver_response: channel %r push codec refused payload (target=%s); skipping", + channel_name, + token, + ) + skipped_tokens.append(token) + continue + try: + await self._durable_task_runner.schedule(HOSTING_PUSH_TASK_NAME, task_payload) + except Exception: + # Schedule-time failures are a host-side outage (runner + # backend unreachable, configuration error). Log and + # treat the destination as skipped — the originating + # channel's fall-back-to-originating rule (below) keeps + # the user from being left without a reply when every + # destination dropped. + logger.exception("deliver_response: failed to schedule push for target=%s", token) + skipped_tokens.append(token) + continue + intended_tokens.append(token) + logger.info( + "deliver_response: scheduled push", + extra={"target": token, "channel": channel_name}, + ) + + if not intended_tokens and not include_originating: + # Spec policy: if every destination drops at resolution time + # (or scheduling fails universally) deliver to originating + # so the user gets a response. The runner backend still + # owns observability for any partial-failure case where at + # least one destination did get scheduled. + logger.warning("deliver_response: every destination dropped — falling back to originating") + include_originating = True + + self._annotate_intended_targets( + payload, + intended=tuple(intended_tokens), + skipped=tuple(skipped_tokens), + include_originating=include_originating, + originating_channel=request.channel, + ) + + return include_originating + + async def _build_push_payload( + self, + *, + channel: ChannelPush, + channel_name: str, + identity: ChannelIdentity, + request: ChannelRequest, + result: HostedRunResult[Any], + echo_payload: HostedRunResult[Any] | None, + runner_mode: DurableTaskPayloadMode, + ) -> dict[str, Any]: + """Assemble the runner payload for a single push destination. + + For object-mode runners (the default in-process runner) we + forward live references — no serialisation cost on the hot + path. For JSON-mode runners we invoke the channel's + :class:`ChannelPushCodec` once to produce a JSON-safe envelope + for the whole push triple; the codec is the only entity that + knows how to project a :class:`HostedRunResult` plus the + channel-side request/identity context for a specific channel's + wire format. + """ + if runner_mode == DurableTaskPayloadMode.OBJECT: + return { + "channel_name": channel_name, + "identity": identity, + "result": result, + "echo_result": echo_payload, + "echo_done": False, + "request": request, + } + # JSON mode — the startup validator guarantees every push-capable + # channel has a ``push_codec``. Use ``getattr`` for the same + # duck-typed lookup pattern the validator and decoder use. + codec = cast("ChannelPushCodec", getattr(channel, "push_codec")) # noqa: B009 + envelope = await codec.encode( + result=result, + request=request, + identity=identity, + echo_result=echo_payload, + ) + return { + "channel_name": channel_name, + "envelope": dict(envelope), + "echo_done": False, + } + + def _annotate_intended_targets( + self, + payload: HostedRunResult[Any], + *, + intended: tuple[str, ...], + skipped: tuple[str, ...], + include_originating: bool = False, + originating_channel: str | None = None, + ) -> None: + """Stamp ``additional_properties["hosting"]`` on every assistant message in the payload. + + The audit annotation is the spec's immutable record of the + host's delivery *intent* — persistence providers see what the + host meant to deliver from a single write, without ever + observing mutable per-destination state (the runner owns + that). Annotated fields: + + - ``intended_targets``: ``[[:], …]`` for + every non-originating destination whose push task was + scheduled successfully. + - ``skipped_targets``: destinations dropped at resolution time + (unknown channel, no ``ChannelPush``, no known identity, or + schedule-time outage). Useful for ops triage. + - ``includes_originating``: ``True`` when the originating + channel rendered (or will render) the reply on its own wire. + + Workflow targets producing arbitrary result objects with no + ``messages`` field are left untouched — the annotation is a + best-effort augmentation of conventional agent responses. + """ + result_obj = payload.result + messages_raw: Any = getattr(result_obj, "messages", None) + if not isinstance(messages_raw, list): + return + hosting_meta: dict[str, Any] = { + "intended_targets": list(intended), + "includes_originating": include_originating, + } + if skipped: + hosting_meta["skipped_targets"] = list(skipped) + if include_originating and originating_channel is not None: + hosting_meta["originating_channel"] = originating_channel + for entry in cast("list[Any]", messages_raw): # type: ignore[redundant-cast] + if not isinstance(entry, Message): + continue + message: Message = entry + if getattr(message, "role", None) != "assistant": + continue + existing = message.additional_properties or {} + existing_hosting = existing.get("hosting") if isinstance(existing, Mapping) else None + if isinstance(existing_hosting, Mapping): + merged_hosting: Mapping[str, Any] = {**existing_hosting, **hosting_meta} + else: + merged_hosting = hosting_meta + message.additional_properties = {**existing, "hosting": merged_hosting} + + +__all__ = ["AgentFrameworkHost", "ChannelContext", "logger"] diff --git a/python/packages/hosting/agent_framework_hosting/_isolation.py b/python/packages/hosting/agent_framework_hosting/_isolation.py new file mode 100644 index 00000000000..53fb2f1e548 --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_isolation.py @@ -0,0 +1,76 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Per-request isolation keys read from inbound HTTP headers. + +The Foundry Hosted Agents runtime injects two well-known headers on every +request it forwards to the user's container: + +* ``x-agent-user-isolation-key`` — opaque per-user partition key +* ``x-agent-chat-isolation-key`` — opaque per-conversation partition key + +When the headers are present we are running inside (or being driven by) the +Foundry runtime; when they are absent we are running in plain local dev. The +host installs an ASGI middleware in :meth:`AgentFrameworkHost._build_app` +that reads both headers off every inbound HTTP request and pushes them into +the :data:`current_isolation_keys` contextvar for the duration of the +request, then resets it. Providers that need partition-aware storage (most +notably ``FoundryHostedAgentHistoryProvider``) read the contextvar via +:func:`get_current_isolation_keys` and apply the keys to their backend +calls — so app authors don't have to wire any middleware themselves and +channels stay free of Foundry-specific header knowledge. + +The contextvar holds a plain :class:`IsolationKeys` mapping; conversion to +provider-specific types (e.g. Foundry's ``IsolationContext``) happens at +the consuming provider so this module has no provider dependencies. +""" + +from __future__ import annotations + +from contextvars import ContextVar, Token + +__all__ = [ + "ISOLATION_HEADER_CHAT", + "ISOLATION_HEADER_USER", + "IsolationKeys", + "current_isolation_keys", + "get_current_isolation_keys", + "reset_current_isolation_keys", + "set_current_isolation_keys", +] + + +ISOLATION_HEADER_USER = "x-agent-user-isolation-key" +ISOLATION_HEADER_CHAT = "x-agent-chat-isolation-key" + + +class IsolationKeys: + """Per-request Foundry isolation keys lifted off the inbound headers.""" + + def __init__(self, user_key: str | None = None, chat_key: str | None = None) -> None: + self.user_key = user_key + self.chat_key = chat_key + + @property + def is_empty(self) -> bool: + return self.user_key is None and self.chat_key is None + + +current_isolation_keys: ContextVar[IsolationKeys | None] = ContextVar( + "agent_framework_hosting_isolation_keys", + default=None, +) + + +def get_current_isolation_keys() -> IsolationKeys | None: + """Return the isolation keys bound to the current request, if any.""" + return current_isolation_keys.get() + + +def set_current_isolation_keys(keys: IsolationKeys | None) -> Token[IsolationKeys | None]: + """Bind ``keys`` to the current async context and return a reset token.""" + return current_isolation_keys.set(keys) + + +def reset_current_isolation_keys(token: Token[IsolationKeys | None]) -> None: + """Restore the isolation contextvar to its prior value.""" + current_isolation_keys.reset(token) diff --git a/python/packages/hosting/agent_framework_hosting/_persistence.py b/python/packages/hosting/agent_framework_hosting/_persistence.py new file mode 100644 index 00000000000..0e2ca854386 --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_persistence.py @@ -0,0 +1,195 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Shared persistence primitives for the hosting package. + +The hosting core ships with an opt-in disk-persistence layer for the +in-process task runner and the host's session-related state. The +on-disk format is provided by the ``diskcache`` package (a small, +pure-Python, sqlite-backed dependency installed via the ``[disk]`` +optional extra). + +This module centralises: + +- :func:`load_diskcache` — lazy import that raises a helpful error when + the optional extra is missing. +- :func:`acquire_state_dir_lock` — single-owner file lock that fails + fast when a second process points at the same directory. +- :func:`normalize_state_dir` — turn the host-level ``state_dir`` + parameter (``str`` / ``PathLike`` / :class:`HostStatePaths` / + ``Mapping``) into a normalised ``dict[component_name -> Path | None]``. + +Everything in this module is internal — public callers should go +through :class:`AgentFrameworkHost` or +:class:`InProcessTaskRunner` directly. +""" + +from __future__ import annotations + +import contextlib +import os +import sys +from collections.abc import Mapping +from pathlib import Path +from typing import TYPE_CHECKING, Any + +if TYPE_CHECKING: + from ._types import HostStatePaths + +# Known component keys recognised by the host's ``state_dir`` normaliser. +# Adding a new component is a non-breaking change: extend this tuple and +# add the matching key to :class:`HostStatePaths` in ``_types.py``. +_KNOWN_COMPONENTS: tuple[str, ...] = ("runner", "sessions", "checkpoints", "links") + + +def load_diskcache() -> Any: + """Lazy-import :mod:`diskcache` with a helpful error when missing. + + The ``diskcache`` package is an optional dependency installed via + the ``agent-framework-hosting[disk]`` extra. Users that never set + ``state_dir`` never trigger the import. This wrapper produces a + single, consistent error message when the import is needed but the + extra was not installed. + """ + try: + import diskcache # type: ignore[import-untyped] + except ImportError as exc: # pragma: no cover - exercised via tests by monkeypatching + raise ImportError( + "agent-framework-hosting was asked to persist state to disk " + "(state_dir is set) but the optional `diskcache` dependency " + "is not installed. Install the disk extra: " + "`pip install 'agent-framework-hosting[disk]'`." + ) from exc + return diskcache + + +def acquire_state_dir_lock(component_dir: Path) -> Any: + """Acquire an exclusive single-owner lock on a component's state dir. + + Two processes pointing at the same state directory would both scan + pending records on startup and could execute the same task twice; + we therefore enforce single-owner semantics with an OS-level + advisory lock. The lock file lives at ``/.lock`` and + is held for the lifetime of the returned file handle. Closing the + handle (or process exit) releases it. + + On Unix this uses :func:`fcntl.flock`. On Windows it uses + :func:`msvcrt.locking`. The lock is *advisory* — the OS will not + enforce it against processes that ignore it, but no + well-behaved component of this package will. + + Raises ``RuntimeError`` if another process already holds the lock. + """ + component_dir.mkdir(parents=True, exist_ok=True) + lock_path = component_dir / ".lock" + # Open in append mode so we don't truncate an existing lock file + # (some monitoring tools may inspect it). + fh = open(lock_path, "a+", encoding="utf-8") # noqa: SIM115 - kept open for lifetime + try: + if sys.platform == "win32": + import msvcrt + + try: + msvcrt.locking(fh.fileno(), msvcrt.LK_NBLCK, 1) + except OSError as exc: + fh.close() + raise RuntimeError( + f"Another process already holds the hosting state lock at {lock_path}. " + "Two hosts (or two runners) pointing at the same state directory would " + "double-execute scheduled tasks; point each host at its own state_dir." + ) from exc + else: + import fcntl + + try: + fcntl.flock(fh.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB) + except OSError as exc: + fh.close() + raise RuntimeError( + f"Another process already holds the hosting state lock at {lock_path}. " + "Two hosts (or two runners) pointing at the same state directory would " + "double-execute scheduled tasks; point each host at its own state_dir." + ) from exc + except RuntimeError: + raise + except Exception: + fh.close() + raise + return fh + + +def release_state_dir_lock(handle: Any) -> None: + """Release a lock previously acquired by :func:`acquire_state_dir_lock`. + + Closing the file handle is sufficient to drop the lock on both + platforms, but we make the intent explicit so the caller doesn't + have to know which mechanism (``fcntl`` vs ``msvcrt``) is in use. + """ + if handle is None: + return + with contextlib.suppress(Exception): # close errors are not actionable + handle.close() + + +def normalize_state_dir( + state_dir: str | os.PathLike[str] | HostStatePaths | Mapping[str, str | os.PathLike[str]] | None, +) -> dict[str, Path | None]: + """Resolve the host-level ``state_dir`` parameter into a per-component map. + + Accepts any of: + + - ``None`` → all components return ``None`` (fully in-memory; today's behavior). + - ``str`` / :class:`os.PathLike` → all components share a parent + directory and get an auto-allocated subfolder (``runner/``, + ``sessions/``, ``checkpoints/``, ``links/``). + - :class:`HostStatePaths` typed dict / plain ``Mapping`` → per-key + override. Components missing from the mapping fall back to ``None`` + (in-memory only). Unknown keys raise ``ValueError`` to surface + typos early. + + Returns a ``dict[component_name -> Path | None]`` covering every + component in :data:`_KNOWN_COMPONENTS`. + """ + result: dict[str, Path | None] = {name: None for name in _KNOWN_COMPONENTS} + if state_dir is None: + return result + + # Strings and PathLikes use the default subfolder layout. + if isinstance(state_dir, (str, os.PathLike)): + root = Path(os.fspath(state_dir)) + for name in _KNOWN_COMPONENTS: + result[name] = root / name + return result + + # Mappings (incl. TypedDict at runtime) get per-component overrides. + if isinstance(state_dir, Mapping): + unknown = [k for k in state_dir if k not in _KNOWN_COMPONENTS] + if unknown: + raise ValueError( + f"state_dir mapping contains unknown component key(s): {unknown!r}. " + f"Known components are: {list(_KNOWN_COMPONENTS)!r}. " + "If you are trying to use a future component, upgrade " + "agent-framework-hosting to a version that supports it." + ) + for name in _KNOWN_COMPONENTS: + raw_value: Any = state_dir.get(name) + if raw_value is None: + result[name] = None + continue + if isinstance(raw_value, (str, os.PathLike)): + result[name] = Path(os.fspath(raw_value)) + else: + raise TypeError(f"state_dir[{name!r}] must be a str or PathLike — got {type(raw_value).__name__}") + return result + + raise TypeError( + f"state_dir must be a str, PathLike, HostStatePaths mapping, or None — got {type(state_dir).__name__}" + ) + + +__all__ = [ + "_KNOWN_COMPONENTS", + "acquire_state_dir_lock", + "load_diskcache", + "normalize_state_dir", + "release_state_dir_lock", +] diff --git a/python/packages/hosting/agent_framework_hosting/_runner.py b/python/packages/hosting/agent_framework_hosting/_runner.py new file mode 100644 index 00000000000..d37fa19bc0e --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_runner.py @@ -0,0 +1,751 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""In-process implementation of :class:`DurableTaskRunner`. + +This is the default runner the host wires in when the operator does not +supply one. It runs tasks via :func:`asyncio.create_task` with a bounded +retry loop following the supplied :class:`RetryPolicy`. + +Two modes: + +* **In-memory** (``state_dir=None``, default) — pending tasks live as + ``asyncio.Task`` references in process memory. In-flight tasks are + lost on process death. Cheap, zero dependencies, suitable for unit + tests and for long-running deployments where "the process dies, + queued pushes are lost" is an acceptable failure mode. +* **Disk-persistent** (``state_dir=``) — pending tasks are + pickled into a :mod:`diskcache`-backed sqlite store before the + ``asyncio.Task`` is created. On the next startup the host calls + :meth:`InProcessTaskRunner.resume` which re-schedules every + surviving ``"pending"`` record with its persisted attempt count. + Graceful shutdown cancellations leave records in ``"pending"`` so + they replay on the next boot. Suitable for ``runtime_mode="long_running"`` + deployments that survive container moves / OOMs. + +For ``runtime_mode="ephemeral"`` deployments (Foundry Hosted Agent, +Azure Functions, Lambda) plug in a durable adapter package +(``agent-framework-hosting-durabletask`` for the gRPC TaskHub backend, +a future Foundry adapter, …) — they all implement the same +:class:`DurableTaskRunner` Protocol. + +See ``docs/specs/002-python-hosting-channels.md`` § "Durable task runner". +""" + +from __future__ import annotations + +import asyncio +import contextlib +import logging +import os +import pickle # noqa: S403 # nosec B403 - used only to validate user payloads round-trip +import time +import uuid +from collections.abc import Awaitable, Callable, Mapping +from pathlib import Path +from typing import Any, cast + +from ._persistence import ( + acquire_state_dir_lock, + load_diskcache, + release_state_dir_lock, +) +from ._types import ( + DurableTaskPayloadMode, + DurableTaskRunner, + PushPayloadNotPicklable, + RetryPolicy, + TaskHandle, + TaskStatus, +) + +logger = logging.getLogger(__name__) + + +# Keys used inside the per-task on-disk record. Kept as module constants +# so the schema is documented once and refactors are mechanical. +_REC_HANDLER_NAME = "handler_name" +_REC_PAYLOAD = "payload" +_REC_RETRY_POLICY = "retry_policy" +_REC_ATTEMPTS = "attempts_completed" +_REC_STATUS = "status" +_REC_CREATED_AT = "created_at" +_REC_TERMINAL_AT = "terminal_at" +_REC_NAME = "name" + +# Deque key inside the cache holding terminal task ids in insertion order. +# Used for FIFO eviction of terminal records once the bounded cap is hit. +_TERMINAL_ORDER_KEY = "__terminal_order__" + + +class _PersistedPayloadDict(dict[str, Any]): + """Drop-in :class:`dict` that mirrors mutations back to disk. + + Used by :class:`InProcessTaskRunner` when ``state_dir`` is set so + handler-side cursors (``echo_done``) survive process restarts. The + handler interacts with this object exactly as it would with a plain + dict; the override on :meth:`__setitem__` is the only difference. + + Held weakly by the runner so handlers that capture the dict in + long-lived closures don't keep the runner alive past its natural + lifetime. + """ + + # Type annotation for the persist callback; the actual attribute is + # assigned via the __slots__-aware ``object.__setattr__`` dance + # below so PyPy doesn't reject the assignment on a ``dict`` subclass. + _persist_cb: Callable[[Mapping[str, Any]], None] + + __slots__ = ("_persist_cb",) + + def __init__( + self, + data: Mapping[str, Any], + persist_cb: Callable[[Mapping[str, Any]], None], + ) -> None: + super().__init__(data) + # Use object.__setattr__ to bypass the __slots__ checker on + # dict subclasses (CPython is liberal here but PyPy is strict). + object.__setattr__(self, "_persist_cb", persist_cb) + + def __setitem__(self, key: str, value: Any) -> None: + super().__setitem__(key, value) + # Re-serialise after each mutation. The cache stores opaque + # pickled values, so partial-field updates aren't possible — + # we send the whole payload mapping every time. Mutations on + # the runner's hot path are rare (just the ``echo_done`` + # cursor today) so this is fine. + self._persist_cb(dict(self)) + + +class InProcessTaskRunner(DurableTaskRunner): + """In-memory or disk-persistent :class:`DurableTaskRunner`. + + Schedules each task as an :func:`asyncio.create_task` coroutine and + retries on exception up to ``RetryPolicy.max_attempts`` times with + exponential backoff. Terminal status (``succeeded`` / ``failed`` / + ``cancelled``) is reported via :meth:`get`. + + Re-registration of the same handler name after :meth:`schedule` has + been called is rejected to avoid silent re-orderings of in-flight + work; the host registers all handlers at startup, before serving + traffic. + + Keyword Args: + default_retry_policy: Per-runner default :class:`RetryPolicy`; + overridable per-task at :meth:`schedule` call sites. + terminal_cache_size: Maximum number of terminal task records to + retain. Older entries are FIFO-evicted so a long-running + host can't accumulate unbounded status entries. + shutdown_grace_seconds: Window :meth:`shutdown` waits for + in-flight tasks to drain before cancelling stragglers. + state_dir: When set, the runner persists pending and terminal + task records under this directory (a :mod:`diskcache` + sqlite store at ``/cache.db`` and a single-owner + lock at ``/.lock``). Persisted pending records + survive process restarts and are replayed by :meth:`resume`. + When ``None`` (default) the runner is purely in-memory and + in-flight tasks are lost on process death. Requires the + optional ``diskcache`` dependency — install with + ``pip install 'agent-framework-hosting[disk]'``. + """ + + # Declared at class level so the ``DurableTaskRunner`` Protocol's + # ``payload_mode`` attribute resolves on instances without needing + # to assign it in ``__init__``. + payload_mode: DurableTaskPayloadMode = DurableTaskPayloadMode.OBJECT + + def __init__( + self, + *, + default_retry_policy: RetryPolicy | None = None, + terminal_cache_size: int = 1024, + shutdown_grace_seconds: float = 5.0, + state_dir: str | os.PathLike[str] | None = None, + ) -> None: + self._handlers: dict[str, Callable[[Mapping[str, Any]], Awaitable[None]]] = {} + self._default_retry_policy = default_retry_policy or RetryPolicy() + self._terminal_cache_size = terminal_cache_size + # How long ``shutdown()`` waits for in-flight tasks to finish on + # their own before cancelling them. Channels may legitimately + # schedule a final push during their own shutdown callback + # (goodbye message, telemetry flush), so the runner gives them + # this window to complete before cancellation kicks in. + self._shutdown_grace_seconds = shutdown_grace_seconds + + # Operational state. ``_pending`` holds asyncio tasks that are + # scheduled or running. ``_terminal`` is an in-memory mirror of + # the most recent terminal statuses (kept in-memory regardless of + # ``state_dir`` so ``get`` is fast and works before/without the + # cache being opened). + self._pending: dict[str, asyncio.Task[None]] = {} + self._terminal: dict[str, TaskStatus] = {} + self._terminal_order: list[str] = [] + + # Set to True on the first ``schedule``/``resume`` call so subsequent + # ``register`` calls fail loudly rather than silently swapping a + # handler out from under in-flight work. + self._started = False + + # Set to True when ``shutdown()`` starts so the retry loop's + # ``CancelledError`` handler distinguishes "the runner is going + # down, leave my record in 'pending' for resume()" from "this + # task was explicitly cancelled, mark it 'cancelled'". + self._shutting_down = False + + # Disk persistence — opt-in via ``state_dir``. ``None`` keeps + # the runner pure-memory (the default behaviour). + self._state_dir: Path | None = Path(os.fspath(state_dir)) if state_dir is not None else None + self._cache: Any = None + self._terminal_deque: Any = None + self._lock_handle: Any = None + if self._state_dir is not None: + self._open_cache() + + # ------------------------------------------------------------------ # + # Cache lifecycle + # ------------------------------------------------------------------ # + + def _open_cache(self) -> None: + """Open the disk cache and acquire the single-owner lock. + + Called from ``__init__`` when ``state_dir`` is set. Splitting it + out keeps the constructor body readable and gives tests a clean + seam for monkeypatching. + """ + if self._state_dir is None: # pragma: no cover - guarded by caller + raise RuntimeError("_open_cache called without state_dir") + diskcache = load_diskcache() + # Acquire the directory lock *before* opening the cache so two + # runners pointed at the same dir don't both try to initialise + # sqlite. The lock handle stays open for the runner's lifetime. + self._lock_handle = acquire_state_dir_lock(self._state_dir) + try: + self._cache = diskcache.Cache(str(self._state_dir)) + # Re-hydrate the in-memory terminal mirror so ``get`` works + # for task ids that completed in a prior process. Doing this + # here (rather than lazily) means the mirror is consistent + # the moment construction returns. + order: Any = self._cache.get(_TERMINAL_ORDER_KEY, default=[]) + if not isinstance(order, list): + # Defensive: a corrupted ordering list shouldn't take + # the host down. Reset and continue — at worst we lose + # ordering for FIFO eviction, not correctness. + logger.warning( + "InProcessTaskRunner: terminal-order entry in %s is not a list; resetting", self._state_dir + ) + order = [] + self._cache.set(_TERMINAL_ORDER_KEY, order) + self._terminal_order = [str(x) for x in cast(list[Any], order)] + for task_id in self._terminal_order: + rec_obj: Any + try: + rec_obj = self._cache.get(task_id) + except Exception: # pragma: no cover - exercised via corrupt-entry test + rec_obj = None + if not isinstance(rec_obj, dict): + continue + rec = cast(dict[str, Any], rec_obj) + status = rec.get(_REC_STATUS) + if status in {"succeeded", "failed", "cancelled"}: + self._terminal[task_id] = status + except Exception: + release_state_dir_lock(self._lock_handle) + self._lock_handle = None + raise + + # ------------------------------------------------------------------ # + # DurableTaskRunner Protocol + # ------------------------------------------------------------------ # + + def register( + self, + name: str, + handler: Callable[[Mapping[str, Any]], Awaitable[None]], + ) -> None: + if self._started: + raise RuntimeError( + f"InProcessTaskRunner.register({name!r}) called after the " + "runner started scheduling tasks — register all handlers at " + "host startup, before serving traffic, to avoid silently " + "reordering in-flight work." + ) + if name in self._handlers: + logger.warning("InProcessTaskRunner: replacing handler registered under %r", name) + self._handlers[name] = handler + + async def schedule( + self, + name: str, + payload: Mapping[str, Any], + *, + retry_policy: RetryPolicy | None = None, + ) -> TaskHandle: + if name not in self._handlers: + raise KeyError( + f"InProcessTaskRunner.schedule({name!r}): no handler " + "registered under this name. Call register(name, handler) " + "at host startup before scheduling." + ) + + self._started = True + policy = retry_policy or self._default_retry_policy + task_id = uuid.uuid4().hex + handle = TaskHandle(task_id=task_id, name=name) + + # Persist the record (when state_dir is set) BEFORE we spawn the + # asyncio task — if the persistence write fails we surface it as + # a synchronous error from ``schedule`` rather than silently + # downgrading to in-memory. + if self._cache is not None: + record = self._build_record(name, dict(payload), policy) + self._validate_picklable(record) + self._cache.set(task_id, record) + + # When persisted, wrap the payload so handler-side mutations + # (e.g. ``payload["echo_done"] = True``) flow back to disk. + runtime_payload: Mapping[str, Any] + if self._cache is not None: + captured_task_id = task_id + + def _persist_cb(new_payload: Mapping[str, Any]) -> None: + self._update_record_payload(captured_task_id, new_payload) + + runtime_payload = _PersistedPayloadDict(payload, _persist_cb) + else: + runtime_payload = payload + + handler = self._handlers[name] + task = asyncio.create_task( + self._run_with_retry(handle, handler, runtime_payload, policy), + name=f"hosting.task[{name}]:{task_id}", + ) + self._pending[task_id] = task + + def _on_done(_t: asyncio.Task[None], tid: str = task_id) -> None: + self._pending.pop(tid, None) + + task.add_done_callback(_on_done) + return handle + + async def get(self, handle: TaskHandle) -> TaskStatus | None: + if handle.task_id in self._pending: + task = self._pending[handle.task_id] + if task.cancelled(): + return "cancelled" + return "running" + # In-memory terminal mirror covers both pure-memory and + # disk-persistent runs (we re-hydrate on cache open). + if handle.task_id in self._terminal: + return self._terminal[handle.task_id] + # Disk fallback for very-aged task ids that left the in-memory + # mirror but still have a record on disk (extremely unlikely + # given that we re-hydrate all terminals at open, but defensive). + if self._cache is not None: + rec_obj: Any = self._cache.get(handle.task_id) + if isinstance(rec_obj, dict): + rec = cast(dict[str, Any], rec_obj) + status = rec.get(_REC_STATUS) + # Records on disk only live in one of four states: + # ``pending`` (queued or in-flight — resume picks these + # up) or one of the terminals. There is no transient + # ``running`` status; the in-flight asyncio task is + # observable via ``_pending`` only inside its own + # process. + if status in {"succeeded", "failed", "cancelled", "pending"}: + return cast(TaskStatus, status) + return None + + # ------------------------------------------------------------------ # + # Resume — replay persisted pending records on startup + # ------------------------------------------------------------------ # + + async def resume(self) -> int: + """Re-schedule pending tasks persisted by a previous process. + + Walks the cache for records in ``"pending"`` status, looks up + their handler in :attr:`_handlers`, and re-creates an + :class:`asyncio.Task` for each — preserving the persisted + attempt count so retry budgets resume mid-way through their + backoff schedule. + + Records whose handler is no longer registered are marked + ``"failed"`` with a clear reason in the log; they will not be + retried again. Records that fail to deserialise (corrupted + sqlite row, schema drift, …) are quarantined: their entry is + removed from the cache and the task id is logged. Both classes + of error are non-fatal — the host should boot even when a + small number of legacy records can't be replayed. + + Returns the number of records successfully re-scheduled. + + Called automatically from :class:`AgentFrameworkHost`'s lifespan + startup hook when the runner is host-owned. Callers driving the + runner directly (tests, bespoke ASGI setups) MUST call this + once after registering handlers and before serving traffic. + """ + if self._cache is None: + return 0 + + # Mark started so subsequent register() calls fail loudly — we + # don't want handler swaps after replay begins. + self._started = True + + replayed = 0 + # iterkeys returns a live view; we copy to a list because we may + # delete entries inside the loop (quarantine / drop-on-missing-handler). + task_ids: list[str] = [str(k) for k in self._cache.iterkeys() if k != _TERMINAL_ORDER_KEY] + for task_id in task_ids: + rec_obj: Any + try: + rec_obj = self._cache.get(task_id) + except Exception: + logger.exception("InProcessTaskRunner.resume: failed to read record %s; quarantining", task_id) + with contextlib.suppress(KeyError): + del self._cache[task_id] + continue + if not isinstance(rec_obj, dict) or _REC_STATUS not in rec_obj: + logger.warning("InProcessTaskRunner.resume: record %s is not a task dict; quarantining", task_id) + with contextlib.suppress(KeyError): + del self._cache[task_id] + continue + rec = cast(dict[str, Any], rec_obj) + status = rec[_REC_STATUS] + if status != "pending": + continue + + handler_name = rec.get(_REC_HANDLER_NAME) + if not isinstance(handler_name, str) or handler_name not in self._handlers: + logger.warning( + "InProcessTaskRunner.resume: no handler registered for record %s (handler=%r); marking failed", + task_id, + handler_name, + ) + self._mark_terminal(task_id, "failed") + continue + handler = self._handlers[handler_name] + + policy_value = rec.get(_REC_RETRY_POLICY) or self._default_retry_policy + if not isinstance(policy_value, RetryPolicy): + # Legacy / corrupt entry — fall back to the default rather + # than failing the whole resume. + policy_value = self._default_retry_policy + policy: RetryPolicy = policy_value + payload_value: Any = rec.get(_REC_PAYLOAD) or {} + payload: dict[str, Any] + if isinstance(payload_value, dict): + payload = cast(dict[str, Any], payload_value) + elif hasattr(payload_value, "keys"): + payload = dict(cast(Mapping[str, Any], payload_value)) + else: + payload = {} + + name_value = rec.get(_REC_NAME, handler_name) + handle = TaskHandle(task_id=task_id, name=str(name_value)) + attempts_value = rec.get(_REC_ATTEMPTS, 0) + attempts_completed = int(attempts_value or 0) + + def _make_resume_persist_cb(tid: str) -> Callable[[Mapping[str, Any]], None]: + def _cb(new_payload: Mapping[str, Any]) -> None: + self._update_record_payload(tid, new_payload) + + return _cb + + runtime_payload = _PersistedPayloadDict(payload, _make_resume_persist_cb(task_id)) + + task = asyncio.create_task( + self._run_with_retry(handle, handler, runtime_payload, policy, _resume_from_attempt=attempts_completed), + name=f"hosting.task[{handle.name}]:{task_id}(resumed)", + ) + self._pending[task_id] = task + + def _on_done(_t: asyncio.Task[None], tid: str = task_id) -> None: + self._pending.pop(tid, None) + + task.add_done_callback(_on_done) + replayed += 1 + + if replayed: + logger.info( + "InProcessTaskRunner.resume: re-scheduled %d pending task(s) from %s", replayed, self._state_dir + ) + return replayed + + # ------------------------------------------------------------------ # + # Lifecycle helper (the host calls this from ``on_shutdown``) + # ------------------------------------------------------------------ # + + async def shutdown(self, *, timeout: float | None = None) -> None: + """Wait briefly for pending tasks to drain, then cancel anything still running. + + Called by the host on ``on_shutdown`` so a graceful shutdown does + not orphan in-flight push retries. Channels may legitimately + schedule a final push from their own shutdown callback (e.g. a + goodbye message); the runner therefore *waits* up to + ``timeout`` seconds (default: the runner's + ``shutdown_grace_seconds`` configured at construction) for the + in-flight set to finish on its own before cancelling stragglers. + Tasks that don't honour cancellation within the same window are + abandoned — the runner makes no synchronous durability claim, + so cleanup is best-effort. + + When ``state_dir`` is set, tasks that didn't drain are left in + ``"pending"`` status on disk so the next process replays them + via :meth:`resume`. The disk cache is closed and the + single-owner lock is released regardless of drain outcome. + """ + self._shutting_down = True + try: + if self._pending: + grace = timeout if timeout is not None else self._shutdown_grace_seconds + tasks = list(self._pending.values()) + # Phase 1 — wait for natural completion within the grace window. + if grace > 0: + await asyncio.wait(tasks, timeout=grace) + # Phase 2 — cancel anything still pending, then wait briefly for + # cancellation to propagate. + still_pending = [t for t in tasks if not t.done()] + if still_pending: + logger.info( + "InProcessTaskRunner.shutdown: %d task(s) still running after %.2fs grace; cancelling", + len(still_pending), + grace, + ) + for task in still_pending: + task.cancel() + cancellation_window = max(grace, 1.0) + try: + await asyncio.wait_for( + asyncio.gather(*still_pending, return_exceptions=True), + timeout=cancellation_window, + ) + except (TimeoutError, asyncio.TimeoutError): + logger.warning( + "InProcessTaskRunner.shutdown: %d task(s) did not exit within %.2fs " + "of cancellation; abandoning", + sum(not t.done() for t in still_pending), + cancellation_window, + ) + finally: + # Release disk resources after the in-flight set has been + # given a chance to drain — tasks that mutate the payload + # mid-shutdown will fail to persist after this point, which + # is the correct behaviour (the next process will replay + # from whatever the last fully-committed state was). + if self._cache is not None: + try: + self._cache.close() + except Exception: # pragma: no cover - close errors aren't actionable + logger.exception("InProcessTaskRunner.shutdown: failed to close cache cleanly") + self._cache = None + if self._lock_handle is not None: + release_state_dir_lock(self._lock_handle) + self._lock_handle = None + + # ------------------------------------------------------------------ # + # Internals — retry loop + # ------------------------------------------------------------------ # + + async def _run_with_retry( + self, + handle: TaskHandle, + handler: Callable[[Mapping[str, Any]], Awaitable[None]], + payload: Mapping[str, Any], + policy: RetryPolicy, + *, + _resume_from_attempt: int = 0, + ) -> None: + delay = policy.initial_backoff_seconds + attempt = _resume_from_attempt + try: + while True: + attempt += 1 + # Persist the attempt counter BEFORE we invoke the + # handler so a crash mid-handler doesn't lose the fact + # that we tried — replay sees the bumped counter and + # respects the original retry budget. Trade-off: a + # crash before the external call is made still consumes + # one attempt (at-most-once semantics around the bump); + # we document this as best-effort across crashes. + self._update_record_attempts(handle.task_id, attempt) + + try: + await handler(payload) + except asyncio.CancelledError: + # On a graceful shutdown of a disk-persistent runner + # we deliberately *don't* mark the record terminal — + # ``resume()`` will pick it up on the next boot and + # replay it with the persisted attempt counter. For + # in-memory runners (no cache) there's nothing to + # resume from, so we still mark ``cancelled`` so + # callers holding the handle can observe the + # outcome. + if not (self._shutting_down and self._cache is not None): + self._mark_terminal(handle.task_id, "cancelled") + raise + except Exception as exc: + if attempt >= policy.max_attempts: + logger.exception( + "InProcessTaskRunner: task %s (%s) failed after %d attempts", + handle.name, + handle.task_id, + attempt, + ) + self._mark_terminal(handle.task_id, "failed") + return + logger.warning( + "InProcessTaskRunner: task %s (%s) attempt %d/%d failed (%s); retrying in %.2fs", + handle.name, + handle.task_id, + attempt, + policy.max_attempts, + exc, + delay, + ) + try: + await asyncio.sleep(delay) + except asyncio.CancelledError: + if not (self._shutting_down and self._cache is not None): + self._mark_terminal(handle.task_id, "cancelled") + raise + delay = min(delay * policy.backoff_multiplier, policy.max_backoff_seconds) + else: + self._mark_terminal(handle.task_id, "succeeded") + return + except asyncio.CancelledError: + # Propagate so the outer ``asyncio.Task`` records cancellation + # in its own state for any observer that holds the raw task. + return + + # ------------------------------------------------------------------ # + # Internals — record / disk helpers + # ------------------------------------------------------------------ # + + def _build_record( + self, + name: str, + payload: Mapping[str, Any], + policy: RetryPolicy, + ) -> dict[str, Any]: + """Construct the on-disk record dict for a freshly-scheduled task.""" + return { + _REC_HANDLER_NAME: name, + _REC_NAME: name, + _REC_PAYLOAD: dict(payload), + _REC_RETRY_POLICY: policy, + _REC_ATTEMPTS: 0, + _REC_STATUS: "pending", + _REC_CREATED_AT: time.time(), + } + + def _validate_picklable(self, record: Mapping[str, Any]) -> None: + """Pickle-probe a record at schedule time so misconfig is loud. + + We only do this when the cache is open (i.e. persistence is on). + The probe runs ``pickle.dumps`` on the record and raises a + framework-typed :class:`PushPayloadNotPicklable` if it fails. + Loud failure here is better than silent data loss after the + next restart. + """ + try: + pickle.dumps(record) # nosec B301 - dumps only, no untrusted load + except Exception as exc: + raise PushPayloadNotPicklable( + "InProcessTaskRunner: scheduled task payload is not picklable; " + "disk persistence (state_dir) requires payloads to round-trip " + "through pickle. Common causes: a user-supplied response that " + "embeds a live network client, asyncio.Lock, or generator. " + f"Underlying pickle error: {exc!r}" + ) from exc + + def _update_record_attempts(self, task_id: str, attempt: int) -> None: + """Bump the attempt counter on the persisted record (if any). + + Status stays ``"pending"`` while the task is in-flight — there + is no transient ``"running"`` status. This keeps the resume + contract simple: anything ``"pending"`` on disk is a candidate + for replay, whether it was never picked up or crashed mid-attempt. + """ + if self._cache is None: + return + rec = self._cache.get(task_id) + if not isinstance(rec, dict): + # Record was evicted / quarantined since schedule; nothing + # to persist. The asyncio task continues — it just won't + # be resumable on next boot. + return + rec[_REC_ATTEMPTS] = attempt + try: + self._cache.set(task_id, rec) + except Exception: # pragma: no cover - cache write failures aren't actionable + logger.exception("InProcessTaskRunner: failed to persist attempt counter for %s", task_id) + + def _update_record_payload(self, task_id: str, new_payload: Mapping[str, Any]) -> None: + """Persist a handler-side payload mutation back to disk. + + Called from :class:`_PersistedPayloadDict.__setitem__`. The whole + payload mapping is re-written (the cache stores opaque pickled + values, so partial-field updates aren't possible). Handler-side + mutations on the runner's hot path are rare (today: only the + ``echo_done`` cursor) so the extra write is acceptable. + """ + if self._cache is None: + return + rec = self._cache.get(task_id) + if not isinstance(rec, dict): + return + rec[_REC_PAYLOAD] = dict(new_payload) + try: + self._cache.set(task_id, rec) + except Exception: # pragma: no cover - cache write failures aren't actionable + logger.exception("InProcessTaskRunner: failed to persist payload mutation for %s", task_id) + + def _mark_terminal(self, task_id: str, status: TaskStatus) -> None: + """Move a task to a terminal status, updating both memory and disk. + + Records are first updated on disk (so a crash between the disk + write and the in-memory write doesn't lose the terminal status), + then mirrored to the in-memory cache, then FIFO-bounded. + """ + # Disk side first. + if self._cache is not None: + rec = self._cache.get(task_id) + if isinstance(rec, dict): + rec[_REC_STATUS] = status + rec[_REC_TERMINAL_AT] = time.time() + # Truncate heavy fields (payload, retry_policy) — once + # the task is terminal we never need them again, and + # keeping them around bloats disk on long-lived hosts. + rec[_REC_PAYLOAD] = None + rec[_REC_RETRY_POLICY] = None + try: + self._cache.set(task_id, rec) + except Exception: # pragma: no cover + logger.exception("InProcessTaskRunner: failed to persist terminal status for %s", task_id) + + # In-memory side. + if task_id not in self._terminal: + self._terminal_order.append(task_id) + self._terminal[task_id] = status + + # FIFO-evict from BOTH layers once we exceed the cap. + while len(self._terminal_order) > self._terminal_cache_size: + evicted = self._terminal_order.pop(0) + self._terminal.pop(evicted, None) + if self._cache is not None: + try: + del self._cache[evicted] + except KeyError: + pass + except Exception: # pragma: no cover + logger.exception("InProcessTaskRunner: failed to evict %s from disk cache", evicted) + + # Persist the new ordering list so a restart sees the same FIFO + # ordering for further eviction decisions. + if self._cache is not None: + try: + self._cache.set(_TERMINAL_ORDER_KEY, list(self._terminal_order)) + except Exception: # pragma: no cover + logger.exception("InProcessTaskRunner: failed to persist terminal-order list") + + +__all__ = ["InProcessTaskRunner"] diff --git a/python/packages/hosting/agent_framework_hosting/_state_store.py b/python/packages/hosting/agent_framework_hosting/_state_store.py new file mode 100644 index 00000000000..1ca62a0a7d5 --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_state_store.py @@ -0,0 +1,402 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Disk-backed wrappers for the host's in-memory state dicts. + +The host keeps three in-process dictionaries that need to survive a +process restart when the operator opts in to disk persistence: + +- ``_session_aliases`` (``isolation_key -> active session_id``): rotated + by :meth:`AgentFrameworkHost.reset_session`; without persistence a + restart silently re-uses the pre-rotation session_id and the user sees + history they were supposed to have walked away from. +- ``_active`` (``isolation_key -> last-seen channel name``): drives + :class:`ResponseTarget` ``.active`` fan-out; losing it on restart makes + :class:`ResponseTarget.active` raise ``"no active channel"`` for every + user the host has previously talked to. +- ``_identities`` + (``isolation_key -> {channel_name -> ChannelIdentity}``): the per-user + channel registry that powers :class:`ResponseTarget` ``.channel(name)``, + ``.channels([...])`` and ``.all_linked``; losing it on restart turns + every linked-identity push target into a not-found. + +Both wrappers are :class:`dict` subclasses so the rest of the host code +doesn't need to know whether persistence is on or off; the only +difference is that mutations are mirrored back to a +:mod:`diskcache`-backed sqlite store. Reads stay fast because the +in-memory copy is the source of truth — disk is purely a backing +store for write-through and re-hydration. + +Layout under ``/sessions/`` (the ``sessions`` component +chosen because all three dicts share the same per-user-life cycle): + + /sessions/ + .lock # single-owner lock (advisory) + cache.db, … # diskcache sqlite files + keyed by: + "aliases:" -> str (session_id) + "active:" -> str (channel name) + "identities:" -> dict[channel_name, ChannelIdentity] + +Pickle is what diskcache uses by default; the wrappers do not impose +their own serialisation. :class:`ChannelIdentity` is a frozen dataclass +of plain scalars and so round-trips cleanly. + +Everything in this module is internal. Public consumers should use +:class:`AgentFrameworkHost(state_dir=...)` and let the host wire the +wrappers up. +""" + +from __future__ import annotations + +import logging +import os +from collections.abc import Mapping +from pathlib import Path +from typing import Any, TypeVar, cast + +from ._persistence import ( + acquire_state_dir_lock, + load_diskcache, + release_state_dir_lock, +) + +logger = logging.getLogger(__name__) + + +_V = TypeVar("_V") + + +# Key prefixes inside the shared sessions cache. Three logical maps live +# in one diskcache so they share a single sqlite handle and a single +# directory lock — opening multiple diskcaches against the same +# directory is supported but doubles file-handle pressure and the +# per-open lock acquisition cost. +_ALIASES_PREFIX = "aliases:" +_ACTIVE_PREFIX = "active:" +_IDENTITIES_PREFIX = "identities:" + + +class SessionsStateStore: + """One disk cache + lock shared by every host-side persisted dict. + + The host constructs one of these per ``state_dir["sessions"]`` value + and threads it into each :class:`_PersistedDict` it creates. Closing + the store releases the lock and the cache handle. + """ + + def __init__(self, sessions_dir: str | os.PathLike[str]) -> None: + self._sessions_dir: Path = Path(os.fspath(sessions_dir)) + diskcache = load_diskcache() + self._lock_handle: Any = acquire_state_dir_lock(self._sessions_dir) + try: + self._cache: Any = diskcache.Cache(str(self._sessions_dir)) + except Exception: + release_state_dir_lock(self._lock_handle) + self._lock_handle = None + raise + + @property + def cache(self) -> Any: + """Return the underlying :mod:`diskcache` Cache. + + Intended for the wrapper classes in this module only. Callers + outside the module should go through the typed wrappers — direct + cache access bypasses the key-prefix discipline that keeps the + three maps from colliding. + """ + return self._cache + + def close(self) -> None: + """Close the cache and release the directory lock. + + Safe to call multiple times. The host invokes this from its + lifespan shutdown hook so a second host can re-open the same + ``state_dir`` cleanly after the first exits. + """ + if self._cache is not None: + try: + self._cache.close() + except Exception: # pragma: no cover - close errors aren't actionable + logger.exception("SessionsStateStore: failed to close cache cleanly") + self._cache = None + if self._lock_handle is not None: + release_state_dir_lock(self._lock_handle) + self._lock_handle = None + + +class _PersistedDict(dict[str, _V]): + """Drop-in :class:`dict` whose mutations mirror to a diskcache prefix. + + Used for the host's flat ``str -> V`` dicts (``_session_aliases`` + and ``_active``). The in-memory copy is the source of truth for + reads; writes update memory first and then mirror to disk so a + crash between the two leaves the in-memory state correct (which is + what subsequent reads will see anyway) and only loses the last + not-yet-flushed value on next restart. + """ + + def __init__( + self, + store: SessionsStateStore, + key_prefix: str, + initial: Mapping[str, _V] | None = None, + ) -> None: + super().__init__() + self._store = store + self._prefix = key_prefix + # Rehydrate from disk into memory exactly once at construction. + # Doing this here (rather than lazily) keeps the in-memory dict + # behaviour consistent with the non-persisted code path — + # ``len(host._session_aliases)`` reflects all known users from + # the moment the host is constructed. + cache: Any = store.cache + for raw_key in cache.iterkeys(): + if not isinstance(raw_key, str) or not raw_key.startswith(key_prefix): + continue + value: Any + try: + value = cache.get(raw_key) + except Exception: + logger.exception("SessionsStateStore: failed to rehydrate %s; skipping", raw_key) + continue + logical_key = raw_key[len(key_prefix) :] + super().__setitem__(logical_key, value) + if initial: + for k, v in initial.items(): + self[k] = v + + def __setitem__(self, key: str, value: _V) -> None: + super().__setitem__(key, value) + try: + self._store.cache.set(self._prefix + key, value) + except Exception: # pragma: no cover - cache write failures aren't actionable + logger.exception("SessionsStateStore: failed to persist %s%s", self._prefix, key) + + def __delitem__(self, key: str) -> None: + super().__delitem__(key) + try: + del self._store.cache[self._prefix + key] + except KeyError: + pass + except Exception: # pragma: no cover - cache write failures aren't actionable + logger.exception("SessionsStateStore: failed to evict %s%s", self._prefix, key) + + def pop(self, key: str, *args: Any) -> _V: + # ``dict.pop`` doesn't go through ``__delitem__``, so we mirror + # the disk side here explicitly. Forward the default sentinel + # only when present so we match ``dict.pop`` semantics exactly. + value: _V = super().pop(key, *args) + try: + del self._store.cache[self._prefix + key] + except KeyError: + pass + except Exception: # pragma: no cover + logger.exception("SessionsStateStore: failed to evict %s%s", self._prefix, key) + return value + + def clear(self) -> None: + keys = list(self.keys()) + super().clear() + cache = self._store.cache + for k in keys: + try: + del cache[self._prefix + k] + except KeyError: + pass + except Exception: # pragma: no cover + logger.exception("SessionsStateStore: failed to evict %s%s during clear", self._prefix, k) + + def update( # type: ignore[override] + self, + other: Mapping[str, _V] | None = None, + /, + **kwargs: _V, + ) -> None: + # Defer to __setitem__ so every entry is mirrored to disk; the + # default ``dict.update`` writes into the underlying storage + # directly and would skip our persistence hook. + if other is not None: + for k in other: + self[k] = other[k] + for k, v in kwargs.items(): + self[k] = v + + +class _PersistedNestedDict(dict[str, dict[str, _V]]): + """Disk-backed wrapper for the per-isolation-key identity map. + + The host's ``_identities`` is a nested dict + ``isolation_key -> {channel_name -> ChannelIdentity}``. The whole + inner dict for a given isolation_key is small (one entry per channel + the user has appeared on), so we persist the inner dict as a single + cache value rather than per-channel — fewer cache hits, simpler + schema, no need for a separate sub-prefix. + + To make mutations of the inner dict mirror to disk, ``__getitem__`` + returns a ``_NestedInnerProxy`` that mutates the parent's cache slot + on each ``__setitem__`` / ``__delitem__``. The wrapper is purely + additive — callers that pass a plain dict in via ``__setitem__`` get + the same write-through behaviour for free. + """ + + def __init__( + self, + store: SessionsStateStore, + key_prefix: str = _IDENTITIES_PREFIX, + ) -> None: + super().__init__() + self._store = store + self._prefix = key_prefix + cache: Any = store.cache + for raw_key in cache.iterkeys(): + if not isinstance(raw_key, str) or not raw_key.startswith(key_prefix): + continue + value: Any + try: + value = cache.get(raw_key) + except Exception: + logger.exception("SessionsStateStore: failed to rehydrate %s; skipping", raw_key) + continue + if not isinstance(value, dict): + continue + inner_value = cast(dict[str, _V], value) + logical_key = raw_key[len(key_prefix) :] + # Wrap so caller-side mutations on the inner dict mirror back. + inner: _NestedInnerProxy[_V] = _NestedInnerProxy(self, logical_key, inner_value) + super().__setitem__(logical_key, inner) + + def __setitem__(self, key: str, value: dict[str, _V]) -> None: + # Wrap whatever the caller passes in so subsequent ``inner[ch] = ...`` + # mutations are mirrored to disk. We always wrap (even + # ``_NestedInnerProxy`` inputs) so the proxy's ``_outer`` link + # points at us rather than at any previous outer dict. + wrapped = _NestedInnerProxy(self, key, dict(value)) + super().__setitem__(key, wrapped) + self.persist_inner(key, dict(value)) + + def __delitem__(self, key: str) -> None: + super().__delitem__(key) + try: + del self._store.cache[self._prefix + key] + except KeyError: + pass + except Exception: # pragma: no cover + logger.exception("SessionsStateStore: failed to evict %s%s", self._prefix, key) + + def setdefault(self, key: str, default: dict[str, _V] | None = None) -> dict[str, _V]: # type: ignore[override] + if key in self: + return self[key] + if default is None: + default = {} + self[key] = default + return self[key] + + def persist_inner(self, isolation_key: str, snapshot: Mapping[str, _V]) -> None: + """Write the full inner dict for ``isolation_key`` back to disk. + + Called from :class:`_NestedInnerProxy` on every mutation and by + :meth:`__setitem__` when a new outer key is added. A single + write per change keeps the schema simple — there is no + partial-row update — and is fine for the access pattern + (mutations on the host's hot path are rare: identity registry + writes are once-per-channel-per-user). + """ + try: + self._store.cache.set(self._prefix + isolation_key, snapshot) + except Exception: # pragma: no cover - cache write failures aren't actionable + logger.exception( + "SessionsStateStore: failed to persist identities for %s%s", + self._prefix, + isolation_key, + ) + + +class _NestedInnerProxy(dict[str, _V]): + """Inner-dict proxy that mirrors mutations back to its outer. + + Returned by :class:`_PersistedNestedDict.__getitem__` (via the + rehydration / ``__setitem__`` wrap). When the channel-registry code + does ``self._identities[ik][channel_name] = identity``, the + ``__setitem__`` on this proxy fires and re-writes the whole inner + dict to disk via the parent's ``persist_inner``. Behavioural + identity with ``dict`` is preserved otherwise (``len``, iteration, + ``__contains__``, …). + """ + + _outer: _PersistedNestedDict[_V] + _key: str + + __slots__ = ("_key", "_outer") + + def __init__( + self, + outer: _PersistedNestedDict[_V], + key: str, + data: Mapping[str, _V], + ) -> None: + super().__init__(data) + # ``__slots__`` on a ``dict`` subclass requires the back-door — + # CPython is lenient, PyPy is strict. + object.__setattr__(self, "_outer", outer) + object.__setattr__(self, "_key", key) + + def __setitem__(self, key: str, value: _V) -> None: + super().__setitem__(key, value) + self._outer.persist_inner(self._key, dict(self)) + + def __delitem__(self, key: str) -> None: + super().__delitem__(key) + self._outer.persist_inner(self._key, dict(self)) + + def pop(self, key: str, *args: Any) -> _V: + value: _V = super().pop(key, *args) + self._outer.persist_inner(self._key, dict(self)) + return value + + def clear(self) -> None: + super().clear() + self._outer.persist_inner(self._key, dict(self)) + + def update( # type: ignore[override] + self, + other: Mapping[str, _V] | None = None, + /, + **kwargs: _V, + ) -> None: + if other is not None: + for k in other: + super().__setitem__(k, other[k]) + for k, v in kwargs.items(): + super().__setitem__(k, v) + self._outer.persist_inner(self._key, dict(self)) + + +def build_session_dicts( + store: SessionsStateStore, +) -> tuple[ + _PersistedDict[str], + _PersistedDict[str], + _PersistedNestedDict[Any], +]: + """Construct the three host-side persisted dicts against a single store. + + Returns ``(session_aliases, active, identities)`` in the order the + host assigns them, so the call site reads + ``self._session_aliases, self._active, self._identities = build_session_dicts(store)``. + """ + aliases: _PersistedDict[str] = _PersistedDict(store, _ALIASES_PREFIX) + active: _PersistedDict[str] = _PersistedDict(store, _ACTIVE_PREFIX) + identities: _PersistedNestedDict[Any] = _PersistedNestedDict(store) + return aliases, active, identities + + +# Re-export keys for tests / power users that want to inspect the cache. +__all__ = [ + "_ACTIVE_PREFIX", + "_ALIASES_PREFIX", + "_IDENTITIES_PREFIX", + "SessionsStateStore", + "_PersistedDict", + "_PersistedNestedDict", + "build_session_dicts", +] diff --git a/python/packages/hosting/agent_framework_hosting/_types.py b/python/packages/hosting/agent_framework_hosting/_types.py new file mode 100644 index 00000000000..30ad79cdd59 --- /dev/null +++ b/python/packages/hosting/agent_framework_hosting/_types.py @@ -0,0 +1,915 @@ +# Copyright (c) Microsoft. All rights reserved. + +# ``ChannelRequest`` is the only intentional dataclass here (callers use +# ``dataclasses.replace`` on it in run hooks). The other types are plain +# Python classes by preference, so the "could be a dataclass" lint is muted +# at the file level. +# ruff: noqa: B903 + +"""Channel-neutral request envelope and channel protocol types. + +These types form the boundary between the host and individual channels. +A channel parses its native payload, builds a :class:`ChannelRequest`, and +hands it to :class:`ChannelContext.run` (or ``run_stream``) on the host. +The host normalizes the request into a single agent invocation and either +returns the result to the originating channel or fans out via +:class:`ResponseTarget` to other channels that implement +:class:`ChannelPush`. + +See ``docs/specs/002-python-hosting-channels.md`` for the full design. +""" + +from __future__ import annotations + +import os +from collections.abc import Awaitable, Callable, Mapping, Sequence +from dataclasses import dataclass, field +from enum import Enum +from typing import TYPE_CHECKING, Any, Generic, Literal, Protocol, TypedDict, TypeVar, runtime_checkable + +from agent_framework import ( + AgentResponse, + AgentResponseUpdate, + AgentRunInputs, + ResponseStream, + SupportsAgentRun, + Workflow, +) +from starlette.routing import BaseRoute + +if TYPE_CHECKING: + from ._host import ChannelContext + + +# --------------------------------------------------------------------------- # +# Channel-neutral request envelope +# --------------------------------------------------------------------------- # + + +class ChannelSession: + """Channel-supplied session hint. + + The host turns this into an ``AgentSession`` keyed by ``isolation_key`` so + every distinct end user gets their own context-provider state (e.g. one + ``FileHistoryProvider`` JSONL file per user). + """ + + def __init__(self, isolation_key: str | None = None) -> None: + self.isolation_key = isolation_key + + +class ChannelIdentity: + """Channel-native identity the host sees on each request. + + Consumed by the host's identity registry. The host uses it for two things: + + 1. Recording the active channel for an ``isolation_key`` so + ``ResponseTarget.active`` resolves correctly. + 2. Telling :class:`ChannelPush` ``push`` recipients **where** in their + native namespace to deliver — Telegram uses ``native_id`` as the + chat id, Teams as the conversation/AAD id, etc. + """ + + def __init__( + self, + channel: str, + native_id: str, + attributes: Mapping[str, Any] | None = None, + ) -> None: + self.channel = channel + self.native_id = native_id + self.attributes: Mapping[str, Any] = attributes if attributes is not None else dict() + + +class ResponseTargetKind(str, Enum): + """Discriminator for :class:`ResponseTarget` variants.""" + + ORIGINATING = "originating" + ACTIVE = "active" + CHANNELS = "channels" + ALL_LINKED = "all_linked" + IDENTITIES = "identities" + NONE = "none" + + +class ResponseTarget: + """Per-request directive controlling **where** the host delivers the agent reply. + + Independent of ``session_mode``. Construct via the classmethod helpers or + use the module-level singletons rather than touching ``kind`` directly. + Variants: + + - ``ResponseTarget.originating`` (default) — synchronous response on the + originating channel only. + - ``ResponseTarget.active`` — push to the channel most recently observed + for the resolved ``isolation_key``. + - ``ResponseTarget.channel("teams")`` / ``.channels([...])`` — push to + one or more named destinations. Each entry is either a bare channel + name (host resolves the native id from its identity registry) or a + ``"channel:native_id"`` token (used verbatim). The pseudo-name + ``"originating"`` includes the originating channel in the fan-out. + - ``ResponseTarget.identity(ChannelIdentity)`` / + ``.identities([ChannelIdentity, ...])`` — push to one or more + **fully-specified identities**. Preferred over the ``"channel:native_id"`` + string variant when the destination needs ``identity.attributes`` + preserved (Teams conversation/thread metadata, Slack channel+thread, + Bot Framework service-url, etc.). + - ``ResponseTarget.all_linked`` — push to every channel where the + resolved ``isolation_key`` has been observed. + - ``ResponseTarget.none`` — background-only; in the prototype this just + suppresses the originating reply (no ``ContinuationToken`` yet). + + Instances are intended to be treated as immutable; the singletons are + shared across the process. + """ + + def __init__( + self, + kind: ResponseTargetKind = ResponseTargetKind.ORIGINATING, + targets: tuple[str, ...] = (), + identities: tuple[ChannelIdentity, ...] = (), + *, + echo_input: bool = False, + ) -> None: + self.kind = kind + self.targets = targets + # Stored under a non-clashing name so the ``identities`` + # *classmethod* (the public builder) can coexist with the + # value accessor (the ``identities`` property below). At + # runtime instance attributes shadow class attributes anyway, + # but type checkers see the classmethod and reject reassignment. + self._target_identities: tuple[ChannelIdentity, ...] = tuple(identities) + # When True, the host first pushes the originating user message + # to every non-originating destination (so end-user apps observing + # those channels can keep their UI in sync) before pushing the + # agent response. Defaults to False — opt-in only, because not + # every channel knows how to render ``role="user"`` content + # gracefully on its own surface. + self.echo_input = echo_input + + @property + def target_identities(self) -> tuple[ChannelIdentity, ...]: + """Destination identities for ``kind == IDENTITIES`` targets. + + Public name distinct from the :meth:`identities` classmethod + builder. Empty for non-``IDENTITIES`` kinds. + """ + return self._target_identities + + # -- builders ---------------------------------------------------------- # + + @classmethod + def channel(cls, name: str, *, echo_input: bool = False) -> ResponseTarget: + """Target a single named destination channel.""" + return cls(kind=ResponseTargetKind.CHANNELS, targets=(name,), echo_input=echo_input) + + @classmethod + def channels(cls, names: Sequence[str], *, echo_input: bool = False) -> ResponseTarget: + """Target an explicit list of destination channels.""" + return cls(kind=ResponseTargetKind.CHANNELS, targets=tuple(names), echo_input=echo_input) + + @classmethod + def identity(cls, identity: ChannelIdentity, *, echo_input: bool = False) -> ResponseTarget: + """Target a single fully-specified :class:`ChannelIdentity`. + + Preferred over the ``"channel:native_id"`` string token in + :meth:`channels` when ``identity.attributes`` carries metadata the + destination channel needs (Teams conversation/thread ids and + service-url, Slack channel + thread, Bot Framework activity-locator + fields, etc.). The host pushes to the named identity verbatim + without consulting its own identity registry. + """ + return cls(kind=ResponseTargetKind.IDENTITIES, identities=(identity,), echo_input=echo_input) + + @classmethod + def identities(cls, identities: Sequence[ChannelIdentity], *, echo_input: bool = False) -> ResponseTarget: + """Target an explicit list of fully-specified :class:`ChannelIdentity` objects. + + See :meth:`identity` for the single-destination variant. + """ + return cls(kind=ResponseTargetKind.IDENTITIES, identities=tuple(identities), echo_input=echo_input) + + # -- value semantics --------------------------------------------------- # + # ``ResponseTarget`` is treated as immutable, so two instances with the + # same ``kind`` + ``targets`` + ``identities`` + ``echo_input`` are + # interchangeable. Tests and channel parsers compare instances with + # ``==`` and use them as dict keys. + + def __eq__(self, other: object) -> bool: + if not isinstance(other, ResponseTarget): + return NotImplemented + return ( + self.kind is other.kind + and self.targets == other.targets + and _identities_equal(self._target_identities, other._target_identities) + and self.echo_input == other.echo_input + ) + + def __hash__(self) -> int: + # ``ChannelIdentity`` is not itself hashable (mutable attributes + # mapping); fold the identifying triple so two ``identities`` + # tuples with the same channel/native_id/attributes content hash + # the same. + identities_key = tuple( + (i.channel, i.native_id, tuple(sorted(i.attributes.items()))) for i in self._target_identities + ) + return hash((self.kind, self.targets, identities_key, self.echo_input)) + + def __repr__(self) -> str: + suffix = ", echo_input=True" if self.echo_input else "" + if self.kind is ResponseTargetKind.CHANNELS: + return f"ResponseTarget.channels({list(self.targets)!r}{suffix})" + if self.kind is ResponseTargetKind.IDENTITIES: + return f"ResponseTarget.identities({list(self._target_identities)!r}{suffix})" + return f"ResponseTarget.{self.kind.value}{suffix}" + + +def _identities_equal(left: tuple[ChannelIdentity, ...], right: tuple[ChannelIdentity, ...]) -> bool: + """Structural-equality helper for ``ResponseTarget.identities`` comparisons. + + ``ChannelIdentity`` is a plain class without ``__eq__``, so ``tuple`` / + ``list`` comparisons fall back to identity equality which is too strict + for value-typed ``ResponseTarget`` callers (two equivalent identity + tuples produced independently would otherwise compare unequal). + """ + if len(left) != len(right): + return False + for a, b in zip(left, right, strict=True): + if a.channel != b.channel or a.native_id != b.native_id: + return False + if dict(a.attributes) != dict(b.attributes): + return False + return True + + +# Module-level singletons so callers can write ``ResponseTarget.originating`` +# (matching the spec's classmethod-style notation) without juggling Python's +# no-zero-arg-classmethod-property limitation. +ResponseTarget.originating = ResponseTarget(kind=ResponseTargetKind.ORIGINATING) # type: ignore[attr-defined] +ResponseTarget.active = ResponseTarget(kind=ResponseTargetKind.ACTIVE) # type: ignore[attr-defined] +ResponseTarget.all_linked = ResponseTarget(kind=ResponseTargetKind.ALL_LINKED) # type: ignore[attr-defined] +ResponseTarget.none = ResponseTarget(kind=ResponseTargetKind.NONE) # type: ignore[attr-defined] + + +@dataclass +class ChannelRequest: + """Uniform invocation envelope every channel produces from its native payload. + + Kept as a dataclass so app authors can use ``dataclasses.replace(...)`` in + run hooks to produce a modified envelope without re-listing every field. + """ + + channel: str + operation: str # e.g. "message.create", "command.invoke" + input: AgentRunInputs + session: ChannelSession | None = None + options: Mapping[str, Any] | None = None + session_mode: str = "auto" # "auto" | "required" | "disabled" + metadata: Mapping[str, Any] = field(default_factory=lambda: {}) + attributes: Mapping[str, Any] = field(default_factory=lambda: {}) + stream: bool = False + identity: ChannelIdentity | None = None + response_target: ResponseTarget = field(default_factory=lambda: ResponseTarget.originating) # type: ignore[attr-defined] + + +class ChannelCommand: + """A discoverable command a channel exposes to its users (e.g. ``/reset``).""" + + def __init__( + self, + name: str, + description: str, + handle: Callable[[ChannelCommandContext], Awaitable[None]], + ) -> None: + self.name = name + self.description = description + self.handle = handle + + +class ChannelCommandContext: + """Context passed to a :class:`ChannelCommand` handler.""" + + def __init__( + self, + request: ChannelRequest, + reply: Callable[[str], Awaitable[None]], + ) -> None: + self.request = request + self.reply = reply + + +_EMPTY_ROUTES: tuple[BaseRoute, ...] = () +_EMPTY_COMMANDS: tuple[ChannelCommand, ...] = () +_EMPTY_LIFECYCLE: tuple[Callable[[], Awaitable[None]], ...] = () + + +class ChannelContribution: + """Routes, commands, and lifecycle hooks a channel contributes to the host.""" + + def __init__( + self, + routes: Sequence[BaseRoute] = _EMPTY_ROUTES, + commands: Sequence[ChannelCommand] = _EMPTY_COMMANDS, + on_startup: Sequence[Callable[[], Awaitable[None]]] = _EMPTY_LIFECYCLE, + on_shutdown: Sequence[Callable[[], Awaitable[None]]] = _EMPTY_LIFECYCLE, + ) -> None: + self.routes = routes + self.commands = commands + self.on_startup = on_startup + self.on_shutdown = on_shutdown + + +class _Unset: + """Sentinel for ``HostedRunResult.replace`` overrides. + + Distinguishes "caller did not pass this kwarg" from "caller passed + ``None`` explicitly" — needed because ``session`` is ``None`` in + many envelopes and we want the no-arg call to preserve it. + """ + + +_UNSET = _Unset() + + +TResult = TypeVar("TResult") + + +class HostedRunResult(Generic[TResult]): + r"""Channel-neutral envelope around the target's full-fidelity result. + + Carries the underlying execution payload **unchanged** so channels + (and developer-supplied ``response_hook``\\s) can read everything the + target produced — full multi-modal contents, structured ``value``, + ``usage_details``, ``response_id``, workflow per-executor outputs, + final ``WorkflowRunState``, etc. + + ``result`` is generic in ``TResult`` so callers retain static typing: + + * Agent targets always produce + ``HostedRunResult[AgentResponse]`` — channels read + ``result.messages``, ``result.value``, ``result.usage_details``, … + directly. + * Workflow targets produce ``HostedRunResult[WorkflowRunResult]`` + today (``Workflow`` is not itself generic, so the static narrowing + is only as tight as ``Workflow.run``'s return). Channels iterate + ``result.get_outputs()`` and inspect ``result.get_final_state()`` + to render workflow-specific UX. When a host author drives the + workflow themselves and knows the final-output type, they may + narrow to ``HostedRunResult[MyOutput]`` in their own + ``response_hook`` signatures. + * The echo-input phase synthesises an ``HostedRunResult[AgentResponse]`` + wrapping the originating user turn so the same per-destination + delivery machinery applies. + + The optional ``session`` slot carries the resolved + :class:`~agent_framework.AgentSession` the host bound to this + invocation (``None`` for workflow targets, which do not own session + state in the agent sense). Channels that want to surface session + metadata (e.g. echo the resolved isolation key into a response + header) read it here. + + Treat instances as immutable: the host clones per-destination before + invoking a per-channel ``response_hook`` so one channel's transform + cannot perturb the payload another destination observes. + """ + + def __init__( + self, + result: TResult, + *, + session: Any | None = None, + ) -> None: + self.result = result + self.session = session + + def replace( + self, + *, + result: TResult | _Unset = _UNSET, + session: Any | _Unset | None = _UNSET, + ) -> HostedRunResult[TResult]: + """Return a shallow copy with the supplied fields overridden. + + Used by the host's delivery layer to clone the envelope before + applying a per-destination ``response_hook``, so one channel's + transform cannot mutate the payload another destination sees. + The clone is shallow — channels that need to mutate + ``result.messages`` (or any other nested mutable container) are + responsible for deep-cloning that container themselves. + """ + new: HostedRunResult[TResult] = HostedRunResult.__new__(HostedRunResult) # pyright: ignore[reportUnknownVariableType] + new.result = self.result if isinstance(result, _Unset) else result + new.session = self.session if isinstance(session, _Unset) else session + return new + + +class DurableTaskPayloadMode(str, Enum): + """How a :class:`DurableTaskRunner` consumes scheduled-task payloads. + + Used by the host's startup validator to pair a runner's persistence + expectations with the channels' push-codec capabilities. Adapter packages + pick the right value for their backing store. + + * ``OBJECT`` — the runner accepts live Python objects in the payload. + No serialization is required; the host's + :class:`InProcessTaskRunner` is the canonical example. Suitable for + ``runtime_mode="long_running"`` deployments where the runner shares + address space with the producer. + * ``JSON`` — the runner persists the payload (database, durable queue, + Foundry scheduled-task store, …) and replays it after a process + restart. Payloads MUST be JSON-serializable, which constrains what + the host can put on the wire. The host validates at construction + that every push-capable channel exposes a + :class:`ChannelPushCodec` (so :class:`HostedRunResult` payloads can + be reduced to a JSON envelope before scheduling). + """ + + OBJECT = "object" + JSON = "json" + + +# A push-codec implementation reduces the ``(result, request, identity)`` +# triple a destination channel will receive into a JSON-safe envelope that +# a durable :class:`DurableTaskRunner` can persist, and reconstructs the +# rendering inputs on the consumer side. The host *invokes* the codec +# during scheduling; the destination channel implements it (the channel +# knows what shape of payload it can render). +# +# Channels with no push codec are usable only with object-mode runners +# (the default :class:`InProcessTaskRunner`) — the host validates this at +# construction so the mismatch surfaces eagerly rather than on first push. +class ChannelPushCodec(Protocol): + """Optional capability: serialise the push envelope for a durable task runner. + + Implementations live on the destination channel (alongside ``push``) + as a duck-typed ``push_codec`` attribute. The host's + :meth:`_deliver_response` invokes :meth:`encode` once per scheduled + push (in JSON-mode runner deployments) to produce a JSON-safe + envelope for the runner; the handler calls :meth:`decode` + immediately before invoking :meth:`ChannelPush.push`. Object-mode + runners (the default in-process runner) bypass the codec entirely + and pass live references through verbatim. + + Encoded envelopes MUST be JSON-serialisable + (``dict``/``list``/``str``/``int``/``float``/``bool``/``None``). + Channels that cannot satisfy this for some inputs (e.g. arbitrary + workflow result objects without a stable schema) SHOULD raise a + typed :class:`PushPayloadNotSerializable` from :meth:`encode` + rather than return a best-effort representation; the host surfaces + that as a schedule-time error and the destination is treated as + skipped (other destinations still get their chance). + """ + + async def encode( + self, + *, + result: HostedRunResult[Any], + request: ChannelRequest, + identity: ChannelIdentity, + echo_result: HostedRunResult[Any] | None, + ) -> Mapping[str, Any]: + """Project the in-memory push triple into a JSON-safe envelope.""" + ... + + async def decode( + self, + envelope: Mapping[str, Any], + ) -> tuple[HostedRunResult[Any], ChannelRequest, ChannelIdentity, HostedRunResult[Any] | None]: + """Reconstruct ``(result, request, identity, echo_result)`` from an envelope.""" + ... + + +class PushPayloadNotSerializable(RuntimeError): + """Raised by a :class:`ChannelPushCodec` when the payload cannot be serialised. + + Channels raise this from :meth:`ChannelPushCodec.encode` when the + inbound :class:`HostedRunResult` carries content the codec has no + JSON projection for (e.g. an arbitrary workflow result with no + declared schema). The host surfaces the error eagerly at schedule + time rather than letting the runner discover it after persisting + a half-formed envelope. + """ + + +class PushPayloadNotPicklable(RuntimeError): + """Raised when a disk-persistent runner cannot pickle a scheduled task payload. + + The in-process runner falls back to pickle when ``state_dir`` is set + so a long-running host can resume in-flight pushes across restarts. + Most :class:`HostedRunResult` payloads (frozen dataclasses wrapping + :class:`AgentResponse` or workflow output) pickle without issue, but + a user-supplied workflow result or response hook may embed an + unpickleable object (live network client, ``asyncio.Lock``, generator). + The runner raises this at schedule time so the misconfig is loud + rather than silently downgrading to no-persistence. + """ + + +class HostStatePaths(TypedDict, total=False): + """Per-component disk paths for host-managed state. + + Pass an instance of this typed dict to + :class:`~agent_framework_hosting._host.AgentFrameworkHost`'s + ``state_dir`` parameter when you want to place individual components + on different volumes — for example, a fast local SSD for the runner + task queue and a network-attached durable volume for session state + that needs to survive container moves. + + All keys are optional (``total=False``): unset components fall back + to in-memory storage (or, for ``checkpoints``, to no checkpoint + persistence). Pass a single ``str``/``PathLike`` to ``state_dir`` + instead to get the default subfolder layout + (``state_dir/runner/``, ``state_dir/sessions/``, + ``state_dir/checkpoints/``, ``state_dir/links/``). + + Future components (continuations, ledger) will be added as additional + keys in subsequent releases. + """ + + runner: str | os.PathLike[str] + """Where :class:`~agent_framework_hosting._runner.InProcessTaskRunner` + persists its pending-task queue and bounded terminal-status cache. + Required for in-flight push retries to survive process restarts.""" + + sessions: str | os.PathLike[str] + """Where the host persists session aliases (from + :meth:`AgentFrameworkHost.reset_session`), the per-isolation-key + identity registry, and the last-active-channel map. Required for + ``ResponseTarget.active``/``.channel``/``.all_linked`` to find + destinations after a restart, and for ``reset_session`` rotations + to survive a restart.""" + + checkpoints: str | os.PathLike[str] + """Where the host persists workflow checkpoints for ``Workflow`` + targets. Equivalent to passing ``checkpoint_location=`` + directly: the host wraps it in a per-isolation-key + :class:`~agent_framework.FileCheckpointStorage`. Ignored when the + target is a ``SupportsAgentRun`` agent (a warning is emitted if you + set it explicitly via the mapping form). Pass the legacy + ``checkpoint_location`` parameter instead when you need to supply a + :class:`~agent_framework.CheckpointStorage` instance — it takes + precedence over this key.""" + + links: str | os.PathLike[str] + """Where identity-linker implementations persist their link store: + pending link challenges/grants, channel-native identity to linked + isolation-key mappings, and verified-claim metadata. The core host + does not impose a storage format; concrete :class:`IdentityLinker` + implementations that support host-provided persistence receive this + path via ``configure_link_store_path``. If a linker manages its own + persistence, omit this key or configure that linker directly.""" + + +# A transform hook runs over each AgentResponseUpdate as the channel consumes +# the stream. It can return a replacement update, ``None`` to drop the update, +# or be async. Channels apply it during iteration so that channel-specific +# concerns (e.g. masking, redaction, formatting for the wire) live close to +# the channel rather than on the agent. +ChannelStreamTransformHook = Callable[ + [AgentResponseUpdate], + "AgentResponseUpdate | Awaitable[AgentResponseUpdate | None] | None", +] + + +# --------------------------------------------------------------------------- # +# Channel run hook +# --------------------------------------------------------------------------- # + + +# Run hooks accept the channel-built ``ChannelRequest`` and return a +# (possibly modified) replacement. Channels invoke the hook with both the +# request and the channel-side context as keyword arguments — the call +# convention is ``await hook(request, target=..., protocol_request=...)``. +# +# The ergonomic minimum for a hook implementation is therefore a function +# accepting ``request`` positionally plus ``**kwargs`` and returning a +# (possibly mutated) :class:`ChannelRequest`. Hooks that need the agent +# target or the raw channel-native payload pull them off the keyword +# arguments by name (``target`` / ``protocol_request``). +# +# ``protocol_request`` is the raw, channel-native payload the channel +# parsed (the JSON body for Responses, the Telegram ``Update`` dict, the +# Bot Framework ``Activity`` for Teams). Use it when the hook needs a +# field the channel did not lift onto ``ChannelRequest`` (e.g. OpenAI's +# ``safety_identifier``, Teams' ``from.aadObjectId``, …). +ChannelRunHook = Callable[..., "Awaitable[ChannelRequest] | ChannelRequest"] + + +async def apply_run_hook( + hook: ChannelRunHook, + request: ChannelRequest, + *, + target: SupportsAgentRun | Workflow, + protocol_request: Any | None, +) -> ChannelRequest: + """Channel-side helper to invoke a :data:`ChannelRunHook` with the standard kwargs. + + Channels call this rather than calling the hook directly so the + invocation convention (``request`` positional, ``target`` / + ``protocol_request`` keyword) is enforced in one place. + """ + result = hook(request, target=target, protocol_request=protocol_request) + if isinstance(result, Awaitable): + return await result + return result + + +# --------------------------------------------------------------------------- # +# Channel response hook +# --------------------------------------------------------------------------- # + + +class ChannelResponseContext: + """Per-destination context handed to a :data:`ChannelResponseHook`. + + Response hooks run on the *output* side of the host pipeline, after + the agent / workflow has produced a :class:`HostedRunResult` but + before the destination channel serialises it to its wire format. + Hooks may need to make decisions based on *where* the payload is + headed — e.g. flatten multi-modal output to text for a text-only + destination, or pick which content variant to deliver to a card- + capable channel. The context captures that information without + forcing hooks to parse stringly destination tokens. + """ + + def __init__( + self, + request: ChannelRequest, + channel_name: str, + destination_identity: ChannelIdentity | None, + originating: bool, + is_echo: bool = False, + ) -> None: + self.request = request + self.channel_name = channel_name + # ``None`` when the originating channel is rendering its own reply + # (no push identity needed for "respond on the wire you came in + # on") or when the destination is named without a known native id. + self.destination_identity = destination_identity + # True when this hook invocation is for the originating channel's + # synchronous reply. False for non-originating push targets. + self.originating = originating + # True when the payload being shaped is the user-message echo + # rather than the agent response (only happens when + # ``ResponseTarget.echo_input`` is set). + self.is_echo = is_echo + + +# Response hooks accept the :class:`HostedRunResult` the host has assembled +# and return a (possibly modified) replacement. Channels invoke the hook +# with both the payload and the per-destination +# :class:`ChannelResponseContext` as keyword arguments — the call +# convention is ``await hook(result, context=...)``. +# +# The ergonomic minimum for a hook implementation is a function accepting +# ``result`` positionally plus ``**kwargs`` and returning a (possibly +# rewritten) :class:`HostedRunResult`. Hooks that need to branch on the +# destination read it off the ``context`` keyword argument. +# +# ``HostedRunResult`` is generic in the underlying ``result`` type; the +# hook callable signature stays ``Any``-typed so a single +# ``response_hook`` attribute on a channel can serve both agent +# (``HostedRunResult[AgentResponse]``) and workflow +# (``HostedRunResult[WorkflowRunResult]``) payloads — channels narrow +# at hook entry if they need static checking. +ChannelResponseHook = Callable[..., "Awaitable[HostedRunResult[Any]] | HostedRunResult[Any]"] + + +async def apply_response_hook( + hook: ChannelResponseHook, + result: HostedRunResult[Any], + *, + context: ChannelResponseContext, +) -> HostedRunResult[Any]: + """Channel-side helper to invoke a :data:`ChannelResponseHook` with the standard kwargs. + + Channels (and the host's delivery layer) call this rather than calling + the hook directly so the invocation convention (``result`` positional, + ``context`` keyword) is enforced in one place. + """ + out = hook(result, context=context) + if isinstance(out, Awaitable): + return await out + return out + + +# --------------------------------------------------------------------------- # +# Channel protocols +# --------------------------------------------------------------------------- # + + +@runtime_checkable +class Channel(Protocol): + """A pluggable adapter that exposes one transport on the host. + + Channels publish their routes, commands, and lifecycle callbacks via + :meth:`contribute`. The host mounts them under the channel's ``path`` + (or at the app root when ``path == ""``) and gives the channel a + :class:`ChannelContext` so it can call back into the host to invoke + the agent target and deliver responses. + """ + + name: str + path: str # default mount path (e.g. "/responses"); use "" to mount routes at the app root + + def contribute(self, context: ChannelContext) -> ChannelContribution: ... + + +@runtime_checkable +class ChannelPush(Protocol): + r"""Optional capability: a channel that can deliver outbound messages without a prior request. + + Per SPEC-002 (req #13), channels that can do proactive delivery + (Telegram bot proactive message, Teams proactive bot message, + webhook callbacks, SSE broadcasts) implement ``push`` on top of the + base :class:`Channel` protocol. Channels without push can only be + addressed as the ``originating`` :class:`ResponseTarget`. + + Distinguishing user echoes from agent replies + --------------------------------------------- + When the originating :class:`ResponseTarget` opts in to + ``echo_input=True``, the host pushes the user's input message to + each non-originating destination **before** the agent reply. Both + pushes go through the same ``push(identity, payload)`` entry point; + the channel distinguishes them by inspecting the role on the + payload's underlying :class:`~agent_framework.Message`\\(s): + + * ``payload.result.messages[i].role == "user"`` → the echo phase + (originating user's turn mirrored onto this destination so the + channel's UX can stay coherent with the user's actual prompt). + Channels that cannot impersonate the user (most chat bots can + only send AS the bot) typically render echoes as a quoted / + prefixed block, drop them, or skip them via a + ``response_hook`` — see below. + * ``payload.result.messages[i].role == "assistant"`` → the agent's + reply. + + Channels that want to branch on phase WITHOUT inspecting roles can + instead expose a ``response_hook`` attribute on the channel + instance: the host calls the hook with a + :class:`ChannelResponseContext` whose ``is_echo`` flag carries the + same phase information explicitly, and the hook returns a + (possibly rewritten) :class:`HostedRunResult` that the host then + hands to ``push``. The hook seam is duck-typed and intentionally + NOT part of this Protocol so adding hook support to an existing + channel never breaks its public contract. + """ + + name: str + + async def push(self, identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: ... + + +# --------------------------------------------------------------------------- # +# Durable task runner — pluggable seam for non-originating push fan-out and +# (in v1 fast-follow) background runs. See spec §"Durable task runner". +# --------------------------------------------------------------------------- # + + +@dataclass(frozen=True) +class RetryPolicy: + """Retry contract a :class:`DurableTaskRunner` honours per scheduled task. + + Defaults are deliberately conservative — five attempts on a 1s/2x/60s + exponential backoff — so a transient channel outage (Telegram returning + 502, Activity Protocol token refresh) is rerouted to retry without the + operator wiring anything. Adapter backends (TaskHub, Foundry durable + tasks) MAY translate this into their native retry primitive; the + in-process runner implements it directly via ``asyncio.sleep``. + """ + + max_attempts: int = 5 + initial_backoff_seconds: float = 1.0 + backoff_multiplier: float = 2.0 + max_backoff_seconds: float = 60.0 + + +@dataclass(frozen=True) +class TaskHandle: + """Opaque, runner-issued handle for a scheduled task. + + Callers receive one of these from :meth:`DurableTaskRunner.schedule` and + pass it back to :meth:`DurableTaskRunner.get` to poll status. ``task_id`` + is opaque — its shape is implementation-defined (UUID for the in-process + runner, instance id for TaskHub, scheduled-task arn for Foundry). The + ``name`` mirrors the handler name supplied to :meth:`schedule` so the + caller does not have to track it separately. + """ + + task_id: str + name: str + + +TaskStatus = Literal["scheduled", "running", "succeeded", "failed", "cancelled"] + + +@runtime_checkable +class DurableTaskRunner(Protocol): + """Pluggable seam the host uses to schedule out-of-band work. + + The host registers a single internal handler — ``"hosting.push"`` — at + startup; each non-originating push destination becomes a + ``runner.schedule("hosting.push", payload)`` call. The handler resolves + the destination channel, runs its ``response_hook`` (if any), and calls + :meth:`ChannelPush.push`. Failures inside the handler are caught by the + runner, retried per the supplied :class:`RetryPolicy`, and ultimately + marked terminal-failed when ``max_attempts`` is exhausted. + + Two implementations ship in the framework: an in-process default + (``InProcessTaskRunner``, asyncio + bounded retry, no cross-restart + persistence) suitable for ``runtime_mode="long_running"`` deployments, + plus adapter packages (``agent-framework-hosting-durabletask``, a future + Foundry adapter) for ``runtime_mode="ephemeral"`` deployments that need + cross-restart durability. + + Adapters MUST publish their ``payload_mode`` so the host's startup + validator can pair runner persistence expectations with channel + push-codec capabilities. Object-mode runners accept live Python + references in the payload (the in-process default does this for + speed); JSON-mode runners persist payloads across process restarts + and therefore require every push-capable channel to expose a + :class:`ChannelPushCodec`. + """ + + # Adapter classes set this explicitly; the host inspects it at + # construction time. Default is conservative ("object") so a runner + # that omits the attribute is treated as in-process-only and does + # not silently impose a JSON requirement on channels. + payload_mode: DurableTaskPayloadMode + + def register( + self, + name: str, + handler: Callable[[Mapping[str, Any]], Awaitable[None]], + ) -> None: + """Register a named handler the runner will invoke when a task fires. + + Re-registering under the same name replaces the previous handler. + Implementations SHOULD raise :class:`RuntimeError` if called after + the runner has been started, to avoid silent reorderings of in-flight + work; the in-process runner enforces this. + """ + ... + + async def schedule( + self, + name: str, + payload: Mapping[str, Any], + *, + retry_policy: RetryPolicy | None = None, + ) -> TaskHandle: + """Schedule a previously-registered handler invocation. + + ``name`` MUST match a name previously passed to :meth:`register`. The + ``payload`` is forwarded verbatim to the handler; implementations + MUST treat it as opaque (no introspection, no normalization). + ``retry_policy`` overrides the runner's default for this task only; + ``None`` means "use the runner-wide default". + + Returns a :class:`TaskHandle` the caller may use with :meth:`get` to + poll status. Returning the handle MUST NOT wait for the task to run + — scheduling is fire-and-forget from the caller's perspective. + """ + ... + + async def get(self, handle: TaskHandle) -> TaskStatus | None: + """Return the current status of a scheduled task. + + Returns ``None`` if the runner no longer has any record of the task + (e.g. it was scheduled in a prior process and the runner has no + persistent backing). Otherwise one of the :data:`TaskStatus` values. + """ + ... + + +__all__ = [ + "AgentResponse", + "AgentResponseUpdate", + "Channel", + "ChannelCommand", + "ChannelCommandContext", + "ChannelContribution", + "ChannelIdentity", + "ChannelPush", + "ChannelPushCodec", + "ChannelRequest", + "ChannelResponseContext", + "ChannelResponseHook", + "ChannelRunHook", + "ChannelSession", + "ChannelStreamTransformHook", + "DurableTaskPayloadMode", + "DurableTaskRunner", + "HostStatePaths", + "HostedRunResult", + "PushPayloadNotPicklable", + "PushPayloadNotSerializable", + "ResponseStream", + "ResponseTarget", + "ResponseTargetKind", + "RetryPolicy", + "TaskHandle", + "TaskStatus", + "apply_response_hook", + "apply_run_hook", +] diff --git a/python/packages/hosting/pyproject.toml b/python/packages/hosting/pyproject.toml new file mode 100644 index 00000000000..f412c84c293 --- /dev/null +++ b/python/packages/hosting/pyproject.toml @@ -0,0 +1,110 @@ +[project] +name = "agent-framework-hosting" +description = "Multi-channel hosting for Microsoft Agent Framework agents." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "starlette>=0.37", +] + +[project.optional-dependencies] +serve = [ + "hypercorn>=0.17", +] +disk = [ + "diskcache>=5.6", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting"] +exclude = ['tests'] + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" + +[dependency-groups] +dev = [ + "httpx>=0.28.1", +] diff --git a/python/packages/hosting/tests/__init__.py b/python/packages/hosting/tests/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/python/packages/hosting/tests/_workflow_fixtures.py b/python/packages/hosting/tests/_workflow_fixtures.py new file mode 100644 index 00000000000..f59bb8cab8e --- /dev/null +++ b/python/packages/hosting/tests/_workflow_fixtures.py @@ -0,0 +1,43 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Workflow fixtures for hosting tests. + +Defined in a module that does not use ``from __future__ import annotations`` +because the workflow handler validation reflects on real annotation objects +rather than stringified forms. +""" + +from agent_framework import Executor, Workflow, WorkflowBuilder, WorkflowContext, handler + + +class _UpperExecutor(Executor): + @handler + async def handle(self, text: str, ctx: WorkflowContext[str]) -> None: + await ctx.yield_output(text.upper()) + + +class _EchoExecutor(Executor): + @handler + async def handle(self, text: str, ctx: WorkflowContext[str]) -> None: + await ctx.yield_output(text) + + +def build_upper_workflow() -> Workflow: + return WorkflowBuilder(start_executor=_UpperExecutor(id="upper")).build() + + +def build_echo_workflow() -> Workflow: + return WorkflowBuilder(start_executor=_EchoExecutor(id="echo")).build() + + +class _MultiChunkExecutor(Executor): + """Yields three separate ``output`` events so streaming has something to chew on.""" + + @handler + async def handle(self, text: str, ctx: WorkflowContext[str]) -> None: + for chunk in (f"{text}-1", f"{text}-2", f"{text}-3"): + await ctx.yield_output(chunk) + + +def build_multi_chunk_workflow() -> Workflow: + return WorkflowBuilder(start_executor=_MultiChunkExecutor(id="multi")).build() diff --git a/python/packages/hosting/tests/test_authorization.py b/python/packages/hosting/tests/test_authorization.py new file mode 100644 index 00000000000..9de3f6b5cad --- /dev/null +++ b/python/packages/hosting/tests/test_authorization.py @@ -0,0 +1,580 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for the authorization and identity-linking seam.""" + +from __future__ import annotations + +from collections.abc import Collection +from typing import Any + +import pytest + +from agent_framework_hosting import ( + AgentFrameworkHost, + AllOfAllowlists, + AllowAll, + Allowed, + AllowlistDecision, + AnyOfAllowlists, + AuthorizationContext, + AuthPolicy, + CallableAllowlist, + ChannelConfigurationError, + ChannelContext, + ChannelContribution, + ChannelIdentity, + Denied, + LinkChallenge, + LinkedClaimAllowlist, + LinkedIdentity, + LinkRequired, + NativeIdAllowlist, +) + +# --------------------------------------------------------------------------- # +# Fakes # +# --------------------------------------------------------------------------- # + + +class _ChannelStub: + name: str = "stub" + path: str = "/stub" + require_link: bool = False + allowlist: Any = "inherit" + emits_verified_claims: bool = False + + def __init__( + self, + *, + name: str = "stub", + require_link: bool = False, + allowlist: Any = "inherit", + emits_verified_claims: bool = False, + ) -> None: + self.name = name + self.path = f"/{name}" + self.require_link = require_link + self.allowlist = allowlist + self.emits_verified_claims = emits_verified_claims + + def contribute(self, context: ChannelContext) -> ChannelContribution: + return ChannelContribution(routes=[]) + + +class _AgentStub: + """Bare minimum target — the validators run during ``__init__``, + not on first request, so the target is never actually invoked.""" + + async def run(self, *args: Any, **kwargs: Any) -> Any: # pragma: no cover + raise NotImplementedError + + +class _StaticLinker: + """Test linker returning either a linked identity or a challenge.""" + + def __init__(self, result: LinkedIdentity | LinkChallenge) -> None: + self.result = result + self.calls: list[ChannelIdentity] = [] + + async def resolve(self, identity: ChannelIdentity) -> LinkedIdentity | LinkChallenge: + self.calls.append(identity) + return self.result + + +def _ctx_pre_link(channel: str = "telegram", native_id: str = "42") -> AuthorizationContext: + return AuthorizationContext( + identity=ChannelIdentity(channel=channel, native_id=native_id), + phase="pre_link", + ) + + +def _ctx_post_link(claims: dict[str, str] | None = None) -> AuthorizationContext: + return AuthorizationContext( + identity=ChannelIdentity(channel="telegram", native_id="42"), + phase="post_link", + isolation_key="alice", + verified_claims=claims or {}, + claim_source="linker", + ) + + +# --------------------------------------------------------------------------- # +# Built-in allowlists # +# --------------------------------------------------------------------------- # + + +class TestAllowAll: + async def test_allows_both_phases(self) -> None: + a = AllowAll() + assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW + assert await a.evaluate(_ctx_post_link()) is AllowlistDecision.ALLOW + + def test_does_not_require_linked_claims(self) -> None: + assert AllowAll().requires_linked_claims is False + + +class TestNativeIdAllowlist: + async def test_allows_listed_id(self) -> None: + a = NativeIdAllowlist({"42", "99"}) + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW + + async def test_denies_unlisted_id(self) -> None: + a = NativeIdAllowlist({"42"}) + assert await a.evaluate(_ctx_pre_link(native_id="99")) is AllowlistDecision.DENY + + async def test_channel_filter_abstains_for_other_channels(self) -> None: + # The native-id list is scoped to "telegram" — a request from + # another channel should ABSTAIN so a combinator can give a + # parallel allowlist a chance to ALLOW. + a = NativeIdAllowlist({"42"}, channel="telegram") + assert await a.evaluate(_ctx_pre_link(channel="slack", native_id="42")) is AllowlistDecision.ABSTAIN + + async def test_channel_filter_evaluates_matching_channel(self) -> None: + a = NativeIdAllowlist({"42"}, channel="telegram") + assert await a.evaluate(_ctx_pre_link(channel="telegram", native_id="42")) is AllowlistDecision.ALLOW + assert await a.evaluate(_ctx_pre_link(channel="telegram", native_id="99")) is AllowlistDecision.DENY + + async def test_async_loader_caches_after_first_call(self) -> None: + # The loader should run once; subsequent ``evaluate`` calls hit + # the cache so a slow / costly source isn't re-queried per + # message. + calls = {"n": 0} + + async def loader() -> Collection[str]: + calls["n"] += 1 + return {"42"} + + a = NativeIdAllowlist(loader) + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW + assert calls["n"] == 1 + + +class TestLinkedClaimAllowlist: + """Claim allowlists abstain pre-link and decide once claims are available.""" + + def test_declares_requires_linked_claims(self) -> None: + a = LinkedClaimAllowlist("oid", ["abc"]) + assert a.requires_linked_claims is True + + async def test_pre_link_abstains(self) -> None: + a = LinkedClaimAllowlist("oid", ["abc"]) + assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ABSTAIN + + async def test_post_link_allows_matching_claim(self) -> None: + a = LinkedClaimAllowlist("oid", ["abc"]) + assert await a.evaluate(_ctx_post_link({"oid": "abc"})) is AllowlistDecision.ALLOW + + async def test_post_link_allows_matching_multi_value_claim(self) -> None: + a = LinkedClaimAllowlist("groups", ["admins"]) + ctx = AuthorizationContext( + identity=ChannelIdentity(channel="telegram", native_id="42"), + phase="post_link", + isolation_key="alice", + verified_claims={"groups": ("users", "admins")}, + claim_source="linker", + ) + assert await a.evaluate(ctx) is AllowlistDecision.ALLOW + + async def test_post_link_denies_missing_or_nonmatching_claim(self) -> None: + a = LinkedClaimAllowlist("oid", ["abc"]) + assert await a.evaluate(_ctx_post_link({"oid": "def"})) is AllowlistDecision.DENY + assert await a.evaluate(_ctx_post_link({"tid": "abc"})) is AllowlistDecision.DENY + + +class TestAnyOfAllowlists: + async def test_any_allow_wins(self) -> None: + a = AnyOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"99"})) + # native_id=42 → first ALLOWs, short-circuit. + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW + + async def test_all_deny_yields_deny(self) -> None: + # Both lists deny native_id=7. + a = AnyOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"99"})) + assert await a.evaluate(_ctx_pre_link(native_id="7")) is AllowlistDecision.DENY + + async def test_abstain_when_no_decision(self) -> None: + # Channel-scoped lists both ABSTAIN on a "slack" request. + a = AnyOfAllowlists( + NativeIdAllowlist({"42"}, channel="telegram"), + NativeIdAllowlist({"99"}, channel="teams"), + ) + assert await a.evaluate(_ctx_pre_link(channel="slack", native_id="42")) is AllowlistDecision.ABSTAIN + + async def test_empty_is_abstain(self) -> None: + # No children → ABSTAIN (not DENY) to avoid silent deny-all. + a = AnyOfAllowlists() + assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ABSTAIN + + def test_propagates_requires_linked_claims(self) -> None: + a = AnyOfAllowlists(NativeIdAllowlist({"42"}), LinkedClaimAllowlist("oid", [])) + assert a.requires_linked_claims is True + + +class TestAllOfAllowlists: + async def test_any_deny_short_circuits(self) -> None: + a = AllOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"99"})) + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.DENY + + async def test_all_allow_yields_allow(self) -> None: + a = AllOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"42", "99"})) + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW + + async def test_abstain_when_no_deny_but_no_unanimous_allow(self) -> None: + a = AllOfAllowlists( + NativeIdAllowlist({"42"}, channel="telegram"), + NativeIdAllowlist({"42"}, channel="teams"), + ) + # ABSTAIN from teams (different channel), ALLOW from telegram → ABSTAIN. + assert await a.evaluate(_ctx_pre_link(channel="telegram", native_id="42")) is AllowlistDecision.ABSTAIN + + async def test_empty_is_abstain(self) -> None: + a = AllOfAllowlists() + assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ABSTAIN + + +class TestCallableAllowlist: + async def test_wraps_async_fn(self) -> None: + async def fn(ctx: AuthorizationContext) -> AllowlistDecision: + if ctx.identity.native_id == "42": + return AllowlistDecision.ALLOW + return AllowlistDecision.DENY + + a = CallableAllowlist(fn) + assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW + assert await a.evaluate(_ctx_pre_link(native_id="99")) is AllowlistDecision.DENY + + def test_requires_linked_claims_passthrough(self) -> None: + async def fn(_: AuthorizationContext) -> AllowlistDecision: # pragma: no cover + return AllowlistDecision.ALLOW + + a = CallableAllowlist(fn, requires_linked_claims=True) + assert a.requires_linked_claims is True + + +class TestAuthPolicy: + async def test_factory_helpers_return_working_allowlists(self) -> None: + assert await AuthPolicy.open().evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW + assert await AuthPolicy.native_ids({"42"}).evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW + assert await AuthPolicy.linked_claim("oid", {"abc"}).evaluate(_ctx_post_link({"oid": "abc"})) is ( + AllowlistDecision.ALLOW + ) + + async def test_custom_factory(self) -> None: + async def fn(_: AuthorizationContext) -> AllowlistDecision: + return AllowlistDecision.ALLOW + + policy = AuthPolicy.custom(fn, requires_linked_claims=True) + assert policy.requires_linked_claims is True + assert await policy.evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW + + +# --------------------------------------------------------------------------- # +# Host configuration validator # +# --------------------------------------------------------------------------- # + + +class TestChannelAuthorizationValidator: + """The host's startup validator catches three classes of misconfig + so they fail at construction rather than silently denying every + user at runtime.""" + + def test_require_link_without_linker_raises(self) -> None: + # ``require_link=True`` with no linker would silently reject + # every request — caught at construction. + with pytest.raises(ChannelConfigurationError, match="identity_linker"): + AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True)], + ) + + def test_require_link_with_linker_passes(self) -> None: + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True)], + identity_linker=_StaticLinker(LinkedIdentity("alice", {"oid": "abc"})), + ) + assert host.runtime_mode == "long_running" + + def test_linked_claim_allowlist_without_claim_source_raises(self) -> None: + # The channel has no ``require_link=True`` AND doesn't emit + # claims natively → the allowlist would always DENY / ABSTAIN. + with pytest.raises(ChannelConfigurationError, match="verified IdP claims"): + AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(allowlist=LinkedClaimAllowlist("oid", []))], + ) + + def test_linked_claim_allowlist_with_native_claim_source_passes(self) -> None: + # When the channel declares ``emits_verified_claims=True`` + # (e.g. Activity Protocol with AAD bearer) the validator + # accepts the LinkedClaimAllowlist without needing a linker. + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[ + _ChannelStub( + allowlist=LinkedClaimAllowlist("oid", ["abc"]), + emits_verified_claims=True, + ) + ], + ) + assert host.default_allowlist is None + + def test_linked_claim_allowlist_with_require_link_and_linker_passes(self) -> None: + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], + identity_linker=_StaticLinker(LinkedIdentity("alice", {"oid": "abc"})), + ) + assert host.runtime_mode == "long_running" + + def test_native_id_allowlist_unknown_channel_raises(self) -> None: + with pytest.raises(ChannelConfigurationError, match="unknown channel 'mystery'"): + AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(allowlist=NativeIdAllowlist({"42"}, channel="mystery"))], + ) + + def test_native_id_allowlist_known_channel_passes(self) -> None: + # A channel-scoped native list pointing at a peer channel is + # the supported way to compose per-channel allowlists. + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[ + _ChannelStub(name="telegram", allowlist=NativeIdAllowlist({"42"}, channel="telegram")), + _ChannelStub(name="slack"), + ], + ) + assert host.runtime_mode == "long_running" + + def test_default_allowlist_applies_to_inheriting_channel(self) -> None: + # ``allowlist="inherit"`` (the default) picks up the host-level + # ``default_allowlist``. This is the "lock down a whole bot in + # one place" ergonomic. + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(name="telegram")], + default_allowlist=NativeIdAllowlist({"42"}), + ) + # The default flowed through; channel sees the host's allowlist. + assert host.default_allowlist is not None + + def test_explicit_none_carve_out_overrides_default(self) -> None: + # ``allowlist=None`` on a channel explicitly opts out of the + # host default — useful for a public endpoint inside an + # otherwise locked-down host. + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(name="public", allowlist=None)], + default_allowlist=NativeIdAllowlist({"42"}), + ) + # Construction succeeded; the validator did not raise. + assert host.default_allowlist is not None + + def test_combinator_with_unknown_nested_channel_raises(self) -> None: + # The validator walks ``AnyOfAllowlists`` / ``AllOfAllowlists`` + # so a typo'd channel name nested under a combinator is still + # caught at construction. + with pytest.raises(ChannelConfigurationError, match="unknown channel 'typo'"): + AgentFrameworkHost( + target=_AgentStub(), + channels=[ + _ChannelStub( + allowlist=AnyOfAllowlists( + NativeIdAllowlist({"42"}, channel="stub"), + NativeIdAllowlist({"99"}, channel="typo"), + ) + ) + ], + ) + + +# --------------------------------------------------------------------------- # +# host.authorize pipeline # +# --------------------------------------------------------------------------- # + + +class TestHostAuthorize: + """Host authorization pipeline across open, native-id, and linked-claim profiles.""" + + def _host(self) -> AgentFrameworkHost: + return AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) + + async def test_open_profile_returns_allowed_with_auto_isolation_key(self) -> None: + host = self._host() + outcome = await host.authorize(ChannelIdentity(channel="telegram", native_id="42")) + assert isinstance(outcome, Allowed) + assert outcome.isolation_key == "telegram:42" + + async def test_native_allowlist_allows_listed_id(self) -> None: + host = self._host() + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + allowlist=NativeIdAllowlist({"42"}), + ) + assert isinstance(outcome, Allowed) + assert outcome.isolation_key == "telegram:42" + + async def test_native_allowlist_denies_unlisted_id(self) -> None: + host = self._host() + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="99"), + allowlist=NativeIdAllowlist({"42"}), + ) + assert isinstance(outcome, Denied) + assert outcome.reason_code == "allowlist_denied_pre_link" + assert outcome.user_message is not None + # The bland default leaks neither tenant nor list size. + assert "telegram" not in (outcome.user_message or "") + + async def test_abstain_with_claim_requirement_yields_link_required_message(self) -> None: + # Without a linker and without channel-emitted claims, a claim-required + # allowlist cannot make progress and the host returns a safe denial. + async def abstain(_: AuthorizationContext) -> AllowlistDecision: + return AllowlistDecision.ABSTAIN + + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(emits_verified_claims=True)], + ) + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + allowlist=CallableAllowlist(abstain, requires_linked_claims=True), + ) + assert isinstance(outcome, Denied) + assert outcome.reason_code == "allowlist_requires_link" + + async def test_abstain_without_claim_requirement_falls_through_to_allowed(self) -> None: + async def abstain(_: AuthorizationContext) -> AllowlistDecision: + return AllowlistDecision.ABSTAIN + + host = self._host() + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + allowlist=CallableAllowlist(abstain), + ) + assert isinstance(outcome, Allowed) + + async def test_auto_issue_returns_existing_key_when_known(self) -> None: + # When an identity has already been observed, the auto-issued + # key matches the existing one rather than coining a fresh + # token. This is the linker-free equivalent of identity resolution. + host = self._host() + host._identities["alice"] = {"telegram": ChannelIdentity(channel="telegram", native_id="42")} + outcome = await host.authorize(ChannelIdentity(channel="telegram", native_id="42")) + assert isinstance(outcome, Allowed) + assert outcome.isolation_key == "alice" + + async def test_verified_claims_propagate_to_context(self) -> None: + # Channels that natively carry verified claims (e.g. Activity + # Protocol bearer with AAD oid) pass them through to + # ``authorize`` — the allowlist sees them on the + # ``AuthorizationContext``. + seen: list[AuthorizationContext] = [] + + async def capture(ctx: AuthorizationContext) -> AllowlistDecision: + seen.append(ctx) + return AllowlistDecision.ALLOW + + host = self._host() + await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + allowlist=CallableAllowlist(capture), + verified_claims={"oid": "abc"}, + ) + assert len(seen) == 1 + assert seen[0].claim_source == "channel" + assert dict(seen[0].verified_claims) == {"oid": "abc"} + + async def test_require_link_returns_challenge_when_unlinked(self) -> None: + challenge = LinkChallenge("c1", url="https://login.example/c1") + linker = _StaticLinker(challenge) + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True)], + identity_linker=linker, + ) + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + require_link=True, + ) + assert isinstance(outcome, LinkRequired) + assert outcome.challenge is challenge + assert [call.native_id for call in linker.calls] == ["42"] + + async def test_require_link_returns_linked_identity_when_resolved(self) -> None: + linked = LinkedIdentity("entra:abc", {"oid": "abc"}) + linker = _StaticLinker(linked) + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True)], + identity_linker=linker, + ) + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + require_link=True, + ) + assert isinstance(outcome, Allowed) + assert outcome.isolation_key == "entra:abc" + assert dict(outcome.verified_claims) == {"oid": "abc"} + assert outcome.claim_source == "linker" + # authorize() is decision-only; identity registry writes remain on + # the request execution path. + assert host._identities == {} + + async def test_linked_claim_allowlist_with_linker_allows_matching_claim(self) -> None: + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], + identity_linker=_StaticLinker(LinkedIdentity("entra:abc", {"oid": "abc"})), + ) + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + require_link=True, + allowlist=LinkedClaimAllowlist("oid", ["abc"]), + ) + assert isinstance(outcome, Allowed) + assert outcome.isolation_key == "entra:abc" + + async def test_linked_claim_allowlist_with_linker_denies_nonmatching_claim(self) -> None: + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], + identity_linker=_StaticLinker(LinkedIdentity("entra:def", {"oid": "def"})), + ) + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + require_link=True, + allowlist=LinkedClaimAllowlist("oid", ["abc"]), + ) + assert isinstance(outcome, Denied) + assert outcome.reason_code == "allowlist_denied_post_link" + + async def test_linked_claim_allowlist_with_linker_returns_challenge_when_unlinked(self) -> None: + challenge = LinkChallenge("c1") + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], + identity_linker=_StaticLinker(challenge), + ) + outcome = await host.authorize( + ChannelIdentity(channel="telegram", native_id="42"), + require_link=True, + allowlist=LinkedClaimAllowlist("oid", ["abc"]), + ) + assert isinstance(outcome, LinkRequired) + assert outcome.challenge is challenge + + async def test_linked_claim_allowlist_uses_channel_verified_claims_without_linker(self) -> None: + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub(emits_verified_claims=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], + ) + outcome = await host.authorize( + ChannelIdentity(channel="activity", native_id="aad-user"), + allowlist=LinkedClaimAllowlist("oid", ["abc"]), + verified_claims={"oid": "abc"}, + ) + assert isinstance(outcome, Allowed) + assert outcome.isolation_key == "activity:aad-user" + assert outcome.claim_source == "channel" diff --git a/python/packages/hosting/tests/test_host.py b/python/packages/hosting/tests/test_host.py new file mode 100644 index 00000000000..283992f597f --- /dev/null +++ b/python/packages/hosting/tests/test_host.py @@ -0,0 +1,1846 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for :class:`AgentFrameworkHost` invocation, session, and delivery routing.""" + +from __future__ import annotations + +from collections.abc import AsyncIterator, Awaitable, Callable, Mapping, Sequence +from dataclasses import dataclass, field +from typing import Any + +import pytest +from agent_framework import AgentResponse, AgentResponseUpdate, Content, Message, ResponseStream +from starlette.requests import Request +from starlette.responses import JSONResponse +from starlette.routing import BaseRoute, Route +from starlette.testclient import TestClient + +from agent_framework_hosting import ( + AgentFrameworkHost, + Channel, + ChannelContext, + ChannelContribution, + ChannelIdentity, + ChannelPush, + ChannelRequest, + ChannelSession, + DurableTaskPayloadMode, + DurableTaskRunner, + HostedRunResult, + ResponseTarget, + RetryPolicy, + TaskHandle, + TaskStatus, +) + + +async def _ping(_request: Request) -> JSONResponse: + return JSONResponse({"ok": True}) + + +# --------------------------------------------------------------------------- # +# Fakes # +# --------------------------------------------------------------------------- # + + +@dataclass +class _FakeAgentSession: + session_id: str | None = None + service_session_id: str | None = None + + +@dataclass +class _FakeAgentResponse: + text: str + + @property + def messages(self) -> list[Message]: + # Real ``AgentResponse`` carries a list of messages; the host's + # ``_invoke`` forwards them on the ``HostedRunResult``. Synthesise + # a single assistant text message so tests that assert on + # ``payload.text`` keep working unchanged. + return [Message(role="assistant", contents=[Content.from_text(text=self.text)])] + + +class _FakeAgent: + """Minimal :class:`SupportsAgentRun` implementation that records invocations.""" + + def __init__(self, reply: str = "ok") -> None: + self._reply = reply + self.calls: list[dict[str, Any]] = [] + self.created_sessions: list[_FakeAgentSession] = [] + + def create_session(self, *, session_id: str | None = None) -> _FakeAgentSession: + s = _FakeAgentSession(session_id=session_id) + self.created_sessions.append(s) + return s + + async def run(self, messages: Any = None, *, stream: bool = False, session: Any = None, **kwargs: Any) -> Any: + self.calls.append({"messages": messages, "stream": stream, "session": session, "kwargs": kwargs}) + if stream: # pragma: no cover - not used by these tests + + async def _gen() -> AsyncIterator[Any]: + yield self._reply + + return _gen() + return _FakeAgentResponse(text=self._reply) + + +class _RecordingChannel: + """Minimal :class:`Channel` + :class:`ChannelPush` for routing tests.""" + + def __init__(self, name: str = "fake", path: str = "/fake", supports_push: bool = True) -> None: + self.name = name + self.path = path + self.context: ChannelContext | None = None + self.pushes: list[tuple[ChannelIdentity, HostedRunResult[Any]]] = [] + self._push_raises: Exception | None = None + self._supports_push = supports_push + # Provide a single trivial route so contribute() exercises the mount path. + self._routes: Sequence[BaseRoute] = (Route("/ping", _ping),) + + def contribute(self, context: ChannelContext) -> ChannelContribution: + self.context = context + return ChannelContribution(routes=self._routes) + + async def push(self, identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: + if self._push_raises is not None: + raise self._push_raises + self.pushes.append((identity, payload)) + + +class _NoPushChannel: + """A channel that does NOT implement :class:`ChannelPush`.""" + + def __init__(self, name: str = "nopush", path: str = "/nopush") -> None: + self.name = name + self.path = path + + def contribute(self, context: ChannelContext) -> ChannelContribution: + return ChannelContribution() + + +class _SyncTaskRunner(DurableTaskRunner): + """A :class:`DurableTaskRunner` that runs handlers inline. + + Tests of the delivery routing want deterministic, synchronous + behaviour. The real :class:`InProcessTaskRunner` schedules via + ``asyncio.create_task`` so push side effects only land *after* + the test has yielded control — awkward for assertions that read + a channel's recorded pushes immediately after + :meth:`ChannelContext.deliver_response` returns. + + Two knobs control failure handling: + + - ``schedule_raises``: when set, every call to :meth:`schedule` + raises this exception. Mimics a host-side outage (the durable + backend is unreachable). + - ``swallow_handler_errors`` (default ``True``): when the + handler raises, the error is recorded in + :attr:`handler_errors` but :meth:`schedule` still returns + successfully — matching the real durable contract that + "scheduled" is a separate signal from "delivered". Set to + ``False`` to surface handler exceptions through + :meth:`schedule` for the few tests that want to assert on + handler-raised failures inline. + """ + + def __init__(self, *, swallow_handler_errors: bool = True) -> None: + self._handlers: dict[str, Callable[[Mapping[str, Any]], Awaitable[None]]] = {} + self.scheduled: list[tuple[str, Mapping[str, Any]]] = [] + self.handler_errors: list[BaseException] = [] + self.schedule_raises: BaseException | None = None + self.swallow_handler_errors = swallow_handler_errors + + # Default object-mode matches the real ``InProcessTaskRunner`` — + # tests that want to exercise the JSON-mode path override this on + # the instance. + payload_mode = DurableTaskPayloadMode.OBJECT + + def register( + self, + name: str, + handler: Callable[[Mapping[str, Any]], Awaitable[None]], + ) -> None: + self._handlers[name] = handler + + async def schedule( + self, + name: str, + payload: Mapping[str, Any], + *, + retry_policy: RetryPolicy | None = None, + ) -> TaskHandle: + if self.schedule_raises is not None: + raise self.schedule_raises + self.scheduled.append((name, payload)) + try: + await self._handlers[name](payload) + except Exception as exc: + self.handler_errors.append(exc) + if not self.swallow_handler_errors: + raise + return TaskHandle(task_id=f"sync-{len(self.scheduled)}", name=name) + + async def get(self, handle: TaskHandle) -> TaskStatus | None: # pragma: no cover - unused + return "succeeded" + + +def _assistant_response(text: str) -> AgentResponse: + """Build a one-message ``AgentResponse`` to use as a ``HostedRunResult.result``.""" + return AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])]) + + +def _make_reply(text: str = "reply") -> HostedRunResult[AgentResponse]: + """Build a ``HostedRunResult[AgentResponse]`` carrying a single assistant text message. + + Test ergonomic mirroring what the host's ``_invoke`` produces for an + agent target — channels (and our delivery tests) receive a typed + envelope whose ``result`` is a real :class:`AgentResponse`. + """ + return HostedRunResult(_assistant_response(text)) + + +@dataclass +class _LifecycleChannel: + name: str = "lifecycle" + path: str = "" + started: list[str] = field(default_factory=list) + stopped: list[str] = field(default_factory=list) + + def contribute(self, context: ChannelContext) -> ChannelContribution: + async def on_start() -> None: + self.started.append("up") + + async def on_stop() -> None: + self.stopped.append("down") + + return ChannelContribution(on_startup=[on_start], on_shutdown=[on_stop]) + + +# --------------------------------------------------------------------------- # +# Host wiring # +# --------------------------------------------------------------------------- # + + +class TestHostWiring: + def test_channel_is_recognized(self) -> None: + ch = _RecordingChannel() + assert isinstance(ch, Channel) + assert isinstance(ch, ChannelPush) + + def test_app_mounts_channel_routes_under_path(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel(path="/fake") + host = AgentFrameworkHost(target=agent, channels=[ch]) + + with TestClient(host.app) as client: + r = client.get("/fake/ping") + assert r.status_code == 200 + assert r.json() == {"ok": True} + + def test_app_mounts_at_root_when_path_is_empty(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel(path="") + host = AgentFrameworkHost(target=agent, channels=[ch]) + + with TestClient(host.app) as client: + r = client.get("/ping") + assert r.status_code == 200 + + def test_app_is_cached(self) -> None: + host = AgentFrameworkHost(target=_FakeAgent(), channels=[_RecordingChannel()]) + assert host.app is host.app + + def test_lifespan_invokes_startup_and_shutdown(self) -> None: + agent = _FakeAgent() + ch = _LifecycleChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app): + assert ch.started == ["up"] + assert ch.stopped == ["down"] + + def test_app_exposes_readiness_probe(self) -> None: + host = AgentFrameworkHost(target=_FakeAgent(), channels=[_RecordingChannel()]) + with TestClient(host.app) as client: + r = client.get("/readiness") + assert r.status_code == 200 + assert r.text == "ok" + + +# --------------------------------------------------------------------------- # +# Invoke + sessions # +# --------------------------------------------------------------------------- # + + +class TestHostInvoke: + async def test_invoke_wraps_input_with_hosting_metadata(self) -> None: + agent = _FakeAgent(reply="hello") + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + # Force ``app`` build to trigger ``contribute``. + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="user:1"), + identity=ChannelIdentity(channel="responses", native_id="user:1"), + ) + result = await ch.context.run(req) + + assert result.result.text == "hello" + assert len(agent.calls) == 1 + msg = agent.calls[0]["messages"] + assert msg.role == "user" + assert msg.additional_properties["hosting"]["channel"] == "responses" + assert msg.additional_properties["hosting"]["identity"] == { + "channel": "responses", + "native_id": "user:1", + "attributes": {}, + } + assert msg.additional_properties["hosting"]["response_target"] == { + "kind": "originating", + "targets": [], + } + + async def test_invoke_caches_session_per_isolation_key(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req_a = ChannelRequest( + channel=ch.name, operation="op", input="1", session=ChannelSession(isolation_key="alice") + ) + req_b = ChannelRequest( + channel=ch.name, operation="op", input="2", session=ChannelSession(isolation_key="alice") + ) + req_c = ChannelRequest(channel=ch.name, operation="op", input="3", session=ChannelSession(isolation_key="bob")) + + await ch.context.run(req_a) + await ch.context.run(req_b) + await ch.context.run(req_c) + + # Two distinct sessions created (alice, bob) — never re-created. + assert len(agent.created_sessions) == 2 + assert agent.calls[0]["session"] is agent.calls[1]["session"] + assert agent.calls[0]["session"] is not agent.calls[2]["session"] + + async def test_session_disabled_does_not_create_session(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel=ch.name, + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + session_mode="disabled", + ) + await ch.context.run(req) + assert agent.created_sessions == [] + assert agent.calls[0]["session"] is None + + async def test_reset_session_rotates_id_and_drops_cache(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest(channel=ch.name, operation="op", input="x", session=ChannelSession(isolation_key="alice")) + await ch.context.run(req) + first_session = agent.calls[-1]["session"] + assert first_session.session_id == "alice" + + host.reset_session("alice") + await ch.context.run(req) + second_session = agent.calls[-1]["session"] + # New session, new id (alias rotation), distinct object. + assert second_session is not first_session + assert second_session.session_id != "alice" + assert second_session.session_id.startswith("alice#") + + async def test_options_propagates_to_target_run(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel=ch.name, + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + options={"temperature": 0.4}, + ) + await ch.context.run(req) + assert agent.calls[0]["kwargs"]["options"] == {"temperature": 0.4} + + +# --------------------------------------------------------------------------- # +# Workflow target # +# --------------------------------------------------------------------------- # + + +class TestHostWorkflowTarget: + """The host accepts a ``Workflow`` and dispatches to ``workflow.run(...)``.""" + + async def test_invoke_workflow_collapses_outputs_to_hosted_run_result(self) -> None: + from tests._workflow_fixtures import build_upper_workflow + + workflow = build_upper_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch]) + _ = host.app + assert ch.context is not None + + # The channel's run_hook is the canonical adapter from a free-form input + # to a workflow's typed input; here the start executor accepts ``str`` + # already so the channel forwards ``input`` verbatim. + req = ChannelRequest(channel="fake", operation="message.create", input="hello") + result = await ch.context.run(req) + + assert list(result.result.get_outputs()) == ["HELLO"] + # No session caching for workflow targets — Workflow has no + # ``create_session`` and the host must not invent one. + assert host._sessions == {} + + async def test_stream_workflow_yields_updates_and_finalizes(self) -> None: + from tests._workflow_fixtures import build_echo_workflow + + workflow = build_echo_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest(channel="fake", operation="message.create", input="hi") + stream = ch.context.run_stream(req) + + updates: list[AgentResponseUpdate] = [] + async for update in stream: + updates.append(update) + + # The echo workflow yields a single ``output`` event whose payload is + # the original string; the host wraps non-update payloads into a + # one-shot ``AgentResponseUpdate`` carrying the text. + assert [u.text for u in updates] == ["hi"] + # ``raw_representation`` preserves the source ``WorkflowEvent`` so + # advanced consumers (telemetry, debug UIs) can recover the full + # workflow timeline. + assert all(u.raw_representation is not None for u in updates) + + final = await stream.get_final_response() + assert final.text == "hi" + + async def test_stream_workflow_yields_one_update_per_output_event(self) -> None: + from tests._workflow_fixtures import build_multi_chunk_workflow + + workflow = build_multi_chunk_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest(channel="fake", operation="message.create", input="x") + stream = ch.context.run_stream(req) + + chunks: list[str] = [] + async for update in stream: + chunks.append(update.text) + # The originating ``executor_id`` is propagated via author_name so + # multi-agent workflows can route per-author rendering downstream. + assert update.author_name == "multi" + + assert chunks == ["x-1", "x-2", "x-3"] + final = await stream.get_final_response() + assert final.text == "x-1x-2x-3" + + +class TestHostWorkflowCheckpointing: + """The host scopes per-conversation checkpoints when ``checkpoint_location`` is set.""" + + def test_rejects_workflow_with_existing_checkpoint_storage(self, tmp_path: Any) -> None: + from agent_framework import InMemoryCheckpointStorage, WorkflowBuilder + + from tests._workflow_fixtures import _UpperExecutor + + workflow = WorkflowBuilder( + start_executor=_UpperExecutor(id="upper"), + checkpoint_storage=InMemoryCheckpointStorage(), + ).build() + with pytest.raises(RuntimeError, match="already has checkpoint storage"): + AgentFrameworkHost( + target=workflow, + channels=[_RecordingChannel()], + checkpoint_location=tmp_path, + ) + + def test_warns_when_target_is_agent(self, tmp_path: Any, caplog: Any) -> None: + import logging as _logging + + agent = _FakeAgent() + with caplog.at_level(_logging.WARNING, logger="agent_framework.hosting"): + host = AgentFrameworkHost(target=agent, channels=[_RecordingChannel()], checkpoint_location=tmp_path) + assert host._checkpoint_location is None + assert any("checkpoint_location" in rec.message for rec in caplog.records) + + async def test_invoke_skips_checkpointing_when_no_isolation_key(self, tmp_path: Any) -> None: + from tests._workflow_fixtures import build_upper_workflow + + workflow = build_upper_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch], checkpoint_location=tmp_path) + _ = host.app + assert ch.context is not None + + # No session -> no scoping key -> no checkpoint storage written. + req = ChannelRequest(channel="fake", operation="message.create", input="hi") + result = await ch.context.run(req) + + assert list(result.result.get_outputs()) == ["HI"] + assert list(tmp_path.iterdir()) == [] + + async def test_invoke_writes_checkpoint_under_isolation_key(self, tmp_path: Any) -> None: + from tests._workflow_fixtures import build_upper_workflow + + workflow = build_upper_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch], checkpoint_location=tmp_path) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="fake", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="alice"), + ) + result = await ch.context.run(req) + assert list(result.result.get_outputs()) == ["HI"] + + # FileCheckpointStorage rooted at / should + # have produced at least one checkpoint file scoped to that user. + scoped = tmp_path / "alice" + assert scoped.exists() + assert any(scoped.iterdir()), "expected at least one checkpoint to be written under the per-user dir" + + async def test_stream_writes_checkpoint_under_isolation_key(self, tmp_path: Any) -> None: + from tests._workflow_fixtures import build_echo_workflow + + workflow = build_echo_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch], checkpoint_location=tmp_path) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="fake", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="bob"), + ) + stream = ch.context.run_stream(req) + async for _ in stream: + pass + await stream.get_final_response() + + scoped = tmp_path / "bob" + assert scoped.exists() + assert any(scoped.iterdir()) + + async def test_caller_supplied_checkpoint_storage_used_as_is(self, tmp_path: Any) -> None: + from agent_framework import InMemoryCheckpointStorage + + from tests._workflow_fixtures import build_upper_workflow + + storage = InMemoryCheckpointStorage() + workflow = build_upper_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch], checkpoint_location=storage) + _ = host.app + assert ch.context is not None + assert host._checkpoint_location is storage + + req = ChannelRequest( + channel="fake", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="carol"), + ) + await ch.context.run(req) + + # The caller-owned storage is used directly (no per-user scoping + # applied by the host); a checkpoint should appear in it. + checkpoints = await storage.list_checkpoints(workflow_name=workflow.name) + assert checkpoints, "expected the caller-supplied storage to receive a checkpoint" + # And nothing should have been written into the tmp_path tree. + assert list(tmp_path.iterdir()) == [] + + +class TestCheckpointPathForIsolationKey: + """Path-traversal hardening for isolation keys joined into checkpoint paths.""" + + @pytest.mark.parametrize( + "isolation_key", + [ + "alice", + "telegram:42", + "entra:abc-def_0123", + "responses:user.name", + "x" * 200, + ], + ) + def test_accepts_legitimate_keys(self, tmp_path: Any, isolation_key: str) -> None: + from agent_framework_hosting._host import _checkpoint_path_for_isolation_key + + target = _checkpoint_path_for_isolation_key(tmp_path, isolation_key) + assert target == (tmp_path / isolation_key).resolve() + assert target.is_relative_to(tmp_path.resolve()) + + @pytest.mark.parametrize( + "isolation_key", + [ + "", + ".", + "..", + "...", + "../etc", + "../../etc/passwd", + "a/b", + "a\\b", + "with\x00nul", + "/abs/path", + "C:/foo", + "C:foo", + ], + ) + def test_rejects_traversal_patterns(self, tmp_path: Any, isolation_key: str) -> None: + from agent_framework_hosting._host import _checkpoint_path_for_isolation_key + + with pytest.raises(ValueError, match="isolation_key"): + _checkpoint_path_for_isolation_key(tmp_path, isolation_key) + + def test_rejects_non_string(self, tmp_path: Any) -> None: + from agent_framework_hosting._host import _checkpoint_path_for_isolation_key + + with pytest.raises(ValueError, match="non-empty string"): + _checkpoint_path_for_isolation_key(tmp_path, None) # type: ignore[arg-type] + + +class TestHostWorkflowCheckpointingPathTraversal: + """End-to-end: malicious isolation keys must not escape ``checkpoint_location``.""" + + async def test_traversal_key_skips_checkpointing_with_warning(self, tmp_path: Any, caplog: Any) -> None: + import logging as _logging + + from tests._workflow_fixtures import build_upper_workflow + + workflow = build_upper_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch], checkpoint_location=tmp_path) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="fake", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="../escape"), + ) + with caplog.at_level(_logging.WARNING, logger="agent_framework.hosting"): + result = await ch.context.run(req) + + assert list(result.result.get_outputs()) == ["HI"] + # Nothing should have been written under tmp_path. + assert list(tmp_path.iterdir()) == [] + assert any( + "Skipping checkpoint storage" in rec.message and "isolation_key" in rec.message for rec in caplog.records + ) + + async def test_separator_in_key_skips_checkpointing(self, tmp_path: Any) -> None: + from tests._workflow_fixtures import build_upper_workflow + + workflow = build_upper_workflow() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=workflow, channels=[ch], checkpoint_location=tmp_path) + _ = host.app + assert ch.context is not None + + # A literal separator in the key is a configuration smell at best + # and an attack at worst; either way it must not create a sub-path. + req = ChannelRequest( + channel="fake", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="evil/sub"), + ) + result = await ch.context.run(req) + + assert list(result.result.get_outputs()) == ["HI"] + assert list(tmp_path.iterdir()) == [] + + +# --------------------------------------------------------------------------- # +# Delivery routing # +# --------------------------------------------------------------------------- # + + +def _make_host_with_two_channels( + *, + runner: DurableTaskRunner | None = None, +) -> tuple[AgentFrameworkHost, _RecordingChannel, _RecordingChannel, ChannelContext, _SyncTaskRunner]: + agent = _FakeAgent() + a = _RecordingChannel(name="responses", path="/r") + b = _RecordingChannel(name="telegram", path="/t") + sync_runner = runner if isinstance(runner, _SyncTaskRunner) else _SyncTaskRunner() + host = AgentFrameworkHost( + target=agent, + channels=[a, b], + durable_task_runner=runner or sync_runner, + ) + _ = host.app + assert a.context is not None + return host, a, b, a.context, sync_runner + + +def _record_identity_on(host: AgentFrameworkHost, isolation_key: str, channel: str, native_id: str) -> None: + """Pre-seed the host's identity registry by running a request.""" + host._identities.setdefault(isolation_key, {})[channel] = ChannelIdentity(channel=channel, native_id=native_id) + host._active[isolation_key] = channel + + +class TestDeliverResponse: + """Delivery routing — the originating channel learns whether to render + on its own wire from the ``bool`` return; everything else + (scheduled tasks, schedule-time failures, skip reasons) lives in + the runner's own log. Tests assert the bool plus observable + state on the sync runner fake (``scheduled``, ``handler_errors``) + and on the destination channels (``pushes``).""" + + async def test_originating_returns_true(self) -> None: + _, _, _, ctx, runner = _make_host_with_two_channels() + req = ChannelRequest(channel="responses", operation="op", input="x") + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True + assert runner.scheduled == [] + + async def test_none_suppresses_everything(self) -> None: + _, _, _, ctx, runner = _make_host_with_two_channels() + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + response_target=ResponseTarget.none, # type: ignore[attr-defined] + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is False + assert runner.scheduled == [] + + async def test_active_pushes_to_other_channel(self) -> None: + host, _a, b, ctx, runner = _make_host_with_two_channels() + # Alice was last seen on telegram. + _record_identity_on(host, "alice", "telegram", "42") + # Now she sends a message via responses; ResponseTarget.active should + # push to telegram, not back to responses. + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.active, # type: ignore[attr-defined] + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is False + assert len(runner.scheduled) == 1 + assert b.pushes and b.pushes[0][0].native_id == "42" + + async def test_active_falls_back_to_originating_when_self(self) -> None: + host, _a, _b, ctx, runner = _make_host_with_two_channels() + _record_identity_on(host, "alice", "responses", "user:1") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.active, # type: ignore[attr-defined] + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True + assert runner.scheduled == [] + + async def test_channels_with_unknown_identity_falls_back_to_originating(self) -> None: + _, _, _, ctx, runner = _make_host_with_two_channels() + # No prior identity seeded for telegram on alice. + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram"), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + # Skipped at resolution → fallback to originating so the user + # still gets a reply. + assert include_originating is True + assert runner.scheduled == [] + + async def test_channels_with_explicit_native_id_token(self) -> None: + _, _, b, ctx, runner = _make_host_with_two_channels() + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + response_target=ResponseTarget.channel("telegram:99"), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is False + assert len(runner.scheduled) == 1 + assert b.pushes[0][0].native_id == "99" + + async def test_channels_originating_pseudo_includes_origin(self) -> None: + host, _a, _b, ctx, runner = _make_host_with_two_channels() + _record_identity_on(host, "alice", "telegram", "42") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channels(["originating", "telegram"]), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True + assert len(runner.scheduled) == 1 + + async def test_channels_unknown_channel_name_falls_back(self) -> None: + _, _, _, ctx, runner = _make_host_with_two_channels() + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + response_target=ResponseTarget.channel("nope"), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True # fallback + assert runner.scheduled == [] + + async def test_no_push_capability_falls_back(self) -> None: + agent = _FakeAgent() + a = _RecordingChannel(name="responses", path="/r") + b = _NoPushChannel(name="nopush", path="/n") + host = AgentFrameworkHost(target=agent, channels=[a, b]) + _ = host.app + assert a.context is not None + # Pre-seed identity on the no-push channel so we get past the + # identity check and hit the ChannelPush check. + host._identities.setdefault("alice", {})["nopush"] = ChannelIdentity(channel="nopush", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("nopush"), + ) + include_originating = await a.context.deliver_response(req, _make_reply("reply")) + assert include_originating is True # fallback + + async def test_all_linked_pushes_to_every_other_channel(self) -> None: + host, _a, b, ctx, runner = _make_host_with_two_channels() + # Alice on responses (originating) and telegram. + host._identities.setdefault("alice", {}) + host._identities["alice"]["responses"] = ChannelIdentity(channel="responses", native_id="user:1") + host._identities["alice"]["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.all_linked, # type: ignore[attr-defined] + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True + assert len(runner.scheduled) == 1 + assert b.pushes and b.pushes[0][1].result.text == "reply" + + async def test_all_linked_no_other_channels_falls_back(self) -> None: + _host, _a, _b, ctx, runner = _make_host_with_two_channels() + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.all_linked, # type: ignore[attr-defined] + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True + assert runner.scheduled == [] + + async def test_identities_variant_preserves_attributes(self) -> None: + """``ResponseTarget.identities([...])`` plumbs full + :class:`ChannelIdentity` objects through resolution, preserving + ``attributes`` for destination channels that need conversation/ + thread metadata (Teams, Slack, Bot Framework).""" + _, _, b, ctx, runner = _make_host_with_two_channels() + ident = ChannelIdentity( + channel="telegram", + native_id="42", + attributes={"thread_id": "t1", "service_url": "https://x"}, + ) + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + response_target=ResponseTarget.identity(ident), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is False + assert len(runner.scheduled) == 1 + # The destination identity arrived at push with attributes intact. + pushed_identity = b.pushes[0][0] + assert pushed_identity.native_id == "42" + assert dict(pushed_identity.attributes) == {"thread_id": "t1", "service_url": "https://x"} + + async def test_identities_pointing_to_originating_includes_origin(self) -> None: + """An identity whose channel matches the originating channel + folds into ``include_originating`` rather than double-delivering + via push.""" + _, _, _, ctx, runner = _make_host_with_two_channels() + ident = ChannelIdentity(channel="responses", native_id="user:1") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + response_target=ResponseTarget.identities([ident]), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is True + assert runner.scheduled == [] + + async def test_handler_exception_does_not_change_return_value(self) -> None: + """When ``ChannelPush.push`` raises *inside the runner handler* + the originating channel still sees the same return value — + ``DurableTaskRunner.schedule`` accepted the work, and downstream + delivery outcome is owned by the runner (it logs and retries + per the configured ``RetryPolicy``).""" + host, _a, b, ctx, runner = _make_host_with_two_channels() + b._push_raises = RuntimeError("boom") # type: ignore[attr-defined] + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram"), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + # Schedule succeeded → the return value is unaffected by a + # downstream handler failure. + assert include_originating is False + assert len(runner.scheduled) == 1 + # Handler raised — runner captured the error (the real runner + # would retry it; the sync fake records it). + assert runner.handler_errors and isinstance(runner.handler_errors[0], RuntimeError) + assert str(runner.handler_errors[0]) == "boom" + + async def test_schedule_exception_falls_back_to_originating(self) -> None: + """When :meth:`DurableTaskRunner.schedule` itself raises (the + runner backend is unreachable) the destination is treated as + skipped — same outcome as any other resolution-time drop. The + host's fall-back-to-originating rule then ensures the user + still gets a reply rather than being left without one.""" + host, _a, _b, ctx, runner = _make_host_with_two_channels() + runner.schedule_raises = RuntimeError("runner backend unreachable") + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="x", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram"), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + # Schedule raised → no scheduled tasks, fall back to originating. + assert runner.scheduled == [] + assert include_originating is True + + async def test_echo_input_pushes_user_message_then_response(self) -> None: + """``echo_input=True`` triggers two pushes per destination, + bundled into the same scheduled task: the originating user + message first, then the agent reply. Channels downstream of a + workflow that emits to multiple channels need this to keep + their UI state coherent with the user's actual prompt.""" + host, _a, b, ctx, runner = _make_host_with_two_channels() + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="hello there", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram", echo_input=True), + ) + include_originating = await ctx.deliver_response(req, _make_reply("reply")) + assert include_originating is False + # One scheduled task per destination; the handler does echo then response inline. + assert len(runner.scheduled) == 1 + _, payload = runner.scheduled[0] + assert payload["echo_result"] is not None + # Two pushes landed on the channel: echo first, then response. + assert len(b.pushes) == 2 + echo_identity, echo_payload = b.pushes[0] + assert echo_identity.native_id == "42" + assert echo_payload.result.text == "hello there" + assert str(echo_payload.result.messages[0].role) == "user" + resp_identity, resp_payload = b.pushes[1] + assert resp_identity.native_id == "42" + assert resp_payload.result.text == "reply" + assert str(resp_payload.result.messages[0].role) == "assistant" + + async def test_echo_input_failure_does_not_block_response(self) -> None: + """An echo push that raises inside the handler is logged and + swallowed; the response push must still be attempted on the + same destination so the user-visible failure mode is + "response delivered without echo" rather than "no response at + all".""" + agent = _FakeAgent() + a = _RecordingChannel(name="responses", path="/r") + b = _RecordingChannel(name="telegram", path="/t") + runner = _SyncTaskRunner() + host = AgentFrameworkHost(target=agent, channels=[a, b], durable_task_runner=runner) + _ = host.app + assert a.context is not None + + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + + # Make the FIRST push (echo) raise, but the SECOND (response) succeed. + calls = {"n": 0} + real_push = b.push + + async def flaky_push(identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: + calls["n"] += 1 + if calls["n"] == 1: + raise RuntimeError("echo down") + await real_push(identity, payload) + + b.push = flaky_push # type: ignore[method-assign] + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram", echo_input=True), + ) + include_originating = await a.context.deliver_response(req, _make_reply("reply")) + # Schedule succeeded; handler swallowed the echo failure and + # the response push landed on the channel. + assert include_originating is False + assert b.pushes and b.pushes[0][1].result.text == "reply" + # Handler did not raise (echo failure was swallowed inside + # the handler), so the runner saw no error. + assert runner.handler_errors == [] + + async def test_echo_idempotent_on_retry(self) -> None: + """When the response push fails on a retried task, the handler + must NOT re-deliver the echo if a prior attempt already + succeeded. The ``echo_done`` cursor on the payload mapping is + the host's idempotency primitive; this test invokes the + handler directly twice with the same payload to exercise the + retry semantics.""" + host, _a, b, ctx, runner = _make_host_with_two_channels() + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram", echo_input=True), + ) + # First scheduled invocation — echo + response both succeed. + await ctx.deliver_response(req, _make_reply("reply")) + assert len(b.pushes) == 2 # echo + response + # Simulate a retry: invoke the handler again with the same + # payload mapping (the in-process runner reuses the mapping + # across retries). After the first run ``echo_done`` was + # mutated to ``True``; the second run must skip the echo. + _, payload = runner.scheduled[0] + assert payload["echo_done"] is True + await host._handle_push_task(payload) + # Only one more push (the response) — the echo was skipped. + assert len(b.pushes) == 3 + assert str(b.pushes[2][1].result.messages[0].role) == "assistant" + + +# --------------------------------------------------------------------------- # +# Response hook + multi-modal payload + clone-on-fan-out # +# --------------------------------------------------------------------------- # + + +class TestResponseHookFanOut: + async def test_response_hook_applied_per_destination(self) -> None: + """Channels with a ``response_hook`` attribute see their hook + applied before push, with a ``ChannelResponseContext`` carrying + the destination identity, the originating request, and an + ``is_echo`` flag.""" + agent = _FakeAgent() + a = _RecordingChannel(name="responses", path="/r") + b = _RecordingChannel(name="telegram", path="/t") + + seen: list[tuple[str, str, bool]] = [] + + async def telegram_hook( + result: HostedRunResult[AgentResponse], + *, + context: Any, + **_: Any, + ) -> HostedRunResult[AgentResponse]: + seen.append((context.channel_name, context.destination_identity.native_id, context.is_echo)) + return result.replace( + result=AgentResponse( + messages=[Message(role="assistant", contents=[Content.from_text("[hooked] " + result.result.text)])] + ), + ) + + b.response_hook = telegram_hook # type: ignore[attr-defined] + host = AgentFrameworkHost(target=agent, channels=[a, b], durable_task_runner=_SyncTaskRunner()) + _ = host.app + assert a.context is not None + + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram"), + ) + report = await a.context.deliver_response(req, _make_reply("reply")) + assert report is False + # The pushed payload reflects the hook's transform. + assert b.pushes[0][1].result.text == "[hooked] reply" + assert seen == [("telegram", "42", False)] + + async def test_response_hook_mutation_isolated_per_destination(self) -> None: + """A hook that rebinds ``result`` on its payload must NOT affect + the payload another destination sees. The host clones the + envelope before each hook invocation so a per-destination + :meth:`HostedRunResult.replace` cannot leak across destinations.""" + agent = _FakeAgent() + a = _RecordingChannel(name="responses", path="/r") + b = _RecordingChannel(name="telegram", path="/t") + c = _RecordingChannel(name="extra", path="/x") + + async def hook_that_rebinds(result: HostedRunResult[AgentResponse], **_: Any) -> HostedRunResult[AgentResponse]: + # Naughty hook: rebind ``result`` to a fresh AgentResponse. + # Host's per-destination clone via ``replace()`` makes this safe + # for sibling destinations. + return result.replace(result=AgentResponse(messages=[])) + + b.response_hook = hook_that_rebinds # type: ignore[attr-defined] + host = AgentFrameworkHost(target=agent, channels=[a, b, c], durable_task_runner=_SyncTaskRunner()) + _ = host.app + assert a.context is not None + + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + host._identities["alice"]["extra"] = ChannelIdentity(channel="extra", native_id="9") + + original = _make_reply("reply") + original_result_snapshot = original.result + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channels(["telegram", "extra"]), + ) + report = await a.context.deliver_response(req, original) + assert report is False + # The rebind on the telegram clone must not have touched the + # original envelope, nor the extra channel's view. + assert original.result is original_result_snapshot + # ``extra`` channel saw the original-shaped payload. + extra_push = next(p for p in c.pushes) + assert extra_push[1].result.text == "reply" + + async def test_response_hook_fires_on_echo_with_is_echo_true(self) -> None: + """When ``echo_input`` is set, the channel's response_hook fires + TWICE per destination — once for the echo (is_echo=True), once + for the response (is_echo=False).""" + agent = _FakeAgent() + a = _RecordingChannel(name="responses", path="/r") + b = _RecordingChannel(name="telegram", path="/t") + + phases: list[bool] = [] + + async def telegram_hook( + result: HostedRunResult[AgentResponse], *, context: Any, **_: Any + ) -> HostedRunResult[AgentResponse]: + phases.append(context.is_echo) + return result + + b.response_hook = telegram_hook # type: ignore[attr-defined] + host = AgentFrameworkHost(target=agent, channels=[a, b], durable_task_runner=_SyncTaskRunner()) + _ = host.app + assert a.context is not None + + host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + session=ChannelSession(isolation_key="alice"), + response_target=ResponseTarget.channel("telegram", echo_input=True), + ) + await a.context.deliver_response(req, _make_reply("reply")) + assert phases == [True, False] + + +# --------------------------------------------------------------------------- # +# HostedRunResult — generic typed envelope # +# --------------------------------------------------------------------------- # + + +class TestHostedRunResult: + """The envelope is a thin generic wrapper around the target's + full-fidelity ``result`` plus an optional session reference. The + host does NOT pre-shape or flatten ``result.messages`` / + ``result.get_outputs()`` — channels read the canonical accessor on + the underlying result type themselves.""" + + def test_result_field_carries_full_fidelity_payload(self) -> None: + resp = AgentResponse( + messages=[Message(role="assistant", contents=[Content.from_text("hello")])], + response_id="r-1", + ) + env: HostedRunResult[AgentResponse] = HostedRunResult(resp) + # ``result`` is the canonical accessor; metadata like + # ``response_id`` round-trips through unchanged because the host + # never re-shapes the payload. + assert env.result is resp + assert env.result.text == "hello" + assert env.result.response_id == "r-1" + assert env.session is None + + def test_session_field_attached_and_optional(self) -> None: + resp = _assistant_response("ok") + session = _FakeAgentSession(session_id="sess-1") + env = HostedRunResult(resp, session=session) + assert env.session is session + + def test_replace_clones_envelope_without_touching_result_by_default(self) -> None: + resp = _assistant_response("orig") + original = HostedRunResult(resp, session=_FakeAgentSession(session_id="s")) + clone = original.replace() + # Clone is a distinct envelope but the inner ``result`` is the + # same object — channels that need a deep copy of ``result`` + # itself do the copy themselves. + assert clone is not original + assert clone.result is original.result + assert clone.session is original.session + + def test_replace_rebinds_result_without_perturbing_original(self) -> None: + original = HostedRunResult(_assistant_response("orig")) + clone = original.replace(result=_assistant_response("shaped")) + assert original.result.text == "orig" + assert clone.result.text == "shaped" + + def test_replace_supports_explicit_none_session(self) -> None: + original = HostedRunResult(_assistant_response("x"), session=_FakeAgentSession(session_id="s")) + clone = original.replace(session=None) + assert clone.session is None + # Source envelope untouched. + assert original.session is not None + + async def test_invoke_preserves_full_agent_response_on_result(self) -> None: + """The host's ``_invoke`` carries the agent's ``AgentResponse`` + through unchanged on ``result``. Channels see image / tool / + structured content alongside text — and metadata like + ``response_id`` — without the host pre-shaping anything.""" + + class _MultiModalResponse: + def __init__(self) -> None: + self.text = "summary" + self.response_id = "resp-xyz" + self.messages = [ + Message( + role="assistant", + contents=[ + Content.from_text("summary"), + # Non-text content the host must NOT drop. + Content.from_data(data=b"\x89PNG", media_type="image/png"), + ], + ), + ] + + class _MultiModalAgent: + def create_session(self, *, session_id: str | None = None) -> _FakeAgentSession: + return _FakeAgentSession(session_id=session_id) + + async def run(self, *_args: Any, **_kwargs: Any) -> Any: + return _MultiModalResponse() + + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=_MultiModalAgent(), channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest(channel="responses", operation="op", input="hi") + env = await ch.context.run(req) + # Full agent response carried through verbatim — no flattening. + assert env.result.text == "summary" + assert env.result.response_id == "resp-xyz" + assert len(env.result.messages) == 1 + types = [c.type for c in env.result.messages[0].contents] + assert "text" in types and "data" in types + + +# --------------------------------------------------------------------------- # +# Bind request context — duck-typed hook on context providers # +# --------------------------------------------------------------------------- # + + +from contextlib import contextmanager # noqa: E402 + + +class _RecordingContextProvider: + """Stand-in for a ``HistoryProvider`` that exposes the duck-typed + ``bind_request_context(response_id=..., previous_response_id=..., **_)`` + seam the host calls. Records (event, payload) pairs so tests can + assert call ordering relative to the agent run + stream lifecycle. + """ + + def __init__(self, *, name: str = "rec") -> None: + self.name = name + # (event, payload) tuples — events: "enter", "exit", "agent_start", + # "agent_end", "stream_yield", "stream_done". + self.events: list[tuple[str, Any]] = [] + + @contextmanager + def bind_request_context(self, **kwargs: Any) -> Any: + # Snapshot the call kwargs on enter (so tests can assert + # response_id / previous_response_id forwarding) and the same + # snapshot on exit so we can verify the SAME payload bracketed + # the agent run. + snapshot = dict(kwargs) + self.events.append(("enter", snapshot)) + try: + yield + finally: + self.events.append(("exit", snapshot)) + + +class _ProvidersAgent: + """Agent stand-in that exposes ``context_providers`` so the host's + ``_flat_context_providers`` finds the recording provider. + + Mirrors the real :class:`agent_framework.Agent.run` shape: a sync + ``def`` that returns either an ``Awaitable[AgentResponse]`` (for + ``stream=False``) or a :class:`ResponseStream` synchronously (for + ``stream=True``). The host's ``_invoke_stream`` relies on the sync + return so it can wrap the stream in ``_BoundResponseStream`` and + hand it to channels for later iteration. + """ + + def __init__(self, providers: Sequence[Any], *, reply: str = "ok") -> None: + self.context_providers = list(providers) + self._reply = reply + self.calls: list[dict[str, Any]] = [] + + def create_session(self, *, session_id: str | None = None) -> _FakeAgentSession: + return _FakeAgentSession(session_id=session_id) + + def run( + self, + messages: Any = None, + *, + stream: bool = False, + session: Any = None, + **kwargs: Any, + ) -> Any: + self.calls.append({"messages": messages, "stream": stream, "session": session, "kwargs": kwargs}) + + if stream: + providers = self.context_providers + updates = [ + AgentResponseUpdate(contents=[Content.from_text("chunk-1")], role="assistant"), + AgentResponseUpdate(contents=[Content.from_text("chunk-2")], role="assistant"), + ] + + async def _gen() -> AsyncIterator[AgentResponseUpdate]: + # ``agent_start`` is only recorded once iteration begins; + # if the channel abandons the stream without iterating + # we expect to see neither ``agent_start`` nor any + # ``stream_yield`` events. + for prov in providers: + if isinstance(prov, _RecordingContextProvider): + prov.events.append(("agent_start", None)) + for u in updates: + for prov in providers: + if isinstance(prov, _RecordingContextProvider): + prov.events.append(("stream_yield", u.text)) + yield u + + async def _finalize(items: Sequence[AgentResponseUpdate]) -> AgentResponse: # noqa: RUF029 + for prov in providers: + if isinstance(prov, _RecordingContextProvider): + prov.events.append(("stream_done", len(items))) + return AgentResponse.from_updates(items) + + return ResponseStream[AgentResponseUpdate, AgentResponse](_gen(), finalizer=_finalize) + + async def _coro() -> _FakeAgentResponse: + for prov in self.context_providers: + if isinstance(prov, _RecordingContextProvider): + prov.events.append(("agent_start", None)) + prov.events.append(("agent_end", None)) + return _FakeAgentResponse(text=self._reply) + + return _coro() + + +class _ProviderWrapper: + """Wrap children in a ``providers`` attribute (mirrors the + ``ContextProviderBase`` aggregation shape).""" + + def __init__(self, providers: Sequence[Any]) -> None: + self.providers = list(providers) + + +class TestBindRequestContext: + """The host walks ``target.context_providers``, descends one level + when a provider exposes a ``providers`` attribute, and calls + ``bind_request_context(response_id=..., previous_response_id=...)`` + on every provider that supports it. Foundry response-id chaining + plugs into this exact seam — a regression that mistypes the kwarg + name, drops the descent, or fails to keep the binding open across + the agent run silently breaks chained writes.""" + + async def test_bind_called_with_request_attributes(self) -> None: + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + session=ChannelSession(isolation_key="alice"), + attributes={"response_id": "resp_abc", "previous_response_id": "resp_prev"}, + ) + result = await ch.context.run(req) + assert result.result.text == "ok" + + # Bind ↔ unbind brackets the agent run. + events = [name for name, _ in prov.events] + assert events == ["enter", "agent_start", "agent_end", "exit"] + + # Both response_id and previous_response_id forwarded by name. + _, enter_payload = prov.events[0] + assert enter_payload["response_id"] == "resp_abc" + assert enter_payload["previous_response_id"] == "resp_prev" + + async def test_bind_skipped_when_no_response_id_attribute(self) -> None: + """Without a ``response_id`` attribute on the request, the host + skips the binding entirely — the contract requires one to anchor + the chain.""" + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest(channel="responses", operation="op", input="hi") + await ch.context.run(req) + assert prov.events == [("agent_start", None), ("agent_end", None)] + + async def test_bind_does_not_descend_into_providers_attribute(self) -> None: + """The host does not introspect ``ContextProviderBase`` aggregator + wrappers. Aggregator providers are responsible for forwarding the + bind to their children themselves (``AggregateContextProvider`` + already does this). The host treats whatever ``agent.context_providers`` + exposes as the final, flat list.""" + prov = _RecordingContextProvider(name="inner") + wrapper = _ProviderWrapper([prov]) + agent = _ProvidersAgent([wrapper]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + attributes={"response_id": "resp_xyz"}, + ) + await ch.context.run(req) + # The wrapper does not implement ``response_context``, so the + # inner provider must NOT have been entered by the host. + assert ("enter", {"response_id": "resp_xyz", "previous_response_id": None}) not in prov.events + + async def test_bind_held_open_until_stream_exhaustion(self) -> None: + """Streaming runs return a ``ResponseStream`` synchronously but + consumption happens later. The binding must survive that gap and + only release after the iterator drains so the provider sees + every yielded chunk under the bound context.""" + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + stream=True, + attributes={"response_id": "resp_stream"}, + ) + stream = ch.context.run_stream(req) + + # As soon as run_stream returns, the binding must already be open + # so any provider work that happens during iteration sees it. + names_after_create = [name for name, _ in prov.events] + assert names_after_create.count("enter") == 1 + assert "exit" not in names_after_create + + chunks: list[str] = [] + async for u in stream: + chunks.append(u.text) + assert chunks == ["chunk-1", "chunk-2"] + + # After exhaustion the binding must be released — exactly once. + names_after_drain = [name for name, _ in prov.events] + assert names_after_drain.count("enter") == 1 + assert names_after_drain.count("exit") == 1 + # Brackets surround every stream_yield. + enter_idx = names_after_drain.index("enter") + exit_idx = names_after_drain.index("exit") + yield_idxs = [i for i, name in enumerate(names_after_drain) if name == "stream_yield"] + assert all(enter_idx < i < exit_idx for i in yield_idxs) + + +# --------------------------------------------------------------------------- # +# Agent-target streaming — `_BoundResponseStream` adapter behaviour # +# --------------------------------------------------------------------------- # + + +class TestBoundResponseStream: + """The ``_BoundResponseStream`` adapter holds the bind-context + ``ExitStack`` open across iteration. Cover the iterator-finally + close, ``get_final_response`` close, double-close idempotence, + ``aclose()``, ``__getattr__`` forwarding, and the awaitable path + (which now routes through ``get_final_response`` so it doesn't + leak the binding).""" + + async def test_get_final_response_closes_binding(self) -> None: + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + stream=True, + attributes={"response_id": "resp_get_final"}, + ) + stream = ch.context.run_stream(req) + # Skip iteration and go straight to ``get_final_response``; + # the adapter must drain the inner stream itself and close + # the binding in ``finally``. + final = await stream.get_final_response() + assert final.text == "chunk-1chunk-2" + names = [n for n, _ in prov.events] + assert names.count("enter") == 1 + assert names.count("exit") == 1 + + async def test_double_close_is_idempotent(self) -> None: + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + stream=True, + attributes={"response_id": "resp_idem"}, + ) + stream = ch.context.run_stream(req) + async for _u in stream: + pass + # Iteration's finally already closed; an explicit ``aclose`` + # afterwards must be a no-op (no second exit event). + await stream.aclose() # type: ignore[attr-defined] + await stream.aclose() # type: ignore[attr-defined] + names = [n for n, _ in prov.events] + assert names.count("exit") == 1 + + async def test_aclose_releases_binding_when_stream_abandoned(self) -> None: + """A channel that abandons the stream without iterating must + be able to call ``aclose()`` so the host-bound contextvars + don't leak for the host's lifetime.""" + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + stream=True, + attributes={"response_id": "resp_abandon"}, + ) + stream = ch.context.run_stream(req) + await stream.aclose() # type: ignore[attr-defined] + + # Binding released without iterating. + names = [n for n, _ in prov.events] + assert names.count("enter") == 1 + assert names.count("exit") == 1 + # Agent never ran — we abandoned before iteration. + assert "agent_start" not in names + + async def test_getattr_forwards_to_inner_stream(self) -> None: + """``_BoundResponseStream.__getattr__`` forwards unknown + attributes to the inner ``ResponseStream``; channels that + check, e.g., ``stream.add_result_hook(...)`` must keep working.""" + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + stream=True, + attributes={"response_id": "resp_getattr"}, + ) + stream = ch.context.run_stream(req) + # ``with_result_hook`` is a real method on ``ResponseStream``; + # if forwarding broke this would AttributeError. + try: + assert callable(stream.with_result_hook) # type: ignore[attr-defined] + finally: + await stream.aclose() # type: ignore[attr-defined] + + async def test_await_path_routes_through_get_final_response(self) -> None: + """``await stream`` is a convenience for ``await + get_final_response()``. The previous direct delegation leaked + the binding for the host's lifetime; the new routing closes the + stack in the same ``finally`` as ``get_final_response``.""" + prov = _RecordingContextProvider() + agent = _ProvidersAgent([prov]) + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + req = ChannelRequest( + channel="responses", + operation="op", + input="hi", + stream=True, + attributes={"response_id": "resp_await"}, + ) + stream = ch.context.run_stream(req) + final = await stream # exercises __await__ + assert final.text == "chunk-1chunk-2" + names = [n for n, _ in prov.events] + assert names.count("enter") == 1 + assert names.count("exit") == 1 + + +# --------------------------------------------------------------------------- # +# `_wrap_input` — list[Message] LAST-message metadata stamping # +# --------------------------------------------------------------------------- # + + +class TestWrapInputListMessages: + """The ``hosting`` block lands on the LAST message of a list — the + contract is load-bearing: the user turn (typically last) must + carry the channel provenance + identity for history correlation; + a regression stamping ``messages[0]`` instead silently breaks + every multi-message payload.""" + + async def test_metadata_lands_on_last_message_only(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + # Responses-API style: a system instruction followed by a user + # turn. Only the user turn (LAST) gets stamped. + system = Message(role="system", contents=[Content.from_text("be concise")]) + user = Message(role="user", contents=[Content.from_text("hi")]) + req = ChannelRequest( + channel="responses", + operation="op", + input=[system, user], + identity=ChannelIdentity(channel="responses", native_id="user:1"), + ) + await ch.context.run(req) + + forwarded = agent.calls[0]["messages"] + assert isinstance(forwarded, list) + assert len(forwarded) == 2 + # System stays clean. + assert (system.additional_properties or {}).get("hosting") is None + # User turn carries the metadata. + hosting = forwarded[-1].additional_properties["hosting"] + assert hosting["channel"] == "responses" + assert hosting["identity"]["native_id"] == "user:1" + + async def test_single_message_payload_still_works(self) -> None: + """Regression guard: the single-``Message`` branch must be + unchanged by the LAST-of-list logic above.""" + agent = _FakeAgent() + ch = _RecordingChannel(name="responses") + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + only = Message(role="user", contents=[Content.from_text("hi")]) + req = ChannelRequest(channel="responses", operation="op", input=only) + await ch.context.run(req) + forwarded = agent.calls[0]["messages"] + assert isinstance(forwarded, Message) + assert forwarded.additional_properties["hosting"]["channel"] == "responses" + + +# --------------------------------------------------------------------------- # +# Lifespan callback aggregation # +# --------------------------------------------------------------------------- # + + +class _RaisingLifecycleChannel: + """Channel whose startup OR shutdown callback raises a controlled error.""" + + def __init__(self, name: str, *, fail_on: str) -> None: + self.name = name + self.path = "" + self._fail_on = fail_on # "startup" | "shutdown" + self.start_calls: list[str] = [] + self.stop_calls: list[str] = [] + + def contribute(self, _context: ChannelContext) -> ChannelContribution: + async def _start() -> None: + self.start_calls.append("up") + if self._fail_on == "startup": + raise RuntimeError(f"startup-boom-{self.name}") + + async def _stop() -> None: + self.stop_calls.append("down") + if self._fail_on == "shutdown": + raise RuntimeError(f"shutdown-boom-{self.name}") + + return ChannelContribution(on_startup=[_start], on_shutdown=[_stop]) + + +class _OkLifecycleChannel: + def __init__(self, name: str) -> None: + self.name = name + self.path = "" + self.start_calls: list[str] = [] + self.stop_calls: list[str] = [] + + def contribute(self, _context: ChannelContext) -> ChannelContribution: + async def _start() -> None: + self.start_calls.append("up") + + async def _stop() -> None: + self.stop_calls.append("down") + + return ChannelContribution(on_startup=[_start], on_shutdown=[_stop]) + + +class TestLifespanAggregation: + """One bad startup / shutdown callback must NOT abort the rest — + every channel gets a chance to wire / unwire so half-initialised + state doesn't leak. The first error is still raised so the + process exits with a failure; remaining errors are logged so + operators see them all in one log scrape.""" + + def test_shutdown_failure_does_not_skip_peer_shutdowns(self, caplog: Any) -> None: + import logging as _logging + + agent = _FakeAgent() + bad = _RaisingLifecycleChannel("bad", fail_on="shutdown") + ok1 = _OkLifecycleChannel("ok1") + ok2 = _OkLifecycleChannel("ok2") + # Order: bad first so that without aggregation, ok1+ok2 would + # never get to run their shutdown callbacks. + host = AgentFrameworkHost(target=agent, channels=[bad, ok1, ok2]) + + with caplog.at_level(_logging.ERROR, logger="agent_framework.hosting"): # noqa: SIM117 + with pytest.raises(RuntimeError, match="shutdown-boom-bad"), TestClient(host.app): + pass + + # Every channel had its shutdown attempted, even though `bad` raised. + assert bad.stop_calls == ["down"] + assert ok1.stop_calls == ["down"] + assert ok2.stop_calls == ["down"] + + def test_startup_failure_aggregates_logs_and_raises_first(self, caplog: Any) -> None: + import logging as _logging + + agent = _FakeAgent() + ok1 = _OkLifecycleChannel("ok1") + bad = _RaisingLifecycleChannel("bad", fail_on="startup") + ok2 = _OkLifecycleChannel("ok2") + another_bad = _RaisingLifecycleChannel("bad2", fail_on="startup") + host = AgentFrameworkHost( + target=agent, + channels=[ok1, bad, ok2, another_bad], + ) + + with caplog.at_level(_logging.ERROR, logger="agent_framework.hosting"): # noqa: SIM117 + # The first failing callback's error is the one that + # propagates; remaining failures are logged. + with pytest.raises(RuntimeError, match="startup-boom-bad"), TestClient(host.app): + pass + + # Every startup callback ran (even ok2 / another_bad after the + # first failure) so we get a complete picture in the logs. + assert ok1.start_calls == ["up"] + assert bad.start_calls == ["up"] + assert ok2.start_calls == ["up"] + assert another_bad.start_calls == ["up"] + + # Both failures show up in operator logs. ``logger.exception`` puts + # the exception payload in ``record.exc_text``; the formatted summary + # of the second failure goes into ``record.message`` via the + # aggregate "N callback(s) failed" line. + log_messages = [rec.getMessage() for rec in caplog.records] + log_exc_texts = [rec.exc_text or "" for rec in caplog.records] + log_text = "\n".join(log_messages + log_exc_texts) + assert "startup-boom-bad" in log_text + assert "startup-boom-bad2" in log_text or "callback(s) failed" in log_text diff --git a/python/packages/hosting/tests/test_host_disk.py b/python/packages/hosting/tests/test_host_disk.py new file mode 100644 index 00000000000..5ed27b67aab --- /dev/null +++ b/python/packages/hosting/tests/test_host_disk.py @@ -0,0 +1,424 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for ``state_dir`` wired through :class:`AgentFrameworkHost`.""" + +from __future__ import annotations + +import asyncio +from pathlib import Path +from typing import Any + +import pytest + +from agent_framework_hosting import ( + AgentFrameworkHost, + ChannelContext, + ChannelContribution, + ChannelIdentity, + LinkChallenge, +) + +# Skip the whole module when the optional disk extra isn't installed. +pytest.importorskip("diskcache") + + +# --------------------------------------------------------------------------- # +# Test helpers # +# --------------------------------------------------------------------------- # + + +class _AgentStub: + """Bare-minimum SupportsAgentRun stub for host construction.""" + + async def run(self, *_args: Any, **_kwargs: Any) -> None: # pragma: no cover - unused + return None + + +class _ChannelStub: + name = "stub" + path = "/stub" + + def contribute(self, _context: ChannelContext) -> ChannelContribution: + return ChannelContribution() + + +class _NonConfigurableLinker: + async def resolve(self, _identity: ChannelIdentity) -> LinkChallenge: + return LinkChallenge("link") + + +class _ConfigurableLinker: + def __init__(self) -> None: + self.configured_path: Path | None = None + + def configure_link_store_path(self, path: str | Path) -> None: + self.configured_path = Path(path) + + async def resolve(self, _identity: ChannelIdentity) -> LinkChallenge: + return LinkChallenge("link") + + +def _close_host_disk(host: AgentFrameworkHost) -> None: + """Mirror the lifespan shutdown ordering for tests that simulate restart. + + The real shutdown order is ``runner.shutdown()`` → ``sessions_store.close()``; + both release their advisory file locks so a second host can take ownership. + """ + runner = host._durable_task_runner + try: + asyncio.get_event_loop().run_until_complete(runner.shutdown(timeout=1.0)) + except RuntimeError: + # No running loop; spin up a throw-away one. + asyncio.run(runner.shutdown(timeout=1.0)) + if host._sessions_store is not None: + host._sessions_store.close() + + +# --------------------------------------------------------------------------- # +# state_dir=None preserves the in-memory contract # +# --------------------------------------------------------------------------- # + + +def test_state_dir_none_keeps_plain_dicts(tmp_path: Path) -> None: + """No store, no sessions persistence, no files written.""" + host = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) + try: + assert host._sessions_store is None + assert isinstance(host._session_aliases, dict) + assert isinstance(host._active, dict) + assert isinstance(host._identities, dict) + # No accidental disk writes anywhere under tmp_path. + assert list(tmp_path.iterdir()) == [] + finally: + # Nothing to close. + pass + + +# --------------------------------------------------------------------------- # +# Single string state_dir creates default subfolders # +# --------------------------------------------------------------------------- # + + +def test_string_state_dir_creates_subfolders(tmp_path: Path) -> None: + """Passing a single path expands to ``runner/`` and ``sessions/``.""" + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + state_dir=tmp_path, + ) + try: + assert host._sessions_store is not None + assert (tmp_path / "runner").is_dir() + assert (tmp_path / "sessions").is_dir() + finally: + _close_host_disk(host) + + +# --------------------------------------------------------------------------- # +# Per-component override via HostStatePaths-shaped dict # +# --------------------------------------------------------------------------- # + + +def test_per_component_paths(tmp_path: Path) -> None: + """Dict form lets the caller route components to different roots.""" + runner_dir = tmp_path / "tasks" + sessions_dir = tmp_path / "state" + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + state_dir={"runner": runner_dir, "sessions": sessions_dir}, + ) + try: + assert runner_dir.is_dir() + assert sessions_dir.is_dir() + # Default subfolders should NOT exist when the caller provides + # explicit overrides. + assert not (tmp_path / "runner").is_dir() or runner_dir == (tmp_path / "runner") + assert not (tmp_path / "sessions").is_dir() or sessions_dir == (tmp_path / "sessions") + finally: + _close_host_disk(host) + + +def test_unknown_component_key_raises(tmp_path: Path) -> None: + """Misspelled keys should fail loudly so the user catches typos.""" + with pytest.raises(ValueError, match="unknown"): + AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + state_dir={"runnerr": tmp_path / "x"}, # type: ignore[dict-item] + ) + + +def test_links_state_path_configures_compatible_identity_linker(tmp_path: Path) -> None: + """``state_dir['links']`` is offered to linkers that accept host-owned persistence.""" + linker = _ConfigurableLinker() + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + identity_linker=linker, + state_dir=tmp_path, + ) + try: + assert linker.configured_path == tmp_path / "links" + finally: + _close_host_disk(host) + + +def test_explicit_links_state_path_without_linker_warns(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: + """Explicit ``links`` path with no linker is almost certainly dead config.""" + with caplog.at_level("WARNING", logger="agent_framework.hosting"): + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + state_dir={"links": tmp_path / "links"}, + ) + try: + assert any( + "state_dir['links']" in rec.message and "no identity_linker" in rec.message for rec in caplog.records + ) + finally: + _close_host_disk(host) + + +def test_links_state_path_with_nonconfigurable_linker_warns(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: + """A linker that owns its persistence directly gets a clear warning.""" + with caplog.at_level("WARNING", logger="agent_framework.hosting"): + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + identity_linker=_NonConfigurableLinker(), + state_dir={"links": tmp_path / "links"}, + ) + try: + assert any( + "state_dir['links']" in rec.message and "SupportsLinkStorePath" in rec.message for rec in caplog.records + ) + finally: + _close_host_disk(host) + + +# --------------------------------------------------------------------------- # +# Session bookkeeping survives a host restart # +# --------------------------------------------------------------------------- # + + +def test_session_aliases_survive_restart(tmp_path: Path) -> None: + """Aliases written on host #1 must be visible to host #2.""" + state_dir = tmp_path / "state" + + host1 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) + host1._session_aliases["user-1"] = "sess-abc" + host1._session_aliases["user-2"] = "sess-def" + _close_host_disk(host1) + + host2 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) + try: + assert host2._session_aliases["user-1"] == "sess-abc" + assert host2._session_aliases["user-2"] == "sess-def" + finally: + _close_host_disk(host2) + + +def test_active_channel_survives_restart(tmp_path: Path) -> None: + """``_active`` must round-trip through the store.""" + state_dir = tmp_path / "state" + + host1 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) + host1._active["user-1"] = "telegram" + host1._active["user-2"] = "responses" + _close_host_disk(host1) + + host2 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) + try: + assert host2._active["user-1"] == "telegram" + assert host2._active["user-2"] == "responses" + finally: + _close_host_disk(host2) + + +def test_identities_nested_mutation_survives_restart(tmp_path: Path) -> None: + """Setting ``self._identities[ik][channel] = identity`` must persist. + + This exercises the proxy-inner-dict ``__setitem__`` write-through path, + not just the outer-key replacement path. + """ + state_dir = tmp_path / "state" + + host1 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) + ident_tg = ChannelIdentity("telegram", "tg-123", {"username": "alice"}) + ident_rsp = ChannelIdentity("responses", "rsp-456") + # Mirrors the host-internal path in ``_register_identity``. + host1._identities.setdefault("user-1", {})["telegram"] = ident_tg + host1._identities.setdefault("user-1", {})["responses"] = ident_rsp + host1._identities.setdefault("user-2", {})["telegram"] = ChannelIdentity("telegram", "tg-789") + _close_host_disk(host1) + + host2 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) + try: + u1 = host2._identities["user-1"] + assert set(u1.keys()) == {"telegram", "responses"} + assert u1["telegram"].native_id == "tg-123" + assert u1["telegram"].attributes["username"] == "alice" + assert u1["responses"].native_id == "rsp-456" + assert host2._identities["user-2"]["telegram"].native_id == "tg-789" + finally: + _close_host_disk(host2) + + +# --------------------------------------------------------------------------- # +# Explicit durable_task_runner + state_dir['runner'] warns # +# --------------------------------------------------------------------------- # + + +def test_explicit_runner_with_runner_state_warns(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: + """Caller-owned runner + state_dir['runner'] → ignore + warn.""" + from agent_framework_hosting import InProcessTaskRunner + + user_runner = InProcessTaskRunner() + try: + with caplog.at_level("WARNING"): + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + durable_task_runner=user_runner, + allow_in_process_runner=True, + state_dir={"runner": tmp_path / "runner"}, + ) + assert any("state_dir['runner']" in rec.message for rec in caplog.records) + # Sessions store wasn't requested, so still None. + assert host._sessions_store is None + finally: + # user_runner has no disk state, so nothing else to clean up. + pass + + +# --------------------------------------------------------------------------- # +# Workflow checkpoint integration # +# --------------------------------------------------------------------------- # + + +def _build_simple_workflow() -> Any: + """Build a no-op workflow for checkpoint-wiring tests.""" + from tests._workflow_fixtures import build_upper_workflow + + return build_upper_workflow() + + +def test_single_path_state_dir_wires_workflow_checkpoints(tmp_path: Path) -> None: + """``state_dir="/foo"`` + workflow target → ``/foo/checkpoints/`` is used.""" + workflow = _build_simple_workflow() + host = AgentFrameworkHost( + target=workflow, + channels=[_ChannelStub()], + state_dir=tmp_path, + ) + try: + # Checkpoint location is derived from the single state_dir. + assert host._checkpoint_location == tmp_path / "checkpoints" + finally: + _close_host_disk(host) + + +def test_mapping_state_dir_checkpoints_key_wires_workflow_checkpoints(tmp_path: Path) -> None: + """``state_dir={"checkpoints": ...}`` + workflow target → that path is used.""" + workflow = _build_simple_workflow() + ckpt_dir = tmp_path / "ck" + host = AgentFrameworkHost( + target=workflow, + channels=[_ChannelStub()], + state_dir={"checkpoints": ckpt_dir}, + ) + try: + assert host._checkpoint_location == ckpt_dir + # No diskcache components were requested. + assert host._sessions_store is None + finally: + _close_host_disk(host) + + +def test_mapping_state_dir_omits_checkpoints_for_workflow(tmp_path: Path) -> None: + """Mapping form lets workflow callers opt out of checkpoint persistence.""" + workflow = _build_simple_workflow() + host = AgentFrameworkHost( + target=workflow, + channels=[_ChannelStub()], + # No 'checkpoints' key → no checkpoint persistence even though + # other components are persisted. + state_dir={"runner": tmp_path / "r", "sessions": tmp_path / "s"}, + ) + try: + assert host._checkpoint_location is None + finally: + _close_host_disk(host) + + +def test_explicit_checkpoint_location_wins_over_state_dir(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: + """``checkpoint_location`` + ``state_dir`` → explicit param wins + warn.""" + workflow = _build_simple_workflow() + explicit = tmp_path / "explicit-ck" + with caplog.at_level("WARNING", logger="agent_framework.hosting"): + host = AgentFrameworkHost( + target=workflow, + channels=[_ChannelStub()], + checkpoint_location=explicit, + state_dir=tmp_path, + ) + try: + assert host._checkpoint_location == explicit + assert any( + "state_dir['checkpoints']" in rec.message and "checkpoint_location" in rec.message for rec in caplog.records + ) + finally: + _close_host_disk(host) + + +def test_state_dir_checkpoints_for_agent_target_silent_for_single_path(tmp_path: Path) -> None: + """Single-path state_dir + agent target → no checkpoint, no warning.""" + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + state_dir=tmp_path, + ) + try: + assert host._checkpoint_location is None + # ``checkpoints/`` subfolder is not eagerly created (no consumer). + assert not (tmp_path / "checkpoints").exists() + finally: + _close_host_disk(host) + + +def test_state_dir_checkpoints_for_agent_target_warns_when_explicit( + tmp_path: Path, caplog: pytest.LogCaptureFixture +) -> None: + """Mapping form with ``checkpoints`` + agent target → warn (dead config).""" + with caplog.at_level("WARNING", logger="agent_framework.hosting"): + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + state_dir={"checkpoints": tmp_path / "ck"}, + ) + try: + assert host._checkpoint_location is None + assert any( + "state_dir['checkpoints']" in rec.message and "not a Workflow" in rec.message for rec in caplog.records + ) + finally: + _close_host_disk(host) + + +def test_state_dir_checkpoints_conflicts_with_workflow_own_storage(tmp_path: Path) -> None: + """Derived checkpoint path triggers the same conflict guard as explicit.""" + from agent_framework import InMemoryCheckpointStorage, WorkflowBuilder + + from tests._workflow_fixtures import _UpperExecutor + + workflow = WorkflowBuilder( + start_executor=_UpperExecutor(id="upper"), + checkpoint_storage=InMemoryCheckpointStorage(), + ).build() + with pytest.raises(RuntimeError, match="already has checkpoint storage"): + AgentFrameworkHost( + target=workflow, + channels=[_ChannelStub()], + state_dir=tmp_path, + ) diff --git a/python/packages/hosting/tests/test_isolation.py b/python/packages/hosting/tests/test_isolation.py new file mode 100644 index 00000000000..4dc029f07b9 --- /dev/null +++ b/python/packages/hosting/tests/test_isolation.py @@ -0,0 +1,282 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for the per-request isolation contextvar surface in +:mod:`agent_framework_hosting._isolation`. + +The isolation keys are the ONLY seam Foundry-aware providers use to +find partition keys, and the host's ASGI middleware lifts them off the +two well-known headers on every inbound HTTP request. A regression +that drops the lookup, mistypes a header name, or fails to reset the +contextvar would silently misroute writes / leak per-request state +across requests, with zero unit-test signal — so cover the surface +fully here. +""" + +from __future__ import annotations + +import asyncio + +from starlette.requests import Request +from starlette.responses import JSONResponse +from starlette.routing import BaseRoute, Route +from starlette.testclient import TestClient + +from agent_framework_hosting import ( + Channel, + ChannelContext, + ChannelContribution, + IsolationKeys, + get_current_isolation_keys, + reset_current_isolation_keys, + set_current_isolation_keys, +) +from agent_framework_hosting._isolation import ( # pyright: ignore[reportPrivateUsage] + ISOLATION_HEADER_CHAT, + ISOLATION_HEADER_USER, + current_isolation_keys, +) + + +class TestIsolationKeys: + def test_defaults_to_none_pair(self) -> None: + keys = IsolationKeys() + assert keys.user_key is None + assert keys.chat_key is None + assert keys.is_empty is True + + def test_partial_with_only_user_is_not_empty(self) -> None: + keys = IsolationKeys(user_key="alice") + assert keys.user_key == "alice" + assert keys.chat_key is None + assert keys.is_empty is False + + def test_partial_with_only_chat_is_not_empty(self) -> None: + keys = IsolationKeys(chat_key="general") + assert keys.is_empty is False + + def test_full_pair_is_not_empty(self) -> None: + keys = IsolationKeys(user_key="alice", chat_key="general") + assert keys.is_empty is False + + +class TestContextVarHelpers: + def test_default_is_none(self) -> None: + # Each test gets a fresh contextvar value because pytest runs + # tests in fresh contexts. ``get`` returns the default. + assert get_current_isolation_keys() is None + + def test_set_and_get_round_trip(self) -> None: + token = set_current_isolation_keys(IsolationKeys(user_key="alice", chat_key="general")) + try: + current = get_current_isolation_keys() + assert current is not None + assert current.user_key == "alice" + assert current.chat_key == "general" + finally: + reset_current_isolation_keys(token) + # Reset restores prior value (None in the default context). + assert get_current_isolation_keys() is None + + def test_set_with_none_clears(self) -> None: + outer = set_current_isolation_keys(IsolationKeys(user_key="alice")) + try: + inner = set_current_isolation_keys(None) + try: + assert get_current_isolation_keys() is None + finally: + reset_current_isolation_keys(inner) + # Reset surfaces the outer value again. + current = get_current_isolation_keys() + assert current is not None + assert current.user_key == "alice" + finally: + reset_current_isolation_keys(outer) + + def test_module_level_contextvar_is_the_same_instance(self) -> None: + """Direct contextvar access (used by the ASGI middleware) and the + public `get_current_isolation_keys()` helper read from the SAME + underlying contextvar. A regression that introduced a second + contextvar would silently break the middleware → provider hop.""" + token = current_isolation_keys.set(IsolationKeys(user_key="bob")) + try: + via_helper = get_current_isolation_keys() + assert via_helper is not None + assert via_helper.user_key == "bob" + finally: + current_isolation_keys.reset(token) + + +class TestHeaderConstants: + """The two header names are part of the public contract — they + match the ones the Foundry Hosted Agents runtime stamps on every + inbound request. A typo here would silently misroute partition + writes.""" + + def test_user_header_value(self) -> None: + assert ISOLATION_HEADER_USER == "x-agent-user-isolation-key" + + def test_chat_header_value(self) -> None: + assert ISOLATION_HEADER_CHAT == "x-agent-chat-isolation-key" + + +# --------------------------------------------------------------------------- # +# End-to-end: ASGI middleware lifts the headers into the contextvar. +# --------------------------------------------------------------------------- # + + +class _IsolationProbeChannel: + """A minimal Channel that exposes a single GET route which captures + the contextvar value INSIDE the request and returns it as JSON. + + Tests use this to exercise the full middleware → contextvar → + handler hop end-to-end. + """ + + name = "probe" + path = "" + + def __init__(self) -> None: + self.captured: list[IsolationKeys | None] = [] + + async def _handler(_request: Request) -> JSONResponse: + keys = get_current_isolation_keys() + self.captured.append(keys) + payload = ( + {"user": keys.user_key, "chat": keys.chat_key} + if keys is not None + else {"user": None, "chat": None, "_present": False} + ) + return JSONResponse(payload) + + self._routes: list[BaseRoute] = [Route("/probe", _handler)] + + def contribute(self, _context: ChannelContext) -> ChannelContribution: + return ChannelContribution(routes=self._routes) + + +def _make_host_with_probe() -> tuple[object, _IsolationProbeChannel]: + from agent_framework_hosting import AgentFrameworkHost + + class _NoopAgent: + async def run(self, *_args: object, **_kwargs: object) -> object: # pragma: no cover - never called + raise RuntimeError("not invoked") + + probe = _IsolationProbeChannel() + assert isinstance(probe, Channel) + host = AgentFrameworkHost(target=_NoopAgent(), channels=[probe]) # type: ignore[arg-type] + return host, probe + + +class TestIsolationMiddlewareEndToEnd: + def test_both_headers_lifted_into_contextvar(self) -> None: + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get( + "/probe", + headers={ + ISOLATION_HEADER_USER: "alice-uid", + ISOLATION_HEADER_CHAT: "general-cid", + }, + ) + assert r.status_code == 200 + assert r.json() == {"user": "alice-uid", "chat": "general-cid"} + assert len(probe.captured) == 1 + captured = probe.captured[0] + assert captured is not None + assert captured.user_key == "alice-uid" + assert captured.chat_key == "general-cid" + + def test_only_user_header_lifted(self) -> None: + """One-header-only branch: the middleware still binds (chat=None).""" + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get("/probe", headers={ISOLATION_HEADER_USER: "alice-uid"}) + assert r.status_code == 200 + assert r.json() == {"user": "alice-uid", "chat": None} + + def test_only_chat_header_lifted(self) -> None: + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get("/probe", headers={ISOLATION_HEADER_CHAT: "general-cid"}) + assert r.status_code == 200 + assert r.json() == {"user": None, "chat": "general-cid"} + + def test_no_headers_keeps_contextvar_none(self) -> None: + """Local-dev path: with neither header present the middleware is + a no-op and the contextvar stays at its default ``None`` — + providers see "no isolation" and route to the in-memory + fallback rather than picking up stale per-request state.""" + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get("/probe") + assert r.status_code == 200 + assert r.json() == {"user": None, "chat": None, "_present": False} + assert probe.captured == [None] + + def test_empty_header_value_treated_as_absent(self) -> None: + """A header that's present but empty must not bind an empty key — + ``IsolationContext`` rejects empty strings on the read side.""" + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get( + "/probe", + headers={ + ISOLATION_HEADER_USER: "", + ISOLATION_HEADER_CHAT: "general-cid", + }, + ) + assert r.status_code == 200 + # Empty user header decodes to None; chat key stays bound. + assert r.json() == {"user": None, "chat": "general-cid"} + + def test_contextvar_resets_after_request(self) -> None: + """The middleware must call ``reset_current_isolation_keys`` in + a ``finally`` so per-request state never leaks across requests + or back into the calling thread's context.""" + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r1 = client.get("/probe", headers={ISOLATION_HEADER_USER: "alice-uid"}) + assert r1.status_code == 200 + # Reading the contextvar OUTSIDE the request scope must see + # the default — not the value the prior request bound. + assert get_current_isolation_keys() is None + # And a follow-up request without headers gets a clean + # ``None`` rather than inheriting alice-uid. + r2 = client.get("/probe") + assert r2.json() == {"user": None, "chat": None, "_present": False} + + def test_concurrent_requests_get_isolated_contextvars(self) -> None: + """Different requests run in different async contexts; binding + from request A must NOT leak into a concurrent request B.""" + host, probe = _make_host_with_probe() + + async def _drive() -> None: + # Run two requests in parallel asyncio tasks against the + # same TestClient and assert their captures don't bleed + # into each other. + async def _hit(user_key: str) -> dict[str, str | None]: + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get("/probe", headers={ISOLATION_HEADER_USER: user_key}) + return r.json() # type: ignore[no-any-return] + + r_alice, r_bob = await asyncio.gather(_hit("alice-uid"), _hit("bob-uid")) + assert r_alice == {"user": "alice-uid", "chat": None} + assert r_bob == {"user": "bob-uid", "chat": None} + + asyncio.run(_drive()) + + +class TestNonHttpScopesPassThrough: + """The middleware intentionally only inspects ``http`` scopes; + lifespan / websocket scopes are forwarded untouched. A regression + that touched lifespan scopes here would crash boot.""" + + async def test_lifespan_scope_does_not_consult_headers(self) -> None: + # The TestClient context manager exercises the lifespan scope + # implicitly; if the middleware tried to decode headers on a + # non-http scope this would raise. Exercise it without binding + # any contextvar work. + host, _probe = _make_host_with_probe() + with TestClient(host.app): # type: ignore[attr-defined] + # Just enter / exit; no requests. + pass diff --git a/python/packages/hosting/tests/test_runner.py b/python/packages/hosting/tests/test_runner.py new file mode 100644 index 00000000000..bee7e097b0a --- /dev/null +++ b/python/packages/hosting/tests/test_runner.py @@ -0,0 +1,333 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for :class:`InProcessTaskRunner` and runtime-mode auto-detection.""" + +from __future__ import annotations + +import asyncio +from collections.abc import Mapping +from typing import Any + +import pytest + +from agent_framework_hosting import ( + AgentFrameworkHost, + ChannelContext, + ChannelContribution, + DurableTaskPayloadMode, + InProcessTaskRunner, + RetryPolicy, + TaskHandle, +) +from agent_framework_hosting._host import _detect_runtime_mode + +# --------------------------------------------------------------------------- # +# Test helpers # +# --------------------------------------------------------------------------- # + + +class _AgentStub: + """Bare-minimum SupportsAgentRun stub for host construction.""" + + async def run(self, *_args: Any, **_kwargs: Any) -> None: # pragma: no cover - unused + return None + + +class _ChannelStub: + name = "stub" + path = "/stub" + + def contribute(self, _context: ChannelContext) -> ChannelContribution: + return ChannelContribution() + + +# --------------------------------------------------------------------------- # +# Runtime-mode auto-detection # +# --------------------------------------------------------------------------- # + + +class TestRuntimeModeDetection: + """``_detect_runtime_mode`` is pure: tests pass a synthetic env so + they never depend on the test runner's environment. Auto-detected + mode + matched marker drive the per-host startup banner so operators + can confirm the host is running in the expected shape.""" + + def test_no_markers_defaults_to_long_running(self) -> None: + mode, marker = _detect_runtime_mode(env={}) + assert mode == "long_running" + assert marker is None + + def test_foundry_marker_selects_ephemeral(self) -> None: + mode, marker = _detect_runtime_mode(env={"FOUNDRY_HOSTING_ENVIRONMENT": "production"}) + assert mode == "ephemeral" + assert marker == "FOUNDRY_HOSTING_ENVIRONMENT" + + def test_azure_functions_marker_selects_ephemeral(self) -> None: + mode, marker = _detect_runtime_mode(env={"AZURE_FUNCTIONS_ENVIRONMENT": "Development"}) + assert mode == "ephemeral" + assert marker == "AZURE_FUNCTIONS_ENVIRONMENT" + + def test_lambda_marker_selects_ephemeral(self) -> None: + mode, marker = _detect_runtime_mode(env={"AWS_LAMBDA_FUNCTION_NAME": "my-fn"}) + assert mode == "ephemeral" + assert marker == "AWS_LAMBDA_FUNCTION_NAME" + + def test_empty_marker_value_ignored(self) -> None: + # Empty-string env var should not count as "set" — Foundry's + # template uses unset-or-empty as "not deployed". + mode, marker = _detect_runtime_mode(env={"FOUNDRY_HOSTING_ENVIRONMENT": ""}) + assert mode == "long_running" + assert marker is None + + +class TestHostRuntimeMode: + """``runtime_mode`` ctor argument overrides auto-detect; ``None`` + triggers auto-detect. The detected mode is exposed via the + ``runtime_mode`` property for operator inspection (and is logged at + startup via ``_log_startup``).""" + + def test_explicit_long_running(self) -> None: + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + runtime_mode="long_running", + ) + assert host.runtime_mode == "long_running" + + def test_explicit_ephemeral_with_default_runner_raises(self) -> None: + # Default runner is in-process and not durable. Ephemeral + # deployments would silently lose pushes on scale-to-zero, so + # the host refuses the combination at construction unless the + # operator opts in explicitly via ``allow_in_process_runner``. + with pytest.raises(RuntimeError, match="ephemeral"): + AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + runtime_mode="ephemeral", + ) + + def test_explicit_ephemeral_with_in_process_opt_in_warns(self, caplog: pytest.LogCaptureFixture) -> None: + # The opt-in escape hatch keeps the old warn-and-proceed + # behaviour for local-dev / smoke-test scenarios that genuinely + # want ephemeral runtime semantics without a real durable + # backend. + with caplog.at_level("WARNING", logger="agent_framework.hosting"): + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + runtime_mode="ephemeral", + allow_in_process_runner=True, + ) + assert host.runtime_mode == "ephemeral" + assert any("ephemeral" in r.getMessage() and "InProcessTaskRunner" in r.getMessage() for r in caplog.records) + + def test_explicit_ephemeral_with_supplied_runner_does_not_warn(self, caplog: pytest.LogCaptureFixture) -> None: + runner = InProcessTaskRunner() + with caplog.at_level("WARNING", logger="agent_framework.hosting"): + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + runtime_mode="ephemeral", + durable_task_runner=runner, + ) + # No warning — operator opted into a specific runner. + assert host.runtime_mode == "ephemeral" + assert host.durable_task_runner is runner + assert not any("ephemeral" in r.getMessage() for r in caplog.records) + + def test_auto_detect_ephemeral_raises_without_opt_in(self, monkeypatch: pytest.MonkeyPatch) -> None: + # Auto-detected ephemeral flows through the same strict gate. + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "production") + with pytest.raises(RuntimeError, match="ephemeral"): + AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) + + def test_auto_detect_ephemeral_with_opt_in_proceeds(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "production") + host = AgentFrameworkHost( + target=_AgentStub(), + channels=[_ChannelStub()], + allow_in_process_runner=True, + ) + assert host.runtime_mode == "ephemeral" + + def test_default_runner_is_in_process_task_runner(self) -> None: + host = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) + assert isinstance(host.durable_task_runner, InProcessTaskRunner) + + +# --------------------------------------------------------------------------- # +# InProcessTaskRunner # +# --------------------------------------------------------------------------- # + + +class TestInProcessTaskRunner: + async def test_schedule_runs_handler_and_records_succeeded(self) -> None: + runner = InProcessTaskRunner() + seen: list[Mapping[str, Any]] = [] + + async def handler(payload: Mapping[str, Any]) -> None: + seen.append(payload) + + runner.register("ping", handler) + handle = await runner.schedule("ping", {"x": 1}) + # ``schedule`` returns immediately; the task runs on the loop. + # Drain explicitly via ``shutdown`` to flush in-flight work, + # then assert. + await _drain(runner, handle) + assert seen == [{"x": 1}] + assert await runner.get(handle) == "succeeded" + + async def test_unknown_handler_raises_keyerror(self) -> None: + runner = InProcessTaskRunner() + with pytest.raises(KeyError): + await runner.schedule("missing", {}) + + async def test_register_after_start_raises(self) -> None: + runner = InProcessTaskRunner() + + async def noop(_p: Mapping[str, Any]) -> None: + return None + + runner.register("x", noop) + handle = await runner.schedule("x", {}) + await _drain(runner, handle) + # Re-registering after the runner has started scheduling is + # rejected so in-flight tasks can't have their handler swapped + # out from under them. + with pytest.raises(RuntimeError, match="register"): + runner.register("y", noop) + + async def test_handler_retried_then_succeeds(self) -> None: + runner = InProcessTaskRunner() + attempts = {"n": 0} + + async def flaky(_p: Mapping[str, Any]) -> None: + attempts["n"] += 1 + if attempts["n"] < 3: + raise RuntimeError(f"attempt {attempts['n']}") + + runner.register("flaky", flaky) + # Tight retry policy so the test doesn't sleep visibly. + policy = RetryPolicy(max_attempts=5, initial_backoff_seconds=0.001, max_backoff_seconds=0.005) + handle = await runner.schedule("flaky", {}, retry_policy=policy) + await _drain(runner, handle) + assert attempts["n"] == 3 + assert await runner.get(handle) == "succeeded" + + async def test_handler_failure_records_failed_after_max_attempts(self) -> None: + runner = InProcessTaskRunner() + + async def always_fails(_p: Mapping[str, Any]) -> None: + raise RuntimeError("nope") + + runner.register("doomed", always_fails) + policy = RetryPolicy(max_attempts=2, initial_backoff_seconds=0.001) + handle = await runner.schedule("doomed", {}, retry_policy=policy) + await _drain(runner, handle) + assert await runner.get(handle) == "failed" + + async def test_shutdown_cancels_pending_tasks(self) -> None: + runner = InProcessTaskRunner() + started = asyncio.Event() + cancelled = asyncio.Event() + + async def long_running(_p: Mapping[str, Any]) -> None: + started.set() + try: + # Sleep longer than the test wait so shutdown can cancel. + await asyncio.sleep(5) + except asyncio.CancelledError: + cancelled.set() + raise + + runner.register("long", long_running) + handle = await runner.schedule("long", {}) + await asyncio.wait_for(started.wait(), timeout=1.0) + await runner.shutdown(timeout=1.0) + assert cancelled.is_set() + assert await runner.get(handle) == "cancelled" + + async def test_shutdown_grace_drain_does_not_cancel_finishing_tasks(self) -> None: + """A short-lived task that completes within the grace window + must NOT receive a cancellation. The grace-period drain is the + graceful-shutdown contract — channels with goodbye-message + flushes rely on it.""" + runner = InProcessTaskRunner() + cancelled = asyncio.Event() + completed = asyncio.Event() + + async def quick(_p: Mapping[str, Any]) -> None: + try: + await asyncio.sleep(0.05) + except asyncio.CancelledError: + cancelled.set() + raise + completed.set() + + runner.register("quick", quick) + handle = await runner.schedule("quick", {}) + # Shutdown with a generous grace window relative to the task duration. + await runner.shutdown(timeout=1.0) + assert completed.is_set() + assert not cancelled.is_set() + assert await runner.get(handle) == "succeeded" + + async def test_get_returns_none_for_unknown_handle(self) -> None: + runner = InProcessTaskRunner() + handle = TaskHandle(task_id="never-scheduled", name="x") + assert await runner.get(handle) is None + + async def test_terminal_cache_evicts_oldest(self) -> None: + # Cache size of 2: drain three tasks in sequence, the first + # should age out by the time the third's terminal lands. + runner = InProcessTaskRunner(terminal_cache_size=2) + + async def noop(_p: Mapping[str, Any]) -> None: + return None + + runner.register("noop", noop) + h1 = await runner.schedule("noop", {}) + await _drain(runner, h1) + h2 = await runner.schedule("noop", {}) + await _drain(runner, h2) + h3 = await runner.schedule("noop", {}) + await _drain(runner, h3) + # Oldest handle's terminal status should be evicted by now. + assert await runner.get(h1) is None + assert await runner.get(h2) == "succeeded" + assert await runner.get(h3) == "succeeded" + + async def test_shutdown_is_safe_when_no_tasks_pending(self) -> None: + runner = InProcessTaskRunner() + # No-op shouldn't raise. + await runner.shutdown() + + def test_payload_mode_defaults_to_object(self) -> None: + # The in-process runner passes live Python references through + # the payload — the host wires this attribute into its codec + # validator at startup. Durable adapters that persist payloads + # must override this to ``JSON`` so the host refuses to ship + # un-serialisable references. + runner = InProcessTaskRunner() + assert runner.payload_mode == DurableTaskPayloadMode.OBJECT + + +# --------------------------------------------------------------------------- # +# Helpers # +# --------------------------------------------------------------------------- # + + +async def _drain(runner: InProcessTaskRunner, handle: TaskHandle, *, timeout: float = 1.0) -> None: + """Wait for ``handle`` to reach a terminal state. + + Polls ``get`` rather than reaching into runner internals so we exercise the + public surface from the test side too. + """ + deadline = asyncio.get_event_loop().time() + timeout + while True: + status = await runner.get(handle) + if status in ("succeeded", "failed", "cancelled"): + return + if asyncio.get_event_loop().time() > deadline: + raise AssertionError(f"task {handle.task_id} did not reach terminal in {timeout}s; status={status}") + await asyncio.sleep(0.01) diff --git a/python/packages/hosting/tests/test_runner_disk.py b/python/packages/hosting/tests/test_runner_disk.py new file mode 100644 index 00000000000..db566e7e3c9 --- /dev/null +++ b/python/packages/hosting/tests/test_runner_disk.py @@ -0,0 +1,278 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for :class:`InProcessTaskRunner` disk persistence (``state_dir``).""" + +from __future__ import annotations + +import asyncio +from collections.abc import Mapping +from pathlib import Path +from typing import Any + +import pytest + +from agent_framework_hosting import ( + InProcessTaskRunner, + PushPayloadNotPicklable, + RetryPolicy, +) + +# Skip the whole module if the optional diskcache dependency isn't installed. +pytest.importorskip("diskcache") + + +# --------------------------------------------------------------------------- # +# state_dir=None preserves today's purely in-memory contract # +# --------------------------------------------------------------------------- # + + +async def test_state_dir_none_is_pure_memory(tmp_path: Path) -> None: + """No directory creation / no lock file when state_dir is omitted.""" + runner = InProcessTaskRunner() + calls: list[Mapping[str, Any]] = [] + + async def handler(payload: Mapping[str, Any]) -> None: + calls.append(payload) + + runner.register("echo", handler) + handle = await runner.schedule("echo", {"k": "v"}) + + # Wait for completion. + for _ in range(50): + if (await runner.get(handle)) == "succeeded": + break + await asyncio.sleep(0.01) + assert calls == [{"k": "v"}] + assert await runner.get(handle) == "succeeded" + # Confirm we didn't accidentally write to disk. + assert not (tmp_path / ".lock").exists() + + await runner.shutdown() + + +# --------------------------------------------------------------------------- # +# Lock contention — two runners on the same dir refuse to coexist # +# --------------------------------------------------------------------------- # + + +async def test_two_runners_one_state_dir_raise(tmp_path: Path) -> None: + """Second runner construction must fail loudly, not silently corrupt.""" + state_dir = tmp_path / "runner" + first = InProcessTaskRunner(state_dir=state_dir) + try: + with pytest.raises(RuntimeError, match="state lock"): + InProcessTaskRunner(state_dir=state_dir) + finally: + await first.shutdown() + + +# --------------------------------------------------------------------------- # +# Pickle failure raises eagerly, never silently downgrades # +# --------------------------------------------------------------------------- # + + +async def test_unpickleable_payload_raises(tmp_path: Path) -> None: + """Schedule must refuse payloads that can't survive a restart.""" + runner = InProcessTaskRunner(state_dir=tmp_path / "runner") + + async def handler(_: Mapping[str, Any]) -> None: ... + + runner.register("echo", handler) + # Local lambdas / closures are the canonical unpicklable values. + with pytest.raises(PushPayloadNotPicklable): + await runner.schedule("echo", {"callback": lambda: None}) + await runner.shutdown() + + +# --------------------------------------------------------------------------- # +# Resume — pending records replay on next process # +# --------------------------------------------------------------------------- # + + +async def test_pending_record_replays_on_resume(tmp_path: Path) -> None: + """Simulate a crash: first runner schedules but never starts running.""" + state_dir = tmp_path / "runner" + + # Process 1 — schedule a task, then "die" before the asyncio loop runs it. + runner1 = InProcessTaskRunner(state_dir=state_dir) + blocked = asyncio.Event() + + async def slow(_: Mapping[str, Any]) -> None: + # Sleep so the task is observably still in flight when we shutdown. + await blocked.wait() + + runner1.register("slow", slow) + handle = await runner1.schedule("slow", {"work": 1}) + # Force a hard shutdown — leaves the in-flight task in 'pending' on disk. + await runner1.shutdown(timeout=0.1) + + # Process 2 — fresh runner against same state_dir, register the handler, + # call resume. We expect the persisted record to be re-scheduled. + runner2 = InProcessTaskRunner(state_dir=state_dir) + seen: list[Mapping[str, Any]] = [] + + async def slow_resumed(payload: Mapping[str, Any]) -> None: + seen.append(dict(payload)) + + runner2.register("slow", slow_resumed) + replayed = await runner2.resume() + assert replayed == 1 + + # Give the resumed task time to run. + for _ in range(50): + if seen: + break + await asyncio.sleep(0.01) + assert seen == [{"work": 1}] + # Status is observable via the original handle. + assert await runner2.get(handle) == "succeeded" + + await runner2.shutdown() + + +# --------------------------------------------------------------------------- # +# echo_done cursor survives restart # +# --------------------------------------------------------------------------- # + + +async def test_payload_mutation_survives_restart(tmp_path: Path) -> None: + """Handler-side payload mutations (echo_done) round-trip through disk.""" + state_dir = tmp_path / "runner" + runner1 = InProcessTaskRunner(state_dir=state_dir) + + # Handler sets echo_done and then blocks forever (simulating mid-flight crash). + handler_progress = asyncio.Event() + + async def half_done(payload: Mapping[str, Any]) -> None: + # Mutate the payload to mark first phase complete. + payload["echo_done"] = True # type: ignore[index] + handler_progress.set() + # Sleep indefinitely so the asyncio task is still running at shutdown. + await asyncio.Event().wait() + + runner1.register("two_phase", half_done) + handle = await runner1.schedule("two_phase", {"echo_done": False, "k": "v"}) + await handler_progress.wait() + await runner1.shutdown(timeout=0.1) + + # Process 2 — replay; the handler now sees echo_done=True from disk. + runner2 = InProcessTaskRunner(state_dir=state_dir) + observed: list[bool] = [] + + async def two_phase_resumed(payload: Mapping[str, Any]) -> None: + observed.append(bool(payload.get("echo_done"))) + + runner2.register("two_phase", two_phase_resumed) + await runner2.resume() + + for _ in range(50): + if observed: + break + await asyncio.sleep(0.01) + assert observed == [True] + # And the resumed task ran to completion. + assert await runner2.get(handle) == "succeeded" + + await runner2.shutdown() + + +# --------------------------------------------------------------------------- # +# Resume gracefully handles missing handler / corrupt entries # +# --------------------------------------------------------------------------- # + + +async def test_resume_with_missing_handler_marks_failed(tmp_path: Path) -> None: + """A persisted record whose handler is no longer registered is marked failed.""" + state_dir = tmp_path / "runner" + + runner1 = InProcessTaskRunner(state_dir=state_dir) + + async def will_be_removed(_: Mapping[str, Any]) -> None: + await asyncio.Event().wait() + + runner1.register("ghost", will_be_removed) + handle = await runner1.schedule("ghost", {}) + await runner1.shutdown(timeout=0.1) + + # Process 2 — never registers "ghost". + runner2 = InProcessTaskRunner(state_dir=state_dir) + replayed = await runner2.resume() + assert replayed == 0 + # The record is moved to terminal 'failed'. + assert await runner2.get(handle) == "failed" + await runner2.shutdown() + + +async def test_resume_quarantines_corrupt_entries(tmp_path: Path) -> None: + """A non-dict on-disk entry must be quarantined, not crash resume.""" + import diskcache # noqa: PLC0415 - lazy import to keep module-import cheap + + state_dir = tmp_path / "runner" + state_dir.mkdir(parents=True, exist_ok=True) + # Pre-populate the cache with a junk entry. + cache = diskcache.Cache(str(state_dir)) + cache.set("bad-task-id", "this is not a dict") + cache.close() + + runner = InProcessTaskRunner(state_dir=state_dir) + # resume() must not raise even with a corrupt entry on disk. + replayed = await runner.resume() + assert replayed == 0 + await runner.shutdown() + + # The corrupt entry should have been removed. + cache2 = diskcache.Cache(str(state_dir)) + assert "bad-task-id" not in cache2 + cache2.close() + + +# --------------------------------------------------------------------------- # +# Retry attempt counter persists across resume # +# --------------------------------------------------------------------------- # + + +async def test_attempt_counter_persists_across_resume(tmp_path: Path) -> None: + """A handler that crashes mid-attempt resumes with the consumed budget.""" + state_dir = tmp_path / "runner" + policy = RetryPolicy(max_attempts=3, initial_backoff_seconds=0.01, backoff_multiplier=1.0) + + # Process 1 — schedule, fail once, shutdown before retry settles. + runner1 = InProcessTaskRunner(state_dir=state_dir, default_retry_policy=policy) + attempts_seen_p1 = 0 + + async def flaky(_: Mapping[str, Any]) -> None: + nonlocal attempts_seen_p1 + attempts_seen_p1 += 1 + raise RuntimeError("boom-1") + + runner1.register("flaky", flaky) + handle = await runner1.schedule("flaky", {}) + # Let it attempt twice (waste 2 of 3 budgeted retries), then crash-shutdown. + for _ in range(50): + if attempts_seen_p1 >= 2: + break + await asyncio.sleep(0.01) + await runner1.shutdown(timeout=0.05) + + # Process 2 — resume; only 1 attempt left in the budget. Confirm we don't + # re-grant the full retry budget. + runner2 = InProcessTaskRunner(state_dir=state_dir, default_retry_policy=policy) + attempts_seen_p2 = 0 + + async def flaky_resumed(_: Mapping[str, Any]) -> None: + nonlocal attempts_seen_p2 + attempts_seen_p2 += 1 + raise RuntimeError("boom-2") + + runner2.register("flaky", flaky_resumed) + await runner2.resume() + # Wait for the resumed task to consume its remaining attempts and fail terminally. + for _ in range(100): + if (await runner2.get(handle)) == "failed": + break + await asyncio.sleep(0.01) + assert await runner2.get(handle) == "failed" + # Original consumed 2 attempts; we should have allowed at most max_attempts-2=1 + # more in process 2. + assert attempts_seen_p2 <= 1 + await runner2.shutdown() diff --git a/python/packages/hosting/tests/test_types.py b/python/packages/hosting/tests/test_types.py new file mode 100644 index 00000000000..e502e16dca0 --- /dev/null +++ b/python/packages/hosting/tests/test_types.py @@ -0,0 +1,252 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for the channel-neutral envelope types in :mod:`agent_framework_hosting._types`.""" + +from __future__ import annotations + +from typing import Any + +from agent_framework_hosting import ( + ChannelIdentity, + ChannelRequest, + ChannelSession, + DurableTaskPayloadMode, + ResponseTarget, + ResponseTargetKind, + apply_run_hook, +) + + +class TestResponseTarget: + def test_originating_default_singleton(self) -> None: + target = ResponseTarget.originating # type: ignore[attr-defined] + assert target.kind is ResponseTargetKind.ORIGINATING + assert target.targets == () + + def test_active_singleton(self) -> None: + target = ResponseTarget.active # type: ignore[attr-defined] + assert target.kind is ResponseTargetKind.ACTIVE + assert target.targets == () + + def test_all_linked_singleton(self) -> None: + target = ResponseTarget.all_linked # type: ignore[attr-defined] + assert target.kind is ResponseTargetKind.ALL_LINKED + + def test_none_singleton(self) -> None: + target = ResponseTarget.none # type: ignore[attr-defined] + assert target.kind is ResponseTargetKind.NONE + + def test_channel_builder_single(self) -> None: + target = ResponseTarget.channel("teams") + assert target.kind is ResponseTargetKind.CHANNELS + assert target.targets == ("teams",) + + def test_channels_builder_list(self) -> None: + target = ResponseTarget.channels(["teams", "telegram", "originating"]) + assert target.kind is ResponseTargetKind.CHANNELS + assert target.targets == ("teams", "telegram", "originating") + + def test_channels_builder_accepts_tuple(self) -> None: + target = ResponseTarget.channels(("a", "b")) + assert target.targets == ("a", "b") + + def test_target_is_hashable(self) -> None: + # Plain class — hashing falls back to identity, which is fine here: + # the two keys below are different instances (singleton vs builder). + d = {ResponseTarget.originating: 1, ResponseTarget.channel("t"): 2} # type: ignore[attr-defined] + assert len(d) == 2 + + +class TestChannelRequest: + def test_required_fields_only(self) -> None: + req = ChannelRequest(channel="responses", operation="message.create", input="hi") + assert req.channel == "responses" + assert req.operation == "message.create" + assert req.input == "hi" + assert req.session is None + assert req.options is None + assert req.session_mode == "auto" + assert req.metadata == {} + assert req.attributes == {} + assert req.stream is False + assert req.identity is None + # Default response target is the originating singleton. + assert req.response_target.kind is ResponseTargetKind.ORIGINATING + + def test_default_response_target_is_originating_singleton(self) -> None: + # Every new request shares the module-level ``originating`` singleton + # by default — instances are intended to be treated as immutable, so + # sharing is safe and avoids per-request allocation. + a = ChannelRequest(channel="a", operation="op", input="x") + b = ChannelRequest(channel="b", operation="op", input="y") + assert a.response_target is ResponseTarget.originating # type: ignore[attr-defined] + assert a.response_target is b.response_target + + def test_with_session_and_identity(self) -> None: + req = ChannelRequest( + channel="telegram", + operation="message.create", + input="hi", + session=ChannelSession(isolation_key="user:42"), + identity=ChannelIdentity(channel="telegram", native_id="42"), + response_target=ResponseTarget.active, # type: ignore[attr-defined] + ) + assert req.session is not None + assert req.session.isolation_key == "user:42" + assert req.identity is not None + assert req.identity.channel == "telegram" + assert req.identity.native_id == "42" + assert req.response_target.kind is ResponseTargetKind.ACTIVE + + +class TestChannelIdentity: + def test_attributes_default_empty_mapping(self) -> None: + ident = ChannelIdentity(channel="teams", native_id="abc") + assert dict(ident.attributes) == {} + + def test_attributes_passthrough(self) -> None: + ident = ChannelIdentity(channel="teams", native_id="abc", attributes={"role": "user"}) + assert dict(ident.attributes) == {"role": "user"} + + +class _DummyTarget: + """Stand-in for the ``SupportsAgentRun | Workflow`` arg `apply_run_hook` forwards. + + `apply_run_hook` doesn't introspect the target — it just forwards + it as a kwarg to the user's hook — so a bare class is enough. + """ + + +class TestApplyRunHook: + """`apply_run_hook` is the channel-side helper that invokes a + `ChannelRunHook` with the standard kwargs (`request` positional, + `target` / `protocol_request` keyword). Channels call this rather + than calling the hook directly so the convention is enforced in + one place. Cover both branching paths (sync vs async hook return) + and assert kwargs forwarding so a regression that drops `target` + or `protocol_request` is caught.""" + + async def test_sync_hook_returning_modified_request(self) -> None: + captured: dict[str, Any] = {} + + def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: + # Snapshot the kwargs for the assertion below, then return a + # NEW request so we also verify the helper passes the + # replacement straight through (no merging / mutation). + captured["target"] = kwargs.get("target") + captured["protocol_request"] = kwargs.get("protocol_request") + return ChannelRequest(channel=request.channel, operation="HOOK_TOUCHED", input=request.input) + + original = ChannelRequest(channel="responses", operation="op", input="hi") + target = _DummyTarget() + proto = {"raw": "payload"} + + result = await apply_run_hook(hook, original, target=target, protocol_request=proto) + + assert result is not original + assert result.operation == "HOOK_TOUCHED" + assert captured["target"] is target + assert captured["protocol_request"] is proto + + async def test_async_hook_returning_modified_request(self) -> None: + captured: dict[str, Any] = {} + + async def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: + captured["target"] = kwargs.get("target") + captured["protocol_request"] = kwargs.get("protocol_request") + # Return an awaitable result to exercise the async branch + # (`isinstance(result, Awaitable) → await it`). + return ChannelRequest(channel=request.channel, operation="ASYNC_HOOK", input=request.input) + + original = ChannelRequest(channel="telegram", operation="op", input="hi") + target = _DummyTarget() + proto = {"update_id": 42} + + result = await apply_run_hook(hook, original, target=target, protocol_request=proto) + + assert result.operation == "ASYNC_HOOK" + assert captured["target"] is target + assert captured["protocol_request"] is proto + + async def test_protocol_request_can_be_none(self) -> None: + """Channels that don't have a raw protocol payload (e.g. CLI / test + harness invocations) pass ``protocol_request=None``; the helper + forwards it as-is so hooks can ``if protocol_request is None`` to + gate channel-specific logic.""" + captured: dict[str, Any] = {} + + async def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: + captured["protocol_request"] = kwargs.get("protocol_request") + captured["protocol_request_in_kwargs"] = "protocol_request" in kwargs + return request + + await apply_run_hook( + hook, + ChannelRequest(channel="x", operation="op", input="hi"), + target=_DummyTarget(), + protocol_request=None, + ) + + assert captured["protocol_request"] is None + assert captured["protocol_request_in_kwargs"] is True + + +class TestDurableTaskPayloadMode: + """``DurableTaskPayloadMode`` distinguishes object-mode (in-process, + live references) from JSON-mode (durable persistence, channel codec + required) runners. The host's startup validator uses the value to + refuse misconfigured deployments.""" + + def test_enum_values(self) -> None: + assert DurableTaskPayloadMode.OBJECT.value == "object" + assert DurableTaskPayloadMode.JSON.value == "json" + # Both members; no surprise additions until we ship a third + # adapter style. + assert set(DurableTaskPayloadMode) == {DurableTaskPayloadMode.OBJECT, DurableTaskPayloadMode.JSON} + + +class TestResponseTargetIdentities: + """``ResponseTarget.identity``/``.identities`` carry full + :class:`ChannelIdentity` objects (incl. attributes) so destination + channels that need conversation/thread metadata (Teams, Slack, Bot + Framework) don't have to encode it through string tokens.""" + + def test_identity_single(self) -> None: + ident = ChannelIdentity(channel="teams", native_id="user@contoso", attributes={"tenant_id": "abc"}) + target = ResponseTarget.identity(ident) + assert target.kind is ResponseTargetKind.IDENTITIES + assert len(target.target_identities) == 1 + assert target.target_identities[0].channel == "teams" + assert target.target_identities[0].native_id == "user@contoso" + assert dict(target.target_identities[0].attributes) == {"tenant_id": "abc"} + + def test_identities_list_preserves_attributes(self) -> None: + ident_a = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t1"}) + ident_b = ChannelIdentity(channel="slack", native_id="u2", attributes={"channel_id": "c2"}) + target = ResponseTarget.identities([ident_a, ident_b]) + assert target.kind is ResponseTargetKind.IDENTITIES + assert len(target.target_identities) == 2 + assert dict(target.target_identities[0].attributes) == {"thread": "t1"} + assert dict(target.target_identities[1].attributes) == {"channel_id": "c2"} + + def test_identity_value_equality_matches_on_attributes(self) -> None: + # Two ``ResponseTarget.identity`` values built independently + # compare equal when the underlying ``ChannelIdentity`` content + # matches — important because tests and channel parsers use + # ``==`` on targets. + ident_a = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t1"}) + ident_b = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t1"}) + assert ResponseTarget.identity(ident_a) == ResponseTarget.identity(ident_b) + # Different attributes → not equal. + ident_c = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t2"}) + assert ResponseTarget.identity(ident_a) != ResponseTarget.identity(ident_c) + + def test_identity_repr_includes_targets(self) -> None: + ident = ChannelIdentity(channel="teams", native_id="u1") + rep = repr(ResponseTarget.identity(ident)) + assert "ResponseTarget.identities" in rep + + def test_identity_echo_input_flag(self) -> None: + ident = ChannelIdentity(channel="teams", native_id="u1") + target = ResponseTarget.identity(ident, echo_input=True) + assert target.echo_input is True diff --git a/python/pyproject.toml b/python/pyproject.toml index 0a4e6f34a93..ca53fbf8ebd 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -87,6 +87,7 @@ agent-framework-foundry-hosting = { workspace = true } agent-framework-foundry-local = { workspace = true } agent-framework-gemini = { workspace = true } agent-framework-github-copilot = { workspace = true } +agent-framework-hosting = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } agent-framework-mem0 = { workspace = true } @@ -210,6 +211,7 @@ executionEnvironments = [ { root = "packages/foundry/tests", reportPrivateUsage = "none" }, { root = "packages/foundry_local/tests", reportPrivateUsage = "none" }, { root = "packages/github_copilot/tests", reportPrivateUsage = "none" }, + { root = "packages/hosting/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/uv.lock b/python/uv.lock index d17e55e1a7f..a2b7ce16f27 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -50,6 +50,7 @@ members = [ "agent-framework-foundry-local", "agent-framework-gemini", "agent-framework-github-copilot", + "agent-framework-hosting", "agent-framework-hyperlight", "agent-framework-lab", "agent-framework-mem0", @@ -612,6 +613,40 @@ requires-dist = [ { name = "github-copilot-sdk", marker = "python_full_version >= '3.11'", specifier = ">=1.0.0,<2" }, ] +[[package]] +name = "agent-framework-hosting" +version = "1.0.0a260424" +source = { editable = "packages/hosting" } +dependencies = [ + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "starlette", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + +[package.optional-dependencies] +disk = [ + { name = "diskcache", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] +serve = [ + { name = "hypercorn", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + +[package.dev-dependencies] +dev = [ + { name = "httpx", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + +[package.metadata] +requires-dist = [ + { name = "agent-framework-core", editable = "packages/core" }, + { name = "diskcache", marker = "extra == 'disk'", specifier = ">=5.6" }, + { name = "hypercorn", marker = "extra == 'serve'", specifier = ">=0.17" }, + { name = "starlette", specifier = ">=0.37" }, +] +provides-extras = ["serve", "disk"] + +[package.metadata.requires-dev] +dev = [{ name = "httpx", specifier = ">=0.28.1" }] + [[package]] name = "agent-framework-hyperlight" version = "1.0.0b260521" @@ -2120,6 +2155,15 @@ wheels = [ { url = "https://files.pythonhosted.org/packages/dc/c4/da7089cd7aa4ab554f56e18a7fb08dcfed8fd2ae91fa528f5b1be207a148/deepdiff-9.0.0-py3-none-any.whl", hash = "sha256:b1ae0dd86290d86a03de5fbee728fde43095c1472ae4974bdab23ab4656305bd", size = 170540, upload-time = "2026-03-30T05:52:22.008Z" }, ] +[[package]] +name = "diskcache" +version = "5.6.3" +source = { registry = "https://pypi.org/simple" } +sdist = { url = "https://files.pythonhosted.org/packages/3f/21/1c1ffc1a039ddcc459db43cc108658f32c57d271d7289a2794e401d0fdb6/diskcache-5.6.3.tar.gz", hash = "sha256:2c3a3fa2743d8535d832ec61c2054a1641f41775aa7c556758a109941e33e4fc", size = 67916, upload-time = "2023-08-31T06:12:00.316Z" } +wheels = [ + { url = "https://files.pythonhosted.org/packages/3f/27/4570e78fc0bf5ea0ca45eb1de3818a23787af9b390c0b0a0033a1b8236f9/diskcache-5.6.3-py3-none-any.whl", hash = "sha256:5e31b2d5fbad117cc363ebaf6b689474db18a1f6438bc82358b024abd4c2ca19", size = 45550, upload-time = "2023-08-31T06:11:58.822Z" }, +] + [[package]] name = "distro" version = "1.9.0" From a83a8af5a72e1b4c71b7b513139422d7f0123fc2 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Fri, 22 May 2026 15:42:06 +0200 Subject: [PATCH 04/20] Python: refactor FoundryHostedAgentHistoryProvider onto Foundry SDK (#5637) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * refactor(foundry_hosting): build FoundryHostedAgentHistoryProvider on azure.ai.agentserver SDK Rebuilds the Foundry hosted-agent history provider on top of ``azure.ai.agentserver``'s ``FoundryStorageProvider`` instead of the in-house ``_HttpStorageBackend``. Splits the monolithic ``_responses.py`` into focused modules: - ``_history_provider.py`` — new ``FoundryHostedAgentHistoryProvider`` that talks to the SDK's ``FoundryStorageProvider``, threads ``response_id`` / ``previous_response_id`` through ``ContextVar``s via ``bind_request_context``, and lifts host-bound isolation keys (``x-agent-{user,chat}-isolation-key``) from the optional ``agent_framework_hosting`` package into a provider-local ``IsolationContext`` so the storage layer carries the correct partition keys without channels having to know about them. - ``_shared.py`` — extracts all SDK ``Item`` / ``OutputItem`` ↔ framework ``Message`` conversion helpers into one place so both ``_responses.py`` and the new history provider can share them. Restores ``_convert_file_data`` for inline ``input_file`` payloads, and the hosted-MCP routing for ``custom_tool_call_output`` items whose ``call_id`` carries the ``mcp_*`` prefix. - ``_ids.py`` — shared id helpers. - ``_responses.py`` — shrinks ~700 lines, re-exports converters for back-compat with existing tests. - ``tests/test_history_provider.py`` — exercises the new provider against a fake SDK backend; the host-isolation test is gated on the optional ``agent_framework_hosting`` import. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(foundry_hosting): add local_storage_root for file-based dev history Adds an optional `local_storage_root: str | Path | None` parameter to `FoundryHostedAgentHistoryProvider`. When set and the provider is running outside a Foundry Hosted Agent container, conversations are persisted to JSONL files via `agent_framework.FileHistoryProvider` laid out as: {root}/{user_key or '~none'}/{chat_key or '~none'}/{session_id}.jsonl Hosted mode (FOUNDRY_HOSTING_ENVIRONMENT set) ignores the option with a one-time INFO log so Foundry storage always wins on the platform. The in-memory fallback is unchanged when the option is omitted. Path safety: isolation segments are validated against the same character allowlist FileHistoryProvider uses for session-id stems and base64-url-encoded with a reserved "~iso-" prefix when unsafe. "~none" sentinel for missing keys can never collide with a real isolation key (real keys starting with "~" are encoded). The resolved target dir is also re-checked to be inside the configured root. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(foundry_hosting): address PR-1 review comments - _shared.py:_capture_raw narrows `except Exception` to `except TypeError` and emits a WARNING with traceback so the lossy fallback to a synthesized round-trip is observable. Mirrors the reviewer suggestion. - _history_provider.py:save_messages narrows `except Exception` to `except FoundryStorageError` so only storage-validation failures (4xx/5xx, opaque server errors) are swallowed. Network / TLS / auth / payload-builder bugs propagate so the caller can retry / alert. Adds an instance-level `failed_writes` counter operators can poll for silent-drop visibility. - _history_provider.py id-stamping loop: drops the `contextlib.suppress(AttributeError, TypeError)` around `item.id = new_id` so SDK contract changes surface in the test suite instead of silently corrupting the chain (the storage backend rejects the entire `create_response` with HTTP 500 when synthetic prefix-based ids leak through). `import contextlib` removed. - tests: * Unit-cover `foundry_response_id` / `foundry_response_id_factory` / `foundry_item_id` so SDK `IdGenerator` contract changes are caught locally. * Cover the `save_messages` wire payload: required-by-storage fields (`background`, `parallel_tool_calls`, `instructions`, `agent_reference`), env-var-driven stamping (`FOUNDRY_AGENT_NAME` / `FOUNDRY_AGENT_VERSION` / `FOUNDRY_AGENT_SESSION_ID` / `MODEL_DEPLOYMENT_NAME` with `AZURE_AI_MODEL_DEPLOYMENT_NAME` fallback), and the rule that `model` / `agent_session_id` / `agent_reference.version` are omitted (not stamped to `None`) when their env vars are unset. * Cover the `FOUNDRY_AGENT_SESSION_ID` last-resort chain anchor on both the get and save paths, including the prefix gate that blocks non-`caresp_*`/`resp_*` values from reaching storage, and the precedence rule that a host binding wins over the env. * Replace the old `test_save_messages_swallows_backend_errors` with two tests asserting the new contract: storage errors are swallowed and bump `failed_writes`; everything else propagates and leaves the counter at zero. 141 unit tests pass; mypy + pyright + ruff clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(foundry_hosting): address PR-1 round-2 review comments - Hosted detection now delegates to AgentConfig.from_env().is_hosted so a future Foundry SDK rename of FOUNDRY_HOSTING_ENVIRONMENT propagates automatically; drop the local _ENV_FOUNDRY_HOSTING_ENVIRONMENT constant. - Drop the FOUNDRY_AGENT_SESSION_ID fallback in both get_messages and save_messages: per the SDK it identifies the *container instance*, not the conversation, so chaining off it would silently merge unrelated conversations across container restarts. The host-bound previous_response_id (set by ResponsesChannel) is the only authoritative anchor; the env value is still stamped into the persisted envelope's agent_session_id for operator correlation. - Update module docstring + replace TestFoundryAgentSessionIdAnchor with assertions for the new contract (env var ignored as anchor, still stamped onto persisted envelope, host binding wins). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * refactor(foundry_hosting): reconcile with upstream main (#5851, #5666) Brings the FoundryHostedAgentHistoryProvider refactor branch back into sync with the foundry_hosting changes that have landed on upstream main since PR-1 was opened: * #5851 (path traversal in checkpoint storage, CWE-22). The workflow-host code in ``_responses.py`` builds a ``FileCheckpointStorage`` from a caller-controlled ``context_id`` (``previous_response_id`` / ``conversation_id`` / ``response_id``). Switch both call sites to route through ``_checkpoint_storage_for_context``, which rejects separators, NUL bytes, drive letters, absolute paths, and all-dot segments, and enforces ``is_relative_to(root)`` before any directory is created. * #5666 (function approval flow). Make the SDK-Item → AF-Message conversion helpers in ``_shared.py`` async and accept an optional ``approval_storage`` keyword: - ``_items_to_messages`` / ``_item_to_message`` / ``_item_to_message_inner`` - ``_output_items_to_messages`` / ``_output_item_to_message`` / ``_output_item_to_message_inner`` For ``mcp_approval_request`` / ``mcp_approval_response`` items the helpers now load the original function-call Content from the approval storage (via ``ApprovalStorage.load_approval_request``) instead of synthesising a placeholder. This matches upstream semantics and lets approval round-trips reconstruct the real payload. The ``ApprovalStorage`` Protocol moves to ``_shared.py`` so the conversion helpers can reference it without pulling in ``_responses.py`` (which would create a circular import). The concrete ``InMemoryFunctionApprovalStorage`` and ``FileBasedFunctionApprovalStorage`` stay in ``_responses.py`` next to the host that owns them, and re-export ``ApprovalStorage`` from ``_shared`` for compatibility. The workflow-host streaming path passes its own ``self._approval_storage`` into ``_to_outputs`` so approval requests are saved at emit time. * Bump ``_history_provider.FoundryHostedAgentHistoryProvider.get_messages`` to ``await`` the now-async ``_output_items_to_messages`` call. No public API change beyond the new keyword-only ``approval_storage`` parameter on the four conversion entry points. Validation: - uv run poe check-packages -P foundry_hosting (lint + pyright clean) - uv run poe mypy -P foundry_hosting (clean) - uv run poe test -P foundry_hosting (183 passed, 1 skipped) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../__init__.py | 21 +- .../_history_provider.py | 991 ++++++++++++ .../agent_framework_foundry_hosting/_ids.py | 72 + .../_responses.py | 1399 +++------------- .../_shared.py | 1340 +++++++++++++++ .../tests/test_history_provider.py | 1435 +++++++++++++++++ 6 files changed, 4099 insertions(+), 1159 deletions(-) create mode 100644 python/packages/foundry_hosting/agent_framework_foundry_hosting/_history_provider.py create mode 100644 python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py create mode 100644 python/packages/foundry_hosting/agent_framework_foundry_hosting/_shared.py create mode 100644 python/packages/foundry_hosting/tests/test_history_provider.py diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/__init__.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/__init__.py index 81e8430783c..691353a0e16 100644 --- a/python/packages/foundry_hosting/agent_framework_foundry_hosting/__init__.py +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/__init__.py @@ -2,6 +2,16 @@ import importlib.metadata +from ._history_provider import ( + FoundryHostedAgentHistoryProvider, + bind_request_context, + get_current_request_context, +) +from ._ids import ( + foundry_item_id, + foundry_response_id, + foundry_response_id_factory, +) from ._invocations import InvocationsHostServer from ._responses import ResponsesHostServer @@ -10,4 +20,13 @@ except importlib.metadata.PackageNotFoundError: __version__ = "0.0.0" -__all__ = ["InvocationsHostServer", "ResponsesHostServer"] +__all__ = [ + "FoundryHostedAgentHistoryProvider", + "InvocationsHostServer", + "ResponsesHostServer", + "bind_request_context", + "foundry_item_id", + "foundry_response_id", + "foundry_response_id_factory", + "get_current_request_context", +] diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_history_provider.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_history_provider.py new file mode 100644 index 00000000000..5b781edb60f --- /dev/null +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_history_provider.py @@ -0,0 +1,991 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Foundry Hosted Agent history provider. + +A standalone :class:`agent_framework.HistoryProvider` implementation that +sources conversation history from the Foundry Hosted Agent storage backend. + +Transport is delegated to the SDK's +:class:`azure.ai.agentserver.responses.FoundryStorageProvider` (when running +inside a Foundry Hosted Agent container) or +:class:`azure.ai.agentserver.responses.InMemoryResponseProvider` (for local +development). Both implement the same read/write surface +(``get_history_item_ids`` / ``get_items`` / ``create_response``), so this +provider's persistence logic stays backend-agnostic. + +Allowed dependencies (deliberately narrow): + +* :mod:`agent_framework` (core, for ``HistoryProvider`` / ``Message``) +* :mod:`azure.ai.agentserver.responses` (for the storage backends, + ``IsolationContext`` typing, and ``OutputItem`` deserialization) +* :mod:`azure.core.credentials_async` (typing of token credentials) + +It MUST NOT depend on any ``agent_framework_hosting*`` package at module +import time. (The host's isolation contextvar is consulted lazily via an +``import`` inside :func:`_host_isolation` so the dependency stays soft.) + +Environment variables read: + +* ``FOUNDRY_HOSTING_ENVIRONMENT`` — non-empty marks "running inside Foundry" + and selects the SDK-backed storage transport. Detection is delegated to + :class:`azure.ai.agentserver.core.AgentConfig` so a future SDK rename + propagates without touching this module. +* ``FOUNDRY_PROJECT_ENDPOINT`` — base URL of the Foundry project; required + when running hosted unless an explicit ``endpoint=`` is supplied. +* ``FOUNDRY_AGENT_NAME`` / ``FOUNDRY_AGENT_VERSION`` — stamped onto the + ``agent_reference`` field of every persisted response envelope. +* ``MODEL_DEPLOYMENT_NAME`` / ``AZURE_AI_MODEL_DEPLOYMENT_NAME`` — model + field stamped on the persisted envelope (must match a real deployment). + +Note on ``FOUNDRY_AGENT_SESSION_ID``: this env var identifies the +*container instance*, not the conversation, so it is **not** consulted as +a fallback ``previous_response_id``. The host-bound +``previous_response_id`` (set by :class:`ResponsesChannel` from the +request envelope) is the authoritative anchor. The value is still +persisted into the ``agent_session_id`` envelope field for operator +correlation only. + +Local fallback: when ``FOUNDRY_HOSTING_ENVIRONMENT`` is unset, the provider +transparently falls back to :class:`InMemoryResponseProvider` so the same +agent code runs in dev. Pass ``local_storage_root`` to use a persistent +file-based store instead of in-memory; histories are then laid out as +``{root}/{user_key or "~none"}/{chat_key or "~none"}/{session_id}.jsonl`` +via :class:`agent_framework.FileHistoryProvider`. +""" + +from __future__ import annotations + +import logging +import os +import time +from base64 import urlsafe_b64encode +from contextlib import contextmanager +from contextvars import ContextVar +from dataclasses import dataclass +from pathlib import Path +from typing import TYPE_CHECKING, Any, ClassVar + +from agent_framework import FileHistoryProvider, HistoryProvider, Message +from azure.ai.agentserver.core import AgentConfig +from azure.ai.agentserver.responses import ( + FoundryStorageProvider, + FoundryStorageSettings, + InMemoryResponseProvider, + IsolationContext, +) +from azure.ai.agentserver.responses._id_generator import IdGenerator +from azure.ai.agentserver.responses.models import OutputItem, ResponseObject +from azure.ai.agentserver.responses.store._foundry_errors import ( # pyright: ignore[reportPrivateUsage] + FoundryBadRequestError, + FoundryResourceNotFoundError, + FoundryStorageError, +) + +from ._shared import ( + _messages_to_output_items, # pyright: ignore[reportPrivateUsage] + _output_items_to_messages, # pyright: ignore[reportPrivateUsage] +) + +if TYPE_CHECKING: + from collections.abc import Iterator, Sequence + + from azure.core.credentials_async import AsyncTokenCredential + +logger = logging.getLogger(__name__) + +# Environment variable name — re-declared (not imported) so this module +# stays decoupled from the private ``azure.ai.agentserver.core._config`` +# constants while still matching exactly. Hosted-vs-local detection is +# delegated to :class:`AgentConfig` so a future SDK rename propagates. +_ENV_FOUNDRY_PROJECT_ENDPOINT = "FOUNDRY_PROJECT_ENDPOINT" + +# Per-request isolation context. The owning Channel is expected to set this +# from the inbound request (e.g. user / tenant headers) for the duration of +# an ``agent.run(...)`` call. When unset, requests are made without +# isolation headers (matches how ``ResponseContext`` behaves with no +# ``IsolationContext``). +_isolation_var: ContextVar[IsolationContext | None] = ContextVar( + "agent_framework_foundry_hosting_isolation", + default=None, +) + + +def set_current_isolation(isolation: IsolationContext | None) -> Any: + """Set the per-request isolation context for downstream history calls. + + Channels that drive an agent backed by :class:`FoundryHostedAgentHistoryProvider` + should call this before invoking ``agent.run(...)`` and reset the token + afterwards. + + Args: + isolation: The isolation context to associate with the current + ``contextvars`` context, or ``None`` to clear it. + + Returns: + A token suitable for :func:`reset_current_isolation` that restores + the previous value. + """ + return _isolation_var.set(isolation) + + +def reset_current_isolation(token: Any) -> None: + """Restore a previously-saved isolation context. + + Args: + token: A token returned by :func:`set_current_isolation`. + """ + _isolation_var.reset(token) + + +def get_current_isolation() -> IsolationContext | None: + """Return the isolation context bound to the current async context, if any. + + Returns: + The :class:`IsolationContext` for the current request, or ``None`` + when no channel has set one. + """ + return _isolation_var.get() + + +@dataclass(frozen=True) +class _RequestContext: + """Per-request anchors the host binds before invoking the agent. + + ``response_id`` is the id this provider's :meth:`save_messages` call + will write under, so the channel and the storage backend agree on + one stable handle per turn (the channel surfaces the same id on the + response envelope, the next turn arrives with this value as + ``previous_response_id`` and the chain walks). + + ``previous_response_id`` is the prior turn's anchor (``None`` on + first turn). Used to seed ``history_item_ids`` on the new write so + the storage chain stays connected, and to load history without + needing to know the channel's session minting convention. + + Per-request Foundry isolation keys (the + ``x-agent-{user,chat}-isolation-key`` headers) are *not* carried + here; the host's own ASGI middleware lifts them off every inbound + HTTP request into a contextvar + (:func:`agent_framework_hosting.get_current_isolation_keys`) which + this provider consults at storage-call time. Keeping the headers + out of the per-request bind means channels never have to import + Foundry-specific types and the host owns the (intentional) coupling + to those two well-known headers. + """ + + response_id: str + previous_response_id: str | None + + +_request_var: ContextVar[_RequestContext | None] = ContextVar( + "agent_framework_foundry_hosting_request", + default=None, +) + + +@contextmanager +def bind_request_context( + *, + response_id: str, + previous_response_id: str | None = None, + **_unused: Any, +) -> Iterator[None]: + """Bind the per-request response-chain anchors for this provider. + + Intended for the host (or any caller orchestrating an + ``agent.run(...)``) to call immediately before invocation, so the + provider's :meth:`save_messages` writes under a known, stable + ``response_id`` (the same one the channel surfaces to the client) + and walks ``previous_response_id`` for history continuity. Unknown + keyword arguments are accepted and ignored so the host can extend + the ``ChannelRequest.attributes`` contract without breaking existing + providers. Foundry isolation keys flow through a separate + host-installed contextvar; see the class docstring on + :class:`_RequestContext`. + + The binding is scoped to the current ``contextvars.Context``, so + concurrent requests in the same process do not interfere. + """ + token = _request_var.set( + _RequestContext( + response_id=response_id, + previous_response_id=previous_response_id, + ) + ) + try: + yield + finally: + _request_var.reset(token) + + +def get_current_request_context() -> _RequestContext | None: + """Return the per-request response chain anchors, if bound.""" + return _request_var.get() + + +def _host_isolation() -> IsolationContext | None: + """Lift the host-bound isolation contextvar into our local type. + + The host installs an ASGI middleware that reads + ``x-agent-{user,chat}-isolation-key`` off every inbound HTTP request + and stores them in a generic ``IsolationKeys`` slot on a contextvar + we import from :mod:`agent_framework_hosting`. We translate it into + our :class:`IsolationContext` shape on demand so the provider stays + in charge of the storage-side type while the host stays free of any + Foundry-specific dependencies. + """ + # Soft dep: ``agent_framework_hosting`` may not be installed (this + # provider is also usable standalone). The whole block is wrapped in + # ``# pyright: ignore`` so the optional import does not block type + # checking when the package isn't on sys.path; when it is, pyright + # picks up the real types automatically. + try: + from agent_framework_hosting import ( # pyright: ignore[reportMissingImports] + get_current_isolation_keys, # pyright: ignore[reportUnknownVariableType] + ) + except ImportError: # pragma: no cover - hosting is a soft dep + return None + keys = get_current_isolation_keys() # pyright: ignore[reportUnknownVariableType] + if keys is None or keys.is_empty: # pyright: ignore[reportUnknownMemberType] + return None + return IsolationContext( + user_key=keys.user_key, # pyright: ignore[reportUnknownMemberType, reportUnknownArgumentType] + chat_key=keys.chat_key, # pyright: ignore[reportUnknownMemberType, reportUnknownArgumentType] + ) + + +# Type alias for the storage backend surface this provider depends on. +# Both ``FoundryStorageProvider`` and ``InMemoryResponseProvider`` from +# ``azure.ai.agentserver.responses`` expose the same +# ``get_history_item_ids`` / ``get_items`` / ``create_response`` methods. +_StorageBackend = "FoundryStorageProvider | InMemoryResponseProvider" + + +# Sentinel directory name used in place of a missing ``user_key`` / +# ``chat_key`` when laying out file-based local history. The tilde +# prefix is reserved (``_is_safe_isolation_segment`` rejects keys that +# start with one) so a real isolation key can never collide with the +# sentinel after sanitisation. +_ISOLATION_NONE_MARKER = "~none" +_ISOLATION_ENCODED_PREFIX = "~iso-" + +# Windows reserved file/directory stems. Mirrors +# ``FileHistoryProvider._WINDOWS_RESERVED_FILE_STEMS`` so the directory +# layer enforces the same portability constraints the file layer does. +_WINDOWS_RESERVED_STEMS = frozenset({ + "CON", + "PRN", + "AUX", + "NUL", + *(f"COM{i}" for i in range(1, 10)), + *(f"LPT{i}" for i in range(1, 10)), +}) + + +def _is_safe_isolation_segment(value: str) -> bool: + """Return whether ``value`` is safe to use directly as a directory name. + + Rules mirror :meth:`FileHistoryProvider._is_literal_session_file_stem_safe`, + with the additional rule that a leading tilde is reserved for our + sentinel/encoded prefixes so real keys can never collide with them. + """ + if ( + not value + or value.startswith((".", "~")) + or value.endswith((" ", ".")) + or value.upper() in _WINDOWS_RESERVED_STEMS + ): + return False + if any(ord(character) < 32 for character in value): + return False + return all(character.isalnum() or character in "._-" for character in value) + + +def _encode_isolation_segment(value: str | None) -> str: + """Encode an isolation key into a filesystem-safe directory name. + + * ``None`` / empty → ``"~none"`` sentinel. + * Already-safe values pass through unchanged. + * Anything else is base64-url-encoded and prefixed with ``"~iso-"`` + so it is unambiguous and never collides with a real (safe) key. + """ + if value is None or value == "": + return _ISOLATION_NONE_MARKER + if _is_safe_isolation_segment(value): + return value + encoded = urlsafe_b64encode(value.encode("utf-8")).decode("ascii").rstrip("=") + return f"{_ISOLATION_ENCODED_PREFIX}{encoded}" + + +class FoundryHostedAgentHistoryProvider(HistoryProvider): + """``HistoryProvider`` backed by Foundry Hosted Agent storage. + + Wraps :class:`azure.ai.agentserver.responses.FoundryStorageProvider` + when running inside a Foundry Hosted Agent container, or + :class:`InMemoryResponseProvider` for local development. The + selection is driven by the ``FOUNDRY_HOSTING_ENVIRONMENT`` + environment variable. + + For local runs that need to *persist* history across process + restarts, pass ``local_storage_root``: the provider then writes + each conversation to + ``{root}/{user_key or "~none"}/{chat_key or "~none"}/{session_id}.jsonl`` + via :class:`agent_framework.FileHistoryProvider`. The Foundry + response-chain semantics (``previous_response_id`` walking, + ``caresp_*`` id stamping, ``ResponseObject`` envelopes) are + bypassed in file mode — the on-disk format is plain JSONL of + :class:`Message` payloads, identical to ``FileHistoryProvider`` + standalone usage. ``local_storage_root`` is ignored when running + hosted (Foundry storage always wins). + + ``session_id`` semantics: in hosted / in-memory mode the value + passed to :meth:`get_messages` and :meth:`save_messages` is treated + as the Responses ``previous_response_id`` (or ``conversation_id``) + whose chain to load. When omitted (and no host-bound chain anchor + is set), :meth:`get_messages` returns an empty list (a fresh + conversation). In file mode ``session_id`` is used as the literal + filename stem (``FileHistoryProvider`` sanitises unsafe values). + """ + + DEFAULT_SOURCE_ID: ClassVar[str] = "foundry_hosted_agent" + + def __init__( + self, + *, + credential: AsyncTokenCredential | None = None, + endpoint: str | None = None, + history_limit: int = 100, + source_id: str = DEFAULT_SOURCE_ID, + load_messages: bool = True, + store_inputs: bool = True, + store_context_messages: bool = False, + store_context_from: set[str] | None = None, + store_outputs: bool = True, + local_storage_root: str | Path | None = None, + ) -> None: + """Initialize the provider. + + Args: + credential: Async token credential used to authenticate against + the Foundry storage API. Required when running hosted + (``FOUNDRY_HOSTING_ENVIRONMENT`` is set). Ignored in + local-mode (the in-memory / file backends need no auth). + endpoint: Foundry project endpoint URL. Defaults to the value + of the ``FOUNDRY_PROJECT_ENDPOINT`` environment variable. + Required when running hosted. + history_limit: Maximum number of history items to fetch per + ``get_messages`` call. Mirrors the agent-server runtime's + ``ResponseContext._history_limit``. Default ``100``. + Ignored in file mode (``FileHistoryProvider`` returns the + full session file each call). + source_id: Unique identifier for this provider instance, as + required by ``HistoryProvider``. + load_messages: Whether to load messages before invocation. + Default ``True``. + store_inputs: Whether to mirror input messages into Foundry + storage. Default ``True`` — the Foundry Hosted Agents + runtime does not persist Responses turns automatically, so + without this the chain would never be visible to subsequent + requests. Set ``False`` only if you know an external writer + is populating storage on your behalf. + store_context_messages: Whether to mirror context-provider + messages. Default ``False``. + store_context_from: If set, only mirror context messages from + these source IDs. + store_outputs: Whether to mirror response messages into Foundry + storage. Default ``True`` for the same reason as + ``store_inputs``. + local_storage_root: When set, *and* the provider is running + outside a Foundry Hosted Agent container, persist history + to JSONL files under + ``{root}/{user_key or "~none"}/{chat_key or "~none"}/{session_id}.jsonl`` + instead of using the in-memory backend. Ignored when + hosted (with a one-time INFO log). Defaults to ``None`` + (in-memory local fallback). + """ + super().__init__( + source_id=source_id, + load_messages=load_messages, + store_inputs=store_inputs, + store_context_messages=store_context_messages, + store_context_from=store_context_from, + store_outputs=store_outputs, + ) + + self._history_limit = history_limit + self._credential = credential + self._endpoint = endpoint or os.environ.get(_ENV_FOUNDRY_PROJECT_ENDPOINT) or None + self._backend: FoundryStorageProvider | InMemoryResponseProvider | None = None + + self._local_storage_root: Path | None = ( + Path(local_storage_root).resolve() if local_storage_root is not None else None + ) + # Cache one ``FileHistoryProvider`` per (user_key, chat_key) + # tuple. Bounded by the number of distinct isolation scopes the + # process sees; cleared on ``aclose``. + self._file_providers: dict[tuple[str, str], FileHistoryProvider] = {} + self._hosted_local_root_warned = False + if self._local_storage_root is not None and self.is_hosted_environment(): + self._warn_hosted_local_root_ignored() + + # Observability: number of ``save_messages`` calls dropped by + # :class:`FoundryStorageError` from ``backend.create_response``. + # Operators / health probes can read this attribute directly to + # detect silent persistence loss; never decremented. + self.failed_writes: int = 0 + + @staticmethod + def is_hosted_environment() -> bool: + """Return ``True`` when running inside a Foundry Hosted Agent container. + + Delegates to :meth:`azure.ai.agentserver.core.AgentConfig.from_env` + so the detection rule stays in lockstep with the Foundry SDK; if + the platform ever renames the underlying signal (today + ``FOUNDRY_HOSTING_ENVIRONMENT``) the SDK update is picked up + automatically without a code change here. + """ + return AgentConfig.from_env().is_hosted + + def _resolve_backend(self) -> FoundryStorageProvider | InMemoryResponseProvider: + """Return the storage backend, constructing it lazily on first use. + + * If ``FOUNDRY_HOSTING_ENVIRONMENT`` is set, build a + :class:`FoundryStorageProvider` (requires ``credential`` and a + resolved ``endpoint``). + * Otherwise, fall back to a process-local + :class:`InMemoryResponseProvider` so dev/local runs work without + additional configuration. + """ + if self._backend is not None: + return self._backend + + if self.is_hosted_environment(): + if self._credential is None: + raise RuntimeError( + "FoundryHostedAgentHistoryProvider requires an async credential when running " + "inside a Foundry Hosted Agent container. Pass credential=... ." + ) + if not self._endpoint: + raise RuntimeError( + "FoundryHostedAgentHistoryProvider needs a Foundry project endpoint. Pass " + "endpoint=... or set the FOUNDRY_PROJECT_ENDPOINT environment variable." + ) + self._backend = FoundryStorageProvider( + credential=self._credential, + settings=FoundryStorageSettings.from_endpoint(self._endpoint), + ) + logger.debug( + "FoundryHostedAgentHistoryProvider using FoundryStorageProvider against %s", + self._endpoint, + ) + return self._backend + + logger.info( + "FOUNDRY_HOSTING_ENVIRONMENT is unset — FoundryHostedAgentHistoryProvider falling " + "back to InMemoryResponseProvider for local development.", + ) + self._backend = InMemoryResponseProvider() + return self._backend + + async def aclose(self) -> None: + """Release storage resources held by this provider. + + Safe to call multiple times. Closes the lazily-constructed + backend if one was created and drops any cached file-history + providers. ``InMemoryResponseProvider`` and + ``FileHistoryProvider`` have no ``aclose`` and are closed + implicitly on garbage collection. + """ + self._file_providers.clear() + if self._backend is None: + return + aclose = getattr(self._backend, "aclose", None) + if aclose is not None: + await aclose() + self._backend = None + + def _warn_hosted_local_root_ignored(self) -> None: + """Log (once) that ``local_storage_root`` is being ignored under hosted mode.""" + if self._hosted_local_root_warned: + return + self._hosted_local_root_warned = True + logger.info( + "FoundryHostedAgentHistoryProvider ignored local_storage_root=%s because " + "FOUNDRY_HOSTING_ENVIRONMENT is set; Foundry storage takes precedence " + "when hosted.", + self._local_storage_root, + ) + + def _resolve_local_file_provider( + self, + isolation: IsolationContext | None, + ) -> FileHistoryProvider | None: + """Return a ``FileHistoryProvider`` for the current isolation, or ``None``. + + Returns ``None`` when ``local_storage_root`` is unset *or* the + provider is running in hosted mode (in which case Foundry + storage handles persistence). Otherwise builds — and caches — + one provider per (user_key, chat_key) tuple, rooted at the + sanitised ``{root}/{user_segment}/{chat_segment}`` directory. + + Raises: + ValueError: If the resolved isolation directory escapes + ``local_storage_root`` (defence in depth — the + sanitisation should already prevent this). + """ + if self._local_storage_root is None: + return None + if self.is_hosted_environment(): + self._warn_hosted_local_root_ignored() + return None + + user_key = isolation.user_key if isolation is not None else None + chat_key = isolation.chat_key if isolation is not None else None + cache_key = (user_key or "", chat_key or "") + cached = self._file_providers.get(cache_key) + if cached is not None: + return cached + + user_segment = _encode_isolation_segment(user_key) + chat_segment = _encode_isolation_segment(chat_key) + target_dir = (self._local_storage_root / user_segment / chat_segment).resolve() + if not target_dir.is_relative_to(self._local_storage_root): + raise ValueError( + "Isolation segments resolved outside of local_storage_root: " + f"user_key={user_key!r} chat_key={chat_key!r}" + ) + + provider = FileHistoryProvider( + target_dir, + source_id=f"{self.source_id}__file__{user_segment}__{chat_segment}", + load_messages=self.load_messages, + store_inputs=self.store_inputs, + store_context_messages=self.store_context_messages, + store_context_from=self.store_context_from, + store_outputs=self.store_outputs, + ) + self._file_providers[cache_key] = provider + logger.debug( + "FoundryHostedAgentHistoryProvider created file backend for isolation (user=%s, chat=%s) at %s", + user_key, + chat_key, + target_dir, + ) + return provider + + async def get_messages( + self, + session_id: str | None, + *, + state: dict[str, Any] | None = None, + **kwargs: Any, + ) -> list[Message]: + """Load conversation history for the given Foundry response chain. + + Args: + session_id: The Responses ``previous_response_id`` / + ``conversation_id`` to anchor history on. When ``None`` / + empty, an empty history is returned (fresh conversation). + state: Unused — kept for ``HistoryProvider`` compatibility. + **kwargs: Extensibility hook; ``isolation`` may be supplied + explicitly to override the contextvar. + + Returns: + The conversation history materialised as a list of + :class:`agent_framework.Message`, oldest-first. + + Notes: + History anchoring follows the Foundry response-id chain. The + preferred anchor is the per-request ``previous_response_id`` + bound by the host via :func:`bind_request_context` — that's + the prior turn's resp id, written by *this* provider's + previous :meth:`save_messages` call, so the chain is + guaranteed walkable. When unbound (e.g. local dev calling + the provider directly), we fall back to the ``session_id`` + argument as long as it's ``resp_*``-shaped; opaque tokens + (such as chat-isolation-key values) are skipped because the + storage backend rejects them with HTTP 400 "Malformed + identifier". + + When ``local_storage_root`` is configured (and the provider + is running outside a Foundry Hosted Agent container), this + method instead delegates to a per-isolation + :class:`FileHistoryProvider` and ``session_id`` is used as + the literal file stem. + """ + isolation = kwargs.get("isolation") or _host_isolation() or get_current_isolation() + file_provider = self._resolve_local_file_provider(isolation) + if file_provider is not None: + return await file_provider.get_messages(session_id, state=state, **kwargs) + + bound = get_current_request_context() + # Prefer the host-bound previous_response_id over the session_id + # the framework feeds in: the bound value is the id we ourselves + # wrote on the previous turn, so we know it's storage-valid. + anchor = bound.previous_response_id if bound is not None else None + if anchor is None and session_id and session_id.startswith(("caresp_", "resp_")): + anchor = session_id + if anchor is None: + # No walkable anchor → fresh conversation, nothing to load. + # Note: we intentionally do NOT fall back to + # ``FOUNDRY_AGENT_SESSION_ID`` — per the Foundry SDK that env + # var identifies the *container instance*, not the + # conversation, so it doesn't yield a walkable response-id + # chain. The host-bound ``previous_response_id`` (set by + # ``ResponsesChannel`` from the request envelope) is the + # authoritative anchor. + return [] + + backend = self._resolve_backend() + + try: + item_ids = await backend.get_history_item_ids( + anchor, + None, + self._history_limit, + isolation=isolation, + ) + except (FoundryBadRequestError, FoundryResourceNotFoundError) as err: + # 400 / 404 here means the anchor isn't storage-valid — treat + # it as an empty history rather than failing the whole request. + logger.debug( + "get_messages: anchor %r rejected by storage (%s); returning empty history", + anchor, + type(err).__name__, + ) + return [] + if not item_ids: + return [] + + items = await backend.get_items(item_ids, isolation=isolation) + # ``get_items`` may return ``None`` placeholders for missing IDs. + resolved = [item for item in items if item is not None] + return await _output_items_to_messages(resolved) + + async def save_messages( + self, + session_id: str | None, + messages: Sequence[Message], + *, + state: dict[str, Any] | None = None, + **kwargs: Any, + ) -> None: + """Persist messages for ``session_id`` into Foundry storage. + + Unlike the standalone ``azure.ai.agentserver`` runtime — which + owns response orchestration end-to-end and writes turns + authoritatively — the Agent Framework hosting stack treats + ``HistoryProvider`` as the *only* persistence path. Without this + method actively writing, a deployed hosted agent would silently + drop every turn. + + Strategy: + + * Use the host-bound ``response_id`` as the envelope id (mints + a fresh ``caresp_*`` id when unbound, e.g. local dev). + * Anchor the new write to the previous turn via + ``previous_response_id``, walking the prior turn's history + item ids forward so the full transcript stays visible. + * Split items by role: ``"message"`` (user/system inputs) into + ``input_items``, everything else (assistant outputs, tool + calls, reasoning, ...) into ``response.output``. + + Args: + session_id: The Responses ``previous_response_id`` / + ``conversation_id`` the messages belong to. + messages: The messages selected for persistence by the base + ``HistoryProvider`` after-run hook. + state: Unused — kept for ``HistoryProvider`` compatibility. + **kwargs: Extensibility hook; ``isolation`` may be supplied + explicitly to override the contextvar. + + Notes: + When ``local_storage_root`` is configured (and the provider + is running outside a Foundry Hosted Agent container), this + method instead delegates to a per-isolation + :class:`FileHistoryProvider` and ``session_id`` is used as + the literal file stem. The Foundry response-chain stamping + described above is bypassed entirely in that mode. + """ + if not messages: + return + + isolation = kwargs.get("isolation") or _host_isolation() or get_current_isolation() + file_provider = self._resolve_local_file_provider(isolation) + if file_provider is not None: + await file_provider.save_messages(session_id, messages, state=state, **kwargs) + return + + bound = get_current_request_context() + # Prefer the host-bound response_id so the channel envelope and + # the storage write agree on a single id per turn — which is + # what makes the next turn's ``previous_response_id`` walkable. + # Without a binding (e.g. local dev calling ``save_messages`` + # directly), fall back to a fresh Foundry-format response id. + # Free-form ``resp_`` ids carry no embedded partition key + # and the storage backend rejects writes with a server error; + # ``IdGenerator.new_response_id()`` mints a ``caresp_*`` id with + # the partition-key segment the backend expects. The chain + # walks only when ``session_id`` is itself a ``caresp_*``-shaped + # value (i.e. a previous response id), matching the prefix the + # ``ResponsesChannel`` factory uses. + if bound is not None: + response_id = bound.response_id + previous_response_id = bound.previous_response_id + else: + if not session_id: + return + response_id = IdGenerator.new_response_id() + previous_response_id = session_id if session_id.startswith(("caresp_", "resp_")) else None + + # Note: we intentionally do NOT consult ``FOUNDRY_AGENT_SESSION_ID`` + # as a fallback ``previous_response_id`` here. Per the Foundry SDK + # that env var identifies the *container instance*, not the + # conversation, so chaining off it produces an unwalkable history. + # The host-bound ``previous_response_id`` (set by + # ``ResponsesChannel`` from the request envelope) is the only + # authoritative anchor; if it's missing the new turn is the start + # of a fresh chain. + + logger.debug( + "save_messages: response_id=%r previous_response_id=%r isolation=%s", + response_id, + previous_response_id, + "" if isolation else "", + ) + backend = self._resolve_backend() + + # The agentserver runtime puts INBOUND items (user/system messages + # the request sent in) in the envelope's ``input_items`` axis and + # OUTBOUND items (assistant outputs, tool calls, reasoning) in + # ``response.output``. See + # ``_resolve_input_items_for_persistence`` (orchestrator.py:61) + + # ``_extract_response_snapshot_from_events`` in + # ``azure.ai.agentserver.responses``: ``input_items`` comes from + # ``ctx.input_items`` (request inputs only); ``response.output`` + # is populated from the lifecycle event stream. + # + # Putting everything in ``input_items`` with ``response.output: []`` + # is a schema violation that the storage backend rejects with an + # opaque HTTP 500. Split by role to mirror the runtime. + all_items = _messages_to_output_items(list(messages), id_prefix=response_id) + + # Re-stamp every item id via ``IdGenerator`` so each carries a + # Foundry-format ``{type-prefix}_`` + # identifier, with the response_id as the partition-key hint + # (co-locates each item with the response record). Free-form + # ``{response_id}_itm_N`` ids are rejected by the storage + # backend with an opaque HTTP 500 because the partition-key + # extractor cannot parse them. ``IdGenerator.new_item_id`` + # dispatches by *Item* (input) type and returns ``None`` for + # our *OutputItem* (storage) instances, so we dispatch by the + # ``type`` discriminator string instead. + ITEM_ID_FACTORY: dict[str, Any] = { + "message": IdGenerator.new_message_item_id, + "output_message": IdGenerator.new_output_message_item_id, + "function_call": IdGenerator.new_function_call_item_id, + "function_call_output": IdGenerator.new_function_call_output_item_id, + "reasoning": IdGenerator.new_reasoning_item_id, + "file_search_call": IdGenerator.new_file_search_call_item_id, + "web_search_call": IdGenerator.new_web_search_call_item_id, + "image_generation_call": IdGenerator.new_image_gen_call_item_id, + "code_interpreter_call": IdGenerator.new_code_interpreter_call_item_id, + "computer_call": IdGenerator.new_computer_call_item_id, + "computer_call_output": IdGenerator.new_computer_call_output_item_id, + "local_shell_call": IdGenerator.new_local_shell_call_item_id, + "local_shell_call_output": IdGenerator.new_local_shell_call_output_item_id, + "mcp_call": IdGenerator.new_mcp_call_item_id, + "mcp_list_tools": IdGenerator.new_mcp_list_tools_item_id, + "mcp_approval_request": IdGenerator.new_mcp_approval_request_item_id, + "mcp_approval_response": IdGenerator.new_mcp_approval_response_item_id, + "custom_tool_call": IdGenerator.new_custom_tool_call_item_id, + "custom_tool_call_output": IdGenerator.new_custom_tool_call_output_item_id, + } + for item in all_items: + factory = ITEM_ID_FACTORY.get(getattr(item, "type", "") or "") + if factory is None: + continue + new_id = factory(response_id) + # Plain attribute assignment — the SDK ``OutputItem`` models + # are ``MutableMapping``s with ``__setattr__`` wired to dict + # set, so this is expected to succeed for every type listed + # above. The previous ``contextlib.suppress`` masked SDK + # contract changes (next save would silently retain the + # synthetic prefix-based id and the storage backend would + # reject the entire ``create_response`` with HTTP 500). + # Letting it raise surfaces those breakages to the test + # suite instead. + item.id = new_id # type: ignore[attr-defined] + + input_items: list[Any] = [] + output_items: list[Any] = [] + for item in all_items: + item_type = getattr(item, "type", None) + if item_type == "message": + input_items.append(item) + else: + # ``output_message``, tool calls, reasoning, etc. all + # belong to the response output stream. + output_items.append(item) + + # Walk the previous response's history chain so the new write + # carries the full transcript forward. Without this, each turn + # would only see the messages saved on that very turn. + history_item_ids: list[str] | None = None + if previous_response_id is not None: + try: + history_item_ids = await backend.get_history_item_ids( + previous_response_id, + None, + self._history_limit, + isolation=isolation, + ) + except (FoundryBadRequestError, FoundryResourceNotFoundError) as err: + # Don't let history fetch failures torpedo the write — + # we still want to persist the new turn even if the + # chain seed is unreachable for some reason. + logger.warning( + "save_messages: failed to walk previous_response_id=%r (%s); writing new turn without history seed", + previous_response_id, + type(err).__name__, + ) + + # Mirror what the agentserver runtime serialises onto the wire + # (see ``_extract_response_snapshot_from_events`` + + # ``strip_nulls`` in + # ``azure.ai.agentserver.responses.streaming._helpers``): + # + # * ``agent_reference`` (Required on the response envelope) — + # built from ``FOUNDRY_AGENT_NAME`` / ``FOUNDRY_AGENT_VERSION``, + # which the hosted platform sets per-deploy (sentinel fallback + # for local dev so the envelope stays well-formed). + # * ``agent_session_id`` (S-038) — forcibly stamped by the + # runtime; sourced from ``FOUNDRY_AGENT_SESSION_ID``. + # * ``conversation`` is intentionally omitted: the (user, chat) + # isolation headers are the Foundry storage partition key, + # and the chat-isolation-key value is opaque (the API + # returns "Malformed identifier"/HTTP 400 if used as a + # body-level ``conversation_id``). + # * Per-item ``response_id`` / ``agent_reference`` are NOT + # stamped here — those B20/B21 defaults only apply to items + # inside ``response.output_item.added/done`` *events* (see + # ``_coerce_handler_event``); items inside ``input_items`` + # and ``response.output`` go through ``to_output_item`` which + # never sets these fields, and the storage validator returns + # HTTP 400 ``invalid_payload`` when extras leak in. + agent_name = os.environ.get("FOUNDRY_AGENT_NAME") or "agent-framework-host" + agent_version = os.environ.get("FOUNDRY_AGENT_VERSION") or None + agent_reference: dict[str, Any] = {"type": "agent_reference", "name": agent_name} + if agent_version: + agent_reference["version"] = agent_version + + agent_session_id = os.environ.get("FOUNDRY_AGENT_SESSION_ID") or None + # ``model`` must be a real deployed model name — the storage + # validator rejects arbitrary strings. Pull it from the + # platform-provided ``MODEL_DEPLOYMENT_NAME`` (set in agent.yaml) + # and fall back to ``AZURE_AI_MODEL_DEPLOYMENT_NAME`` for local + # dev. When neither is set we omit the field entirely (it is + # ``Optional[str]`` per the ResponseObject schema). + model_deployment = ( + os.environ.get("MODEL_DEPLOYMENT_NAME") or os.environ.get("AZURE_AI_MODEL_DEPLOYMENT_NAME") or None + ) + + # Build the wire payload to match exactly what the agentserver + # runtime emits via ``_extract_response_snapshot_from_events`` + # for a synthetic ``status=completed`` snapshot: + # + # {id, object, output, created_at, [model], agent_reference, + # status, completed_at, [agent_session_id]} + # + # ``previous_response_id`` is appended when chaining; the runtime + # threads it through the same code path. + now = int(time.time()) + response_body: dict[str, Any] = { + "id": response_id, + # SDK mirror: ``streaming/_helpers.py:244`` always stamps + # ``response_id`` alongside ``id`` on the snapshot before it + # reaches ``serialize_create_request``. + "response_id": response_id, + "object": "response", + # S-040 auto-stamp: the orchestrator (``_orchestrator.py:1706``) + # echoes ``background`` from the request to every response + # envelope; storage rejects payloads that omit it. + "background": False, + # ``ResponseObject`` schema (``_models.py:13995``) declares + # ``parallel_tool_calls: bool`` as REQUIRED. The SDK's synthetic + # fallback path (``_build_events``) never sets it because it's + # only invoked for failure recovery; real handler events carry + # it through. Storage rejects payloads that omit it. + "parallel_tool_calls": False, + # Same story for ``instructions`` (``_models.py:13989``) — + # required ``str | list[Item]`` field. + "instructions": "", + "output": [item.as_dict() for item in output_items], + "created_at": now, + "agent_reference": agent_reference, + "status": "completed", + "completed_at": now, + } + if model_deployment is not None: + response_body["model"] = model_deployment + if agent_session_id is not None: + response_body["agent_session_id"] = agent_session_id + if previous_response_id is not None: + response_body["previous_response_id"] = previous_response_id + response = ResponseObject(response_body) + + try: + await backend.create_response( + response, + input_items=input_items, + history_item_ids=history_item_ids, + isolation=isolation, + ) + except FoundryStorageError as exc: + # Storage-validation failures (4xx ``invalid_payload`` / + # ``not_found``, opaque 5xx) are best-effort losses: the + # caller's run already produced output and we don't want to + # crash the whole turn over a chain-write the user can't + # recover from. They are still observable: every drop bumps + # ``failed_writes`` (operators can poll it / surface in + # health probes) and the full traceback + ``response_body`` + # is logged. + # + # Network / TLS / DNS errors, expired-credential 401/403s, + # and bugs in the wire-payload builder above (e.g. a + # required-field regression) deliberately propagate so they + # surface to the caller and trigger retry / alerting paths + # instead of being silently dropped here. + self.failed_writes += 1 + err_body = getattr(exc, "response_body", None) + logger.exception( + "FoundryHostedAgentHistoryProvider.save_messages: storage rejected " + "%d message(s) (response_id=%s, previous_response_id=%s, error_body=%s, " + "failed_writes=%d).", + len(messages), + response_id, + previous_response_id, + err_body, + self.failed_writes, + ) + return + logger.debug( + "FoundryHostedAgentHistoryProvider.save_messages: persisted %d message(s) " + "(response_id=%s, previous_response_id=%s).", + len(messages), + response_id, + previous_response_id, + ) + + +# Re-export ``OutputItem`` for callers that want to construct test items +# without reaching into the SDK's ``models`` namespace directly. +__all__ = [ + "FoundryHostedAgentHistoryProvider", + "OutputItem", + "bind_request_context", + "get_current_isolation", + "get_current_request_context", + "reset_current_isolation", + "set_current_isolation", +] diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py new file mode 100644 index 00000000000..588231d073c --- /dev/null +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py @@ -0,0 +1,72 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Foundry-storage-compatible identifier helpers. + +The Foundry hosted-agent storage backend partitions records by extracting +an embedded partition-key segment from every record/item id. The id +format is ``{prefix}_{18charPartitionKey}{32charEntropy}`` (or a 48-char +legacy body). Free-form ids such as ``resp_`` carry no valid +partition key and the storage API rejects writes with an opaque +``HTTP 500 server_error``. + +These helpers wrap :class:`azure.ai.agentserver.responses._id_generator.IdGenerator` +so callers (e.g. the ``ResponsesChannel.response_id_factory`` argument +or :class:`FoundryHostedAgentHistoryProvider.save_messages`) can mint +ids that the storage backend accepts without leaking the SDK import +path into user code. +""" + +from __future__ import annotations + +from typing import Any + +from azure.ai.agentserver.responses._id_generator import IdGenerator + +__all__ = [ + "foundry_item_id", + "foundry_response_id", + "foundry_response_id_factory", +] + + +def foundry_response_id(previous_response_id: str | None = None) -> str: + """Mint a Foundry-storage-compatible response id (``caresp_*``). + + Args: + previous_response_id: When supplied (and shaped like a Foundry + id with an embedded partition key), the new id co-locates + with the chain by reusing that partition key. The storage + backend rejects chained writes whose new record sits in a + different partition than the prior one. + + Returns: + A new id of the form ``caresp_<18charPartitionKey><32charEntropy>``. + """ + return IdGenerator.new_response_id(previous_response_id or "") + + +def foundry_response_id_factory() -> "Any": + """Return a callable suitable for ``ResponsesChannel(response_id_factory=...)``. + + The returned callable accepts an optional ``previous_response_id`` + hint which the channel passes for chained turns so the new id + inherits the prior turn's partition key (Foundry storage requirement). + """ + return foundry_response_id + + +def foundry_item_id(item: "Any", response_id: str | None = None) -> str | None: + """Mint a Foundry-storage-compatible item id for *item*. + + Dispatches via :meth:`IdGenerator.new_item_id` so the id picks up + the right type prefix (``msg`` / ``om`` / ``fc`` / ``rs`` / ...). + When ``response_id`` is supplied it acts as a partition-key hint so + every item written under one response co-locates with the response + record (Foundry storage requirement). + + Returns: + A new id of the form ``{type-prefix}_``, + or ``None`` when *item* is an unrecognised / reference-only type + (mirrors the SDK helper's contract). + """ + return IdGenerator.new_item_id(item, response_id) diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py index 3f9ae41b97b..567335d2fac 100644 --- a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py @@ -3,17 +3,15 @@ from __future__ import annotations import asyncio -import base64 import json import logging import os import tempfile import threading -from collections.abc import AsyncIterable, AsyncIterator, Generator, Mapping, Sequence +from collections.abc import AsyncIterable, AsyncIterator, Generator from contextlib import AbstractAsyncContextManager, AsyncExitStack, suppress -from dataclasses import asdict, dataclass, is_dataclass from pathlib import Path -from typing import Protocol, cast +from typing import cast from agent_framework import ( ChatOptions, @@ -21,7 +19,6 @@ ContextProvider, FileCheckpointStorage, HistoryProvider, - Message, RawAgent, SupportsAgentRun, WorkflowAgent, @@ -32,78 +29,10 @@ ResponseEventStream, ResponseProviderProtocol, ResponsesServerOptions, + models, ) from azure.ai.agentserver.responses._id_generator import IdGenerator from azure.ai.agentserver.responses.hosting import ResponsesAgentServerHost -from azure.ai.agentserver.responses.models import ( - ApplyPatchToolCallItemParam, - ApplyPatchToolCallOutputItemParam, - ComputerCallOutputItemParam, - ComputerScreenshotContent, - CreateResponse, - FunctionCallOutputItemParam, - FunctionShellAction, - FunctionShellCallItemParam, - FunctionShellCallOutputContent, - FunctionShellCallOutputExitOutcome, - FunctionShellCallOutputItemParam, - Item, - ItemCodeInterpreterToolCall, - ItemComputerToolCall, - ItemCustomToolCall, - ItemCustomToolCallOutput, - ItemFileSearchToolCall, - ItemFunctionToolCall, - ItemImageGenToolCall, - ItemLocalShellToolCall, - ItemLocalShellToolCallOutput, - ItemMcpApprovalRequest, - ItemMcpToolCall, - ItemMessage, - ItemOutputMessage, - ItemReasoningItem, - ItemWebSearchToolCall, - LocalEnvironmentResource, - MCPApprovalResponse, - MessageContent, - MessageContentInputFileContent, - MessageContentInputImageContent, - MessageContentInputTextContent, - MessageContentOutputTextContent, - MessageContentReasoningTextContent, - MessageContentRefusalContent, - MessageRole, - OAuthConsentRequestOutputItem, - OutputItem, - OutputItemApplyPatchToolCall, - OutputItemApplyPatchToolCallOutput, - OutputItemCodeInterpreterToolCall, - OutputItemComputerToolCall, - OutputItemComputerToolCallOutputResource, - OutputItemCustomToolCall, - OutputItemCustomToolCallOutput, - OutputItemFileSearchToolCall, - OutputItemFunctionShellCall, - OutputItemFunctionShellCallOutput, - OutputItemFunctionToolCall, - OutputItemImageGenToolCall, - OutputItemLocalShellToolCall, - OutputItemLocalShellToolCallOutput, - OutputItemMcpApprovalRequest, - OutputItemMcpApprovalResponseResource, - OutputItemMcpToolCall, - OutputItemMessage, - OutputItemOutputMessage, - OutputItemReasoningItem, - OutputItemWebSearchToolCall, - OutputMessageContent, - OutputMessageContentOutputTextContent, - OutputMessageContentRefusalContent, - ResponseStreamEvent, - StructuredOutputsOutputItem, - SummaryTextContent, - TextContent, -) from azure.ai.agentserver.responses.streaming._builders import ( OutputItemFunctionCallBuilder, OutputItemMcpCallBuilder, @@ -115,22 +44,43 @@ from mcp import McpError from typing_extensions import Any -logger = logging.getLogger(__name__) - -_AZURE_RESPONSES_MESSAGE_ROLE_TYPE = f"{MessageRole.__module__}:{MessageRole.__qualname__}" - +from ._shared import ( + ApprovalStorage, + _arguments_to_str, # pyright: ignore[reportPrivateUsage] + _convert_message_content, # pyright: ignore[reportPrivateUsage] + _convert_output_message_content, # pyright: ignore[reportPrivateUsage] + _item_to_message, # pyright: ignore[reportPrivateUsage] + _items_to_messages, # pyright: ignore[reportPrivateUsage] + _output_item_to_message, # pyright: ignore[reportPrivateUsage] + _output_items_to_messages, # pyright: ignore[reportPrivateUsage] +) -# region Approval Storage -class ApprovalStorage(Protocol): - """Storage for saving function approval requests.""" +# Re-export the conversion helpers under their historical names so existing +# tests (which import them from this module) keep working — the canonical +# definitions now live in :mod:`._shared`. +__all__ = ( + "ApprovalStorage", + "_arguments_to_str", + "_convert_message_content", + "_convert_output_message_content", + "_item_to_message", + "_items_to_messages", + "_output_item_to_message", + "_output_items_to_messages", +) - async def save_approval_request(self, approval_request_id: str, request: Content) -> None: - """Save a function approval request under the given ID.""" - ... +# Local aliases for the agent-server SDK types this module touches at the +# Python type-annotation layer. Using ``models.X`` everywhere would work but +# would noisily clutter type-only positions where the alias adds no value. +CreateResponse = models.CreateResponse +ResponseStreamEvent = models.ResponseStreamEvent +FunctionShellAction = models.FunctionShellAction +FunctionShellCallOutputContent = models.FunctionShellCallOutputContent +FunctionShellCallOutputExitOutcome = models.FunctionShellCallOutputExitOutcome +LocalEnvironmentResource = models.LocalEnvironmentResource +OAuthConsentRequestOutputItem = models.OAuthConsentRequestOutputItem - async def load_approval_request(self, approval_request_id: str) -> Content: - """Load a function approval request by its ID.""" - ... +logger = logging.getLogger(__name__) class InMemoryFunctionApprovalStorage: @@ -252,85 +202,35 @@ def _checkpoint_storage_for_context(root: str, context_id: str) -> FileCheckpoin storage_path = (root_path / context_id).resolve() if not storage_path.is_relative_to(root_path): raise RuntimeError(f"Invalid checkpoint context id: {context_id!r}") - return FileCheckpointStorage( - storage_path, - # Keep this provider-specific allowlist narrow. Hosted workflow - # checkpoints can persist Azure's role enum inside Message objects. - allowed_checkpoint_types=[_AZURE_RESPONSES_MESSAGE_ROLE_TYPE], - ) + return FileCheckpointStorage(storage_path) # endregion Approval Storage # Foundry Toolbox Auth integration # Consent-URL error code returned by the Foundry MCP gateway when calling `/list` -CONSENT_ERROR_CODE = -32006 - +CONSENT_ERROR_CODE = -32007 -@dataclass -class ConsentError: - name: str - consent_url: str +def consent_url_from_error(exc: BaseException) -> str | None: + """Return the consent URL when ``exc`` wraps a Foundry MCP gateway consent error. -def consent_url_from_error(exc: BaseException) -> list[ConsentError] | None: - """Return the consent URLs when ``exc`` wraps Foundry MCP gateway consent errors. + The Agent Framework MCP layer surfaces gateway consent failures by wrapping the underlying + ``McpError`` inside an :class:`AgentFrameworkException` (typically a ``ToolExecutionException`` + raised from ``MCPStreamableHTTPTool.__aenter__``). This helper inspects ``exc.args`` for a + wrapped ``McpError`` whose ``error.code`` is :data:`CONSENT_ERROR_CODE`; when found, the + consent link the gateway returned in ``error.message`` is returned. Returns ``None`` for + anything else, so callers can do ``if (url := consent_url_from_error(ex)) is None: raise``. Args: exc: The exception to inspect. Returns: - The consent URL(s) extracted from the error, or ``None`` if no consent error was found. + The consent URL if ``exc`` wraps a consent ``McpError``, otherwise ``None``. """ inner_exception = next((arg for arg in exc.args if isinstance(arg, McpError)), None) if inner_exception is not None and inner_exception.error.code == CONSENT_ERROR_CODE: - # Parse the error message - # The error message is structured with the following format: - # "tools/list failed for 1 tool source(s), succeeded for 0 tool source(s) {"errors":[{"name": ..." - # where the second part is a JSON string that can be deserialized into an object with the following shape: - # ruff: disable[ERA001] - # { - # "errors" : [ - # { - # "name": "Name of the MCP tool that requires consent", - # "type" : "mcp", - # "error": { - # "code": "CONSENT_REQUIRED", - # "message": consent_url, - # } - # } - # ] - # } - # ruff: enable[ERA001] - try: - consent_errors: list[ConsentError] = [] - error_message_start = inner_exception.error.message.find("{") - if error_message_start == -1: - logger.warning("Consent error message does not contain JSON: %s", inner_exception.error.message) - return None - consent_details_json = inner_exception.error.message[error_message_start:] - consent_details = json.loads(consent_details_json) - if "errors" not in consent_details or not isinstance(consent_details["errors"], list): - logger.warning("Consent error message JSON does not contain 'errors' list: %s", consent_details_json) - return None - for error in consent_details["errors"]: - if ( - isinstance(error, dict) - and error.get("type") == "mcp" # type: ignore - and "error" in error - and isinstance(error["error"], dict) - and error["error"].get("code") == "CONSENT_REQUIRED" # type: ignore - and "message" in error["error"] - ): - consent_url = error["error"]["message"] # type: ignore - if isinstance(consent_url, str): - consent_errors.append(ConsentError(name=error.get("name", "Unknown"), consent_url=consent_url)) # type: ignore - else: - logger.warning("Consent URL in error message is not a valid URL: %s", consent_url) # type: ignore - if consent_errors: - return consent_errors - except json.JSONDecodeError: - logger.warning("Failed to parse consent details JSON: %s", inner_exception.error.message) + return inner_exception.error.message return None @@ -461,70 +361,71 @@ async def _handle_inner_agent( context: ResponseContext, ) -> AsyncIterable[ResponseStreamEvent | dict[str, Any]]: """Handle the creation of a response for a regular (non-workflow) agent.""" + input_items = await context.get_input_items() + input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) + + history = await context.get_history() + run_kwargs: dict[str, Any] = { + "messages": [ + *(await _output_items_to_messages(history, approval_storage=self._approval_storage)), + *input_messages, + ] + } + is_streaming_request = request.stream is not None and request.stream is True + + chat_options, are_options_set = _to_chat_options(request) + response_event_stream = ResponseEventStream(response_id=context.response_id, model=request.model) + yield response_event_stream.emit_created() yield response_event_stream.emit_in_progress() - # Track the current active output item builder for streaming; - # lazily created on matching content, closed when a different type arrives. - tracker: _OutputItemTracker | None = None + if are_options_set and not isinstance(self._agent, RawAgent): + logger.warning("Agent doesn't support runtime options. They will be ignored.") + else: + run_kwargs["options"] = chat_options + # Lazy-enter the agent (and any MCP tools it owns). The MCP client wraps gateway + # consent failures (and other connection-time errors) in AgentFrameworkException; if + # one of those is a consent error we surface the consent link to the client through + # the already-opened response stream instead of crashing the request. Other exception + # types propagate normally so the host can handle / log them. try: - input_items = await context.get_input_items() - input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) - - history = await context.get_history() - run_kwargs: dict[str, Any] = { - "messages": [ - *(await _output_items_to_messages(history, approval_storage=self._approval_storage)), - *input_messages, - ] - } - is_streaming_request = request.stream is not None and request.stream is True - - chat_options, are_options_set = _to_chat_options(request) - - if are_options_set and not isinstance(self._agent, RawAgent): - logger.warning("Agent doesn't support runtime options. They will be ignored.") - else: - run_kwargs["options"] = chat_options - - # Lazy-enter the agent (and any MCP tools it owns). The MCP client wraps gateway - # consent failures (and other connection-time errors) in AgentFrameworkException; if - # one of those is a consent error we surface the consent link to the client through - # the already-opened response stream instead of failing the request. Other exception - # types fall through to the outer handler below and become ``response.failed``. - try: - await self._ensure_agent_ready() - except AgentFrameworkException as ex: - consent_errors = consent_url_from_error(ex) - if consent_errors is None: - raise - for consent_error in consent_errors: - logger.warning("Consent URL for tool '%s': %s", consent_error.name, consent_error.consent_url) - oauth_item = OAuthConsentRequestOutputItem( - id=IdGenerator.new_id("oacr"), - consent_link=consent_error.consent_url, - server_label=consent_error.name, - ) - builder = response_event_stream.add_output_item(oauth_item.id) - yield builder.emit_added(oauth_item) - yield builder.emit_done(oauth_item) - yield response_event_stream.emit_completed() - return + await self._ensure_agent_ready() + except AgentFrameworkException as ex: + consent_url = consent_url_from_error(ex) + if consent_url is None: + raise + logger.warning("OAuth consent required for Foundry MCP gateway.") + oauth_item = OAuthConsentRequestOutputItem( + id=IdGenerator.new_id("oacr"), + consent_link=consent_url, + server_label="Foundry Toolbox", + ) + builder = response_event_stream.add_output_item(oauth_item.id) + yield builder.emit_added(oauth_item) + yield builder.emit_done(oauth_item) + yield response_event_stream.emit_completed() + return - tracker = _OutputItemTracker(response_event_stream) if is_streaming_request else None + # Track the current active output item builder for streaming; + # lazily created on matching content, closed when a different type arrives. + tracker: _OutputItemTracker | None = _OutputItemTracker(response_event_stream) if is_streaming_request else None + try: if not is_streaming_request: # Run the agent in non-streaming mode response = await self._agent.run(stream=False, **run_kwargs) # type: ignore[reportUnknownMemberType] - async for item in _to_outputs_for_messages( - response_event_stream, - response.messages, - approval_storage=self._approval_storage, - ): - yield item + for message in response.messages: + for content in message.contents: + async for item in _to_outputs( + response_event_stream, + content, + approval_storage=self._approval_storage, + ): + yield item + yield response_event_stream.emit_completed() else: if tracker is None: # pragma: no cover - defensive, set above raise RuntimeError("Streaming tracker was not initialized.") @@ -545,158 +446,160 @@ async def _handle_inner_agent( # Close any remaining active builder for event in tracker.close(): yield event - yield response_event_stream.emit_completed() - except Exception as ex: - logger.exception("Failed to produce response for agent") - for event in self._emit_failure(response_event_stream, tracker, ex): - yield event + yield response_event_stream.emit_completed() + except Exception: + # Drain any in-progress streaming builder before emitting consent + # so the resulting stream stays well-formed. + if tracker is not None: + for event in tracker.close(): + yield event + yield response_event_stream.emit_completed() + raise async def _handle_inner_workflow( self, request: CreateResponse, context: ResponseContext, ) -> AsyncIterable[ResponseStreamEvent | dict[str, Any]]: - """Handle the creation of a response for a workflow agent.""" + """Handle the creation of a response for a workflow agent. + + Why this is required: + The sandbox may be deactivated after some period of inactivity, and only data managed + by the hosting infrastructure or files will be preserved upon deactivation. + """ + input_items = await context.get_input_items() + input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) + is_streaming_request = request.stream is not None and request.stream is True + + _, are_options_set = _to_chat_options(request) + if are_options_set: + logger.warning("Workflow agent doesn't support runtime options. They will be ignored.") + + if request.previous_response_id is not None and context.conversation_id is not None: + raise RuntimeError("Previous response ID cannot be used in conjunction with conversation ID.") + context_id = request.previous_response_id or context.conversation_id + + # The following should never happen due to the checks above. + # This is for type safety and defensive programming. + if self._checkpoint_storage_path is None: + raise RuntimeError("Checkpoint storage path is not configured for workflow agent.") + if not isinstance(self._agent, WorkflowAgent): + raise RuntimeError("Agent is not a workflow agent.") + + # Workflow agents are not async context managers in any built-in path, + # but call _ensure_agent_ready for symmetry with the regular path so + # any future async resources owned by the workflow are entered here. + await self._ensure_agent_ready() + + # Determine the latest checkpoint (if any) so we can resume the + # workflow's prior state for this turn. The directory is keyed by + # the inbound context id (conversation_id when set, otherwise + # previous_response_id). Multi-turn declarative workflows need the + # workflow's internal state (e.g. Conversation.messages, + # intermediate Local.* variables) to survive across user turns; + # the only place that state lives is the workflow checkpoint, so + # on every turn we restore the latest checkpoint and feed the new + # input back into the start executor as a continuation rather than + # a fresh run. + latest_checkpoint_id: str | None = None + restore_storage: FileCheckpointStorage | None = None + if context_id is not None: + restore_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, context_id) + latest_checkpoint = await restore_storage.get_latest(workflow_name=self._agent.workflow.name) + if latest_checkpoint is not None: + latest_checkpoint_id = latest_checkpoint.checkpoint_id + + # Storage that will receive checkpoints written during this turn. + # When the caller chains with previous_response_id, the next turn + # will reference the current response_id as its previous_response_id, + # so new checkpoints must land under the current response_id (or the + # conversation_id when set). When conversation_id is set, this + # matches restore_storage; when only previous_response_id was + # supplied, restore_storage points at the *prior* response's + # directory and checkpoint_storage points at the *current* response's. + write_context_id = context.conversation_id or context.response_id + checkpoint_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, write_context_id) + + # Multi-turn pattern: when we have a prior checkpoint, restore it + # first (drive the workflow back to idle with prior state intact), + # then make a separate call that delivers the new user input. This + # depends on Workflow.run preserving shared state across calls. The + # restore-only call may yield events from any pending in-flight + # work in the checkpoint; we consume those internally here so they + # don't surface to the response stream as duplicates. + # + # If the restored checkpoint had pending request_info events, the + # restore-only call replays them through + # ``WorkflowAgent._convert_workflow_event_to_agent_response_updates`` + # and populates ``self._agent.pending_requests``. That is the correct + # state: those requests are genuinely outstanding, and the next + # ``run(input_messages, ...)`` call may contain ``function_call_output`` + # items (carried as FunctionResult/FunctionApprovalResponse content) + # that fulfill them via :meth:`WorkflowAgent._process_pending_requests`. + if latest_checkpoint_id is not None: + if restore_storage is None: # pragma: no cover - defensive + raise RuntimeError("Checkpoint restore storage is not configured.") + if is_streaming_request: + async for _ in self._agent.run( + stream=True, + checkpoint_id=latest_checkpoint_id, + checkpoint_storage=restore_storage, + ): + pass + else: + await self._agent.run( + stream=False, + checkpoint_id=latest_checkpoint_id, + checkpoint_storage=restore_storage, + ) + + # Now run the agent with the latest input response_event_stream = ResponseEventStream(response_id=context.response_id, model=request.model) + yield response_event_stream.emit_created() yield response_event_stream.emit_in_progress() - # Track the current active output item builder for streaming; - # lazily created on matching content, closed when a different type arrives. - tracker: _OutputItemTracker | None = None - - try: - input_items = await context.get_input_items() - input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) - is_streaming_request = request.stream is not None and request.stream is True - - _, are_options_set = _to_chat_options(request) - if are_options_set: - logger.warning("Workflow agent doesn't support runtime options. They will be ignored.") - - if request.previous_response_id is not None and context.conversation_id is not None: - raise RuntimeError("Previous response ID cannot be used in conjunction with conversation ID.") - context_id = request.previous_response_id or context.conversation_id - - # The following should never happen due to the checks above. - # This is for type safety and defensive programming. - if self._checkpoint_storage_path is None: - raise RuntimeError("Checkpoint storage path is not configured for workflow agent.") - if not isinstance(self._agent, WorkflowAgent): - raise RuntimeError("Agent is not a workflow agent.") - - # Workflow agents are not async context managers in any built-in path, - # but call _ensure_agent_ready for symmetry with the regular path so - # any future async resources owned by the workflow are entered here. - await self._ensure_agent_ready() + if not is_streaming_request: + # Run the agent in non-streaming mode + response = await self._agent.run(input_messages, stream=False, checkpoint_storage=checkpoint_storage) - # Determine the latest checkpoint (if any) so we can resume the - # workflow's prior state for this turn. The directory is keyed by - # the inbound context id (conversation_id when set, otherwise - # previous_response_id). Multi-turn declarative workflows need the - # workflow's internal state (e.g. Conversation.messages, - # intermediate Local.* variables) to survive across user turns; - # the only place that state lives is the workflow checkpoint, so - # on every turn we restore the latest checkpoint and feed the new - # input back into the start executor as a continuation rather than - # a fresh run. - latest_checkpoint_id: str | None = None - restore_storage: FileCheckpointStorage | None = None - if context_id is not None: - restore_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, context_id) - latest_checkpoint = await restore_storage.get_latest(workflow_name=self._agent.workflow.name) - if latest_checkpoint is not None: - latest_checkpoint_id = latest_checkpoint.checkpoint_id - - # Storage that will receive checkpoints written during this turn. - # When the caller chains with previous_response_id, the next turn - # will reference the current response_id as its previous_response_id, - # so new checkpoints must land under the current response_id (or the - # conversation_id when set). When conversation_id is set, this - # matches restore_storage; when only previous_response_id was - # supplied, restore_storage points at the *prior* response's - # directory and write_storage points at the *current* response's. - write_context_id = context.conversation_id or context.response_id - write_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, write_context_id) - - # Multi-turn pattern: when we have a prior checkpoint, restore it - # first (drive the workflow back to idle with prior state intact), - # then make a separate call that delivers the new user input. This - # depends on Workflow.run preserving shared state across calls. The - # restore-only call may yield events from any pending in-flight - # work in the checkpoint; we consume those internally here so they - # don't surface to the response stream as duplicates. - # - # If the restored checkpoint had pending request_info events, the - # restore-only call replays them through - # ``WorkflowAgent._convert_workflow_event_to_agent_response_updates`` - # and populates ``self._agent.pending_requests``. That is the correct - # state: those requests are genuinely outstanding, and the next - # ``run(input_messages, ...)`` call may contain ``function_call_output`` - # items (carried as FunctionResult/FunctionApprovalResponse content) - # that fulfill them via :meth:`WorkflowAgent._process_pending_requests`. - if latest_checkpoint_id is not None: - if is_streaming_request: - async for _ in self._agent.run( - stream=True, - checkpoint_id=latest_checkpoint_id, - checkpoint_storage=restore_storage, + for message in response.messages: + for content in message.contents: + async for item in _to_outputs( + response_event_stream, + content, + approval_storage=self._approval_storage, ): - pass - else: - await self._agent.run( - stream=False, - checkpoint_id=latest_checkpoint_id, - checkpoint_storage=restore_storage, - ) + yield item - if not is_streaming_request: - # Run the agent in non-streaming mode with the new user input. - response = await self._agent.run( - input_messages, - stream=False, - checkpoint_storage=write_storage, - ) - - async for item in _to_outputs_for_messages( - response_event_stream, - response.messages, - approval_storage=self._approval_storage, - ): - yield item + await self._delete_not_latest_checkpoints(checkpoint_storage, self._agent.workflow.name) + yield response_event_stream.emit_completed() + return - await self._delete_not_latest_checkpoints(write_storage, self._agent.workflow.name) - yield response_event_stream.emit_completed() - return + # Track the current active output item builder for streaming; + # lazily created on matching content, closed when a different type arrives. + tracker = _OutputItemTracker(response_event_stream) - tracker = _OutputItemTracker(response_event_stream) - - # Run the workflow agent in streaming mode with the new user input. - async for update in self._agent.run( - input_messages, - stream=True, - checkpoint_storage=write_storage, - ): - for content in update.contents: - for event in tracker.handle(content): - yield event - if tracker.needs_async: - async for item in _to_outputs( - response_event_stream, content, approval_storage=self._approval_storage - ): - yield item - tracker.needs_async = False + # Run the workflow agent in streaming mode + async for update in self._agent.run(input_messages, stream=True, checkpoint_storage=checkpoint_storage): + for content in update.contents: + for event in tracker.handle(content): + yield event + if tracker.needs_async: + async for item in _to_outputs( + response_event_stream, + content, + approval_storage=self._approval_storage, + ): + yield item + tracker.needs_async = False - # Close any remaining active builder - for event in tracker.close(): - yield event + # Close any remaining active builder + for event in tracker.close(): + yield event - await self._delete_not_latest_checkpoints(write_storage, self._agent.workflow.name) - yield response_event_stream.emit_completed() - except Exception as ex: - logger.exception("Failed to produce response for workflow agent") - for event in self._emit_failure(response_event_stream, tracker, ex): - yield event + await self._delete_not_latest_checkpoints(checkpoint_storage, self._agent.workflow.name) + yield response_event_stream.emit_completed() @staticmethod async def _delete_not_latest_checkpoints(checkpoint_storage: FileCheckpointStorage, workflow_name: str) -> None: @@ -711,29 +614,6 @@ async def _delete_not_latest_checkpoints(checkpoint_storage: FileCheckpointStora if checkpoint.checkpoint_id != latest_checkpoint.checkpoint_id: await checkpoint_storage.delete(checkpoint.checkpoint_id) - @staticmethod - def _emit_failure( - response_event_stream: ResponseEventStream, - tracker: _OutputItemTracker | None, - ex: BaseException, - ) -> Generator[ResponseStreamEvent]: - """Yield a terminal ``response.failed`` event for ``ex``. - - Drains any in-progress streaming output item first so the resulting - SSE stream stays well-formed, then emits ``response.failed`` carrying - the exception's message (falling back to the exception type name when - ``str(ex)`` is empty). Any error raised while draining the tracker is - logged and otherwise ignored so that the original failure is always - what the client sees. - """ - if tracker is not None: - try: - yield from tracker.close() - except Exception: - logger.exception("Error while closing streaming tracker after failure") - message = str(ex) or type(ex).__name__ - yield response_event_stream.emit_failed(message=message) - # endregion ResponsesHostServer @@ -796,7 +676,7 @@ def handle(self, content: Content) -> Generator[ResponseStreamEvent]: yield self._fc_builder.emit_arguments_delta(args_str) elif content.type == "mcp_server_tool_call" and content.tool_name: - key = content.call_id or f"{content.server_name or 'default'}::{content.tool_name}" + key = f"{content.server_name or 'default'}::{content.tool_name}" if self._active_type != "mcp_server_tool_call" or self._active_id != key: yield from self._close() yield from self._open_mcp_call(content) @@ -805,24 +685,6 @@ def handle(self, content: Content) -> Generator[ResponseStreamEvent]: if self._mcp_builder is not None: yield self._mcp_builder.emit_arguments_delta(args_str) - elif ( - content.type == "mcp_server_tool_result" - and self._active_type == "mcp_server_tool_call" - and self._mcp_builder is not None - and content.call_id is not None - and content.call_id == self._mcp_builder.item_id - ): - accumulated = "".join(self._accumulated) - yield self._mcp_builder.emit_arguments_done(accumulated) - yield self._mcp_builder.emit_completed() - yield self._mcp_builder.emit_done(output=_stringify_mcp_output(content.output)) - self._mcp_builder = None - self._active_type = None - self._active_id = None - self._accumulated.clear() - self.needs_async = False - return - else: yield from self._close() self.needs_async = True @@ -862,10 +724,9 @@ def _open_mcp_call(self, content: Content) -> Generator[ResponseStreamEvent]: self._mcp_builder = self._stream.add_output_item_mcp_call( server_label=content.server_name or "default", name=content.tool_name or "", - item_id=content.call_id, ) self._active_type = "mcp_server_tool_call" - self._active_id = content.call_id or f"{content.server_name or 'default'}::{content.tool_name}" + self._active_id = f"{content.server_name or 'default'}::{content.tool_name}" yield self._mcp_builder.emit_added() def _close(self) -> Generator[ResponseStreamEvent]: @@ -940,696 +801,6 @@ def _to_chat_options(request: CreateResponse) -> tuple[ChatOptions, bool]: # endregion -# region Input Message Conversion - - -async def _items_to_messages( - input_items: Sequence[Item], *, approval_storage: ApprovalStorage | None = None -) -> list[Message]: - """Converts a sequence of input items to a list of Messages, one per item. - - Args: - input_items: The input items to convert. - approval_storage: An optional ApprovalStorage instance used to look up - approval requests when converting MCP approval response items. - - Returns: - A list of Messages, one per supported input item. - """ - messages: list[Message] = [] - for item in input_items: - messages.append(await _item_to_message(item, approval_storage=approval_storage)) - return messages - - -async def _item_to_message(item: Item, *, approval_storage: ApprovalStorage | None = None) -> Message: - """Converts an Item to a Message. - - Args: - item: The Item to convert. - approval_storage: An optional ApprovalStorage instance used to look up - approval requests when converting MCP approval response items. - - Returns: - The converted Message. - - Raises: - ValueError: If the Item type is not supported. - """ - if item.type == "message": - msg = cast(ItemMessage, item) - if isinstance(msg.content, str): - return Message(role=msg.role, contents=[Content.from_text(msg.content)]) - return Message(role=msg.role, contents=[_convert_message_content(part) for part in msg.content]) - - if item.type == "output_message": - output_msg = cast(ItemOutputMessage, item) - return Message( - role=output_msg.role, contents=[_convert_output_message_content(part) for part in output_msg.content] - ) - - if item.type == "function_call": - fc = cast(ItemFunctionToolCall, item) - return Message( - role="assistant", - contents=[Content.from_function_call(fc.call_id, fc.name, arguments=fc.arguments)], - ) - - if item.type == "function_call_output": - fco = cast(FunctionCallOutputItemParam, item) - output = fco.output if isinstance(fco.output, str) else str(fco.output) - return Message( - role="tool", - contents=[Content.from_function_result(fco.call_id, result=output)], - ) - - if item.type == "reasoning": - reasoning = cast(ItemReasoningItem, item) - reason_contents: list[Content] = [] - if reasoning.summary: - for summary in reasoning.summary: - reason_contents.append(Content.from_text(summary.text)) - return Message(role="assistant", contents=reason_contents) - - if item.type == "mcp_call": - mcp = cast(ItemMcpToolCall, item) - contents = [ - Content.from_mcp_server_tool_call( - mcp.id, - mcp.name, - server_name=mcp.server_label, - arguments=mcp.arguments, - ) - ] - if getattr(mcp, "output", None) is not None: - contents.append(Content.from_mcp_server_tool_result(call_id=mcp.id, output=mcp.output)) - return Message( - role="assistant", - contents=contents, - ) - - if item.type == "mcp_approval_request": - mcp_req = cast(ItemMcpApprovalRequest, item) - if approval_storage is not None: - function_approval_request_content = await approval_storage.load_approval_request(mcp_req.id) - else: - raise ValueError("ApprovalStorage is required to load approval request.") - return Message( - role="assistant", - contents=[function_approval_request_content], - ) - - if item.type == "mcp_approval_response": - mcp_resp = cast(MCPApprovalResponse, item) - if approval_storage is not None: - function_approval_request_content = await approval_storage.load_approval_request( - mcp_resp.approval_request_id - ) - else: - raise ValueError("ApprovalStorage is required to load approval request.") - return Message( - role="user", - contents=[function_approval_request_content.to_function_approval_response(mcp_resp.approve)], - ) - - if item.type == "code_interpreter_call": - ci = cast(ItemCodeInterpreterToolCall, item) - return Message( - role="assistant", - contents=[Content.from_code_interpreter_tool_call(call_id=ci.id)], - ) - - if item.type == "image_generation_call": - ig = cast(ItemImageGenToolCall, item) - return Message( - role="assistant", - contents=[Content.from_image_generation_tool_call(image_id=ig.id)], - ) - - if item.type == "shell_call": - sc = cast(FunctionShellCallItemParam, item) - return Message( - role="assistant", - contents=[ - Content.from_shell_tool_call( - call_id=sc.call_id, - commands=sc.action.commands, - status=str(sc.status), - ) - ], - ) - - if item.type == "shell_call_output": - sco = cast(FunctionShellCallOutputItemParam, item) - outputs = [ - Content.from_shell_command_output( - stdout=out.stdout or "", - stderr=out.stderr or "", - exit_code=getattr(out.outcome, "exit_code", None) if hasattr(out, "outcome") else None, - ) - for out in (sco.output or []) - ] - return Message( - role="tool", - contents=[ - Content.from_shell_tool_result( - call_id=sco.call_id, - outputs=outputs, - max_output_length=sco.max_output_length, - ) - ], - ) - - if item.type == "local_shell_call": - lsc = cast(ItemLocalShellToolCall, item) - commands = lsc.action.command if hasattr(lsc.action, "command") and lsc.action.command else [] - return Message( - role="assistant", - contents=[ - Content.from_shell_tool_call( - call_id=lsc.call_id, - commands=commands, - status=str(lsc.status), - ) - ], - ) - - if item.type == "local_shell_call_output": - lsco = cast(ItemLocalShellToolCallOutput, item) - return Message( - role="tool", - contents=[ - Content.from_shell_tool_result( - call_id=lsco.id, - outputs=[Content.from_shell_command_output(stdout=lsco.output)], - ) - ], - ) - - if item.type == "file_search_call": - fs = cast(ItemFileSearchToolCall, item) - return Message( - role="assistant", - contents=[ - Content.from_function_call( - fs.id, - "file_search", - arguments=json.dumps({"queries": fs.queries}), - ) - ], - ) - - if item.type == "web_search_call": - ws = cast(ItemWebSearchToolCall, item) - return Message( - role="assistant", - contents=[Content.from_function_call(ws.id, "web_search")], - ) - - if item.type == "computer_call": - cc = cast(ItemComputerToolCall, item) - return Message( - role="assistant", - contents=[ - Content.from_function_call( - cc.call_id, - "computer_use", - arguments=str(cc.action), - ) - ], - ) - - if item.type == "computer_call_output": - cco = cast(ComputerCallOutputItemParam, item) - return Message( - role="tool", - contents=[Content.from_function_result(cco.call_id, result=str(cco.output))], - ) - - if item.type == "custom_tool_call": - ct = cast(ItemCustomToolCall, item) - return Message( - role="assistant", - contents=[Content.from_function_call(ct.call_id, ct.name, arguments=ct.input)], - ) - - if item.type == "custom_tool_call_output": - cto = cast(ItemCustomToolCallOutput, item) - output = cto.output if isinstance(cto.output, str) else str(cto.output) - # Hosted-MCP results land here because the host writes them via - # `aoutput_item_custom_tool_call_output` (see `_to_outputs` for - # `mcp_server_tool_result`). The persisted `call_id` keeps its - # `mcp_*` prefix; on read, route those back to a hosted-MCP result - # Content so the chat-client serialize layer can coalesce them - # onto a single `mcp_call` input item with `output` populated. - # Issue #5546. - if cto.call_id and cto.call_id.startswith("mcp_"): - return Message( - role="tool", - contents=[Content.from_mcp_server_tool_result(call_id=cto.call_id, output=output)], - ) - return Message( - role="tool", - contents=[Content.from_function_result(cto.call_id, result=output)], - ) - - if item.type == "apply_patch_call": - ap = cast(ApplyPatchToolCallItemParam, item) - return Message( - role="assistant", - contents=[ - Content.from_function_call( - ap.call_id, - "apply_patch", - arguments=str(ap.operation), - ) - ], - ) - - if item.type == "apply_patch_call_output": - apo = cast(ApplyPatchToolCallOutputItemParam, item) - return Message( - role="tool", - contents=[Content.from_function_result(apo.call_id, result=apo.output or "")], - ) - - raise ValueError(f"Unsupported Item type: {item.type}") - - -async def _output_items_to_messages( - history: Sequence[OutputItem], - *, - approval_storage: ApprovalStorage | None = None, -) -> list[Message]: - """Converts a sequence of OutputItem objects to a list of Message objects. - - Args: - history (Sequence[OutputItem]): The sequence of OutputItem objects to convert. - approval_storage (ApprovalStorage | None, optional): The approval storage to use for - resolving MCP approval requests. Defaults to None. - - Returns: - list[Message]: The list of Message objects. - """ - messages: list[Message] = [] - for item in history: - messages.append(await _output_item_to_message(item, approval_storage=approval_storage)) - return messages - - -async def _output_item_to_message(item: OutputItem, *, approval_storage: ApprovalStorage | None = None) -> Message: - """Converts an OutputItem to a Message. - - Args: - item (OutputItem): The OutputItem to convert. - approval_storage (ApprovalStorage | None, optional): The approval storage to use for - resolving MCP approval requests. Defaults to None. - - Returns: - Message: The converted Message. - - Raises: - ValueError: If the OutputItem type is not supported. - """ - if item.type == "output_message": - output_msg = cast(OutputItemOutputMessage, item) - return Message( - role=output_msg.role, contents=[_convert_output_message_content(part) for part in output_msg.content] - ) - - if item.type == "message": - msg = cast(OutputItemMessage, item) - return Message(role=msg.role, contents=[_convert_message_content(part) for part in msg.content]) - - if item.type == "function_call": - fc = cast(OutputItemFunctionToolCall, item) - return Message( - role="assistant", - contents=[Content.from_function_call(fc.call_id, fc.name, arguments=fc.arguments)], - ) - - if item.type == "function_call_output": - fco = cast(FunctionCallOutputItemParam, item) - output = fco.output if isinstance(fco.output, str) else str(fco.output) - return Message( - role="tool", - contents=[Content.from_function_result(fco.call_id, result=output)], - ) - - if item.type == "reasoning": - reasoning = cast(OutputItemReasoningItem, item) - contents: list[Content] = [] - if reasoning.summary: - for summary in reasoning.summary: - contents.append(Content.from_text(summary.text)) - return Message(role="assistant", contents=contents) - - if item.type == "mcp_call": - mcp = cast(OutputItemMcpToolCall, item) - contents = [ - Content.from_mcp_server_tool_call( - mcp.id, - mcp.name, - server_name=mcp.server_label, - arguments=mcp.arguments, - ) - ] - if getattr(mcp, "output", None) is not None: - contents.append(Content.from_mcp_server_tool_result(call_id=mcp.id, output=mcp.output)) - return Message( - role="assistant", - contents=contents, - ) - - if item.type == "mcp_approval_request": - mcp_req = cast(OutputItemMcpApprovalRequest, item) - if approval_storage is not None: - function_approval_request_content = await approval_storage.load_approval_request(mcp_req.id) - else: - raise ValueError("ApprovalStorage is required to load approval request.") - return Message( - role="assistant", - contents=[function_approval_request_content], - ) - - if item.type == "mcp_approval_response": - mcp_resp = cast(OutputItemMcpApprovalResponseResource, item) - if approval_storage is not None: - function_approval_request_content = await approval_storage.load_approval_request( - mcp_resp.approval_request_id - ) - else: - raise ValueError("ApprovalStorage is required to load approval request.") - - return Message( - role="user", - contents=[function_approval_request_content.to_function_approval_response(mcp_resp.approve)], - ) - - if item.type == "code_interpreter_call": - ci = cast(OutputItemCodeInterpreterToolCall, item) - return Message( - role="assistant", - contents=[Content.from_code_interpreter_tool_call(call_id=ci.id)], - ) - - if item.type == "image_generation_call": - ig = cast(OutputItemImageGenToolCall, item) - return Message( - role="assistant", - contents=[Content.from_image_generation_tool_call(image_id=ig.id)], - ) - - if item.type == "shell_call": - sc = cast(OutputItemFunctionShellCall, item) - return Message( - role="assistant", - contents=[ - Content.from_shell_tool_call( - call_id=sc.call_id, - commands=sc.action.commands, - status=str(sc.status), - ) - ], - ) - - if item.type == "shell_call_output": - sco = cast(OutputItemFunctionShellCallOutput, item) - outputs = [ - Content.from_shell_command_output( - stdout=out.stdout or "", - stderr=out.stderr or "", - exit_code=getattr(out.outcome, "exit_code", None) if hasattr(out, "outcome") else None, - ) - for out in (sco.output or []) - ] - return Message( - role="tool", - contents=[ - Content.from_shell_tool_result( - call_id=sco.call_id, - outputs=outputs, - max_output_length=sco.max_output_length, - ) - ], - ) - - if item.type == "local_shell_call": - lsc = cast(OutputItemLocalShellToolCall, item) - commands = lsc.action.command if hasattr(lsc.action, "command") and lsc.action.command else [] - return Message( - role="assistant", - contents=[ - Content.from_shell_tool_call( - call_id=lsc.call_id, - commands=commands, - status=str(lsc.status), - ) - ], - ) - - if item.type == "local_shell_call_output": - lsco = cast(OutputItemLocalShellToolCallOutput, item) - return Message( - role="tool", - contents=[ - Content.from_shell_tool_result( - call_id=lsco.id, - outputs=[Content.from_shell_command_output(stdout=lsco.output)], - ) - ], - ) - - if item.type == "file_search_call": - fs = cast(OutputItemFileSearchToolCall, item) - return Message( - role="assistant", - contents=[ - Content.from_function_call( - fs.id, - "file_search", - arguments=json.dumps({"queries": fs.queries}), - ) - ], - ) - - if item.type == "web_search_call": - ws = cast(OutputItemWebSearchToolCall, item) - return Message( - role="assistant", - contents=[Content.from_function_call(ws.id, "web_search")], - ) - - if item.type == "computer_call": - cc = cast(OutputItemComputerToolCall, item) - return Message( - role="assistant", - contents=[ - Content.from_function_call( - cc.call_id, - "computer_use", - arguments=str(cc.action), - ) - ], - ) - - if item.type == "computer_call_output": - cco = cast(OutputItemComputerToolCallOutputResource, item) - return Message( - role="tool", - contents=[Content.from_function_result(cco.call_id, result=str(cco.output))], - ) - - if item.type == "custom_tool_call": - ct = cast(OutputItemCustomToolCall, item) - return Message( - role="assistant", - contents=[Content.from_function_call(ct.call_id, ct.name, arguments=ct.input)], - ) - - if item.type == "custom_tool_call_output": - cto = cast(OutputItemCustomToolCallOutput, item) - output = cto.output if isinstance(cto.output, str) else str(cto.output) - # Hosted-MCP results land here because the host writes them via - # `aoutput_item_custom_tool_call_output`. Route `mcp_*` call_ids - # back to a hosted-MCP result Content so the chat-client serialize - # layer can coalesce onto the matching `mcp_call` input item. - # Issue #5546. - if cto.call_id and cto.call_id.startswith("mcp_"): - return Message( - role="tool", - contents=[Content.from_mcp_server_tool_result(call_id=cto.call_id, output=output)], - ) - return Message( - role="tool", - contents=[Content.from_function_result(cto.call_id, result=output)], - ) - - if item.type == "apply_patch_call": - ap = cast(OutputItemApplyPatchToolCall, item) - return Message( - role="assistant", - contents=[ - Content.from_function_call( - ap.call_id, - "apply_patch", - arguments=str(ap.operation), - ) - ], - ) - - if item.type == "apply_patch_call_output": - apo = cast(OutputItemApplyPatchToolCallOutput, item) - return Message( - role="tool", - contents=[Content.from_function_result(apo.call_id, result=apo.output or "")], - ) - - if item.type == "oauth_consent_request": - oauth = cast(OAuthConsentRequestOutputItem, item) - return Message( - role="assistant", - contents=[Content.from_oauth_consent_request(oauth.consent_link)], - ) - - if item.type == "structured_outputs": - so = cast(StructuredOutputsOutputItem, item) - text = json.dumps(so.output) if not isinstance(so.output, str) else so.output - return Message(role="assistant", contents=[Content.from_text(text)]) - - raise ValueError(f"Unsupported OutputItem type: {item.type}") - - -def _convert_output_message_content(content: OutputMessageContent) -> Content: - """Converts an OutputMessageContent to a Content object. - - Args: - content (OutputMessageContent): The OutputMessageContent to convert. - - Returns: - Content: The converted Content object. - - Raises: - ValueError: If the OutputMessageContent type is not supported. - """ - if content.type == "output_text": - text_content = cast(OutputMessageContentOutputTextContent, content) - return Content.from_text(text_content.text) - if content.type == "refusal": - refusal_content = cast(OutputMessageContentRefusalContent, content) - return Content.from_text(refusal_content.refusal) - - raise ValueError(f"Unsupported OutputMessageContent type: {content.type}") - - -def _convert_file_data(data_uri: str, filename: str | None = None) -> Content: - """Convert a file_data data URI to a Content object. - - For text/* MIME types, decodes the base64 content and returns it as text. - For other types, returns a URI-based Content with the filename preserved. - """ - # Parse data URI: data:;base64, - if data_uri.startswith("data:") and ";base64," in data_uri: - header, encoded = data_uri.split(";base64,", 1) - media_type = header[len("data:") :] - if media_type.startswith("text/"): - try: - decoded_text = base64.b64decode(encoded).decode("utf-8") - except (ValueError, UnicodeDecodeError): - logger.warning( - "Failed to decode text/* file_data as UTF-8, falling through to URI passthrough.", - exc_info=True, - ) - else: - prefix = f"[File: {filename}]\n" if filename else "" - return Content.from_text(f"{prefix}{decoded_text}") - additional_properties = {"filename": filename} if filename else None - return Content.from_uri(data_uri, additional_properties=additional_properties) - - -def _convert_message_content(content: MessageContent) -> Content: - """Converts a MessageContent to a Content object. - - Args: - content (MessageContent): The MessageContent to convert. - - Returns: - Content: The converted Content object. - - Raises: - ValueError: If the MessageContent type is not supported. - """ - if content.type == "input_text": - input_text = cast(MessageContentInputTextContent, content) - return Content.from_text(input_text.text) - if content.type == "output_text": - output_text = cast(MessageContentOutputTextContent, content) - return Content.from_text(output_text.text) - if content.type == "text": - text = cast(TextContent, content) - return Content.from_text(text.text) - if content.type == "summary_text": - summary = cast(SummaryTextContent, content) - return Content.from_text(summary.text) - if content.type == "refusal": - refusal = cast(MessageContentRefusalContent, content) - return Content.from_text(refusal.refusal) - if content.type == "reasoning_text": - reasoning = cast(MessageContentReasoningTextContent, content) - return Content.from_text_reasoning(text=reasoning.text) - if content.type == "input_image": - image = cast(MessageContentInputImageContent, content) - if image.image_url: - if image.image_url.startswith("data:"): - return Content.from_uri(image.image_url) - return Content.from_uri(image.image_url, media_type="image/*") - if image.file_id: - return Content.from_hosted_file(image.file_id) - if content.type == "input_file": - file = cast(MessageContentInputFileContent, content) - if file.file_url: - return Content.from_uri(file.file_url) - if file.file_id: - return Content.from_hosted_file(file.file_id, name=file.filename) - if file.file_data: - return _convert_file_data(file.file_data, file.filename) - if content.type == "computer_screenshot": - screenshot = cast(ComputerScreenshotContent, content) - return Content.from_uri(screenshot.image_url) - - raise ValueError(f"Unsupported MessageContent type: {content.type}") - - -# endregion - -# region Output Item Conversion - - -def _argument_json_default(value: Any) -> Any: - if is_dataclass(value) and not isinstance(value, type): - return asdict(value) - to_dict = getattr(value, "to_dict", None) - if callable(to_dict): - return to_dict() - raise TypeError(f"Object of type {type(value).__name__} is not JSON serializable") - - -def _arguments_to_str(arguments: Any | None) -> str: - """Convert arguments to a JSON string. - - Args: - arguments: The arguments to convert, can be a string, JSON-like object, or None. - - Returns: - The arguments as a JSON string. - """ - if arguments is None: - return "" - if isinstance(arguments, str): - return arguments - return json.dumps(arguments, default=_argument_json_default) - - async def _to_outputs( stream: ResponseEventStream, content: Content, @@ -1675,7 +846,6 @@ async def _to_outputs( mcp_call = stream.add_output_item_mcp_call( server_label=content.server_name or "default", name=content.tool_name or "", - item_id=content.call_id, ) yield mcp_call.emit_added() async for event in mcp_call.aarguments(_arguments_to_str(content.arguments)): @@ -1750,91 +920,4 @@ async def _to_outputs( logger.warning(f"Content type '{content.type}' is not supported yet. This is usually safe to ignore.") -def _stringify_mcp_output(output: Any) -> str: - """Convert hosted MCP output payloads into the string shape expected by mcp_call.output.""" - if output is None: - return "" - if isinstance(output, str): - return output - if isinstance(output, Mapping): - text = cast(Any, output).get("text") - if isinstance(text, str): - return text - return json.dumps(output, default=str) - if isinstance(output, Sequence) and not isinstance(output, (str, bytes, bytearray)): - parts: list[str] = [] - entries = cast(Sequence[object], output) - for entry in entries: - if isinstance(entry, Content) and entry.type == "text": - parts.append(entry.text or "") - continue - parts.append(_stringify_mcp_output(entry)) - return "".join(parts) - return str(output) - - -def _emit_completed_mcp_call( - stream: ResponseEventStream, - call_content: Content, - *, - arguments: str, - output: str, -) -> Generator[ResponseStreamEvent]: - """Emit a single completed MCP call item carrying both arguments and output.""" - mcp_call = stream.add_output_item_mcp_call( - server_label=call_content.server_name or "default", - name=call_content.tool_name or "", - item_id=call_content.call_id, - ) - yield mcp_call.emit_added() - yield mcp_call.emit_arguments_done(arguments) - yield mcp_call.emit_completed() - yield mcp_call.emit_done(output=output) - - -async def _to_outputs_for_messages( - stream: ResponseEventStream, - messages: Sequence[Message], - *, - approval_storage: ApprovalStorage | None = None, -) -> AsyncIterator[ResponseStreamEvent]: - """Convert messages to output events with hosted-MCP call/result coalescing. - - Parse once in message/content order and emit either: - - a single canonical completed ``mcp_call`` when adjacent hosted MCP - call/result content are encountered, or - - standard output items for all other content types. - """ - pending_mcp_call: Content | None = None - - for message in messages: - for content in message.contents: - if pending_mcp_call is not None: - if content.type == "mcp_server_tool_result" and content.call_id == pending_mcp_call.call_id: - for event in _emit_completed_mcp_call( - stream, - pending_mcp_call, - arguments=_arguments_to_str(pending_mcp_call.arguments), - output=_stringify_mcp_output(content.output), - ): - yield event - pending_mcp_call = None - continue - - async for event in _to_outputs(stream, pending_mcp_call, approval_storage=approval_storage): - yield event - pending_mcp_call = None - - if content.type == "mcp_server_tool_call" and content.call_id: - pending_mcp_call = content - continue - - async for event in _to_outputs(stream, content, approval_storage=approval_storage): - yield event - - if pending_mcp_call is not None: - async for event in _to_outputs(stream, pending_mcp_call, approval_storage=approval_storage): - yield event - - # endregion diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_shared.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_shared.py new file mode 100644 index 00000000000..53007a979c0 --- /dev/null +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_shared.py @@ -0,0 +1,1340 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Shared transformation helpers between the agent-server data model and Agent Framework. + +This module is the single home for *pure-data* conversions between the +:mod:`azure.ai.agentserver.responses.models` SDK shapes (``Item``, +``OutputItem``, ``MessageContent``, …) and the Agent Framework public types +(:class:`agent_framework.Message`, :class:`agent_framework.Content`, …). + +Why this lives in one module +---------------------------- +* The :mod:`._responses` channel adapter and the + :class:`._history_provider.FoundryHostedAgentHistoryProvider` both need the + exact same OutputItem→Message conversion. Keeping it in one place means we + only have **one** ``isinstance(item.type, ...)`` dispatch table to keep up + to date when the agent-server SDK grows new item kinds. If you spot a + ``type`` value that this module raises ``ValueError`` for, that is the place + to add support — and **both** consumers benefit immediately. +* The whole module references the agent-server SDK through a single + ``from azure.ai.agentserver.responses import models`` import. Looking at the + ``models.X`` references makes it obvious which generated types we already + consume and which ones (e.g. ``models.A2AToolCall``, + ``models.AzureFunctionToolCall``, …) are not yet wired into + :func:`_output_item_to_message`. + +``additional_properties`` round-trip +------------------------------------ +Both the SDK models and :class:`agent_framework.Message` carry an extensible +extras bag — the agent-server models are +:class:`collections.abc.MutableMapping` instances that round-trip *any* key +through their JSON serialisation, and ``Message`` (and ``Content``) expose a +public ``additional_properties: dict[str, Any]`` slot. + +To preserve channel-specific extras across a load/save cycle: + +* On **load** (SDK model → Message) :func:`_collect_unknown_keys` extracts + every key on the source model that is **not** part of its declared schema + (per ``_attr_to_rest_field``) and stashes it on + ``Message.additional_properties["foundry"]`` (and per-content the same + bag is attached onto ``Content.additional_properties["foundry"]``). The + bag is only attached when at least one extra key is present, so messages + that didn't have extras stay byte-equal to the previous behaviour. +* On **save** (Message → SDK model) :func:`_inject_extras` writes any + previously stashed bag back as direct keys on the SDK model — Foundry + storage will round-trip them as opaque JSON. + +This means an app can stash channel-specific bookkeeping (delivery +fingerprints, `hosting` envelope from the host, AG-UI ``client_state`` +snapshots, …) under a known top-level key and rely on it surviving a +write/read cycle through the Foundry response store. +""" + +from __future__ import annotations + +import base64 +import json +import logging +from collections.abc import Mapping, Sequence +from typing import Any, Protocol, cast + +from agent_framework import Content, Message +from azure.ai.agentserver.responses import models + +logger = logging.getLogger(__name__) + +# Top-level key under which round-tripped SDK extras live on +# ``Message.additional_properties`` and ``Content.additional_properties``. +# Stable on purpose: write-paths look it up by name to re-inject extras into +# outbound SDK models. +EXTRAS_KEY = "foundry" + +# Sub-key (under ``additional_properties[EXTRAS_KEY]``) that stores a +# verbatim snapshot of the original SDK ``OutputItem`` mapping captured at +# read time. The write path re-emits the SDK item from this snapshot when +# present, giving lossless audit/replay semantics: every declared field +# (item id, type discriminator, content array, status, …) AND every undeclared +# extra Foundry handed us survive the AF round-trip. Without this, a +# message synthesised back from ``Message.text`` alone would discard the +# original item shape. +RAW_KEY = "__raw__" + +# Top-level key on the SDK ``OutputItem`` mapping under which we round-trip +# *every* :class:`agent_framework.Message` ``additional_properties`` namespace +# **other than** :data:`EXTRAS_KEY` (the foundry-internal namespace, handled +# separately by :func:`_inject_extras`). +# +# Why a single container key instead of writing each namespace as a top-level +# extra on the SDK item: Foundry's storage backend round-trips arbitrary +# unknown keys, but on **load** :func:`_collect_unknown_keys` cannot tell +# which unknowns were AF-written namespaces (``hosting``, ``agui_state``, +# ...) vs Foundry-runtime additions. Funnelling AF namespaces under a single +# sentinel key removes that ambiguity: anything inside ``agent_framework`` +# is restored under its original namespace; anything else stays under +# :data:`EXTRAS_KEY` (preserving today's behaviour for Foundry-side extras). +# +# Concretely, this is the mechanism that gives the Hosting spec's +# ``Message.additional_properties["hosting"]`` envelope (channel / +# identity / response_target / initial-write ``deliveries[]``) durable +# round-trip semantics through the Foundry response store — see +# ``docs/specs/002-python-hosting-channels.md`` §"Channel metadata +# persisted onto stored messages". +AF_EXTRAS_KEY = "agent_framework" + +# Re-exports — these helpers are consumed by sibling modules +# (``_responses.py`` and ``_history_provider.py``); declaring them in +# ``__all__`` quiets pyright's ``reportUnusedFunction`` for module-private +# names that are intentionally part of the package-internal API. +__all__ = ( + "AF_EXTRAS_KEY", + "EXTRAS_KEY", + "RAW_KEY", + "ApprovalStorage", + "_arguments_to_str", + "_attach_content_extras", + "_attach_extras", + "_capture_raw", + "_collect_af_extras", + "_collect_unknown_keys", + "_convert_message_content", + "_convert_output_message_content", + "_inject_af_extras", + "_inject_extras", + "_item_to_message", + "_items_to_messages", + "_message_text", + "_message_to_output_item", + "_messages_to_output_items", + "_output_item_to_message", + "_output_items_to_messages", +) + + +class ApprovalStorage(Protocol): + """Storage for saving function approval requests.""" + + async def save_approval_request(self, approval_request_id: str, request: Content) -> None: + """Save a function approval request under the given ID.""" + ... + + async def load_approval_request(self, approval_request_id: str) -> Content: + """Load a function approval request by its ID.""" + ... + + +# region Extras helpers + + +def _collect_unknown_keys(model: Mapping[str, Any]) -> dict[str, Any]: + """Return any keys present on the SDK model that are not part of its declared schema. + + The agent-server SDK models are + :class:`collections.abc.MutableMapping` instances generated from the + Foundry REST contract; declared fields are exposed via the class-level + ``_attr_to_rest_field`` map. Any extra key on the instance therefore + represents data the Foundry runtime stored that the SDK doesn't model + explicitly — typically channel-specific extras a previous write-path + deliberately stashed there via :func:`_inject_extras`. + + Args: + model: A model instance (or any mapping) to inspect. + + Returns: + A new ``dict`` containing only the keys on ``model`` that are not + declared in the model's REST schema. Empty when the model only + carries declared fields. + """ + if not isinstance(model, Mapping): + return {} + known = set(getattr(type(model), "_attr_to_rest_field", {}).keys()) + return {key: value for key, value in model.items() if key not in known} + + +def _attach_extras(message: Message, model: Mapping[str, Any]) -> Message: + """Attach SDK extras (if any) to ``message.additional_properties``. + + Two-tier restoration so the Hosting spec's namespaced envelopes + (``hosting``, ``agui_state``, …) come back under their **original** + keys while Foundry-side extras (anything the runtime layered on the + SDK item) stay under the foundry-internal :data:`EXTRAS_KEY` + namespace: + + 1. Pop :data:`AF_EXTRAS_KEY` from the unknown-keys bag and merge each + sub-key directly onto ``message.additional_properties`` — this is + how the inbound ``hosting`` envelope (channel/identity/ + response_target) and the initial-write ``deliveries[]`` snapshot + round-trip through Foundry storage. + 2. Anything remaining (Foundry-runtime extras the SDK doesn't model + explicitly) is stashed under + ``additional_properties[EXTRAS_KEY]`` for backward compatibility + and audit/replay. + + No-op when the model carries no extras — ``additional_properties`` is left + alone so callers and tests that compare ``Message`` instances for equality + by ``role``/``contents`` only continue to pass. + + Args: + message: The message to enrich. + model: The SDK model whose extras should be preserved. + + Returns: + The same ``message`` instance (returned for fluent chaining). + """ + extras = _collect_unknown_keys(model) + if not extras: + return message + af_extras = extras.pop(AF_EXTRAS_KEY, None) + if isinstance(af_extras, Mapping): + af_extras_typed = cast("Mapping[str, Any]", af_extras) + for ns_key, ns_val in af_extras_typed.items(): + # Per-namespace overwrite: a fresh load is the source of + # truth for the message we're rebuilding. + message.additional_properties[ns_key] = ns_val + if extras: + message.additional_properties.setdefault(EXTRAS_KEY, {}).update(extras) + return message + + +def _capture_raw(message: Message, item: Mapping[str, Any]) -> Message: + """Snapshot the SDK item's full mapping onto the message for replay. + + Stored under ``message.additional_properties[EXTRAS_KEY][RAW_KEY]`` so + :func:`_message_to_output_item` can re-emit the byte-for-byte original + SDK shape on the write side. This is what lets the AF → + Foundry-storage round-trip preserve item ids, content variants + (citations, reasoning, tool results, …) and any extras Foundry + layered on top of the declared schema. + + Narrow ``TypeError`` is the only swallowed failure (matches the + ``Mapping`` contract precondition); ``MemoryError`` and other + ``Exception`` subclasses propagate so genuine bugs aren't masked. + A WARNING with ``exc_info`` is logged so the lossy fallback is + observable downstream — without it a regression in the SDK schema + silently drops citations / reasoning / tool-result extras on every + round-tripped message and there is no breadcrumb pointing here. + """ + try: + raw = dict(item) + except TypeError: + logger.warning( + "_capture_raw: SDK item %r is not mapping-like; round-tripping without raw snapshot", + type(item).__name__, + exc_info=True, + ) + return message + message.additional_properties.setdefault(EXTRAS_KEY, {})[RAW_KEY] = raw + return message + + +def _inject_extras(model: Any, source: Mapping[str, Any] | None) -> Any: + """Inject previously-stashed extras back onto an outbound SDK model. + + The SDK models are :class:`collections.abc.MutableMapping`; setting + arbitrary keys on them is supported and round-trips through serialisation. + Use this when **emitting** SDK shapes (e.g. when ``save_messages`` decides + to write back through the Foundry storage API). + + Args: + model: The SDK model instance to enrich. Must be mapping-like. + source: The extras bag previously read from + ``Message.additional_properties[EXTRAS_KEY]`` (or any equivalent). + ``None`` is treated as an empty bag. + + Returns: + The same ``model`` instance (returned for fluent chaining). + """ + if not source: + return model + for key, value in source.items(): + # Internal sentinel — never write the raw-snapshot back as a + # storage field; it lives only inside ``additional_properties``. + if key == RAW_KEY: + continue + # Avoid clobbering declared fields — extras are never allowed to + # overwrite the schema-defined contract on the model. + model_type: Any = type(model) # pyright: ignore[reportUnknownVariableType] + known: set[str] = set(getattr(model_type, "_attr_to_rest_field", {})) + if key in known: + continue + model[key] = value + return model + + +def _collect_af_extras(message: Message) -> dict[str, Any]: + """Gather every AF-side ``additional_properties`` namespace except :data:`EXTRAS_KEY`. + + Returns the namespaces (``hosting``, ``agui_state``, …) that should + round-trip through Foundry storage as a single opaque container under + :data:`AF_EXTRAS_KEY` on the SDK item. The foundry-internal namespace + is excluded because :func:`_inject_extras` handles it separately and + its contents are AF-specific bookkeeping (raw snapshots, Foundry + runtime extras) that don't belong inside the AF container. + """ + props = message.additional_properties or {} + return {key: value for key, value in props.items() if key != EXTRAS_KEY} + + +def _inject_af_extras(model: Any, source: Mapping[str, Any] | None) -> Any: + """Write AF-side namespaces onto the SDK model under :data:`AF_EXTRAS_KEY`. + + This is the save-side counterpart to :func:`_attach_extras`'s + AF-namespace restoration. The container key collides with declared + schema fields only if Foundry decides to add an + ``agent_framework`` field to its REST contract — at which point we + rename the constant. + + A non-empty ``source`` overwrites any value already at + :data:`AF_EXTRAS_KEY` on the model (e.g. a stale value baked into a + raw-snapshot replay) so the in-process :class:`Message` remains the + source of truth at write time. + """ + if not source: + return model + model[AF_EXTRAS_KEY] = dict(source) + return model + + +# endregion + + +# region Small utilities + + +def _arguments_to_str(arguments: str | Mapping[str, Any] | None) -> str: + """Convert a tool-call ``arguments`` payload to its on-the-wire JSON string form. + + Args: + arguments: The arguments to serialise. ``None`` becomes an empty + string, an existing string is returned verbatim, and any mapping + is JSON-encoded. + + Returns: + The arguments as a JSON string. + """ + if arguments is None: + return "" + if isinstance(arguments, str): + return arguments + return json.dumps(arguments) + + +# endregion + + +# region Content conversion + + +def _convert_file_data(data_uri: str, filename: str | None = None) -> Content: + """Convert a ``file_data`` data URI to a :class:`Content`. + + For ``text/*`` MIME types the base64 payload is decoded and returned as + plain text (with a ``[File: ]`` prefix when a filename is known); + other media types fall through to a URI-based content with the + filename preserved as an additional property. + """ + if data_uri.startswith("data:") and ";base64," in data_uri: + header, encoded = data_uri.split(";base64,", 1) + media_type = header[len("data:") :] + if media_type.startswith("text/"): + try: + decoded_text = base64.b64decode(encoded).decode("utf-8") + except (ValueError, UnicodeDecodeError): + logger.warning( + "Failed to decode text/* file_data as UTF-8, falling through to URI passthrough.", + exc_info=True, + ) + else: + prefix = f"[File: {filename}]\n" if filename else "" + return Content.from_text(f"{prefix}{decoded_text}") + additional_properties = {"filename": filename} if filename else None + return Content.from_uri(data_uri, additional_properties=additional_properties) + + +def _convert_message_content(content: models.MessageContent) -> Content: + """Convert an SDK ``MessageContent`` (input-side) into a framework ``Content``. + + Handles all input/output content variants currently understood by the + Responses channel — text, output text, summary, refusal, reasoning text, + input images, input files, computer screenshot. + + Args: + content: The SDK content node to convert. + + Returns: + The corresponding :class:`agent_framework.Content`. + + Raises: + ValueError: If the SDK content ``type`` is not yet supported by this + adapter. + """ + if content.type == "input_text": + return _attach_content_extras( + Content.from_text(cast(models.MessageContentInputTextContent, content).text), content + ) + if content.type == "output_text": + return _attach_content_extras( + Content.from_text(cast(models.MessageContentOutputTextContent, content).text), content + ) + if content.type == "text": + return _attach_content_extras(Content.from_text(cast(models.TextContent, content).text), content) + if content.type == "summary_text": + return _attach_content_extras(Content.from_text(cast(models.SummaryTextContent, content).text), content) + if content.type == "refusal": + return _attach_content_extras( + Content.from_text(cast(models.MessageContentRefusalContent, content).refusal), content + ) + if content.type == "reasoning_text": + return _attach_content_extras( + Content.from_text_reasoning(text=cast(models.MessageContentReasoningTextContent, content).text), + content, + ) + if content.type == "input_image": + image = cast(models.MessageContentInputImageContent, content) + if image.image_url: + return _attach_content_extras(Content.from_uri(image.image_url), content) + if image.file_id: + return _attach_content_extras(Content.from_hosted_file(image.file_id), content) + if content.type == "input_file": + file = cast(models.MessageContentInputFileContent, content) + if file.file_url: + return _attach_content_extras(Content.from_uri(file.file_url), content) + if file.file_id: + return _attach_content_extras(Content.from_hosted_file(file.file_id, name=file.filename), content) + if file.file_data: + return _attach_content_extras(_convert_file_data(file.file_data, file.filename), content) + if content.type == "computer_screenshot": + return _attach_content_extras( + Content.from_uri(cast(models.ComputerScreenshotContent, content).image_url), content + ) + + raise ValueError(f"Unsupported MessageContent type: {content.type}") + + +def _convert_output_message_content(content: models.OutputMessageContent) -> Content: + """Convert an SDK ``OutputMessageContent`` (assistant output side) into a framework ``Content``. + + Handles assistant-output variants: ``output_text`` and ``refusal``. + + Args: + content: The SDK content node to convert. + + Returns: + The corresponding :class:`agent_framework.Content`. + + Raises: + ValueError: If the SDK content ``type`` is not yet supported. + """ + if content.type == "output_text": + return _attach_content_extras( + Content.from_text(cast(models.OutputMessageContentOutputTextContent, content).text), content + ) + if content.type == "refusal": + return _attach_content_extras( + Content.from_text(cast(models.OutputMessageContentRefusalContent, content).refusal), content + ) + + raise ValueError(f"Unsupported OutputMessageContent type: {content.type}") + + +def _attach_content_extras(content: Content, model: Mapping[str, Any]) -> Content: + """Round-trip SDK content extras onto :attr:`Content.additional_properties`. + + Mirror of :func:`_attach_extras` but for individual content nodes. Only + attaches the bag when at least one extra key is present, so the produced + ``Content`` stays byte-equivalent to a non-extras conversion when there is + nothing to preserve. + + Args: + content: The framework content to enrich. + model: The SDK content node whose extras should be preserved. + + Returns: + The same ``content`` instance. + """ + extras = _collect_unknown_keys(model) + if extras: + content.additional_properties.setdefault(EXTRAS_KEY, {}).update(extras) + return content + + +# endregion + + +# region Item → Message (input side) + + +async def _items_to_messages( + input_items: Sequence[models.Item], + *, + approval_storage: ApprovalStorage | None = None, +) -> list[Message]: + """Convert a sequence of input ``Item`` SDK objects to framework ``Message`` objects. + + One :class:`agent_framework.Message` per input item — fan-out is the + caller's responsibility. + + Args: + input_items: The input items to convert. + + Keyword Args: + approval_storage: Optional approval storage. Required when the input + stream contains ``mcp_approval_request`` / ``mcp_approval_response`` + items so the original function-call payload can be looked up. + + Returns: + A list of messages in the same order as the input. + """ + return [await _item_to_message(item, approval_storage=approval_storage) for item in input_items] + + +async def _item_to_message( + item: models.Item, + *, + approval_storage: ApprovalStorage | None = None, +) -> Message: + """Convert a single input ``Item`` SDK object to a framework ``Message``. + + Wraps :func:`_item_to_message_inner` and stamps a :data:`RAW_KEY` + snapshot of the SDK item so the write path can rebuild the original + shape losslessly. See :func:`_capture_raw`. + """ + return _capture_raw(await _item_to_message_inner(item, approval_storage=approval_storage), item) + + +async def _item_to_message_inner( + item: models.Item, + *, + approval_storage: ApprovalStorage | None = None, +) -> Message: + """Convert a single input ``Item`` SDK object to a framework ``Message``. + + The conversion table is intentionally explicit (no auto-discovery) so it + is easy to scan for missing variants. To add support for a new item kind: + + 1. Add an ``elif item.type == "...":`` branch here. + 2. Reference the corresponding ``models.ItemX`` (or + ``models.XItemParam``) type via ``cast(...)``. + 3. Map its fields onto :class:`agent_framework.Content` factory methods. + 4. Add an ``isinstance(...)`` branch in :func:`_output_item_to_message` + if the same kind also appears on the output side. + + Args: + item: The SDK item to convert. + + Keyword Args: + approval_storage: Optional approval storage. Required when the item is + an ``mcp_approval_request`` / ``mcp_approval_response``; ignored + otherwise. + + Returns: + The converted message, with any unknown extras round-tripped under + ``message.additional_properties[EXTRAS_KEY]``. + + Raises: + ValueError: If the SDK item ``type`` is not yet supported by this + adapter. + """ + if item.type == "message": + msg = cast(models.ItemMessage, item) + if isinstance(msg.content, str): + message = Message(role=msg.role, contents=[Content.from_text(msg.content)]) + else: + message = Message(role=msg.role, contents=[_convert_message_content(part) for part in msg.content]) + return _attach_extras(message, item) + + if item.type == "output_message": + output_msg = cast(models.ItemOutputMessage, item) + return _attach_extras( + Message( + role=output_msg.role, + contents=[_convert_output_message_content(part) for part in output_msg.content], + ), + item, + ) + + if item.type == "function_call": + fc = cast(models.ItemFunctionToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[Content.from_function_call(fc.call_id, fc.name, arguments=fc.arguments)], + ), + item, + ) + + if item.type == "function_call_output": + fco = cast(models.FunctionCallOutputItemParam, item) + output = fco.output if isinstance(fco.output, str) else str(fco.output) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(fco.call_id, result=output)]), + item, + ) + + if item.type == "reasoning": + reasoning = cast(models.ItemReasoningItem, item) + reason_contents: list[Content] = [] + if reasoning.summary: + for summary in reasoning.summary: + reason_contents.append(Content.from_text(summary.text)) + return _attach_extras(Message(role="assistant", contents=reason_contents), item) + + if item.type == "mcp_call": + mcp = cast(models.ItemMcpToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_mcp_server_tool_call( + mcp.id, + mcp.name, + server_name=mcp.server_label, + arguments=mcp.arguments, + ) + ], + ), + item, + ) + + if item.type == "mcp_approval_request": + mcp_req = cast(models.ItemMcpApprovalRequest, item) + if approval_storage is None: + raise ValueError("ApprovalStorage is required to load approval request.") + mcp_call_content = await approval_storage.load_approval_request(mcp_req.id) + return _attach_extras( + Message(role="assistant", contents=[mcp_call_content]), + item, + ) + + if item.type == "mcp_approval_response": + mcp_resp = cast(models.MCPApprovalResponse, item) + if approval_storage is None: + raise ValueError("ApprovalStorage is required to load approval request.") + function_approval_request_content = await approval_storage.load_approval_request(mcp_resp.approval_request_id) + return _attach_extras( + Message( + role="user", + contents=[function_approval_request_content.to_function_approval_response(mcp_resp.approve)], + ), + item, + ) + + if item.type == "code_interpreter_call": + ci = cast(models.ItemCodeInterpreterToolCall, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_code_interpreter_tool_call(call_id=ci.id)]), + item, + ) + + if item.type == "image_generation_call": + ig = cast(models.ItemImageGenToolCall, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_image_generation_tool_call(image_id=ig.id)]), + item, + ) + + if item.type == "shell_call": + sc = cast(models.FunctionShellCallItemParam, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=sc.call_id, + commands=sc.action.commands, + status=str(sc.status), + ) + ], + ), + item, + ) + + if item.type == "shell_call_output": + sco = cast(models.FunctionShellCallOutputItemParam, item) + outputs = [ + Content.from_shell_command_output( + stdout=out.stdout or "", + stderr=out.stderr or "", + exit_code=getattr(out.outcome, "exit_code", None) if hasattr(out, "outcome") else None, + ) + for out in (sco.output or []) + ] + return _attach_extras( + Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=sco.call_id, + outputs=outputs, + max_output_length=sco.max_output_length, + ) + ], + ), + item, + ) + + if item.type == "local_shell_call": + lsc = cast(models.ItemLocalShellToolCall, item) + commands = lsc.action.command if hasattr(lsc.action, "command") and lsc.action.command else [] + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=lsc.call_id, + commands=commands, + status=str(lsc.status), + ) + ], + ), + item, + ) + + if item.type == "local_shell_call_output": + lsco = cast(models.ItemLocalShellToolCallOutput, item) + return _attach_extras( + Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=lsco.id, + outputs=[Content.from_shell_command_output(stdout=lsco.output)], + ) + ], + ), + item, + ) + + if item.type == "file_search_call": + fs = cast(models.ItemFileSearchToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_function_call( + fs.id, + "file_search", + arguments=json.dumps({"queries": fs.queries}), + ) + ], + ), + item, + ) + + if item.type == "web_search_call": + ws = cast(models.ItemWebSearchToolCall, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_function_call(ws.id, "web_search")]), + item, + ) + + if item.type == "computer_call": + cc = cast(models.ItemComputerToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_function_call( + cc.call_id, + "computer_use", + arguments=str(cc.action), + ) + ], + ), + item, + ) + + if item.type == "computer_call_output": + cco = cast(models.ComputerCallOutputItemParam, item) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(cco.call_id, result=str(cco.output))]), + item, + ) + + if item.type == "custom_tool_call": + ct = cast(models.ItemCustomToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[Content.from_function_call(ct.call_id, ct.name, arguments=ct.input)], + ), + item, + ) + + if item.type == "custom_tool_call_output": + cto = cast(models.ItemCustomToolCallOutput, item) + output = cto.output if isinstance(cto.output, str) else str(cto.output) + # Hosted-MCP results land here because the host writes them via + # ``aoutput_item_custom_tool_call_output`` (see ``_to_outputs`` for + # ``mcp_server_tool_result``). The persisted ``call_id`` keeps its + # ``mcp_*`` prefix; on read, route those back to a hosted-MCP + # result Content so the chat-client serialize layer can coalesce + # them onto a single ``mcp_call`` input item with ``output`` + # populated. Issue #5546. + if cto.call_id and cto.call_id.startswith("mcp_"): + return _attach_extras( + Message( + role="tool", + contents=[Content.from_mcp_server_tool_result(call_id=cto.call_id, output=output)], + ), + item, + ) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(cto.call_id, result=output)]), + item, + ) + + if item.type == "apply_patch_call": + ap = cast(models.ApplyPatchToolCallItemParam, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_function_call( + ap.call_id, + "apply_patch", + arguments=str(ap.operation), + ) + ], + ), + item, + ) + + if item.type == "apply_patch_call_output": + apo = cast(models.ApplyPatchToolCallOutputItemParam, item) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(apo.call_id, result=apo.output or "")]), + item, + ) + + raise ValueError(f"Unsupported Item type: {item.type}") + + +# endregion + + +# region OutputItem → Message (output / history side) + + +async def _output_items_to_messages( + history: Sequence[models.OutputItem], + *, + approval_storage: ApprovalStorage | None = None, +) -> list[Message]: + """Convert a sequence of ``OutputItem`` SDK objects to framework ``Message`` objects. + + This is the function the :class:`._history_provider.FoundryHostedAgentHistoryProvider` + calls to materialise stored Foundry response items into the message + history the agent will see on its next turn. + + Args: + history: The output items to convert, oldest-first. + + Keyword Args: + approval_storage: Optional approval storage. Required when the + history contains ``mcp_approval_request`` / + ``mcp_approval_response`` items so the original function-call + payload can be looked up. + + Returns: + A list of messages, one per supported item, in the same order. + """ + return [await _output_item_to_message(item, approval_storage=approval_storage) for item in history] + + +async def _output_item_to_message( + item: models.OutputItem, + *, + approval_storage: ApprovalStorage | None = None, +) -> Message: + """Convert a single ``OutputItem`` SDK object to a framework ``Message``. + + Wraps :func:`_output_item_to_message_inner` and stamps a + :data:`RAW_KEY` snapshot of the SDK item onto + ``Message.additional_properties[EXTRAS_KEY]`` so the write path can + re-emit byte-for-byte. See :func:`_capture_raw` for the rationale. + """ + return _capture_raw(await _output_item_to_message_inner(item, approval_storage=approval_storage), item) + + +async def _output_item_to_message_inner( + item: models.OutputItem, + *, + approval_storage: ApprovalStorage | None = None, +) -> Message: + """Convert a single ``OutputItem`` SDK object to a framework ``Message``. + + Variant table — keep in sync with :func:`_item_to_message` when both + sides exist for the same item kind. To add a new variant: + + 1. Add a ``elif item.type == "...":`` branch here. + 2. Reference the corresponding ``models.OutputItemX`` type. + 3. Map its fields to :class:`agent_framework.Content` factory methods. + + Variants currently **missing** from this dispatch (visible by scanning + ``models.OutputItem*`` and comparing against the branches below): + + * ``models.OutputItemCompactionBody`` — context compaction summaries + * ``models.OutputItemMcpListTools`` — MCP server ``list_tools`` results + * ``models.WorkflowActionOutputItem`` — workflow-channel actions + * Any tool-call variant produced by Azure-specific tools + (Azure Search, Bing Grounding, SharePoint, Fabric, OpenAPI, A2A, + browser automation, memory search, …) — the ``models.*ToolCall`` + / ``models.*ToolCallOutput`` family. + + Args: + item: The SDK item to convert. + + Keyword Args: + approval_storage: Optional approval storage. Required when the item is + an ``mcp_approval_request`` / ``mcp_approval_response``; ignored + otherwise. + + Returns: + The converted message, with any unknown extras round-tripped under + ``message.additional_properties[EXTRAS_KEY]``. + + Raises: + ValueError: If the SDK item ``type`` is not yet supported. + """ + if item.type == "output_message": + output_msg = cast(models.OutputItemOutputMessage, item) + return _attach_extras( + Message( + role=output_msg.role, + contents=[_convert_output_message_content(part) for part in output_msg.content], + ), + item, + ) + + if item.type == "message": + msg = cast(models.OutputItemMessage, item) + return _attach_extras( + Message(role=msg.role, contents=[_convert_message_content(part) for part in msg.content]), + item, + ) + + if item.type == "function_call": + fc = cast(models.OutputItemFunctionToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[Content.from_function_call(fc.call_id, fc.name, arguments=fc.arguments)], + ), + item, + ) + + if item.type == "function_call_output": + fco = cast(models.FunctionCallOutputItemParam, item) + output = fco.output if isinstance(fco.output, str) else str(fco.output) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(fco.call_id, result=output)]), + item, + ) + + if item.type == "reasoning": + reasoning = cast(models.OutputItemReasoningItem, item) + contents: list[Content] = [] + if reasoning.summary: + for summary in reasoning.summary: + contents.append(Content.from_text(summary.text)) + return _attach_extras(Message(role="assistant", contents=contents), item) + + if item.type == "mcp_call": + mcp = cast(models.OutputItemMcpToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_mcp_server_tool_call( + mcp.id, + mcp.name, + server_name=mcp.server_label, + arguments=mcp.arguments, + ) + ], + ), + item, + ) + + if item.type == "mcp_approval_request": + mcp_req = cast(models.OutputItemMcpApprovalRequest, item) + if approval_storage is None: + raise ValueError("ApprovalStorage is required to load approval request.") + function_approval_request_content = await approval_storage.load_approval_request(mcp_req.id) + return _attach_extras( + Message(role="assistant", contents=[function_approval_request_content]), + item, + ) + + if item.type == "mcp_approval_response": + mcp_resp = cast(models.OutputItemMcpApprovalResponseResource, item) + if approval_storage is None: + raise ValueError("ApprovalStorage is required to load approval request.") + function_approval_request_content = await approval_storage.load_approval_request(mcp_resp.approval_request_id) + return _attach_extras( + Message( + role="user", + contents=[function_approval_request_content.to_function_approval_response(mcp_resp.approve)], + ), + item, + ) + + if item.type == "code_interpreter_call": + ci = cast(models.OutputItemCodeInterpreterToolCall, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_code_interpreter_tool_call(call_id=ci.id)]), + item, + ) + + if item.type == "image_generation_call": + ig = cast(models.OutputItemImageGenToolCall, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_image_generation_tool_call(image_id=ig.id)]), + item, + ) + + if item.type == "shell_call": + sc = cast(models.OutputItemFunctionShellCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=sc.call_id, + commands=sc.action.commands, + status=str(sc.status), + ) + ], + ), + item, + ) + + if item.type == "shell_call_output": + sco = cast(models.OutputItemFunctionShellCallOutput, item) + outputs = [ + Content.from_shell_command_output( + stdout=out.stdout or "", + stderr=out.stderr or "", + exit_code=getattr(out.outcome, "exit_code", None) if hasattr(out, "outcome") else None, + ) + for out in (sco.output or []) + ] + return _attach_extras( + Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=sco.call_id, + outputs=outputs, + max_output_length=sco.max_output_length, + ) + ], + ), + item, + ) + + if item.type == "local_shell_call": + lsc = cast(models.OutputItemLocalShellToolCall, item) + commands = lsc.action.command if hasattr(lsc.action, "command") and lsc.action.command else [] + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=lsc.call_id, + commands=commands, + status=str(lsc.status), + ) + ], + ), + item, + ) + + if item.type == "local_shell_call_output": + lsco = cast(models.OutputItemLocalShellToolCallOutput, item) + return _attach_extras( + Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=lsco.id, + outputs=[Content.from_shell_command_output(stdout=lsco.output)], + ) + ], + ), + item, + ) + + if item.type == "file_search_call": + fs = cast(models.OutputItemFileSearchToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_function_call( + fs.id, + "file_search", + arguments=json.dumps({"queries": fs.queries}), + ) + ], + ), + item, + ) + + if item.type == "web_search_call": + ws = cast(models.OutputItemWebSearchToolCall, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_function_call(ws.id, "web_search")]), + item, + ) + + if item.type == "computer_call": + cc = cast(models.OutputItemComputerToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_function_call( + cc.call_id, + "computer_use", + arguments=str(cc.action), + ) + ], + ), + item, + ) + + if item.type == "computer_call_output": + cco = cast(models.OutputItemComputerToolCallOutputResource, item) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(cco.call_id, result=str(cco.output))]), + item, + ) + + if item.type == "custom_tool_call": + ct = cast(models.OutputItemCustomToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[Content.from_function_call(ct.call_id, ct.name, arguments=ct.input)], + ), + item, + ) + + if item.type == "custom_tool_call_output": + cto = cast(models.OutputItemCustomToolCallOutput, item) + output = cto.output if isinstance(cto.output, str) else str(cto.output) + # Hosted-MCP results land here because the host writes them via + # ``aoutput_item_custom_tool_call_output``. Route ``mcp_*`` + # call_ids back to a hosted-MCP result Content so the chat-client + # serialize layer can coalesce onto the matching ``mcp_call`` + # input item. Issue #5546. + if cto.call_id and cto.call_id.startswith("mcp_"): + return _attach_extras( + Message( + role="tool", + contents=[Content.from_mcp_server_tool_result(call_id=cto.call_id, output=output)], + ), + item, + ) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(cto.call_id, result=output)]), + item, + ) + + if item.type == "apply_patch_call": + ap = cast(models.OutputItemApplyPatchToolCall, item) + return _attach_extras( + Message( + role="assistant", + contents=[ + Content.from_function_call( + ap.call_id, + "apply_patch", + arguments=str(ap.operation), + ) + ], + ), + item, + ) + + if item.type == "apply_patch_call_output": + apo = cast(models.OutputItemApplyPatchToolCallOutput, item) + return _attach_extras( + Message(role="tool", contents=[Content.from_function_result(apo.call_id, result=apo.output or "")]), + item, + ) + + if item.type == "oauth_consent_request": + oauth = cast(models.OAuthConsentRequestOutputItem, item) + return _attach_extras( + Message(role="assistant", contents=[Content.from_oauth_consent_request(oauth.consent_link)]), + item, + ) + + if item.type == "structured_outputs": + so = cast(models.StructuredOutputsOutputItem, item) + text = json.dumps(so.output) if not isinstance(so.output, str) else so.output + return _attach_extras(Message(role="assistant", contents=[Content.from_text(text)]), item) + + raise ValueError(f"Unsupported OutputItem type: {item.type}") + + +# endregion + + +# region AF Message → SDK OutputItem (write path) + + +def _message_text(message: Message) -> str: + """Collapse a :class:`Message` into a single text blob. + + The Foundry storage write path only persists the user-visible text — the + same compression the Responses runtime applies on its own write side. We + walk ``contents`` rather than relying on ``Message.text`` so we get a + consistent ordering and can drop non-text parts cleanly. + """ + chunks: list[str] = [] + for content in message.contents: + text = getattr(content, "text", None) + if isinstance(text, str) and text: + chunks.append(text) + if chunks: + return "".join(chunks) + # Fallback: surface ``Message.text`` if the framework knows how to + # render the contents (covers structured contents that synthesise text). + return message.text or "" + + +def _message_to_output_item(message: Message, item_id: str) -> models.OutputItem: + """Convert a single :class:`Message` to a Foundry SDK :class:`OutputItem`. + + Two-tier strategy: + + 1. **Lossless replay** — if the message carries a previously-captured + raw SDK snapshot under ``additional_properties[EXTRAS_KEY][RAW_KEY]`` + (set by :func:`_capture_raw` on the read path), rebuild the SDK + item from that snapshot via the model registry's discriminator + dispatch (:meth:`models.OutputItem._deserialize`). The snapshot's + ``id`` is rewritten to ``item_id`` so each write turn gets a + unique storage row, but every other declared field — content + variants (citations, reasoning, tool calls, function results, + …) AND any undeclared extras Foundry layered on top — survives + intact. This is the auditable round-trip the Foundry storage + backend relies on. + + 2. **Synthesise from text** — for messages constructed in user code + (no raw snapshot), fall back to the text-only path. ``assistant`` + maps to :class:`OutputItemOutputMessage` (output_text content, + ``status="completed"``); anything else maps to + :class:`OutputItemMessage` with the role normalised onto the + enum's three accepted values (``user`` / ``system`` / + ``developer`` — ``tool`` collapses to ``user`` because the + discriminator forbids it). + + In both branches: + + * ``additional_properties[EXTRAS_KEY]`` extras other than the raw + snapshot are layered onto the emitted model via + :func:`_inject_extras` so message-level Foundry annotations + round-trip. + * **Every other ``additional_properties`` namespace** (notably the + Hosting spec's ``hosting`` envelope — channel, identity, + response_target, initial-write ``deliveries[]`` — plus any future + AF namespaces) is funneled into a single + :data:`AF_EXTRAS_KEY` container key on the SDK item via + :func:`_inject_af_extras`. Foundry storage round-trips that key + as opaque JSON, and :func:`_attach_extras` peels each sub-key + back onto its original namespace on load. This is what makes the + audit/replay envelope from the Hosting spec durable across + Foundry-storage save/load cycles. + """ + extras_raw: Any = (message.additional_properties or {}).get(EXTRAS_KEY) or {} + extras: dict[str, Any] = dict(cast("Mapping[str, Any]", extras_raw)) if isinstance(extras_raw, Mapping) else {} + raw_snapshot: Any = extras.get(RAW_KEY) + af_extras = _collect_af_extras(message) + + if isinstance(raw_snapshot, Mapping): + # ``_deserialize`` does discriminator dispatch and tolerates + # extras-bearing mappings; bypassing it (constructing the + # concrete class directly) would lose the discriminator wiring + # and break round-trip for tool-call / reasoning / ... variants. + snapshot: dict[str, Any] = dict(cast("Mapping[str, Any]", raw_snapshot)) + snapshot["id"] = item_id + deserialize = cast(Any, models.OutputItem)._deserialize + item = cast("models.OutputItem", deserialize(snapshot, [])) + return cast( + "models.OutputItem", + _inject_af_extras(_inject_extras(item, extras), af_extras), + ) + + text = _message_text(message) + # ``Message.role`` is an unconstrained ``str | enum`` slot — the + # framework keeps whatever the constructor was handed (str literals + # round-trip as ``str``; converters that pass the SDK's + # ``MessageRole`` enum store the enum). Normalise to the enum's + # ``value`` (or the bare string) so we don't end up writing + # ``"MessageRole.USER"`` to storage. + role_str = getattr(message.role, "value", message.role) + + # Construct via the mapping overload — the SDK's keyword overload tags + # ``content`` with the abstract base type and rejects our concrete list. + if role_str == "assistant": + item = models.OutputItemOutputMessage({ + "id": item_id, + "type": "output_message", + "role": "assistant", + "status": "completed", + "content": [ + {"type": "output_text", "text": text, "annotations": [], "logprobs": []}, + ], + }) + else: + # OutputItemMessage's role enum admits "user" / "system" / + # "developer". Anything outside that set (e.g. "tool") collapses to + # "user" so we don't crash on the SDK's discriminator validation. + role_value = role_str if role_str in ("user", "system", "developer") else "user" + item = models.OutputItemMessage({ + "id": item_id, + "type": "message", + "role": role_value, + "status": "completed", + "content": [ + {"type": "input_text", "text": text}, + ], + }) + return cast("models.OutputItem", _inject_af_extras(_inject_extras(item, extras), af_extras)) + + +def _messages_to_output_items(messages: Sequence[Message], *, id_prefix: str) -> list[models.OutputItem]: + """Convert a batch of messages to Foundry SDK items with stable IDs. + + Each message gets a deterministic id of the form ``{id_prefix}_itm_{i}``. + Callers (typically :meth:`FoundryHostedAgentHistoryProvider.save_messages`) + derive ``id_prefix`` from the response id they're persisting under so + the per-item ids are unique across a conversation. + """ + return [_message_to_output_item(msg, f"{id_prefix}_itm_{i}") for i, msg in enumerate(messages)] + + +# endregion diff --git a/python/packages/foundry_hosting/tests/test_history_provider.py b/python/packages/foundry_hosting/tests/test_history_provider.py new file mode 100644 index 00000000000..cfdbeccacb6 --- /dev/null +++ b/python/packages/foundry_hosting/tests/test_history_provider.py @@ -0,0 +1,1435 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Unit tests for FoundryHostedAgentHistoryProvider.""" + +from __future__ import annotations + +import os +import time +from collections.abc import Iterable +from typing import Any +from unittest.mock import AsyncMock, MagicMock + +import pytest +from agent_framework import Content, HistoryProvider, Message +from azure.ai.agentserver.responses import ( + FoundryStorageProvider, + InMemoryResponseProvider, + IsolationContext, +) +from azure.ai.agentserver.responses.models import ( + OutputItem, + OutputItemOutputMessage, + OutputMessageContentOutputTextContent, +) +from azure.ai.agentserver.responses.store._foundry_errors import ( # pyright: ignore[reportPrivateUsage] + FoundryBadRequestError, +) + +from agent_framework_foundry_hosting import FoundryHostedAgentHistoryProvider +from agent_framework_foundry_hosting._history_provider import ( # pyright: ignore[reportPrivateUsage] + get_current_isolation, + reset_current_isolation, + set_current_isolation, +) + + +def _with_backend(prov: FoundryHostedAgentHistoryProvider, backend: Any) -> FoundryHostedAgentHistoryProvider: + """Inject a fake backend into ``prov`` so ``_resolve_backend`` returns it. + + Replaces the old ``backend=`` constructor parameter that was removed + when the dual-backend model was collapsed onto ``FoundryStorageProvider``. + """ + prov._backend = backend # pyright: ignore[reportPrivateUsage] + return prov + + +# region Helpers + + +def _make_text_item(item_id: str, text: str) -> OutputItemOutputMessage: + return OutputItemOutputMessage( + id=item_id, + type="output_message", + role="assistant", + status="completed", + content=[OutputMessageContentOutputTextContent(type="output_text", text=text, annotations=[])], + ) + + +def _make_fake_backend( + *, + history_ids: list[str] | None = None, + items: list[OutputItem | None] | None = None, +) -> MagicMock: + """Build a MagicMock matching the _StorageBackend protocol.""" + backend = MagicMock() + + async def _ids(*args: Any, **kwargs: Any) -> list[str]: + return list(history_ids or []) + + async def _items(item_ids: Iterable[str], *, isolation: IsolationContext | None = None) -> list[OutputItem | None]: + return list(items or []) + + backend.get_history_item_ids = AsyncMock(side_effect=_ids) + backend.get_items = AsyncMock(side_effect=_items) + backend.create_response = AsyncMock() + return backend + + +class _FakeAccessToken: + def __init__(self, token: str, *, expires_in: float = 3600.0) -> None: + self.token = token + self.expires_on = int(time.time() + expires_in) + + +class _FakeCredential: + """Minimal AsyncTokenCredential stand-in.""" + + def __init__(self, *, token: str = "fake-token", expires_in: float = 3600.0) -> None: + self._token = token + self._expires_in = expires_in + self.calls: list[tuple[str, ...]] = [] + + async def get_token(self, *scopes: str) -> _FakeAccessToken: + self.calls.append(scopes) + return _FakeAccessToken(self._token, expires_in=self._expires_in) + + +# region Construction + + +class TestConstruction: + """Constructor + class-level invariants.""" + + def test_defaults(self) -> None: + prov = _with_backend(FoundryHostedAgentHistoryProvider(), _make_fake_backend()) + assert isinstance(prov, HistoryProvider) + assert prov.source_id == FoundryHostedAgentHistoryProvider.DEFAULT_SOURCE_ID + assert prov.store_inputs is True + assert prov.store_outputs is True + assert prov.load_messages is True + + def test_is_hosted_environment_reads_env(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + assert FoundryHostedAgentHistoryProvider.is_hosted_environment() is False + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") + assert FoundryHostedAgentHistoryProvider.is_hosted_environment() is True + + def test_endpoint_falls_back_to_env(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_PROJECT_ENDPOINT", "https://example.foundry.azure.com") + prov = _with_backend(FoundryHostedAgentHistoryProvider(), _make_fake_backend()) + assert prov._endpoint == "https://example.foundry.azure.com" # pyright: ignore[reportPrivateUsage] + + +# region Backend resolution + + +class TestBackendResolution: + """Lazy backend construction + local fallback.""" + + def test_uses_explicit_backend(self) -> None: + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + assert prov._resolve_backend() is backend # pyright: ignore[reportPrivateUsage] + + def test_local_fallback_when_not_hosted(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider() + resolved = prov._resolve_backend() # pyright: ignore[reportPrivateUsage] + assert isinstance(resolved, InMemoryResponseProvider) + # Cached on subsequent calls. + assert prov._resolve_backend() is resolved # pyright: ignore[reportPrivateUsage] + + def test_hosted_without_credential_raises(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") + monkeypatch.setenv("FOUNDRY_PROJECT_ENDPOINT", "https://x.foundry.azure.com") + prov = FoundryHostedAgentHistoryProvider() + with pytest.raises(RuntimeError, match="requires an async credential"): + prov._resolve_backend() # pyright: ignore[reportPrivateUsage] + + def test_hosted_without_endpoint_raises(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") + monkeypatch.delenv("FOUNDRY_PROJECT_ENDPOINT", raising=False) + prov = FoundryHostedAgentHistoryProvider(credential=_FakeCredential()) # type: ignore[arg-type] + with pytest.raises(RuntimeError, match="needs a Foundry project endpoint"): + prov._resolve_backend() # pyright: ignore[reportPrivateUsage] + + def test_hosted_builds_http_backend(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") + monkeypatch.setenv("FOUNDRY_PROJECT_ENDPOINT", "https://x.foundry.azure.com") + prov = FoundryHostedAgentHistoryProvider(credential=_FakeCredential()) # type: ignore[arg-type] + resolved = prov._resolve_backend() # pyright: ignore[reportPrivateUsage] + assert isinstance(resolved, FoundryStorageProvider) + + +# region get_messages + + +class TestGetMessages: + async def test_no_session_id_returns_empty(self) -> None: + backend = _make_fake_backend(history_ids=["x"], items=[_make_text_item("x", "hi")]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + assert await prov.get_messages(None) == [] + assert await prov.get_messages("") == [] + backend.get_history_item_ids.assert_not_called() + + async def test_no_history_returns_empty(self) -> None: + backend = _make_fake_backend(history_ids=[]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + assert await prov.get_messages("resp_123") == [] + backend.get_items.assert_not_called() + + async def test_loads_and_converts(self) -> None: + items: list[OutputItem | None] = [_make_text_item("itm_1", "hello"), _make_text_item("itm_2", "world")] + backend = _make_fake_backend(history_ids=["itm_1", "itm_2"], items=items) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + + messages = await prov.get_messages("resp_123") + assert len(messages) == 2 + assert all(isinstance(m, Message) for m in messages) + assert messages[0].text == "hello" + assert messages[1].text == "world" + + backend.get_history_item_ids.assert_awaited_once() + call = backend.get_history_item_ids.await_args + assert call.args[0] == "resp_123" + assert call.args[1] is None # conversation_id + assert call.args[2] == 100 # default history_limit + + async def test_drops_missing_items(self) -> None: + backend = _make_fake_backend( + history_ids=["a", "b", "c"], + items=[_make_text_item("a", "first"), None, _make_text_item("c", "third")], + ) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + messages = await prov.get_messages("resp_x") + assert [m.text for m in messages] == ["first", "third"] + + async def test_history_limit_propagates(self) -> None: + backend = _make_fake_backend(history_ids=[]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(history_limit=7), backend) + # ``resp_*``-shaped session anchors directly; we expect a single + # backend call carrying the configured limit. + await prov.get_messages("resp_s") + assert backend.get_history_item_ids.await_count == 1 + assert backend.get_history_item_ids.await_args.args[2] == 7 + + async def test_non_resp_session_skips_storage_probe(self) -> None: + """Non-``resp_*`` session ids (e.g. opaque chat-isolation keys) + are not valid storage anchors — the provider must skip the + backend probe entirely so we don't hit "Malformed identifier" + HTTP 400s, returning an empty history instead. + """ + backend = _make_fake_backend(history_ids=[]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + messages = await prov.get_messages("5leZSsJ3m1UtB-JW3m3iowFd5_zqP30SE0MmGUEkcGQ") + assert messages == [] + backend.get_history_item_ids.assert_not_awaited() + + async def test_resp_probe_tolerates_400(self) -> None: + """A 400 on the storage probe must not abort ``get_messages`` — + the provider falls through to an empty history.""" + backend = _make_fake_backend() + backend.get_history_item_ids.side_effect = FoundryBadRequestError("malformed", response_body=None) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + messages = await prov.get_messages("resp_x") + assert messages == [] + + +# region IsolationContext + + +class TestIsolationContext: + async def test_explicit_isolation_kwarg_wins(self) -> None: + backend = _make_fake_backend(history_ids=[]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + explicit = IsolationContext(user_key="u-explicit", chat_key="c-explicit") + await prov.get_messages("resp_s", isolation=explicit) + assert backend.get_history_item_ids.await_args.kwargs["isolation"] is explicit + + async def test_contextvar_picked_up(self) -> None: + backend = _make_fake_backend(history_ids=["a"], items=[_make_text_item("a", "x")]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + ctx = IsolationContext(user_key="u-1", chat_key="c-1") + token = set_current_isolation(ctx) + try: + assert get_current_isolation() is ctx + await prov.get_messages("resp_s") + finally: + reset_current_isolation(token) + assert backend.get_history_item_ids.await_args.kwargs["isolation"] is ctx + assert backend.get_items.await_args.kwargs["isolation"] is ctx + + async def test_no_isolation_when_unset(self) -> None: + backend = _make_fake_backend(history_ids=[]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + await prov.get_messages("resp_s") + assert backend.get_history_item_ids.await_args.kwargs["isolation"] is None + + async def test_host_isolation_keys_picked_up(self) -> None: + """The host's ASGI middleware lifts the + ``x-agent-{user,chat}-isolation-key`` headers into a contextvar + exposed by ``agent_framework_hosting``. The provider lifts that + into its own ``IsolationContext`` so the storage call carries + the platform partition keys without channels having to forward + anything (or even know the headers exist).""" + pytest.importorskip("agent_framework_hosting") + from agent_framework_hosting import ( + IsolationKeys, + reset_current_isolation_keys, + set_current_isolation_keys, + ) + + backend = _make_fake_backend(history_ids=["a"], items=[_make_text_item("a", "x")]) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + token = set_current_isolation_keys(IsolationKeys(user_key="u-3", chat_key="c-3")) + try: + await prov.get_messages("resp_s") + finally: + reset_current_isolation_keys(token) + applied = backend.get_history_item_ids.await_args.kwargs["isolation"] + assert applied is not None + assert applied.user_key == "u-3" + assert applied.chat_key == "c-3" + + +# region save_messages + + +class TestSaveMessages: + async def test_save_messages_writes_to_backend_when_bound(self) -> None: + """``save_messages`` writes a ``create_response`` envelope using + the host-bound response_id when present. + + The host's ``_bind_request_context`` plumbs the channel-minted + ``response_id`` (and prior turn's ``previous_response_id``) into + the provider via :func:`bind_request_context`, so the channel + envelope and the storage write share a single id per turn — + which is what makes the next turn's ``previous_response_id`` + walkable. + """ + from agent_framework_foundry_hosting import bind_request_context + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + msg = Message(role="assistant", contents=[Content.from_text("hello")]) + with bind_request_context(response_id="resp_bound_1", previous_response_id=None): + await prov.save_messages("session-x", [msg]) + + backend.create_response.assert_awaited_once() + call = backend.create_response.await_args + response = call.args[0] + assert response.id == "resp_bound_1" + # Conversation is intentionally omitted — Foundry isolation + # headers handle partitioning; cross-turn chaining is via the + # response-id chain only. + assert response.conversation is None + # Assistant outputs go on ``response.output``, not ``input_items`` + # — mirrors the agentserver runtime split (see + # ``_resolve_input_items_for_persistence``). + assert call.kwargs["input_items"] == [] + output = response.output or [] + assert len(output) == 1 + assert output[0]["type"] == "output_message" + + async def test_save_messages_falls_back_to_session_id_when_unbound(self) -> None: + """Without a host binding (e.g. local dev), ``save_messages`` + mints a fresh ``resp_*`` envelope and only chains when the + ``session_id`` is itself ``resp_*``-shaped.""" + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + msg = Message(role="user", contents=[Content.from_text("hi")]) + await prov.save_messages("resp_prev", [msg]) + + backend.create_response.assert_awaited_once() + call = backend.create_response.await_args + response = call.args[0] + assert response.id.startswith("caresp_") + # Provider walked the prior chain to seed history_item_ids; the + # fake backend returns ``[]`` so this stays empty but the call + # was made. + assert backend.get_history_item_ids.await_count == 1 + assert backend.get_history_item_ids.await_args.args[0] == "resp_prev" + + async def test_save_messages_empty_short_circuits(self) -> None: + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + await prov.save_messages("s", []) + backend.create_response.assert_not_called() + + async def test_save_messages_no_session_short_circuits(self) -> None: + """No session id and no host binding → nothing to anchor against, + skip the write.""" + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + await prov.save_messages(None, [Message(role="user", contents=[Content.from_text("hi")])]) + backend.create_response.assert_not_called() + + async def test_save_messages_swallows_storage_errors(self) -> None: + """Persistence is best-effort for *Foundry storage* failures. + + Storage-validation rejections, opaque 5xx, etc. should be + swallowed (the agent run already produced output and the + caller can't recover from a chain-write failure mid-stream). + Counter is bumped for observability. + """ + backend = _make_fake_backend() + backend.create_response.side_effect = FoundryBadRequestError( + "simulated invalid_payload", + response_body={"error": {"code": "invalid_payload"}}, + ) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + # Must not raise. + await prov.save_messages("resp_session_x", [Message(role="user", contents=[Content.from_text("hi")])]) + backend.create_response.assert_awaited_once() + assert prov.failed_writes == 1 + + async def test_save_messages_propagates_non_storage_errors(self) -> None: + """Network / auth / payload-builder bugs MUST surface to the caller. + + Anything that's not a ``FoundryStorageError`` — connection + resets, expired credential 401/403s, ``AttributeError`` from a + regression in the wire-payload builder — propagates so the + caller can retry / alert. Counter is NOT bumped for these. + """ + backend = _make_fake_backend() + backend.create_response.side_effect = ConnectionError("simulated network failure") + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + with pytest.raises(ConnectionError, match="simulated network failure"): + await prov.save_messages( + "resp_session_x", + [Message(role="user", contents=[Content.from_text("hi")])], + ) + assert prov.failed_writes == 0 + + async def test_save_then_get_round_trip_via_in_memory_backend(self) -> None: + """End-to-end save→get round-trip through ``InMemoryResponseProvider``. + + Mirrors the host-bound multi-turn flow: turn 1 binds a fresh + response id; turn 2 binds a new response id with the prior id + as ``previous_response_id``. ``get_messages`` on turn 2 is + called with the prior anchor and must return both turns. + """ + from agent_framework_foundry_hosting import bind_request_context + + backend = InMemoryResponseProvider() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + + with bind_request_context(response_id="resp_turn1", previous_response_id=None): + await prov.save_messages( + "resp_turn1", + [Message(role="user", contents=[Content.from_text("ping")])], + ) + + with bind_request_context(response_id="resp_turn2", previous_response_id="resp_turn1"): + history = await prov.get_messages("resp_turn1") + assert [m.text for m in history] == ["ping"] + await prov.save_messages( + "resp_turn2", + [Message(role="assistant", contents=[Content.from_text("pong")])], + ) + + # Final read for turn 3: walking turn 2 must reveal both turns. + with bind_request_context(response_id="resp_turn3", previous_response_id="resp_turn2"): + messages = await prov.get_messages("resp_turn2") + assert [m.text for m in messages] == ["ping", "pong"] + roles = [getattr(m.role, "value", m.role) for m in messages] + assert roles == ["user", "assistant"] + + +# region aclose + + +class TestAclose: + async def test_closes_backend_with_aclose(self) -> None: + # Provider always closes whatever backend is currently bound; + # the dual-mode (external vs owned) distinction was dropped + # along with the ``backend=`` constructor param. + backend = _make_fake_backend() + backend.aclose = AsyncMock() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + prov._resolve_backend() # pyright: ignore[reportPrivateUsage] + await prov.aclose() + backend.aclose.assert_awaited_once() + + async def test_aclose_idempotent(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider() + prov._resolve_backend() # pyright: ignore[reportPrivateUsage] + await prov.aclose() + await prov.aclose() # idempotent — second call is a no-op + + +# region Local file storage option + + +class TestLocalFileStorage: + """`local_storage_root` swaps the in-memory local fallback for a + per-isolation :class:`FileHistoryProvider` so dev runs persist + across process restarts.""" + + async def test_unset_keeps_in_memory_fallback(self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider() + assert prov._resolve_local_file_provider(None) is None # pyright: ignore[reportPrivateUsage] + assert isinstance( + prov._resolve_backend(), # pyright: ignore[reportPrivateUsage] + InMemoryResponseProvider, + ) + + async def test_creates_per_isolation_provider(self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + iso = IsolationContext(user_key="alice", chat_key="chat-1") + + fp = prov._resolve_local_file_provider(iso) # pyright: ignore[reportPrivateUsage] + assert fp is not None + # Cached on subsequent calls for the same (user, chat). + assert prov._resolve_local_file_provider(iso) is fp # pyright: ignore[reportPrivateUsage] + # Different isolation → different provider rooted at a different dir. + other = prov._resolve_local_file_provider( # pyright: ignore[reportPrivateUsage] + IsolationContext(user_key="bob", chat_key="chat-1"), + ) + assert other is not None and other is not fp + assert fp.storage_path != other.storage_path + assert fp.storage_path == (tmp_path / "alice" / "chat-1").resolve() + + async def test_missing_isolation_uses_sentinel_dir(self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + fp = prov._resolve_local_file_provider(None) # pyright: ignore[reportPrivateUsage] + assert fp is not None + assert fp.storage_path == (tmp_path / "~none" / "~none").resolve() + + async def test_unsafe_isolation_segments_are_encoded(self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + iso = IsolationContext(user_key="../escape", chat_key="ok-chat") + fp = prov._resolve_local_file_provider(iso) # pyright: ignore[reportPrivateUsage] + assert fp is not None + # Encoded segment never contains a ``/`` and never escapes the root. + assert fp.storage_path.is_relative_to(tmp_path.resolve()) + assert "../" not in str(fp.storage_path) + # Encoded segments use the reserved ``~iso-`` prefix. + parts = fp.storage_path.relative_to(tmp_path.resolve()).parts + assert parts[0].startswith("~iso-") + assert parts[1] == "ok-chat" + + async def test_hosted_mode_ignores_local_storage_root( + self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any, caplog: pytest.LogCaptureFixture + ) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") + with caplog.at_level("INFO", logger="agent_framework_foundry_hosting._history_provider"): + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + # File provider is never resolved when hosted. + assert prov._resolve_local_file_provider(None) is None # pyright: ignore[reportPrivateUsage] + assert any("ignored local_storage_root" in record.message for record in caplog.records) + + async def test_get_and_save_round_trip_via_file(self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + iso = IsolationContext(user_key="alice", chat_key="chat-1") + + msgs = [ + Message(role="user", contents=["hello"]), + Message(role="assistant", contents=["hi back"]), + ] + await prov.save_messages("conv-1", msgs, isolation=iso) + + # File exists at the expected nested path with session_id as stem. + expected_path = tmp_path / "alice" / "chat-1" / "conv-1.jsonl" + assert expected_path.exists() + # Two JSONL records (one per message). + assert len([line for line in expected_path.read_text().splitlines() if line.strip()]) == 2 + + loaded = await prov.get_messages("conv-1", isolation=iso) + assert [m.text for m in loaded] == ["hello", "hi back"] + + # Different isolation → different file → independent history. + bob_loaded = await prov.get_messages( + "conv-1", + isolation=IsolationContext(user_key="bob", chat_key="chat-1"), + ) + assert bob_loaded == [] + + async def test_session_id_with_special_chars_is_sanitised_by_file_provider( + self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any + ) -> None: + # The wrapper passes ``session_id`` through unchanged; the + # delegate ``FileHistoryProvider`` is responsible for sanitising + # it. This test just confirms the delegation works for a + # non-trivial id without raising. + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + msgs = [Message(role="user", contents=["hi"])] + await prov.save_messages("conv:with:colons", msgs) + loaded = await prov.get_messages("conv:with:colons") + assert [m.text for m in loaded] == ["hi"] + + async def test_aclose_clears_file_provider_cache(self, monkeypatch: pytest.MonkeyPatch, tmp_path: Any) -> None: + monkeypatch.delenv("FOUNDRY_HOSTING_ENVIRONMENT", raising=False) + prov = FoundryHostedAgentHistoryProvider(local_storage_root=tmp_path) + prov._resolve_local_file_provider(IsolationContext(user_key="alice")) # pyright: ignore[reportPrivateUsage] + assert prov._file_providers # pyright: ignore[reportPrivateUsage] + await prov.aclose() + assert not prov._file_providers # pyright: ignore[reportPrivateUsage] + + +# region Foundry id helpers (`_ids.py`) + + +class TestFoundryIdHelpers: + """Cover the public ``_ids`` re-exports so SDK ``IdGenerator`` + contract changes surface in unit tests rather than as opaque + HTTP 500 ``server_error`` from Foundry storage at runtime.""" + + def test_foundry_response_id_carries_partition_key(self) -> None: + """A minted ``caresp_*`` id must embed an 18-char partition key. + + Free-form ``resp_`` ids carry no parseable partition key + and Foundry storage rejects writes with HTTP 500. + """ + from agent_framework_foundry_hosting import foundry_response_id + + new_id = foundry_response_id() + assert new_id.startswith("caresp_") + # ``caresp_`` (7) + 18-char partition key + 32-char entropy = 57. + # The legacy 48-char body variant is also accepted by storage, + # so just check the lower bound. + assert len(new_id) >= 7 + 18 + 32 - 8 + + def test_foundry_response_id_reuses_previous_partition_key(self) -> None: + """Chained writes co-locate by reusing the prior partition key. + + Foundry storage rejects chained writes whose new record sits in + a different partition than the prior one. Passing a ``caresp_*`` + ``previous_response_id`` should produce a new id whose partition + segment matches. + """ + from agent_framework_foundry_hosting import foundry_response_id + + prior = foundry_response_id() + # Partition key = 18 chars after the ``caresp_`` prefix. + prior_partition = prior[len("caresp_") : len("caresp_") + 18] + chained = foundry_response_id(prior) + assert chained.startswith("caresp_") + assert chained != prior + assert chained[len("caresp_") : len("caresp_") + 18] == prior_partition + + def test_foundry_response_id_factory_returns_callable(self) -> None: + """The factory wrapper used by ``ResponsesChannel`` must + delegate to :func:`foundry_response_id` so chained turns can + seed the partition key from ``previous_response_id``.""" + from agent_framework_foundry_hosting import ( + foundry_response_id, + foundry_response_id_factory, + ) + + factory = foundry_response_id_factory() + assert factory is foundry_response_id + + def test_foundry_item_id_for_known_input_type(self) -> None: + """Recognised ``Item`` types get a typed prefix and a + partition-key hint matching the response id when supplied.""" + from azure.ai.agentserver.responses.models import ( + ItemMessage, + MessageContentInputTextContent, + ) + + from agent_framework_foundry_hosting import foundry_item_id, foundry_response_id + + response_id = foundry_response_id() + partition = response_id[len("caresp_") : len("caresp_") + 18] + item = ItemMessage( + type="message", + role="user", + content=[MessageContentInputTextContent(type="input_text", text="hi")], + ) + new_id = foundry_item_id(item, response_id) + assert new_id is not None + # ``msg_*`` is what ``IdGenerator.new_message_item_id`` mints. + assert new_id.startswith("msg_") + assert partition in new_id + + def test_foundry_item_id_returns_none_for_unknown_type(self) -> None: + """Reference-only / unrecognised types must return ``None`` + per the SDK helper's contract — callers (e.g. + ``save_messages``'s id-stamping loop) skip these so storage + only receives ids it can parse.""" + from agent_framework_foundry_hosting import foundry_item_id + + class _UnknownItem: + pass + + assert foundry_item_id(_UnknownItem()) is None + + +# region Wire payload stamping (`save_messages`) + + +class TestSaveMessagesWirePayload: + """Storage rejects ``create_response`` payloads that omit fields + flagged as REQUIRED in ``ResponseObject`` (``parallel_tool_calls``, + ``instructions``, ``background``) or that leak extras the validator + refuses (``conversation``, ``model=None``, …). Any regression that + drops one of these silently breaks every hosted deploy with an + opaque 4xx; cover them here so the test suite catches it first.""" + + async def test_envelope_includes_required_storage_fields( + self, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """``background``, ``parallel_tool_calls``, ``instructions``, + and ``agent_reference`` MUST be present on every stamped + envelope; storage returns HTTP 400 ``invalid_payload`` if any + of them is missing.""" + from agent_framework_foundry_hosting import bind_request_context + + # Strip env so the defaults are exercised cleanly. + for var in ( + "FOUNDRY_AGENT_NAME", + "FOUNDRY_AGENT_VERSION", + "FOUNDRY_AGENT_SESSION_ID", + "MODEL_DEPLOYMENT_NAME", + "AZURE_AI_MODEL_DEPLOYMENT_NAME", + ): + monkeypatch.delenv(var, raising=False) + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + with bind_request_context(response_id="resp_envelope_1", previous_response_id=None): + await prov.save_messages( + "session-x", + [Message(role="assistant", contents=[Content.from_text("hi")])], + ) + + backend.create_response.assert_awaited_once() + response = backend.create_response.await_args.args[0] + body = response.as_dict() + + # Required-by-storage fields. + assert body["background"] is False + assert body["parallel_tool_calls"] is False + assert body["instructions"] == "" + assert body["agent_reference"] == { + "type": "agent_reference", + "name": "agent-framework-host", + } + + async def test_envelope_omits_optional_fields_when_env_unset( + self, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """``model``, ``agent_session_id``, and the ``version`` slot of + ``agent_reference`` are omitted (NOT stamped as ``None``) when + their env vars are unset — storage rejects ``model: null``.""" + from agent_framework_foundry_hosting import bind_request_context + + for var in ( + "FOUNDRY_AGENT_NAME", + "FOUNDRY_AGENT_VERSION", + "FOUNDRY_AGENT_SESSION_ID", + "MODEL_DEPLOYMENT_NAME", + "AZURE_AI_MODEL_DEPLOYMENT_NAME", + ): + monkeypatch.delenv(var, raising=False) + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + with bind_request_context(response_id="resp_omit_1", previous_response_id=None): + await prov.save_messages( + "session-x", + [Message(role="assistant", contents=[Content.from_text("hi")])], + ) + + body = backend.create_response.await_args.args[0].as_dict() + # Either entirely absent or explicitly None — assert the field + # was NOT stamped to a non-None value. + assert body.get("model") is None + assert body.get("agent_session_id") is None + # ``version`` slot inside agent_reference is omitted entirely + # (the key is absent, not set to None) when the env var is unset. + assert "version" not in body["agent_reference"] + + async def test_envelope_picks_up_env_vars(self, monkeypatch: pytest.MonkeyPatch) -> None: + """When the platform-set env vars are present they MUST land on + the envelope: ``FOUNDRY_AGENT_NAME`` / ``FOUNDRY_AGENT_VERSION`` + feed ``agent_reference``, ``FOUNDRY_AGENT_SESSION_ID`` feeds + ``agent_session_id``, and ``MODEL_DEPLOYMENT_NAME`` feeds + ``model``.""" + from agent_framework_foundry_hosting import bind_request_context + + monkeypatch.setenv("FOUNDRY_AGENT_NAME", "concierge") + monkeypatch.setenv("FOUNDRY_AGENT_VERSION", "v3") + monkeypatch.setenv("FOUNDRY_AGENT_SESSION_ID", "caresp_envsessionABCDEF") + monkeypatch.setenv("MODEL_DEPLOYMENT_NAME", "gpt-4o-mini-prod") + monkeypatch.delenv("AZURE_AI_MODEL_DEPLOYMENT_NAME", raising=False) + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + with bind_request_context(response_id="resp_env_1", previous_response_id=None): + await prov.save_messages( + "session-x", + [Message(role="assistant", contents=[Content.from_text("hi")])], + ) + + body = backend.create_response.await_args.args[0].as_dict() + assert body["agent_reference"] == { + "type": "agent_reference", + "name": "concierge", + "version": "v3", + } + assert body["agent_session_id"] == "caresp_envsessionABCDEF" + assert body["model"] == "gpt-4o-mini-prod" + + async def test_envelope_falls_back_to_local_dev_model_var( + self, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """Local dev sets ``AZURE_AI_MODEL_DEPLOYMENT_NAME`` rather than + the platform-only ``MODEL_DEPLOYMENT_NAME``; the latter wins + when both are present, the former fills in when only it is.""" + from agent_framework_foundry_hosting import bind_request_context + + monkeypatch.delenv("MODEL_DEPLOYMENT_NAME", raising=False) + monkeypatch.setenv("AZURE_AI_MODEL_DEPLOYMENT_NAME", "gpt-4o-mini-dev") + for var in ("FOUNDRY_AGENT_NAME", "FOUNDRY_AGENT_VERSION", "FOUNDRY_AGENT_SESSION_ID"): + monkeypatch.delenv(var, raising=False) + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + with bind_request_context(response_id="resp_devmodel_1", previous_response_id=None): + await prov.save_messages( + "session-x", + [Message(role="assistant", contents=[Content.from_text("hi")])], + ) + + body = backend.create_response.await_args.args[0].as_dict() + assert body["model"] == "gpt-4o-mini-dev" + + +# region FOUNDRY_AGENT_SESSION_ID chain anchor + + +class TestFoundryAgentSessionIdAnchor: + """``FOUNDRY_AGENT_SESSION_ID`` identifies the *container instance*, + not the conversation (per the Foundry SDK), so it MUST NOT be used + as a fallback ``previous_response_id`` for chain walking. The host- + bound ``previous_response_id`` (set by ``ResponsesChannel`` from the + request envelope) is the only authoritative anchor; any code that + re-introduces an env-based fallback would silently merge unrelated + conversations across container restarts.""" + + async def test_get_messages_ignores_env_session_anchor_when_unbound( + self, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """No host binding, opaque ``session_id`` and a populated + ``FOUNDRY_AGENT_SESSION_ID``: ``get_messages`` must return ``[]`` + and never call the backend (no walkable conversation anchor).""" + for var in ("MODEL_DEPLOYMENT_NAME", "AZURE_AI_MODEL_DEPLOYMENT_NAME"): + monkeypatch.delenv(var, raising=False) + monkeypatch.setenv("FOUNDRY_AGENT_SESSION_ID", "caresp_envanchor1") + + backend = _make_fake_backend( + history_ids=["msg_envanchor_1"], + items=[_make_text_item("msg_envanchor_1", "from-env-anchor")], + ) + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + + messages = await prov.get_messages("opaque-session") + + assert messages == [] + backend.get_history_item_ids.assert_not_called() + + async def test_save_messages_ignores_env_session_anchor_when_unbound( + self, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """When no host binding supplies a ``previous_response_id`` and + ``session_id`` is opaque, the env var must NOT be consulted as a + fallback; the new turn writes without a prior chain seed.""" + for var in ( + "FOUNDRY_AGENT_NAME", + "FOUNDRY_AGENT_VERSION", + "MODEL_DEPLOYMENT_NAME", + "AZURE_AI_MODEL_DEPLOYMENT_NAME", + ): + monkeypatch.delenv(var, raising=False) + monkeypatch.setenv("FOUNDRY_AGENT_SESSION_ID", "caresp_envchain1") + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + # Opaque session_id, no host binding → save proceeds without + # walking any chain (no get_history_item_ids call). + await prov.save_messages( + "opaque-session", + [Message(role="assistant", contents=[Content.from_text("hi")])], + ) + + backend.get_history_item_ids.assert_not_called() + # The persisted envelope still stamps the env value into + # ``agent_session_id`` for operator correlation (see the + # docstring on the module): only the chain anchor is gated. + backend.create_response.assert_awaited_once() + wire_payload = backend.create_response.await_args.args[0].as_dict() + assert wire_payload["agent_session_id"] == "caresp_envchain1" + + async def test_save_messages_env_anchor_skipped_when_host_bound( + self, + monkeypatch: pytest.MonkeyPatch, + ) -> None: + """A host-bound ``previous_response_id`` wins over any env value; + the binding is the authoritative chain seed for the request.""" + from agent_framework_foundry_hosting import bind_request_context + + for var in ( + "FOUNDRY_AGENT_NAME", + "FOUNDRY_AGENT_VERSION", + "MODEL_DEPLOYMENT_NAME", + "AZURE_AI_MODEL_DEPLOYMENT_NAME", + ): + monkeypatch.delenv(var, raising=False) + monkeypatch.setenv("FOUNDRY_AGENT_SESSION_ID", "caresp_envignored") + + backend = _make_fake_backend() + prov = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + with bind_request_context(response_id="resp_bound_2", previous_response_id="caresp_boundprev"): + await prov.save_messages( + "session-x", + [Message(role="assistant", contents=[Content.from_text("hi")])], + ) + + # Host binding wins; the env anchor is ignored for chaining. + assert backend.get_history_item_ids.await_args.args[0] == "caresp_boundprev" + + +# region Shared module re-exports + + +class TestSharedReExports: + """`_responses.py` must re-export the conversion helpers so tests and + downstream code that historically imported them keep working.""" + + def test_responses_re_exports_helpers(self) -> None: + # All of these used to live in ``_responses``; after the + # refactor they live in ``_shared`` but are re-exported. + from agent_framework_foundry_hosting import ( + _responses, # pyright: ignore[reportPrivateUsage] + _shared, # pyright: ignore[reportPrivateUsage] + ) + + for name in ( + "_arguments_to_str", + "_convert_message_content", + "_convert_output_message_content", + "_item_to_message", + "_items_to_messages", + "_output_item_to_message", + "_output_items_to_messages", + ): + assert getattr(_responses, name) is getattr(_shared, name), ( + f"{name} should be re-exported from _responses for backwards compat" + ) + + +# region Full AF ↔ Foundry round-trip via InMemoryResponseProvider + + +class TestAfFoundryRoundTrip: + """Round-trip two AF :class:`Message` instances through the Foundry SDK + types and back via the real :class:`InMemoryResponseProvider` backend. + + This is the same backend the provider uses in its local-fallback path + (i.e. the one that runs whenever ``FOUNDRY_HOSTING_ENVIRONMENT`` is + unset), so this test gives us coverage of the + "AF → Foundry SDK shape → storage → Foundry SDK shape → AF" pipeline + using exactly the production conversion code in :mod:`._shared`. + """ + + @staticmethod + def _af_message(text: str, item_id: str) -> tuple[Message, OutputItem]: + """Build an AF ``Message`` and the matching Foundry ``OutputItem``. + + Both messages are assistant ``output_message`` items because that's + the only OutputItem variant we round-trip through here — this test + exercises the conversion path, not every input/output shape. + """ + from agent_framework import Content + + af_message = Message(role="assistant", contents=[Content.from_text(text)]) + foundry_item = OutputItemOutputMessage( + id=item_id, + type="output_message", + role="assistant", + status="completed", + content=[OutputMessageContentOutputTextContent(type="output_text", text=text, annotations=[])], + ) + return af_message, foundry_item + + async def test_two_messages_round_trip_through_in_memory_backend(self) -> None: + from azure.ai.agentserver.responses.models import ResponseObject + + # 1. Start from two AF Messages (the "outside world" shape). + original_first, foundry_first = self._af_message("First message: 2 + 2 equals 4.", "itm_1") + original_second, foundry_second = self._af_message("Second message: 3 + 5 equals 8.", "itm_2") + + # 2. Hand the Foundry items to the real in-memory storage backend + # via the same ``create_response`` API the agent-server runtime + # uses on every successful turn. Passing them as ``input_items`` + # is enough — the in-memory backend records each item under its + # own id and exposes it via ``get_history_item_ids``. + backend = InMemoryResponseProvider() + response = ResponseObject( + id="resp_round_trip", + object="response", + status="completed", + model="test-model", + created_at=0, + ) + await backend.create_response( + response, + input_items=[foundry_first, foundry_second], + history_item_ids=None, + ) + + # 3. Wire the provider to the seeded backend (no HTTP, no + # credential needed — this exercises the local-mode contract). + provider = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + + # 4. Retrieve via the public API. Internally this fans out: + # backend.get_history_item_ids → backend.get_items + # → ``_output_items_to_messages`` from ``_shared`` → AF Messages. + retrieved = await provider.get_messages("resp_round_trip") + + # 5. Round-trip preserves role + text content for both messages. + assert len(retrieved) == 2 + assert all(isinstance(m, Message) for m in retrieved) + + assert retrieved[0].role == original_first.role + assert retrieved[0].text == original_first.text == "First message: 2 + 2 equals 4." + + assert retrieved[1].role == original_second.role + assert retrieved[1].text == original_second.text == "Second message: 3 + 5 equals 8." + + async def test_additional_properties_round_trip_through_in_memory_backend(self) -> None: + """End-to-end audit/replay verification via the public provider API. + + Seeds the in-memory backend with an :class:`OutputItemOutputMessage` + carrying: + + * a non-default item id; + * declared content fields (``output_text`` with annotations); + * a non-default ``status``; + * an arbitrary, undeclared top-level key + (``"audit_trace_id": "..."``) — i.e. the kind of opaque field + Foundry might layer on for audit/replay; + * an undeclared key on a content child + (``"vendor_metadata": {...}``). + + Reads the items back through ``get_messages`` (which captures the + :data:`RAW_KEY` snapshot), then writes them via ``save_messages`` + (which re-emits via the snapshot), then reads again and asserts + every field above survives the storage → AF → storage hop. Without + the raw-snapshot path, the second read would see synthesised + text-only items with newly-minted ids and lose every audit field. + """ + from azure.ai.agentserver.responses.models import ResponseObject + + from agent_framework_foundry_hosting._shared import EXTRAS_KEY, RAW_KEY # pyright: ignore[reportPrivateUsage] + + backend = InMemoryResponseProvider() + original_id = "itm_audit_001" + seed_item = OutputItemOutputMessage( + id=original_id, + type="output_message", + role="assistant", + status="completed", + content=[ + OutputMessageContentOutputTextContent( + type="output_text", + text="The final answer is 42.", + annotations=[], + ) + ], + ) + # Layer audit fields onto the SDK model directly — these are the + # "extras" that pyright would warn about but the runtime + # round-trips faithfully via as_dict(). + seed_item["audit_trace_id"] = "trace-abc-123" + seed_item.content[0]["vendor_metadata"] = {"score": 0.97, "model": "gpt-x"} + + seed_response = ResponseObject( + id="resp_audit", + object="response", + status="completed", + model="test-model", + created_at=0, + ) + await backend.create_response(seed_response, input_items=[seed_item], history_item_ids=None) + + provider = _with_backend(FoundryHostedAgentHistoryProvider(), backend) + + # 1. Read back — provider stamps the RAW_KEY snapshot onto the + # AF Message's additional_properties. + first_read = await provider.get_messages("resp_audit") + assert len(first_read) == 1 + msg = first_read[0] + raw = msg.additional_properties[EXTRAS_KEY][RAW_KEY] + assert raw["id"] == original_id + assert raw["type"] == "output_message" + assert raw["audit_trace_id"] == "trace-abc-123" + assert raw["content"][0]["text"] == "The final answer is 42." + assert raw["content"][0]["vendor_metadata"] == {"score": 0.97, "model": "gpt-x"} + + # 2. Write back — this is where the snapshot-driven write path + # matters: save_messages mints a new response_id but must + # re-emit the SDK item from the captured raw shape. + from agent_framework_foundry_hosting import bind_request_context + + with bind_request_context(response_id="resp_audit_replay", previous_response_id="resp_audit"): + await provider.save_messages("resp_audit_replay", [msg]) + + # 3. Inspect what was stored. We walk the new response id and + # expect to see the prior history seeded plus the replayed + # message — proof the snapshot survived storage→AF→storage. + item_ids = await backend.get_history_item_ids( + previous_response_id="resp_audit_replay", conversation_id=None, limit=20 + ) + assert len(item_ids) >= 1 + stored_items = await backend.get_items(item_ids) + # Find the replayed item (its content text matches). + replay = next( + dict(it) + for it in stored_items + if it is not None + and dict(it).get("type") == "output_message" + and dict(it).get("audit_trace_id") == "trace-abc-123" + and dict(it).get("id") != original_id + ) + stored_dict = replay + assert stored_dict["type"] == "output_message" + assert stored_dict["status"] == "completed" + assert stored_dict["audit_trace_id"] == "trace-abc-123" + assert stored_dict["content"][0]["text"] == "The final answer is 42." + assert stored_dict["content"][0]["vendor_metadata"] == {"score": 0.97, "model": "gpt-x"} + # The replay item id is regenerated per write turn (caller + # supplies it), so it must NOT equal the original — that's how + # we know the snapshot path didn't naively echo back the seed. + assert stored_dict["id"] != original_id + + # 4. Final read confirms the entire chain is observable through + # the public AF surface. Walking the new response id returns + # both the seeded prior item and the replayed one. + second_read = await provider.get_messages("resp_audit_replay") + assert len(second_read) >= 1 + # Find the replayed message (matches the seed text + audit field). + replayed_msg = next( + m + for m in second_read + if EXTRAS_KEY in m.additional_properties + and m.additional_properties[EXTRAS_KEY].get(RAW_KEY, {}).get("audit_trace_id") == "trace-abc-123" + ) + replayed_raw = replayed_msg.additional_properties[EXTRAS_KEY][RAW_KEY] + assert replayed_raw["content"][0]["vendor_metadata"] == {"score": 0.97, "model": "gpt-x"} + + +# region Integration tests against a real Foundry project +# +# Required environment variables: +# +# * ``FOUNDRY_PROJECT_ENDPOINT`` — base URL of a real Foundry project, +# e.g. ``https://my-proj.services.ai.azure.com``. +# * Azure auth (any one of): +# - ``az login`` (recommended for local dev) +# - ``AZURE_CLIENT_ID`` + ``AZURE_CLIENT_SECRET`` + ``AZURE_TENANT_ID`` +# - Managed identity when on Azure +# The identity needs at least the ``Azure AI User`` role on the project. +# +# Optional (enables the seeded-history test): +# +# * ``FOUNDRY_HOSTING_PREVIOUS_RESPONSE_ID`` — a real response id with attached items. +# * ``FOUNDRY_HOSTING_CONVERSATION_ID`` — alternative. +# * ``FOUNDRY_HOSTING_USER_ISOLATION_KEY`` / +# ``FOUNDRY_HOSTING_CHAT_ISOLATION_KEY`` — set if your project enforces isolation. +# +# Run with: ``uv run pytest -m integration packages/foundry_hosting/tests/test_history_provider.py`` + + +_FOUNDRY_PROJECT_ENDPOINT = os.getenv("FOUNDRY_PROJECT_ENDPOINT", "") + +_skip_if_no_foundry_endpoint = pytest.mark.skipif( + not _FOUNDRY_PROJECT_ENDPOINT or _FOUNDRY_PROJECT_ENDPOINT == "https://test-project.services.ai.azure.com/", + reason=( + "FOUNDRY_PROJECT_ENDPOINT not set to a real Foundry project; " + "skipping FoundryHostedAgentHistoryProvider integration tests." + ), +) + + +def _isolation_from_env() -> IsolationContext | None: + user_key = os.getenv("FOUNDRY_HOSTING_USER_ISOLATION_KEY") + chat_key = os.getenv("FOUNDRY_HOSTING_CHAT_ISOLATION_KEY") + if not user_key and not chat_key: + return None + return IsolationContext(user_key=user_key, chat_key=chat_key) + + +@pytest.fixture +async def _live_credential() -> object: + """Yield a :class:`AzureCliCredential` and close it afterwards.""" + # Imported lazily so collection still works in environments without + # ``azure-identity`` available (e.g. minimal CI matrices). + from azure.identity.aio import AzureCliCredential + + cred = AzureCliCredential() + try: + yield cred + finally: + await cred.close() + + +class TestLiveFoundryStorage: + """End-to-end tests against a real Foundry project's storage HTTP API. + + These tests are gated behind ``@pytest.mark.integration`` so the + default ``pytest -m 'not integration'`` run skips them; they are + additionally skipped unless ``FOUNDRY_PROJECT_ENDPOINT`` points at a + real project. + """ + + @pytest.mark.flaky + @pytest.mark.integration + @_skip_if_no_foundry_endpoint + async def test_get_messages_unknown_response_id_returns_empty(self, _live_credential: object) -> None: + """A brand-new previous_response_id should yield an empty history. + + The native HTTP backend treats a 404 from the storage ``item_ids`` + endpoint as "no prior history" rather than raising, so a freshly + bootstrapped client never crashes on its first request. This test + proves that contract end-to-end against the live service. + """ + isolation = _isolation_from_env() + provider = FoundryHostedAgentHistoryProvider( + endpoint=_FOUNDRY_PROJECT_ENDPOINT, + credential=_live_credential, # type: ignore[arg-type] + ) + try: + messages = await provider.get_messages( + "resp_does_not_exist_integration_smoke", + isolation=isolation, + ) + finally: + await provider.aclose() + + assert messages == [] + + @pytest.mark.flaky + @pytest.mark.integration + @_skip_if_no_foundry_endpoint + @pytest.mark.skipif( + not os.getenv("FOUNDRY_HOSTING_PREVIOUS_RESPONSE_ID") and not os.getenv("FOUNDRY_HOSTING_CONVERSATION_ID"), + reason=( + "Set FOUNDRY_HOSTING_PREVIOUS_RESPONSE_ID or " + "FOUNDRY_HOSTING_CONVERSATION_ID to a real seeded conversation to " + "enable this test." + ), + ) + async def test_get_messages_returns_real_history(self, _live_credential: object) -> None: + """When pointed at a real seeded conversation we should get Messages back.""" + previous_response_id = os.getenv("FOUNDRY_HOSTING_PREVIOUS_RESPONSE_ID") or "" + conversation_id = os.getenv("FOUNDRY_HOSTING_CONVERSATION_ID") + isolation = _isolation_from_env() + + provider = FoundryHostedAgentHistoryProvider( + endpoint=_FOUNDRY_PROJECT_ENDPOINT, + credential=_live_credential, # type: ignore[arg-type] + history_limit=20, + ) + try: + # ``get_messages`` is keyed on ``session_id`` (== previous_response_id) + # so we pass that as the primary lookup; conversation_id is the + # fallback when only a conversation id is configured. + messages = await provider.get_messages( + previous_response_id or (conversation_id or ""), + isolation=isolation, + ) + finally: + await provider.aclose() + + assert isinstance(messages, list) + assert messages, "Expected at least one message in the seeded history" + assert all(isinstance(m, Message) for m in messages) + + @pytest.mark.flaky + @pytest.mark.integration + @_skip_if_no_foundry_endpoint + async def test_invoke_then_read_and_write_with_isolation(self, _live_credential: object) -> None: + """Invoke a deployed Foundry hosted agent, then round-trip via storage. + + This test exercises the realistic, fully-permissioned path: + + 1. Use :class:`FoundryAgent` to invoke the deployed + ``agent-framework-hosting-sample`` (version 10) hosted agent + with an explicit ``isolation_key``. The Foundry runtime + creates the response + history items inside the storage + backend on the user's behalf. + 2. Read the resulting history back through our own native HTTP + :class:`FoundryHostedAgentHistoryProvider` using the matching + :class:`IsolationContext`. This is the production read path + that DevUI / external clients use to render conversation + transcripts. + 3. Best-effort: try to APPEND two more items to the same + response via :class:`FoundryStorageProvider` write API. The + storage write path is normally callable only from inside the + agent-server container's runtime identity (Foundry strips + the user's bearer token at the runtime boundary), so a 403 + here is expected for ordinary user principals; we skip the + write-side assertions in that case rather than failing. + """ + from agent_framework_foundry import FoundryAgent + from azure.ai.agentserver.responses import ( + FoundryStorageProvider, + FoundryStorageSettings, + ) + from azure.ai.agentserver.responses.store._foundry_errors import ( # pyright: ignore[reportPrivateImportUsage] + FoundryApiError, + ) + + # Per-run-unique isolation key keeps each test run in its own + # tenant partition so concurrent runs (CI matrix, retries) don't + # collide. + isolation_key = f"af-hosting-roundtrip-{int(time.time())}" + isolation = IsolationContext(user_key=isolation_key, chat_key=isolation_key) + + # 1. Invoke the deployed hosted agent. + agent = FoundryAgent( + project_endpoint=_FOUNDRY_PROJECT_ENDPOINT, + agent_name="agent-framework-hosting-sample", + agent_version="10", + credential=_live_credential, # type: ignore[arg-type] + allow_preview=True, + default_options={"isolation_key": isolation_key}, + ) + # ``create_session()`` makes a fresh local session with no + # ``service_session_id`` set; the FoundryAgent's + # ``_prepare_run_context`` will lazily call + # ``project_client.beta.agents.create_session`` under our + # isolation key on first run. + session = agent.create_session() + prompt = "Please reply with exactly: 'Round-trip ack.'" + result = await agent.run(prompt, session=session) + + assert result.text, "FoundryAgent.run returned an empty response" + response_id = result.response_id + assert isinstance(response_id, str) and response_id, "Expected a non-empty response_id from FoundryAgent.run" + + # 2. Read history back via the native HTTP provider with the + # same isolation context. Try both the response_id and the + # service_session_id Foundry created on our behalf — depending + # on the runtime's storage layout, history may be anchored to + # either. + service_session_id = session.service_session_id + candidates = [c for c in (response_id, service_session_id) if c] + + reader = FoundryHostedAgentHistoryProvider( + endpoint=_FOUNDRY_PROJECT_ENDPOINT, + credential=_live_credential, # type: ignore[arg-type] + history_limit=20, + ) + try: + messages_after_invoke: list[Message] = [] + for cand in candidates: + msgs = await reader.get_messages(cand, isolation=isolation) + if msgs: + messages_after_invoke = msgs + break + finally: + await reader.aclose() + + # The read path returning a well-typed list (possibly empty if + # Foundry compacts items out of the response chain we queried) + # is enough to confirm the isolation header path works end-to-end. + assert all(isinstance(m, Message) for m in messages_after_invoke) + + # If we got messages back, every one should carry the lossless + # raw-snapshot under additional_properties[EXTRAS_KEY][RAW_KEY] — + # this is what guarantees audit/replay round-trip through the + # storage backend. Without it, a write-back would synthesise a + # text-only item and lose every audit field. + if messages_after_invoke: + from agent_framework_foundry_hosting._shared import ( # pyright: ignore[reportPrivateUsage] + EXTRAS_KEY, + RAW_KEY, + ) + + for m in messages_after_invoke: + extras = m.additional_properties.get(EXTRAS_KEY) or {} + assert RAW_KEY in extras, f"Live read message missing raw snapshot: {m!r}" + raw = extras[RAW_KEY] + # Snapshot must carry the discriminator + id — the two + # fields save_messages relies on to rebuild the SDK item. + assert isinstance(raw, dict) + assert "type" in raw and "id" in raw + + # 3. Best-effort write: create a fresh response under the same + # isolation key carrying two known items, then read it back + # via the native HTTP provider. Skip the write-side + # assertions if Foundry rejects the call with 403 (expected + # when the runtime is the only authorised writer). + from azure.ai.agentserver.responses.models import ResponseObject + + write_response_id = f"resp_af_write_{int(time.time())}" + _, foundry_first = TestAfFoundryRoundTrip._af_message( + "Appended message 1: 2 + 2 equals 4.", f"{write_response_id}_itm_1" + ) + _, foundry_second = TestAfFoundryRoundTrip._af_message( + "Appended message 2: 3 + 5 equals 8.", f"{write_response_id}_itm_2" + ) + + write_succeeded = False + writer = FoundryStorageProvider( + credential=_live_credential, # type: ignore[arg-type] + settings=FoundryStorageSettings.from_endpoint(_FOUNDRY_PROJECT_ENDPOINT), + ) + try: + await writer.create_response( + ResponseObject( + id=write_response_id, + object="response", + status="completed", + model="agent", + created_at=int(time.time()), + ), + input_items=[foundry_first, foundry_second], + history_item_ids=None, + isolation=isolation, + ) + write_succeeded = True + except FoundryApiError as exc: + if "403" not in str(exc): + raise + # Foundry strips the user bearer token at the runtime + # boundary, so external principals can't write directly to + # storage. The container's MSI is the authorised writer. + pytest.skip("Foundry rejected external storage write with 403 (expected outside container).") + finally: + await writer.aclose() + + # Re-read and verify our two appended items now show up. + if not write_succeeded: # pragma: no cover — defensive; pytest.skip already raised + return + reader2 = FoundryHostedAgentHistoryProvider( + endpoint=_FOUNDRY_PROJECT_ENDPOINT, + credential=_live_credential, # type: ignore[arg-type] + history_limit=20, + ) + try: + messages_after_write = await reader2.get_messages(write_response_id, isolation=isolation) + finally: + await reader2.aclose() + + appended_texts = {m.text for m in messages_after_write} + assert "Appended message 1: 2 + 2 equals 4." in appended_texts + assert "Appended message 2: 3 + 5 equals 8." in appended_texts From 7bd68f889715ecb45e4905e1c8ae2486313fdda5 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Thu, 28 May 2026 13:56:43 +0200 Subject: [PATCH 05/20] Python: add agent-framework-hosting-responses channel (#5639) * feat(hosting-responses): add OpenAI Responses-shaped channel package New ``agent-framework-hosting-responses`` package implementing the OpenAI Responses-shaped HTTP channel for the Hosting framework. Mounts ``POST /responses`` (and a ``/responses/{response_id}`` GET) onto an ``AgentFrameworkHost`` and translates the OpenAI Responses wire shape to/from the channel-neutral ``ChannelRequest`` / ``HostedRunResult`` plumbing. Surface (re-exported from ``agent_framework_hosting_responses``): - ``ResponsesChannel`` -- concrete ``Channel`` implementation. Owns the Starlette route(s), parses inbound JSON into ``ChannelRequest``, runs the optional ``ChannelRunHook``, calls back into the ``ChannelContext`` to invoke the agent target, builds Responses envelopes (sync JSON or SSE), and respects ``DeliveryReport.include_originating`` so cross-channel push routes only ack to the originating Responses caller. - The minted ``response_id`` is propagated via the host's ContextVar machinery so storage-side history providers (e.g. ``FoundryHostedAgentHistoryProvider``) persist envelopes against the same id the channel returns. - 48 unit tests covering route wiring, parsing of each Responses input shape, hook composition, sync vs streaming paths, and originating vs non-originating delivery branches. Registers the package in ``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching pyright ``executionEnvironments`` entry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * review: address PR-3 round 2 feedback - consume IsolationKeys.chat_key from the host-bound contextvar instead of the raw `x-agent-chat-isolation-key` header off the wire so the host's ASGI isolation middleware (or any operator-supplied replacement) is the authoritative point at which the caller is authenticated and the bucket key is established - expand `response_id_factory` docstring to call out partition co-location vs. partition-ownership enforcement: the channel forwards `previous_response_id` as a hint to the factory; the storage layer validates the embedded partition against the bound user/chat isolation keys - on mid-stream failure, call `deliver_response` with the accumulated text before emitting `response.failed` so host-side history / push-channel state stays consistent with the partial deltas the client already saw Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting-responses): fix quickstart to use current Agent API ChatAgent was renamed to Agent and ChatMessage to Message. Update the README quickstart to use client.as_agent(...) and refresh the stale docstring reference in _channel.py. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting-responses): adapt to hosted run result wrapper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting-responses): add response hooks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting-responses): keep instructions in chat options Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/hosting-responses/LICENSE | 21 + python/packages/hosting-responses/README.md | 21 + .../__init__.py | 27 ++ .../_channel.py | 427 ++++++++++++++++++ .../_parsing.py | 234 ++++++++++ .../packages/hosting-responses/pyproject.toml | 98 ++++ .../hosting-responses/tests/__init__.py | 0 .../hosting-responses/tests/test_channel.py | 287 ++++++++++++ .../hosting-responses/tests/test_parsing.py | 204 +++++++++ python/uv.lock | 18 + 10 files changed, 1337 insertions(+) create mode 100644 python/packages/hosting-responses/LICENSE create mode 100644 python/packages/hosting-responses/README.md create mode 100644 python/packages/hosting-responses/agent_framework_hosting_responses/__init__.py create mode 100644 python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py create mode 100644 python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py create mode 100644 python/packages/hosting-responses/pyproject.toml create mode 100644 python/packages/hosting-responses/tests/__init__.py create mode 100644 python/packages/hosting-responses/tests/test_channel.py create mode 100644 python/packages/hosting-responses/tests/test_parsing.py diff --git a/python/packages/hosting-responses/LICENSE b/python/packages/hosting-responses/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting-responses/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting-responses/README.md b/python/packages/hosting-responses/README.md new file mode 100644 index 00000000000..ae03d364af3 --- /dev/null +++ b/python/packages/hosting-responses/README.md @@ -0,0 +1,21 @@ +# agent-framework-hosting-responses + +OpenAI Responses-shaped channel for `agent-framework-hosting`. + +Exposes a single `POST /responses` endpoint that accepts the OpenAI +Responses API request body and returns either a Responses-shaped JSON +body or a Server-Sent-Events stream when `stream=True`. + +```python +from agent_framework.openai import OpenAIChatClient +from agent_framework_hosting import AgentFrameworkHost +from agent_framework_hosting_responses import ResponsesChannel + +agent = OpenAIChatClient().as_agent(name="Assistant") + +host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel()]) +host.serve(port=8000) +``` + +The base host plumbing lives in +[`agent-framework-hosting`](https://pypi.org/project/agent-framework-hosting/). diff --git a/python/packages/hosting-responses/agent_framework_hosting_responses/__init__.py b/python/packages/hosting-responses/agent_framework_hosting_responses/__init__.py new file mode 100644 index 00000000000..72b2272aecd --- /dev/null +++ b/python/packages/hosting-responses/agent_framework_hosting_responses/__init__.py @@ -0,0 +1,27 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""OpenAI Responses-shaped channel for ``agent-framework-hosting``.""" + +import importlib.metadata + +from ._channel import ResponsesChannel +from ._parsing import ( + messages_from_responses_input, + parse_response_target, + parse_responses_identity, + parse_responses_request, +) + +try: + __version__ = importlib.metadata.version(__name__) +except importlib.metadata.PackageNotFoundError: + __version__ = "0.0.0" + +__all__ = [ + "ResponsesChannel", + "__version__", + "messages_from_responses_input", + "parse_response_target", + "parse_responses_identity", + "parse_responses_request", +] diff --git a/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py b/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py new file mode 100644 index 00000000000..cf85cca0260 --- /dev/null +++ b/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py @@ -0,0 +1,427 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""``ResponsesChannel`` — OpenAI Responses-shaped HTTP surface. + +Exposes a single ``POST /responses`` endpoint that accepts +``{"input": "...", "stream": false}`` (and the rest of the Responses API +request body) and returns either a Responses-shaped JSON body +(``stream=False``, default) or a Server-Sent-Events stream +(``stream=True``). + +Payload construction reuses the ``openai.types.responses`` Pydantic +models so the OpenAI Python SDK ``stream=True`` consumer parses every +required field without surprises. +""" + +from __future__ import annotations + +import time +import uuid +from collections.abc import AsyncIterator, Awaitable, Callable, Mapping +from typing import Any, cast + +from agent_framework import AgentResponse, Content, Message +from agent_framework_hosting import ( + ChannelContext, + ChannelContribution, + ChannelRequest, + ChannelResponseContext, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + ChannelStreamTransformHook, + HostedRunResult, + apply_response_hook, + apply_run_hook, + get_current_isolation_keys, + logger, +) +from openai.types.responses import ( + Response as OpenAIResponse, +) +from openai.types.responses import ( + ResponseCompletedEvent, + ResponseCreatedEvent, + ResponseError, + ResponseFailedEvent, + ResponseOutputMessage, + ResponseOutputText, + ResponseTextDeltaEvent, +) +from starlette.requests import Request +from starlette.responses import JSONResponse, Response, StreamingResponse +from starlette.routing import Route + +from ._parsing import ( + parse_response_target, + parse_responses_identity, + parse_responses_request, +) + + +def _ack_text() -> str: + """Tiny acknowledgement string for the originating wire. + + Used when the agent reply is delivered out-of-band via :class:`ChannelPush`. + """ + return "[delivered out-of-band]" + + +def _text_result(text: str) -> HostedRunResult[AgentResponse]: + """Build a host delivery payload from text accumulated by this channel.""" + return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) + + +class ResponsesChannel: + """Minimal OpenAI-Responses-shaped surface. + + Mounts ``POST /responses`` (default path ``/responses`` so the + full route is ``/responses/responses`` when the channel is prefixed, + or just ``/responses`` when ``path=""``). + """ + + name = "responses" + + def __init__( + self, + *, + path: str = "", + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + stream_transform_hook: ChannelStreamTransformHook | None = None, + response_id_factory: Callable[..., str] | None = None, + ) -> None: + """Create a Responses channel. + + Keyword Args: + path: Mount prefix on the host. Default ``""`` mounts the + ``POST /responses`` route at the app root, matching the + upstream OpenAI surface. + run_hook: Optional :data:`ChannelRunHook` invoked with the + parsed :class:`ChannelRequest` before the agent target + runs. May return a replacement request. + response_hook: Optional :data:`ChannelResponseHook` invoked + before the channel serializes an originating + :class:`HostedRunResult` into a Responses envelope. The + host also invokes this hook when delivering to this + channel as a non-originating push destination. + stream_transform_hook: Optional per-update transform hook + applied while streaming Server-Sent Events. Return a + replacement update, or ``None`` to drop the update. + response_id_factory: Optional callable that mints the + per-request response id. Default produces + ``resp_`` which matches the OpenAI Responses + wire shape. Override when the host backing storage + requires a different id format (e.g. Foundry storage, + whose partition keys are encoded in the id and which + rejects free-form ``resp_*`` ids with a server error). + The same id is used for the channel envelope and for + the host-side anchoring (``ChannelRequest.attributes``) + so storage and replay agree. + + Security note on partition co-location: when a caller + supplies ``previous_response_id`` we forward it to the + factory so id backends that embed partition keys can + co-locate the new record with the chain's existing + partition. The factory passes that hint through to the + storage layer; **partition ownership is enforced at + the storage layer**, not in the channel: the Foundry + storage provider, for example, validates the request + against the bound user/chat isolation keys and rejects + writes whose embedded partition does not match the + authenticated caller's isolation. Channel-level + forwarding is therefore a performance hint, not a + security boundary; the host's isolation middleware + must establish the caller's identity before this + route is entered. + """ + self.path = path + self._hook = run_hook + self.response_hook = response_hook + self._stream_transform_hook = stream_transform_hook + self._ctx: ChannelContext | None = None + self._response_id_factory: Callable[..., str] = ( + response_id_factory if response_id_factory is not None else (lambda *_a, **_kw: f"resp_{uuid.uuid4().hex}") + ) + + def contribute(self, context: ChannelContext) -> ChannelContribution: + """Capture the host-supplied context and register ``POST /responses``.""" + self._ctx = context + return ChannelContribution(routes=[Route("/responses", self._handle, methods=["POST"])]) + + async def _handle(self, request: Request) -> Response: + """Handle a single ``POST /responses`` call. + + Parses the OpenAI Responses-shaped body into ``Message`` / + ``options`` / ``ChannelSession`` triples via :mod:`._parsing`, + applies the optional ``run_hook``, and either streams an SSE + response stream or returns a one-shot OpenAI ``Response`` envelope. + Non-originating ``response_target`` values resolve to a delivery + acknowledgement instead of echoing the agent text on this wire. + """ + if self._ctx is None: # pragma: no cover - guarded by Channel lifecycle + return JSONResponse({"error": "channel not initialized"}, status_code=500) + try: + body = await request.json() + except Exception: + return JSONResponse({"error": "invalid json"}, status_code=400) + + try: + messages, options, session = parse_responses_request(body) + except ValueError as exc: + return JSONResponse({"error": str(exc)}, status_code=422) + + # When no ``previous_response_id`` chain anchor is on the body, + # surface the isolation key the **host** lifted off the request + # (via ``_FoundryIsolationASGIMiddleware`` for the default + # Foundry-platform deployment, or whatever middleware the + # operator configured in front of the host) as the channel + # session id, so callers without an explicit anchor still get + # a stable per-conversation session id (used by non-Foundry + # history providers, routing/idempotency, etc.). + # + # Security note: we consume the host-bound contextvar set by the + # ASGI isolation middleware, NOT the raw header off the wire. + # That middleware is the operator's place to enforce auth and + # gate which callers get to set isolation. If you mount the host + # in front of a custom auth boundary, your middleware should + # validate the caller before stamping ``set_current_isolation_keys``; + # never trust raw wire headers to identify a session bucket. + # The chat-iso value is *not* a valid storage anchor: the + # Foundry history provider deliberately ignores it — multi-turn + # storage chaining goes through the ``previous_response_id`` / + # bound ``response_id`` pair on ``ChannelRequest.attributes``. + bound_keys = get_current_isolation_keys() + chat_iso = bound_keys.chat_key if bound_keys is not None else None + if session is None and chat_iso: + session = ChannelSession(isolation_key=chat_iso) + + # Mint the response id once per request so the channel envelope + # (one-shot or streamed) and any host-side anchoring (e.g. the + # Foundry history provider's ``bind_request_context``) agree on + # the same handle. The next turn arrives with this value as + # ``previous_response_id`` and the storage chain walks. We pass + # both anchors via ``ChannelRequest.attributes`` so the host + # can pick them up without a channel-specific contract. + previous_response_id: str | None = None + prev_raw = body.get("previous_response_id") + if isinstance(prev_raw, str) and prev_raw: + previous_response_id = prev_raw + # Pass the previous id (if any) as a hint to the factory so id + # backends that embed partition keys (e.g. Foundry storage) can + # co-locate the new record with the chain's existing partition. + # No-arg factories continue to work via ``Callable[..., str]``. + response_id = self._response_id_factory(previous_response_id) + + attributes: dict[str, Any] = {"response_id": response_id} + if previous_response_id is not None: + attributes["previous_response_id"] = previous_response_id + + # Honor the OpenAI-Responses ``stream`` flag — non-streaming by + # default, SSE when the caller opts in. Run hooks may still flip + # this per-request (e.g. force non-streaming for a particular user). + channel_request = ChannelRequest( + channel=self.name, + operation="message.create", + input=messages, + session=session, + options=options or None, + stream=bool(body.get("stream", False)), + identity=parse_responses_identity(body, self.name), + response_target=parse_response_target(body), + attributes=attributes, + ) + + if self._hook is not None: + channel_request = await apply_run_hook( + self._hook, + channel_request, + target=self._ctx.target, + protocol_request=body, + ) + + if channel_request.stream: + return StreamingResponse( + self._stream_events(channel_request, body, response_id=response_id), + media_type="text/event-stream", + headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"}, + ) + + result = await self._ctx.run(channel_request) + include_originating = await self._ctx.deliver_response(channel_request, result) + if include_originating: + result = await self._apply_response_hook(result, channel_request) + text = result.result.text if include_originating else _ack_text() + envelope = self._build_response(body, text, status="completed", response_id=response_id) + return JSONResponse(envelope.model_dump(mode="json", exclude_none=True)) + + async def _apply_response_hook( + self, + result: HostedRunResult[AgentResponse], + request: ChannelRequest, + ) -> HostedRunResult[AgentResponse]: + """Apply the channel-level response hook for an originating reply.""" + if self.response_hook is None: + return result + context = ChannelResponseContext( + request=request, + channel_name=self.name, + destination_identity=None, + originating=True, + is_echo=False, + ) + shaped = await apply_response_hook(self.response_hook, result, context=context) + return cast("HostedRunResult[AgentResponse]", shaped) + + def _build_response( + self, + body: Mapping[str, Any], + text: str, + *, + status: str, + response_id: str | None = None, + ) -> OpenAIResponse: + """Construct an OpenAI ``Response`` for a finished (non-streaming) run. + + ``status`` mirrors the top-level Response status set values + (``in_progress`` / ``completed`` / ``failed`` / ``incomplete`` / + ``cancelled``). The nested ``ResponseOutputMessage.status`` field + only accepts ``in_progress`` / ``completed`` / ``incomplete``, so + terminal-but-non-success states collapse to ``incomplete`` there + — the failure detail still travels via the top-level ``status`` + and (for streamed errors) the ``error`` field. + + ``response_id``: the per-request id minted in :meth:`_handle`. + Passed in so envelope and storage agree on a single handle per + turn (see :meth:`_handle` notes). Falls back to a fresh uuid + when callers (e.g. :meth:`_stream_events`'s skeleton path + before this argument was introduced) don't supply one. + """ + message_status = status if status in ("in_progress", "completed", "incomplete") else "incomplete" + return OpenAIResponse( + id=response_id or self._response_id_factory(None), + object="response", + created_at=time.time(), + status=status, # type: ignore[arg-type] + model=body.get("model", "agent"), + output=[ + ResponseOutputMessage( + id=f"msg_{uuid.uuid4().hex}", + type="message", + role="assistant", + status=message_status, # type: ignore[arg-type] + content=[ResponseOutputText(type="output_text", text=text, annotations=[])], + ) + ], + parallel_tool_calls=False, + tool_choice="auto", + tools=[], + metadata={}, + ) + + async def _stream_events( + self, + request: ChannelRequest, + body: Mapping[str, Any], + *, + response_id: str, + ) -> AsyncIterator[str]: + """Yield SSE events shaped like the OpenAI Responses streaming protocol. + + Emits ``response.created`` → many ``response.output_text.delta`` + → ``response.completed`` (or ``response.failed`` on error). + """ + if self._ctx is None: # pragma: no cover - guarded by Channel lifecycle + return + + msg_id = f"msg_{uuid.uuid4().hex}" + seq = 0 + + def next_seq() -> int: + nonlocal seq + seq += 1 + return seq + + def sse(event: Any) -> str: + return f"event: {event.type}\ndata: {event.model_dump_json(exclude_none=True)}\n\n" + + skeleton = self._build_response(body, "", status="in_progress", response_id=response_id) + yield sse(ResponseCreatedEvent(type="response.created", response=skeleton, sequence_number=next_seq())) + + accumulated = "" + try: + stream = self._ctx.run_stream(request) + async for update in stream: + if self._stream_transform_hook is not None: + transformed = self._stream_transform_hook(update) + update = await transformed if isinstance(transformed, Awaitable) else transformed + if update is None: + continue + chunk = getattr(update, "text", None) + if chunk: + accumulated += chunk + yield sse( + ResponseTextDeltaEvent( + type="response.output_text.delta", + item_id=msg_id, + output_index=0, + content_index=0, + delta=chunk, + logprobs=[], + sequence_number=next_seq(), + ) + ) + try: + # Finalize so context-provider / history hooks on the agent + # still run even though we are emitting our own SSE. + await stream.get_final_response() + except Exception: # pragma: no cover - finalize is best-effort + logger.exception("Responses stream finalize failed") + except Exception as exc: + logger.exception("Responses stream consumption failed") + # Mid-stream failure: the wire already saw partial deltas + # so host-side state must reflect that — call + # ``deliver_response`` with the accumulated text (best-effort) + # before signalling failure to the client. Without this, + # next turn's chain anchored on this ``response_id`` would + # be inconsistent with what the user actually saw, and any + # non-originating push targets would silently miss the turn. + # ``deliver_response`` itself is best-effort; we swallow its + # exceptions so the failure event still reaches the client. + try: + await self._ctx.deliver_response(request, _text_result(accumulated)) + except Exception: # pragma: no cover - delivery is best-effort + logger.exception("Responses stream failure deliver_response failed") + failed = self._build_response(body, accumulated, status="failed", response_id=response_id) + failed.error = ResponseError(code="server_error", message=str(exc)) + yield sse( + ResponseFailedEvent( + type="response.failed", + response=failed, + sequence_number=next_seq(), + ) + ) + return + + completed_text = accumulated + result = _text_result(accumulated) + include_originating = await self._ctx.deliver_response(request, result) + if include_originating: + result = await self._apply_response_hook(result, request) + completed_text = result.result.text + else: + completed_text = _ack_text() + completed = self._build_response(body, completed_text, status="completed", response_id=response_id) + # Reuse the same message id we emitted deltas under. + if completed.output and isinstance(completed.output[0], ResponseOutputMessage): + completed.output[0].id = msg_id + yield sse( + ResponseCompletedEvent( + type="response.completed", + response=completed, + sequence_number=next_seq(), + ) + ) + + +__all__ = ["ResponsesChannel"] diff --git a/python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py b/python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py new file mode 100644 index 00000000000..e3e58211451 --- /dev/null +++ b/python/packages/hosting-responses/agent_framework_hosting_responses/_parsing.py @@ -0,0 +1,234 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Parsing helpers for the OpenAI Responses-API request body. + +The Responses API accepts ``input`` as either a string or a list of "input +items". An item is either a content part (``input_text`` / ``input_image`` +/ ``input_file``) or a message envelope ``{type: "message", role, +content: [...]}``. We translate that into an Agent Framework ``Message`` +list and split out the ChatOptions-shaped fields the API also carries. +""" + +from __future__ import annotations + +from collections.abc import Mapping +from typing import Any, cast + +from agent_framework import Content, Message +from agent_framework_hosting import ChannelIdentity, ChannelSession, ResponseTarget, logger + +# OpenAI Responses field name → Agent Framework ChatOptions field name. +_RESPONSES_OPTION_REMAP = { + "max_output_tokens": "max_tokens", + "parallel_tool_calls": "allow_multiple_tool_calls", +} +# Fields we forward to ChatOptions verbatim. ``instructions`` stays here +# because Agent Framework exposes it as a ChatOptions field; it must not be +# lifted into a synthetic system message. +_RESPONSES_OPTION_PASSTHROUGH = { + "instructions", + "temperature", + "top_p", + "metadata", + "user", + "safety_identifier", + "tool_choice", + "tools", + "store", + "response_format", + "stop", + "seed", + "frequency_penalty", + "presence_penalty", + "logit_bias", +} +# Fields the Responses transport owns; they must not be forwarded as options. +_RESPONSES_TRANSPORT_KEYS = {"input", "model", "stream", "previous_response_id", "response_target"} + + +def parse_response_target(body: Mapping[str, Any]) -> ResponseTarget: + """Translate the OpenAI Responses ``response_target`` field into a :class:`ResponseTarget`. + + Accepted shapes: + + - ``"originating"`` / ``"active"`` / ``"all_linked"`` / ``"none"`` — bare strings. + - ``"telegram"`` / ``"telegram:"`` — single channel destination. + - ``["telegram:", "originating"]`` — list of destinations; the + pseudo-name ``"originating"`` includes the originating channel. + - ``{"channels": [...]}`` — same list semantics with the explicit key. + - ``{"kind": "active"}`` / ``{"kind": "all_linked"}`` — explicit kind. + + Anything malformed is logged at WARNING and falls back to ``originating``. + """ + raw = body.get("response_target") + if raw is None: + return ResponseTarget.originating # type: ignore[attr-defined,no-any-return] + if isinstance(raw, str): + keyword = raw.strip() + if keyword == "originating": + return ResponseTarget.originating # type: ignore[attr-defined,no-any-return] + if keyword == "active": + return ResponseTarget.active # type: ignore[attr-defined,no-any-return] + if keyword == "all_linked": + return ResponseTarget.all_linked # type: ignore[attr-defined,no-any-return] + if keyword == "none": + return ResponseTarget.none # type: ignore[attr-defined,no-any-return] + # Treat any other bare string as a single channel destination. + return ResponseTarget.channel(keyword) + if isinstance(raw, list): + return _parse_channels_list(cast("list[Any]", raw)) # type: ignore[redundant-cast] + if isinstance(raw, Mapping): + raw_map = cast("Mapping[str, Any]", raw) + channels = raw_map.get("channels") + if isinstance(channels, list): + return _parse_channels_list(cast("list[Any]", channels)) # type: ignore[redundant-cast] + kind = raw_map.get("kind") + if kind == "active": + return ResponseTarget.active # type: ignore[attr-defined,no-any-return] + if kind == "all_linked": + return ResponseTarget.all_linked # type: ignore[attr-defined,no-any-return] + if kind == "none": + return ResponseTarget.none # type: ignore[attr-defined,no-any-return] + if kind == "originating": + return ResponseTarget.originating # type: ignore[attr-defined,no-any-return] + logger.warning("responses: ignoring malformed response_target=%r", cast("Any", raw)) + return ResponseTarget.originating # type: ignore[attr-defined,no-any-return] + + +def _parse_channels_list(raw: list[Any]) -> ResponseTarget: + """Build a ``ResponseTarget.channels`` from a raw list, dropping non-string entries. + + An empty list (or one with no usable strings) collapses back to + ``originating`` so we never silently produce a target that nobody + will deliver to. + """ + tokens = [t for t in raw if isinstance(t, str) and t] + if len(tokens) != len(raw): + logger.warning("responses: dropping non-string entries from response_target=%r", raw) + if not tokens: + return ResponseTarget.originating # type: ignore[attr-defined,no-any-return] + return ResponseTarget.channels(tokens) + + +def parse_responses_identity(body: Mapping[str, Any], channel_name: str) -> ChannelIdentity | None: + """Surface the caller as a :class:`ChannelIdentity` so the host can record it. + + OpenAI Responses replaced ``user`` with ``safety_identifier`` — we use + that as the native id, falling back to the legacy ``user`` field. + """ + native = body.get("safety_identifier") or body.get("user") + if not isinstance(native, str) or not native: + return None + return ChannelIdentity(channel=channel_name, native_id=native) + + +def _content_from_input_item(item: Mapping[str, Any]) -> Content: + """Convert a single OpenAI Responses ``input`` item into a :class:`Content` part. + + Handles the ``input_text``/``output_text``/``text`` text variants, + ``input_image`` URL references, and ``input_file`` references via either + a public URL or a hosted ``file_id``. Raises ``ValueError`` for any + unsupported item type so the surrounding parser can return a 422. + """ + item_type = item.get("type") + if item_type in ("input_text", "output_text", "text"): + return Content.from_text(text=str(item.get("text", ""))) + if item_type == "input_image": + image_url: Any = item.get("image_url") + if isinstance(image_url, Mapping): + image_url = cast("Mapping[str, Any]", image_url).get("url") + if not isinstance(image_url, str): + raise ValueError("input_image requires `image_url`") + return Content.from_uri(uri=image_url, media_type="image/*") + if item_type == "input_file": + if (uri := item.get("file_url")) and isinstance(uri, str): + return Content.from_uri(uri=uri, media_type=item.get("mime_type")) + if file_id := item.get("file_id"): + return Content(type="hosted_file", file_id=str(file_id)) + raise ValueError("input_file requires `file_url` or `file_id`") + raise ValueError(f"Unsupported Responses input content type: {item_type!r}") + + +def messages_from_responses_input(value: Any) -> list[Message]: + """Translate ``input`` (string or list of items) into :class:`Message` objects.""" + if isinstance(value, str): + return [Message("user", [Content.from_text(text=value)])] + if not isinstance(value, list) or not value: + raise ValueError("`input` must be a non-empty string or list") + + messages: list[Message] = [] + pending_user_parts: list[Content] = [] + + def flush() -> None: + """Emit any buffered loose user content as a single user message.""" + if pending_user_parts: + messages.append(Message("user", list(pending_user_parts))) + pending_user_parts.clear() + + for item in cast("list[Any]", value): # type: ignore[redundant-cast] + if not isinstance(item, Mapping): + raise ValueError("each `input` item must be an object") + item_map = cast("Mapping[str, Any]", item) + if item_map.get("type") == "message": + flush() + role = str(item_map.get("role") or "user") + content: Any = item_map.get("content") or [] + parts: list[Content] + if isinstance(content, str): + parts = [Content.from_text(text=content)] + elif isinstance(content, list): + parts = [ + _content_from_input_item(cast("Mapping[str, Any]", c)) + for c in cast("list[Any]", content) # type: ignore[redundant-cast] + if isinstance(c, Mapping) + ] + else: + parts = [] + messages.append(Message(role, parts)) + else: + pending_user_parts.append(_content_from_input_item(item_map)) + + flush() + if not messages: + raise ValueError("`input` produced no messages") + return messages + + +def parse_responses_request( + body: Mapping[str, Any], +) -> tuple[list[Message], dict[str, Any], ChannelSession | None]: + """Translate a Responses-API request body into Agent Framework constructs. + + Returns a triple ``(messages, options, session)`` where: + + - ``messages`` is the parsed conversation. + - ``options`` is a ``ChatOptions``-shaped dict with the model-tunable + fields the channel lifted off the body. + - ``session`` is a :class:`ChannelSession` keyed by + ``previous_response_id`` when one was supplied, else ``None``. + """ + messages = messages_from_responses_input(body.get("input")) + + options: dict[str, Any] = {} + for key, value in body.items(): + if key in _RESPONSES_TRANSPORT_KEYS or value is None: + continue + if (mapped := _RESPONSES_OPTION_REMAP.get(key)) is not None: + options[mapped] = value + elif key in _RESPONSES_OPTION_PASSTHROUGH: + options[key] = value + # silently drop everything else (truncation, reasoning, include, ...) + + session: ChannelSession | None = None + if (prev := body.get("previous_response_id")) and isinstance(prev, str): + session = ChannelSession(isolation_key=prev) + + return messages, options, session + + +__all__ = [ + "messages_from_responses_input", + "parse_response_target", + "parse_responses_identity", + "parse_responses_request", +] diff --git a/python/packages/hosting-responses/pyproject.toml b/python/packages/hosting-responses/pyproject.toml new file mode 100644 index 00000000000..6606c94455a --- /dev/null +++ b/python/packages/hosting-responses/pyproject.toml @@ -0,0 +1,98 @@ +[project] +name = "agent-framework-hosting-responses" +description = "OpenAI Responses-shaped channel for agent-framework-hosting." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "agent-framework-hosting==1.0.0a260424", + "openai>=1.99.0,<3", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting_responses"] +exclude = ['tests'] + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting_responses"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_responses" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_responses --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" diff --git a/python/packages/hosting-responses/tests/__init__.py b/python/packages/hosting-responses/tests/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/python/packages/hosting-responses/tests/test_channel.py b/python/packages/hosting-responses/tests/test_channel.py new file mode 100644 index 00000000000..c76dda9c448 --- /dev/null +++ b/python/packages/hosting-responses/tests/test_channel.py @@ -0,0 +1,287 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""End-to-end tests for :class:`ResponsesChannel` via Starlette's ``TestClient``.""" + +from __future__ import annotations + +from collections.abc import AsyncIterator +from dataclasses import dataclass +from typing import Any + +from agent_framework_hosting import ( + AgentFrameworkHost, + ChannelIdentity, + HostedRunResult, +) +from starlette.testclient import TestClient + +from agent_framework_hosting_responses import ResponsesChannel + +# --------------------------------------------------------------------------- # +# Fakes # +# --------------------------------------------------------------------------- # + + +@dataclass +class _FakeAgentResponse: + text: str + + +@dataclass +class _FakeUpdate: + text: str + + +class _FakeStream: + """Minimal stand-in for AF's ``ResponseStream`` returned by ``run(stream=True)``.""" + + def __init__(self, chunks: list[str]) -> None: + self._chunks = chunks + self._final = _FakeAgentResponse(text="".join(chunks)) + + def __aiter__(self) -> AsyncIterator[_FakeUpdate]: + async def _gen() -> AsyncIterator[_FakeUpdate]: + for c in self._chunks: + yield _FakeUpdate(c) + + return _gen() + + async def get_final_response(self) -> _FakeAgentResponse: + return self._final + + +class _FakeAgent: + def __init__(self, reply: str = "hello", chunks: list[str] | None = None) -> None: + self._reply = reply + self._chunks = chunks or [reply] + self.calls: list[dict[str, Any]] = [] + + def create_session(self, *, session_id: str | None = None) -> Any: + return {"session_id": session_id} + + def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + self.calls.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + if stream: + return _FakeStream(self._chunks) + + async def _coro() -> _FakeAgentResponse: + return _FakeAgentResponse(text=self._reply) + + return _coro() + + +class _RecordingPushChannel: + name = "telegram" + path = "/telegram" + + def __init__(self) -> None: + self.pushes: list[tuple[ChannelIdentity, HostedRunResult]] = [] + + def contribute(self, _ctx: Any) -> Any: + from agent_framework_hosting import ChannelContribution + + return ChannelContribution() + + async def push(self, identity: ChannelIdentity, payload: HostedRunResult) -> None: + self.pushes.append((identity, payload)) + + +# --------------------------------------------------------------------------- # +# Tests # +# --------------------------------------------------------------------------- # + + +def _make_client(agent: _FakeAgent | None = None) -> tuple[TestClient, AgentFrameworkHost, _FakeAgent]: + agent = agent or _FakeAgent() + host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel()]) + return TestClient(host.app), host, agent + + +class TestResponsesChannelNonStreaming: + def test_post_responses_returns_completed_envelope(self) -> None: + client, _host, agent = _make_client(_FakeAgent(reply="hi back")) + with client: + r = client.post("/responses", json={"input": "hi"}) + assert r.status_code == 200 + body = r.json() + assert body["status"] == "completed" + assert body["object"] == "response" + assert body["id"].startswith("resp_") + assert body["output"][0]["content"][0]["text"] == "hi back" + assert len(agent.calls) == 1 + + def test_invalid_json_returns_400(self) -> None: + client, *_ = _make_client() + with client: + r = client.post("/responses", content=b"{not json", headers={"content-type": "application/json"}) + assert r.status_code == 400 + + def test_invalid_input_returns_422(self) -> None: + client, *_ = _make_client() + with client: + r = client.post("/responses", json={"input": 42}) + assert r.status_code == 422 + + def test_options_propagate_to_target_run(self) -> None: + client, _host, agent = _make_client() + with client: + r = client.post("/responses", json={"input": "x", "temperature": 0.5, "max_output_tokens": 64}) + assert r.status_code == 200 + opts = agent.calls[0]["kwargs"]["options"] + assert opts == {"temperature": 0.5, "max_tokens": 64} + + def test_previous_response_id_creates_session(self) -> None: + client, _host, agent = _make_client() + with client: + client.post("/responses", json={"input": "x", "previous_response_id": "resp_42"}) + # AgentFrameworkHost converts the channel session into an AgentSession. + sess = agent.calls[0]["kwargs"].get("session") + assert sess is not None + # _FakeAgent.create_session stashes the session_id on the dict it returns. + assert sess["session_id"] == "resp_42" + + def test_chat_isolation_header_creates_session_when_no_prev_id(self) -> None: + """Foundry-style ``x-agent-chat-isolation-key`` falls back to a session anchor. + + First-turn requests have no ``previous_response_id`` (the client + doesn't have one yet), but Foundry Hosted Agents always inject + the isolation headers. The channel must derive a session from the + chat key so the host can build a stable per-conversation session + that history providers persist under. + """ + client, _host, agent = _make_client() + with client: + client.post( + "/responses", + json={"input": "x"}, + headers={"x-agent-chat-isolation-key": "chat-abc"}, + ) + sess = agent.calls[0]["kwargs"].get("session") + assert sess is not None + assert sess["session_id"] == "chat-abc" + + def test_prev_response_id_wins_over_chat_isolation_header(self) -> None: + """When both anchors are present, ``previous_response_id`` wins. + + ``previous_response_id`` is the protocol-native chain anchor; the + header fallback is only meant to bootstrap when no protocol + anchor exists. + """ + client, _host, agent = _make_client() + with client: + client.post( + "/responses", + json={"input": "x", "previous_response_id": "resp_99"}, + headers={"x-agent-chat-isolation-key": "chat-abc"}, + ) + sess = agent.calls[0]["kwargs"].get("session") + assert sess is not None + assert sess["session_id"] == "resp_99" + + def test_response_target_channel_returns_ack_text_when_pushed(self) -> None: + agent = _FakeAgent(reply="real reply") + push_ch = _RecordingPushChannel() + host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel(), push_ch]) + + with TestClient(host.app) as client: + r = client.post( + "/responses", + json={ + "input": "hi", + "response_target": "telegram:42", + }, + ) + assert r.status_code == 200 + body = r.json() + text = body["output"][0]["content"][0]["text"] + assert "delivered out-of-band" in text + assert push_ch.pushes and push_ch.pushes[0][1].result.text == "real reply" + assert push_ch.pushes[0][0].native_id == "42" + + def test_response_hook_can_rewrite_originating_reply(self) -> None: + contexts: list[Any] = [] + + def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: + contexts.append(kwargs["context"]) + return HostedRunResult(_FakeAgentResponse(text=result.result.text.upper()), session=result.session) + + agent = _FakeAgent(reply="hooked") + host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel(response_hook=hook)]) + + with TestClient(host.app) as client: + r = client.post("/responses", json={"input": "hi"}) + + assert r.status_code == 200 + body = r.json() + assert body["output"][0]["content"][0]["text"] == "HOOKED" + assert contexts + assert contexts[0].channel_name == "responses" + assert contexts[0].originating is True + assert contexts[0].destination_identity is None + + +class TestResponsesChannelStreaming: + def test_sse_emits_created_delta_completed(self) -> None: + agent = _FakeAgent(reply="hello world", chunks=["hello", " ", "world"]) + host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel()]) + with TestClient(host.app) as client: + r = client.post("/responses", json={"input": "hi", "stream": True}) + assert r.status_code == 200 + body = r.text + + # SSE event lines look like "event: \ndata: \n\n". + events = [line[len("event: ") :] for line in body.splitlines() if line.startswith("event: ")] + assert events[0] == "response.created" + assert events[-1] == "response.completed" + assert events.count("response.output_text.delta") == 3 + + def test_sse_transform_hook_can_rewrite_chunks(self) -> None: + agent = _FakeAgent(reply="hello", chunks=["he", "llo"]) + + def transform(update: _FakeUpdate) -> _FakeUpdate: + return _FakeUpdate(text=update.text.upper()) + + host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel(stream_transform_hook=transform)]) + with TestClient(host.app) as client: + r = client.post("/responses", json={"input": "hi", "stream": True}) + + assert r.status_code == 200 + assert '"delta":"HE"' in r.text + assert '"delta":"LLO"' in r.text + assert '"text":"HELLO"' in r.text + + def test_sse_emits_failed_when_stream_raises(self) -> None: + # Regression: ResponseOutputMessage.status only accepts in_progress/ + # completed/incomplete, so building an OpenAIResponse with status="failed" + # used to crash with a pydantic ValidationError. The channel must map the + # nested message status to "incomplete" while keeping the top-level + # Response.status="failed". + class _BoomStream: + def __aiter__(self) -> AsyncIterator[_FakeUpdate]: + async def _gen() -> AsyncIterator[_FakeUpdate]: + yield _FakeUpdate("partial") + raise RuntimeError("upstream blew up") + + return _gen() + + async def get_final_response(self) -> _FakeAgentResponse: # pragma: no cover + return _FakeAgentResponse(text="") + + class _BoomAgent(_FakeAgent): + def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + self.calls.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + if stream: + return _BoomStream() + raise AssertionError("non-streaming path not exercised here") + + host = AgentFrameworkHost(target=_BoomAgent(), channels=[ResponsesChannel()]) + with TestClient(host.app) as client: + r = client.post("/responses", json={"input": "hi", "stream": True}) + assert r.status_code == 200 + body = r.text + + events = [line[len("event: ") :] for line in body.splitlines() if line.startswith("event: ")] + assert events[0] == "response.created" + assert events[-1] == "response.failed" + # The failed envelope must serialize cleanly — i.e. no ValidationError raised. + assert "upstream blew up" in body diff --git a/python/packages/hosting-responses/tests/test_parsing.py b/python/packages/hosting-responses/tests/test_parsing.py new file mode 100644 index 00000000000..b2507574bf4 --- /dev/null +++ b/python/packages/hosting-responses/tests/test_parsing.py @@ -0,0 +1,204 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Tests for the OpenAI Responses request-body parser.""" + +from __future__ import annotations + +import pytest +from agent_framework_hosting import ResponseTarget, ResponseTargetKind + +from agent_framework_hosting_responses import ( + messages_from_responses_input, + parse_response_target, + parse_responses_identity, + parse_responses_request, +) + + +class TestMessagesFromResponsesInput: + def test_string_input_becomes_single_user_message(self) -> None: + msgs = messages_from_responses_input("hello") + assert len(msgs) == 1 + assert msgs[0].role == "user" + assert msgs[0].text == "hello" + + def test_input_text_items_collapse_into_one_user_message(self) -> None: + msgs = messages_from_responses_input([{"type": "input_text", "text": "a"}, {"type": "input_text", "text": "b"}]) + assert len(msgs) == 1 + assert msgs[0].role == "user" + assert msgs[0].text == "a b" + + def test_message_envelope_with_string_content(self) -> None: + msgs = messages_from_responses_input([ + {"type": "message", "role": "system", "content": "be brief"}, + {"type": "message", "role": "user", "content": "hi"}, + ]) + assert [m.role for m in msgs] == ["system", "user"] + assert msgs[0].text == "be brief" + + def test_message_envelope_with_content_parts(self) -> None: + msgs = messages_from_responses_input([ + { + "type": "message", + "role": "user", + "content": [{"type": "input_text", "text": "describe this"}], + } + ]) + assert msgs[0].text == "describe this" + + def test_pending_text_flushes_before_message_envelope(self) -> None: + msgs = messages_from_responses_input([ + {"type": "input_text", "text": "first"}, + {"type": "message", "role": "user", "content": "second"}, + ]) + assert len(msgs) == 2 + assert msgs[0].text == "first" + assert msgs[1].text == "second" + + def test_image_url_via_string(self) -> None: + msgs = messages_from_responses_input([{"type": "input_image", "image_url": "https://example.com/cat.png"}]) + assert len(msgs) == 1 + # Image content present. + assert any(getattr(c, "uri", None) == "https://example.com/cat.png" for c in msgs[0].contents) + + def test_image_url_via_object(self) -> None: + msgs = messages_from_responses_input([ + {"type": "input_image", "image_url": {"url": "https://example.com/cat.png"}} + ]) + assert any(getattr(c, "uri", None) == "https://example.com/cat.png" for c in msgs[0].contents) + + def test_unknown_input_type_raises(self) -> None: + with pytest.raises(ValueError, match="Unsupported"): + messages_from_responses_input([{"type": "weird"}]) + + def test_empty_list_raises(self) -> None: + with pytest.raises(ValueError, match="non-empty"): + messages_from_responses_input([]) + + def test_non_string_non_list_raises(self) -> None: + with pytest.raises(ValueError): + messages_from_responses_input(42) # type: ignore[arg-type] + + def test_image_url_missing_raises(self) -> None: + with pytest.raises(ValueError, match="image_url"): + messages_from_responses_input([{"type": "input_image"}]) + + +class TestParseResponsesRequest: + def test_instructions_are_forwarded_as_chat_options(self) -> None: + msgs, opts, sess = parse_responses_request({"input": "hi", "instructions": "be brief"}) + assert len(msgs) == 1 + assert msgs[0].role == "user" + assert msgs[0].text == "hi" + assert opts["instructions"] == "be brief" + assert sess is None + + def test_options_passthrough(self) -> None: + _, opts, _ = parse_responses_request({"input": "x", "temperature": 0.4, "top_p": 0.9, "tool_choice": "auto"}) + assert opts["temperature"] == 0.4 + assert opts["top_p"] == 0.9 + assert opts["tool_choice"] == "auto" + + def test_options_remap(self) -> None: + _, opts, _ = parse_responses_request({"input": "x", "max_output_tokens": 256, "parallel_tool_calls": False}) + assert opts == {"max_tokens": 256, "allow_multiple_tool_calls": False} + + def test_transport_keys_not_forwarded(self) -> None: + _, opts, _ = parse_responses_request({ + "input": "x", + "model": "gpt-x", + "stream": True, + "previous_response_id": "r", + }) + for key in ("input", "model", "stream", "previous_response_id"): + assert key not in opts + + def test_unknown_keys_silently_dropped(self) -> None: + _, opts, _ = parse_responses_request({"input": "x", "truncation": "auto", "reasoning": {"effort": "low"}}) + assert opts == {} + + def test_none_values_dropped(self) -> None: + _, opts, _ = parse_responses_request({"input": "x", "temperature": None}) + assert "temperature" not in opts + + def test_previous_response_id_becomes_session(self) -> None: + _, _, sess = parse_responses_request({"input": "x", "previous_response_id": "resp_42"}) + assert sess is not None + assert sess.isolation_key == "resp_42" + + +class TestParseResponseTarget: + def test_default_originating_when_missing(self) -> None: + assert parse_response_target({}).kind is ResponseTargetKind.ORIGINATING + + @pytest.mark.parametrize( + "value,expected_kind", + [ + ("originating", ResponseTargetKind.ORIGINATING), + ("active", ResponseTargetKind.ACTIVE), + ("all_linked", ResponseTargetKind.ALL_LINKED), + ("none", ResponseTargetKind.NONE), + ], + ) + def test_bare_string_kinds(self, value: str, expected_kind: ResponseTargetKind) -> None: + assert parse_response_target({"response_target": value}).kind is expected_kind + + def test_bare_string_other_becomes_channel(self) -> None: + target = parse_response_target({"response_target": "telegram"}) + assert target == ResponseTarget.channel("telegram") + + def test_bare_string_with_native_id_becomes_channel(self) -> None: + target = parse_response_target({"response_target": "telegram:42"}) + assert target.kind is ResponseTargetKind.CHANNELS + assert target.targets == ("telegram:42",) + + def test_list_form(self) -> None: + target = parse_response_target({"response_target": ["telegram:42", "originating"]}) + assert target == ResponseTarget.channels(["telegram:42", "originating"]) + + def test_list_drops_non_strings(self) -> None: + target = parse_response_target({"response_target": ["telegram", 42, ""]}) + assert target.targets == ("telegram",) + + def test_empty_list_falls_back_to_originating(self) -> None: + target = parse_response_target({"response_target": []}) + assert target.kind is ResponseTargetKind.ORIGINATING + + def test_dict_with_channels(self) -> None: + target = parse_response_target({"response_target": {"channels": ["a", "b"]}}) + assert target == ResponseTarget.channels(["a", "b"]) + + @pytest.mark.parametrize( + "kind,expected", + [ + ("active", ResponseTargetKind.ACTIVE), + ("all_linked", ResponseTargetKind.ALL_LINKED), + ("none", ResponseTargetKind.NONE), + ("originating", ResponseTargetKind.ORIGINATING), + ], + ) + def test_dict_kind(self, kind: str, expected: ResponseTargetKind) -> None: + assert parse_response_target({"response_target": {"kind": kind}}).kind is expected + + def test_malformed_falls_back_to_originating(self) -> None: + target = parse_response_target({"response_target": 42}) + assert target.kind is ResponseTargetKind.ORIGINATING + + +class TestParseResponsesIdentity: + def test_safety_identifier_preferred(self) -> None: + ident = parse_responses_identity({"safety_identifier": "abc", "user": "legacy"}, "responses") + assert ident is not None + assert ident.native_id == "abc" + assert ident.channel == "responses" + + def test_fallback_to_user(self) -> None: + ident = parse_responses_identity({"user": "legacy"}, "responses") + assert ident is not None + assert ident.native_id == "legacy" + + def test_returns_none_when_absent(self) -> None: + assert parse_responses_identity({}, "responses") is None + + def test_returns_none_for_non_string(self) -> None: + assert parse_responses_identity({"safety_identifier": 42}, "responses") is None diff --git a/python/uv.lock b/python/uv.lock index a2b7ce16f27..f9b7388f420 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -51,6 +51,7 @@ members = [ "agent-framework-gemini", "agent-framework-github-copilot", "agent-framework-hosting", + "agent-framework-hosting-responses", "agent-framework-hyperlight", "agent-framework-lab", "agent-framework-mem0", @@ -647,6 +648,23 @@ provides-extras = ["serve", "disk"] [package.metadata.requires-dev] dev = [{ name = "httpx", specifier = ">=0.28.1" }] +[[package]] +name = "agent-framework-hosting-responses" +version = "1.0.0a260424" +source = { editable = "packages/hosting-responses" } +dependencies = [ + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "openai", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + +[package.metadata] +requires-dist = [ + { name = "agent-framework-core", editable = "packages/core" }, + { name = "agent-framework-hosting", editable = "packages/hosting" }, + { name = "openai", specifier = ">=1.99.0,<3" }, +] + [[package]] name = "agent-framework-hyperlight" version = "1.0.0b260521" From 07717d31245c287f2826eb255896ad70d9cd2b2f Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Thu, 28 May 2026 14:08:34 +0200 Subject: [PATCH 06/20] Python: add agent-framework-hosting-invocations channel (#5640) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(hosting-invocations): add Invocations channel package New ``agent-framework-hosting-invocations`` package implementing the "Invocations" HTTP channel for the Hosting framework -- a lightweight JSON-over-HTTP shape (``POST /invocations``) for callers that want a single request/response without committing to the full OpenAI Responses envelope. Mounts onto an ``AgentFrameworkHost`` like any other channel. Surface (re-exported from ``agent_framework_hosting_invocations``): - ``InvocationsChannel`` -- concrete ``Channel`` implementation. Owns the Starlette route, parses inbound JSON into a ``ChannelRequest`` (``input`` / ``session`` / ``metadata`` / ``options``), runs the optional ``ChannelRunHook``, calls back into the ``ChannelContext`` to invoke the agent target, and returns a flat JSON envelope (or an SSE stream when ``stream=true``). - 8 unit tests covering route wiring, isolation-key passthrough, hook composition, sync vs streaming paths, and ack-only behaviour for non-originating ``DeliveryReport``s. Registers the package in ``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching pyright ``executionEnvironments`` entry. Independent of PR-3 (Responses); both depend only on PR-2 (Hosting core). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * review: address PR-4 round 2 feedback - expand `_stream` docstring to call out the HTTP-200 + `event: error` SSE contract (status committed before generator runs; hard failures surface as the first SSE frame, not an HTTP code) - split chunked text on full-line terminators via `splitlines()` so embedded `\r` / `\r\n` no longer leak into `data:` framing on the wire, breaking EventSource consumers - on `get_final_response()` failure, emit `event: error` instead of silently swallowing — finalize is what triggers history-provider persistence on the agent side, so a 5xx / disk-full / context-provider error must reach the client - add tests covering `stream_transform_hook` (rewrite, drop, async), CRLF-in-chunk framing, and the finalize-error → no-`[DONE]` contract Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting-invocations): rename stale ChatMessage docstring reference to Message Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting-invocations): adapt to hosted run result wrapper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting-invocations): add response hooks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/hosting-invocations/LICENSE | 21 ++ python/packages/hosting-invocations/README.md | 30 ++ .../__init__.py | 7 + .../_channel.py | 219 +++++++++++++++ .../hosting-invocations/pyproject.toml | 97 +++++++ .../hosting-invocations/tests/__init__.py | 0 .../hosting-invocations/tests/test_channel.py | 256 ++++++++++++++++++ python/pyproject.toml | 2 + python/uv.lock | 11 +- 9 files changed, 642 insertions(+), 1 deletion(-) create mode 100644 python/packages/hosting-invocations/LICENSE create mode 100644 python/packages/hosting-invocations/README.md create mode 100644 python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py create mode 100644 python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py create mode 100644 python/packages/hosting-invocations/pyproject.toml create mode 100644 python/packages/hosting-invocations/tests/__init__.py create mode 100644 python/packages/hosting-invocations/tests/test_channel.py diff --git a/python/packages/hosting-invocations/LICENSE b/python/packages/hosting-invocations/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting-invocations/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting-invocations/README.md b/python/packages/hosting-invocations/README.md new file mode 100644 index 00000000000..5587a2b6365 --- /dev/null +++ b/python/packages/hosting-invocations/README.md @@ -0,0 +1,30 @@ +# agent-framework-hosting-invocations + +Minimal `POST /invoke` channel for [agent-framework-hosting](../hosting). Useful +for smoke-testing, durable-task drivers, and bespoke clients that don't speak +the OpenAI Responses protocol. + +## Wire shape + +``` +POST /invocations/invoke +{ + "message": "hello", + "session_id": "user-42", + "stream": false +} +``` + +Non-streaming response: `{"response": "...", "session_id": "..."}`. +Streaming response: `text/event-stream` of `data:` lines, terminated by +`data: [DONE]`. + +## Usage + +```python +from agent_framework_hosting import AgentFrameworkHost +from agent_framework_hosting_invocations import InvocationsChannel + +host = AgentFrameworkHost(target=my_agent, channels=[InvocationsChannel()]) +host.serve() +``` diff --git a/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py b/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py new file mode 100644 index 00000000000..2ad7b4be911 --- /dev/null +++ b/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py @@ -0,0 +1,7 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Minimal ``POST /invoke`` channel for :mod:`agent_framework_hosting`.""" + +from ._channel import InvocationsChannel + +__all__ = ["InvocationsChannel"] diff --git a/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py b/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py new file mode 100644 index 00000000000..bbaf27b4959 --- /dev/null +++ b/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py @@ -0,0 +1,219 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Minimal ``POST /invoke`` channel. + +Inspired by ``agent-framework-foundry-hosting``'s ``InvocationsHostServer``. +A framework-agnostic surface for callers that just want to send a message and +get an answer back — no OpenAI-style envelope, no Responses item lattice. +""" + +from __future__ import annotations + +from collections.abc import AsyncIterator, Awaitable +from typing import Any, cast + +from agent_framework_hosting import ( + ChannelContext, + ChannelContribution, + ChannelRequest, + ChannelResponseContext, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + ChannelStreamTransformHook, + HostedRunResult, + apply_response_hook, + apply_run_hook, + logger, +) +from starlette.requests import Request +from starlette.responses import JSONResponse, Response, StreamingResponse +from starlette.routing import Route + + +class InvocationsChannel: + """Minimal ``POST /invoke`` surface. + + A run hook can rewrite the channel request (e.g. inject a session, add + options) before the host invokes the agent. A stream-transform hook can + rewrite or drop ``AgentResponseUpdate`` chunks before they hit the wire. + """ + + name = "invocations" + + def __init__( + self, + *, + path: str = "/invocations", + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + stream_transform_hook: ChannelStreamTransformHook | None = None, + ) -> None: + """Configure the invocations endpoint. + + ``path`` is the mount root the host prefixes when registering this + channel's routes (the actual handler is ``POST {path}/invoke``). + ``run_hook`` may rewrite the :class:`ChannelRequest` before the host + invokes the target — typically to attach session metadata or + translate the wire payload into ``Message`` instances. + ``response_hook`` may rewrite the :class:`HostedRunResult` before + the channel serializes it to JSON for the originating caller. + ``stream_transform_hook`` lets callers map or drop individual + ``AgentResponseUpdate`` chunks while streaming. + """ + self.path = path + self._hook = run_hook + self.response_hook = response_hook + self._stream_transform_hook = stream_transform_hook + self._ctx: ChannelContext | None = None + + def contribute(self, context: ChannelContext) -> ChannelContribution: + """Capture the host-supplied context and register ``POST /invoke``.""" + self._ctx = context + return ChannelContribution(routes=[Route("/invoke", self._handle, methods=["POST"])]) + + async def _handle(self, request: Request) -> Response: + """Handle a single ``POST /invoke`` call. + + Validates the JSON body shape, builds a :class:`ChannelRequest` + (optionally with a ``ChannelSession`` keyed by ``session_id``), + runs the configured ``run_hook``, and either streams SSE chunks + when ``stream`` is true or returns a single JSON ``{response, + session_id}`` envelope. + """ + if self._ctx is None: # pragma: no cover - guarded by Channel lifecycle + return JSONResponse({"error": "channel not initialized"}, status_code=500) + try: + body: Any = await request.json() + except Exception: + return JSONResponse({"error": "invalid json"}, status_code=400) + + if not isinstance(body, dict): + return JSONResponse({"error": "request body must be an object"}, status_code=422) + body_map: dict[str, Any] = cast("dict[str, Any]", body) + + message = body_map.get("message") + if not isinstance(message, str) or not message: + return JSONResponse({"error": "missing or empty 'message'"}, status_code=422) + + session_id = body_map.get("session_id") + if session_id is not None and not isinstance(session_id, str): + return JSONResponse({"error": "'session_id' must be a string"}, status_code=422) + + session = ChannelSession(isolation_key=f"invocations:{session_id}") if session_id else None + + attributes: dict[str, Any] = {} + if session_id: + attributes["session_id"] = session_id + + channel_request = ChannelRequest( + channel=self.name, + operation="invoke", + input=message, + session=session, + stream=bool(body_map.get("stream")), + attributes=attributes, + ) + + if self._hook is not None: + channel_request = await apply_run_hook( + self._hook, + channel_request, + target=self._ctx.target, + protocol_request=body_map, + ) + + if channel_request.stream: + return StreamingResponse( + self._stream(channel_request), + media_type="text/event-stream", + headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"}, + ) + + result = await self._ctx.run(channel_request) + result = await self._apply_response_hook(result, channel_request) + return JSONResponse({"response": result.result.text, "session_id": session_id}) + + async def _apply_response_hook( + self, + result: HostedRunResult[Any], + request: ChannelRequest, + ) -> HostedRunResult[Any]: + """Apply the channel-level response hook for an originating reply.""" + if self.response_hook is None: + return result + context = ChannelResponseContext( + request=request, + channel_name=self.name, + destination_identity=None, + originating=True, + is_echo=False, + ) + return await apply_response_hook(self.response_hook, result, context=context) + + async def _stream(self, request: ChannelRequest) -> AsyncIterator[str]: + r"""Yield bare ``data:`` SSE lines for each text chunk + a final ``[DONE]``. + + SSE protocol notes: + + * The HTTP status is committed when ASGI sends headers, before the + generator runs. Emitting a stream-opening 200 + ``text/event-stream`` + and signalling errors via ``event: error`` SSE frames is the + conventional contract — ``EventSource`` and OpenAI-style SSE + consumers treat ``event: error`` as a terminal error condition. + Hard run-acquisition failures (e.g. target rejected) therefore + surface as the first frame, not as an HTTP error code. + * The SSE spec treats ``\r``, ``\n``, and ``\r\n`` as line + terminators. Per-chunk text is split on all three so embedded + carriage returns don't corrupt ``data:`` framing on the wire. + """ + if self._ctx is None: # pragma: no cover - guarded by Channel lifecycle + yield "event: error\ndata: channel not initialized\n\n" + return + try: + stream = self._ctx.run_stream(request) + async for update in stream: + if self._stream_transform_hook is not None: + transformed = self._stream_transform_hook(update) + if isinstance(transformed, Awaitable): + transformed = await transformed + if transformed is None: + continue + update = transformed + chunk = getattr(update, "text", None) + if chunk: + # Each text chunk is its own SSE event so curl-friendly + # consumers can read it directly. Newlines inside the + # chunk are escaped per SSE spec by emitting one + # ``data:`` line per source line. ``splitlines()`` is + # used over ``split('\n')`` so embedded ``\r`` / + # ``\r\n`` don't bleed into the framing. + for line in str(chunk).splitlines() or [""]: + yield f"data: {line}\n" + yield "\n" + try: + # Finalize so context-provider / history hooks on the agent + # still run even though we are emitting our own SSE. + # If finalization fails, the agent's persistence side + # effects (history-provider write, context-provider hooks) + # are unreliable — surface that to the client as an + # ``event: error`` frame so it isn't a silent drop. + await stream.get_final_response() + except Exception as finalize_exc: + logger.exception("Invocations stream finalize failed") + yield "event: error\n" + for line in f"finalize failed: {finalize_exc!s}".splitlines() or [""]: + yield f"data: {line}\n" + yield "\n" + return + except Exception as exc: + logger.exception("Invocations stream consumption failed") + yield "event: error\n" + for line in str(exc).splitlines() or [""]: + yield f"data: {line}\n" + yield "\n" + return + yield "data: [DONE]\n\n" + + +__all__ = ["InvocationsChannel"] diff --git a/python/packages/hosting-invocations/pyproject.toml b/python/packages/hosting-invocations/pyproject.toml new file mode 100644 index 00000000000..80cb40bfc17 --- /dev/null +++ b/python/packages/hosting-invocations/pyproject.toml @@ -0,0 +1,97 @@ +[project] +name = "agent-framework-hosting-invocations" +description = "Minimal POST /invoke channel for agent-framework-hosting." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "agent-framework-hosting==1.0.0a260424", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting_invocations"] +exclude = ['tests'] + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting_invocations"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_invocations" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_invocations --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" diff --git a/python/packages/hosting-invocations/tests/__init__.py b/python/packages/hosting-invocations/tests/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/python/packages/hosting-invocations/tests/test_channel.py b/python/packages/hosting-invocations/tests/test_channel.py new file mode 100644 index 00000000000..cdd3403850a --- /dev/null +++ b/python/packages/hosting-invocations/tests/test_channel.py @@ -0,0 +1,256 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""End-to-end tests for :class:`InvocationsChannel`.""" + +from __future__ import annotations + +from collections.abc import AsyncIterator +from dataclasses import dataclass, replace +from typing import Any + +from agent_framework_hosting import AgentFrameworkHost, ChannelRequest, HostedRunResult +from starlette.testclient import TestClient + +from agent_framework_hosting_invocations import InvocationsChannel + + +@dataclass +class _FakeAgentResponse: + text: str + + +@dataclass +class _FakeUpdate: + text: str + + +class _FakeStream: + def __init__(self, chunks: list[str]) -> None: + self._chunks = chunks + self._final = _FakeAgentResponse(text="".join(chunks)) + + def __aiter__(self) -> AsyncIterator[_FakeUpdate]: + async def _gen() -> AsyncIterator[_FakeUpdate]: + for c in self._chunks: + yield _FakeUpdate(c) + + return _gen() + + async def get_final_response(self) -> _FakeAgentResponse: + return self._final + + +class _FakeAgent: + def __init__(self, reply: str = "hi", chunks: list[str] | None = None) -> None: + self._reply = reply + self._chunks = chunks or [reply] + self.calls: list[dict[str, Any]] = [] + + def create_session(self, *, session_id: str | None = None) -> Any: + return {"session_id": session_id} + + def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + self.calls.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + if stream: + return _FakeStream(self._chunks) + + async def _coro() -> _FakeAgentResponse: + return _FakeAgentResponse(text=self._reply) + + return _coro() + + +def _make_client(agent: _FakeAgent | None = None) -> tuple[TestClient, _FakeAgent]: + agent = agent or _FakeAgent() + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) + return TestClient(host.app), agent + + +class TestInvocations: + def test_post_invoke_returns_response(self) -> None: + client, _agent = _make_client(_FakeAgent(reply="pong")) + with client: + r = client.post("/invocations/invoke", json={"message": "ping"}) + assert r.status_code == 200 + assert r.json() == {"response": "pong", "session_id": None} + + def test_session_id_propagates_to_target(self) -> None: + client, agent = _make_client() + with client: + r = client.post("/invocations/invoke", json={"message": "x", "session_id": "s1"}) + assert r.status_code == 200 + assert r.json()["session_id"] == "s1" + sess = agent.calls[0]["kwargs"].get("session") + # Host converts ChannelSession.isolation_key -> AgentSession via + # target.create_session(session_id=...). Our fake stashes that here. + assert sess is not None + assert sess["session_id"] == "invocations:s1" + + def test_invalid_json_returns_400(self) -> None: + client, _ = _make_client() + with client: + r = client.post( + "/invocations/invoke", + content=b"{not json", + headers={"content-type": "application/json"}, + ) + assert r.status_code == 400 + + def test_empty_message_returns_422(self) -> None: + client, _ = _make_client() + with client: + r = client.post("/invocations/invoke", json={"message": ""}) + assert r.status_code == 422 + + def test_non_string_session_id_returns_422(self) -> None: + client, _ = _make_client() + with client: + r = client.post("/invocations/invoke", json={"message": "x", "session_id": 1}) + assert r.status_code == 422 + + def test_non_object_body_returns_422(self) -> None: + client, _ = _make_client() + with client: + r = client.post("/invocations/invoke", json=[]) + assert r.status_code == 422 + + def test_streaming_emits_data_lines_and_done(self) -> None: + agent = _FakeAgent(chunks=["hel", "lo"]) + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + body = r.text + assert "data: hel" in body + assert "data: lo" in body + assert body.rstrip().endswith("data: [DONE]") + + def test_run_hook_can_rewrite_request(self) -> None: + captured: list[ChannelRequest] = [] + + async def hook(req: ChannelRequest, **_: Any) -> ChannelRequest: + captured.append(req) + # Force stream off even if requested. + return replace(req, stream=False) + + agent = _FakeAgent(reply="ok") + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel(run_hook=hook)]) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + # Even though caller asked for stream=True, hook flipped it off — so + # we get JSON back, not SSE. + assert r.headers["content-type"].startswith("application/json") + assert captured and captured[0].channel == "invocations" + + def test_response_hook_can_rewrite_originating_reply(self) -> None: + contexts: list[Any] = [] + + def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: + contexts.append(kwargs["context"]) + return HostedRunResult(_FakeAgentResponse(text=f"hooked:{result.result.text}"), session=result.session) + + agent = _FakeAgent(reply="pong") + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel(response_hook=hook)]) + + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "ping"}) + + assert r.status_code == 200 + assert r.json() == {"response": "hooked:pong", "session_id": None} + assert contexts + assert contexts[0].channel_name == "invocations" + assert contexts[0].originating is True + assert contexts[0].destination_identity is None + + def test_stream_transform_hook_can_rewrite_chunks(self) -> None: + agent = _FakeAgent(chunks=["foo", "bar"]) + + def transform(update: Any) -> Any: + return _FakeUpdate(text=update.text.upper()) + + host = AgentFrameworkHost( + target=agent, + channels=[InvocationsChannel(stream_transform_hook=transform)], + ) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + body = r.text + assert "data: FOO" in body + assert "data: BAR" in body + assert "data: foo" not in body + + def test_stream_transform_hook_can_drop_chunks(self) -> None: + agent = _FakeAgent(chunks=["keep", "drop", "keep2"]) + + def transform(update: Any) -> Any: + return None if update.text == "drop" else update + + host = AgentFrameworkHost( + target=agent, + channels=[InvocationsChannel(stream_transform_hook=transform)], + ) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + body = r.text + assert "data: keep" in body + assert "data: keep2" in body + assert "data: drop" not in body + + def test_stream_transform_hook_supports_async(self) -> None: + agent = _FakeAgent(chunks=["aa"]) + + async def transform(update: Any) -> Any: + return _FakeUpdate(text=update.text + "!") + + host = AgentFrameworkHost( + target=agent, + channels=[InvocationsChannel(stream_transform_hook=transform)], + ) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + assert "data: aa!" in r.text + + def test_streaming_chunk_with_crlf_splits_into_separate_data_lines(self) -> None: + # Per SSE spec, ``\r``, ``\n`` and ``\r\n`` are all line terminators; + # a chunk like ``"line1\r\nline2"`` must produce two ``data:`` lines, + # not one ``data:`` line containing an embedded ``\r``. + agent = _FakeAgent(chunks=["line1\r\nline2"]) + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + body = r.text + assert "data: line1\n" in body + assert "data: line2\n" in body + assert "\r" not in body.split("data: [DONE]")[0] + + def test_streaming_finalize_error_emits_error_frame_no_done(self) -> None: + # ``get_final_response()`` is what triggers history-provider + # persistence on the agent side; if it fails we must surface that + # to the client as ``event: error`` rather than emitting ``[DONE]`` + # as if the run completed cleanly. + class _FailingFinalStream(_FakeStream): + async def get_final_response(self) -> _FakeAgentResponse: + raise RuntimeError("history backend exploded") + + class _AgentWithFailingFinal(_FakeAgent): + def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + self.calls.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + if stream: + return _FailingFinalStream(["partial"]) + return super().run(messages, stream=stream, **kwargs) + + agent = _AgentWithFailingFinal() + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) + with TestClient(host.app) as client: + r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + assert r.status_code == 200 + body = r.text + assert "data: partial" in body + assert "event: error" in body + assert "history backend exploded" in body + assert "[DONE]" not in body diff --git a/python/pyproject.toml b/python/pyproject.toml index ca53fbf8ebd..5f51bf598e5 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -88,6 +88,7 @@ agent-framework-foundry-local = { workspace = true } agent-framework-gemini = { workspace = true } agent-framework-github-copilot = { workspace = true } agent-framework-hosting = { workspace = true } +agent-framework-hosting-invocations = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } agent-framework-mem0 = { workspace = true } @@ -212,6 +213,7 @@ executionEnvironments = [ { root = "packages/foundry_local/tests", reportPrivateUsage = "none" }, { root = "packages/github_copilot/tests", reportPrivateUsage = "none" }, { root = "packages/hosting/tests", reportPrivateUsage = "none" }, + { root = "packages/hosting-invocations/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/uv.lock b/python/uv.lock index f9b7388f420..5c9fb2686ac 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -52,6 +52,7 @@ members = [ "agent-framework-github-copilot", "agent-framework-hosting", "agent-framework-hosting-responses", + "agent-framework-hosting-invocations", "agent-framework-hyperlight", "agent-framework-lab", "agent-framework-mem0", @@ -656,7 +657,6 @@ dependencies = [ { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "openai", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, -] [package.metadata] requires-dist = [ @@ -665,6 +665,15 @@ requires-dist = [ { name = "openai", specifier = ">=1.99.0,<3" }, ] +[[package]] +name = "agent-framework-hosting-invocations" +version = "1.0.0a260424" +source = { editable = "packages/hosting-invocations" } +dependencies = [ + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + [[package]] name = "agent-framework-hyperlight" version = "1.0.0b260521" From fbcf0619d550c2fe2df44028cdffc2ffd595bba3 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Thu, 28 May 2026 14:28:30 +0200 Subject: [PATCH 07/20] Python: add agent-framework-hosting-telegram channel (#5643) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(hosting-telegram): add Telegram channel package New ``agent-framework-hosting-telegram`` package implementing the Telegram Bot API channel for the Hosting framework. Mounts a webhook endpoint (``POST /telegram/webhook``) and an in-process polling loop onto an ``AgentFrameworkHost`` and translates Telegram ``Update`` payloads to/from the channel-neutral ``ChannelRequest`` / ``HostedRunResult`` plumbing. Surface (re-exported from ``agent_framework_hosting_telegram``): - ``TelegramChannel`` -- concrete ``Channel`` implementation. Owns the webhook route + an optional ``getUpdates`` long-polling lifespan, parses Telegram ``Update``s into ``ChannelRequest`` (text, photo, document, voice, callback_query, …), runs the optional ``ChannelRunHook``, calls back into the ``ChannelContext`` to invoke the agent target, and posts the response back via ``sendMessage`` / ``sendChatAction`` / ``answerCallbackQuery`` on the Telegram Bot API. Honours ``DeliveryReport.include_originating`` so cross-channel pushes can target the originating Telegram chat without double-acking. - Native fields the channel doesn't lift onto ``ChannelRequest`` (e.g. ``chat.type``, ``message.message_id``, ``callback_query.data``) are attached to ``ChannelRequest.attributes`` so a ``ChannelRunHook`` can pick them up via the standard ``protocol_request=`` kwarg. - 13 unit tests covering route wiring, ``Update`` parsing across the common content shapes, hook composition, and originating vs non-originating delivery branches. Registers the package in ``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching pyright ``executionEnvironments`` entry. Stacks on PR-2 (Hosting core); independent of PR-3 / PR-4. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting-telegram): preserve in-chat ordering, ack-before-run, drain shutdown - Replace per-update task fan-out with per-chat asyncio.Queue + worker. Telegram only guarantees update ordering up to getUpdates; the previous code spawned one task per update, which broke ordering for adjacent updates in the same chat. Updates are now serialised per chat_id (so /start then "what's the weather" can't race) while different chats still process in parallel. - Webhook handler now acks (200) immediately and runs the agent in the per-chat worker. Telegram redelivers any update the webhook doesn't 200 within ~60 seconds, so a streamed agent reply that runs longer than that previously triggered a retry storm and duplicate replies. - _on_shutdown now drains everything: poll task → per-chat workers → webhook-spawned dispatcher tasks (the new ack-before-run path), then deletes the webhook + closes the HTTP client. Previously webhook tasks were not tracked at all, so an in-flight agent invocation could leak past app shutdown. - _enqueue_update extracts chat_id from message / edited_message / callback_query; updates with no resolvable chat fall back to a one-shot dispatcher task that's still tracked in _update_tasks for shutdown. - Webhook handler now also returns 400 on malformed JSON / non-object payloads instead of crashing the request. 4 new tests cover per-chat serial ordering, parallel-across-chats isolation, ack-before-run latency, and shutdown drain. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(hosting): drop redundant @pytest.mark.asyncio decorators asyncio_mode = "auto" is configured in pyproject.toml across the hosting packages, so individual @pytest.mark.asyncio decorators are unnecessary. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting-telegram): adapt push tests to hosted run result wrapper Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting-telegram): add response hooks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/hosting-telegram/LICENSE | 21 + python/packages/hosting-telegram/README.md | 29 + .../__init__.py | 7 + .../_channel.py | 873 ++++++++++++++++++ .../packages/hosting-telegram/pyproject.toml | 107 +++ .../hosting-telegram/tests/__init__.py | 0 .../hosting-telegram/tests/test_channel.py | 389 ++++++++ python/pyproject.toml | 2 + python/uv.lock | 12 + 9 files changed, 1440 insertions(+) create mode 100644 python/packages/hosting-telegram/LICENSE create mode 100644 python/packages/hosting-telegram/README.md create mode 100644 python/packages/hosting-telegram/agent_framework_hosting_telegram/__init__.py create mode 100644 python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py create mode 100644 python/packages/hosting-telegram/pyproject.toml create mode 100644 python/packages/hosting-telegram/tests/__init__.py create mode 100644 python/packages/hosting-telegram/tests/test_channel.py diff --git a/python/packages/hosting-telegram/LICENSE b/python/packages/hosting-telegram/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting-telegram/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting-telegram/README.md b/python/packages/hosting-telegram/README.md new file mode 100644 index 00000000000..aa97e074173 --- /dev/null +++ b/python/packages/hosting-telegram/README.md @@ -0,0 +1,29 @@ +# agent-framework-hosting-telegram + +Telegram channel for [agent-framework-hosting](../hosting). Supports both +**polling** (default — no public URL required, perfect for local dev) and +**webhook** transports, multi-content messages (text + media), command +registration, and end-to-end SSE-style streaming via Telegram message edits. + +## Usage + +```python +from agent_framework_hosting import AgentFrameworkHost +from agent_framework_hosting_telegram import TelegramChannel + +host = AgentFrameworkHost( + target=my_agent, + channels=[TelegramChannel(bot_token="...")], +) +host.serve() +``` + +For production, configure `webhook_url="https://…"` and the channel will +register the webhook on startup and receive updates over HTTPS. + +## Identity & sessions + +Each Telegram chat is mapped to an opaque isolation key +(`telegram:`) so other channels can opt into the same per-chat +session by reusing the same key. The helper `telegram_isolation_key(chat_id)` +is exported for that purpose. diff --git a/python/packages/hosting-telegram/agent_framework_hosting_telegram/__init__.py b/python/packages/hosting-telegram/agent_framework_hosting_telegram/__init__.py new file mode 100644 index 00000000000..0be4a26381b --- /dev/null +++ b/python/packages/hosting-telegram/agent_framework_hosting_telegram/__init__.py @@ -0,0 +1,7 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Telegram channel for :mod:`agent_framework_hosting`.""" + +from ._channel import TelegramChannel, telegram_isolation_key + +__all__ = ["TelegramChannel", "telegram_isolation_key"] diff --git a/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py b/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py new file mode 100644 index 00000000000..2e7c20fb937 --- /dev/null +++ b/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py @@ -0,0 +1,873 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Built-in channel: Telegram (polling + webhook transports). + +Inspired by PR #5393's Telegram sample. Two transports are supported: + +- ``polling`` (default when no ``webhook_url`` is set): the channel runs a + background ``getUpdates`` long-poll loop. No public URL required — + perfect for local development. This is what ``python-telegram-bot`` + uses by default. +- ``webhook``: when ``webhook_url`` is set, the channel registers it via + ``setWebhook`` on startup and receives updates over HTTPS POSTs to the + mounted ``/webhook`` route. This is the production-recommended mode. +""" + +from __future__ import annotations + +import asyncio +import contextlib +import time +from collections.abc import Awaitable, Callable, Mapping, Sequence +from typing import Any, Literal + +import httpx +from agent_framework import ( + AgentResponse, + AgentResponseUpdate, + Content, + Message, + ResponseStream, +) +from agent_framework_hosting import ( + ChannelCommand, + ChannelCommandContext, + ChannelContext, + ChannelContribution, + ChannelIdentity, + ChannelRequest, + ChannelResponseContext, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + ChannelStreamTransformHook, + HostedRunResult, + apply_response_hook, + apply_run_hook, + logger, +) +from starlette.requests import Request +from starlette.responses import JSONResponse, Response +from starlette.routing import BaseRoute, Route + +# Telegram update parsing ------------------------------------------------------ +# +# A Telegram message can carry text, a caption, and one of several media kinds +# (photo, document, voice, audio, video). For media we resolve the file_id +# into a public bot-file URL via ``getFile`` and emit a ``Content.from_uri``; +# the agent then receives a multi-content Message with text + media side by +# side, the same as it would over the Responses API. + +_TELEGRAM_MEDIA_DEFAULT_MIMETYPE = { + "photo": "image/jpeg", + "document": "application/octet-stream", + "voice": "audio/ogg", + "audio": "audio/mpeg", + "video": "video/mp4", +} + +# Telegram's hard limit on a single message body. Past this, sendMessage / +# editMessageText return 400. We truncate interim and final edits at this +# boundary; if the agent emits more, callers can split into a follow-up +# sendMessage in their run hook. +_TELEGRAM_MAX_TEXT_LEN = 4096 + + +def telegram_isolation_key(chat_id: Any) -> str: + """Build the namespaced isolation key the Telegram channel writes under. + + Exposed at module scope so other channels' ``run_hook`` callbacks can opt + into the same per-chat session (e.g. a Responses caller resuming a + Telegram conversation by passing the chat id). + """ + return f"telegram:{chat_id}" + + +def _text_result(text: str) -> HostedRunResult[AgentResponse]: + """Build a host delivery payload from text accumulated by this channel.""" + return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) + + +def _telegram_media_file_id(message: Mapping[str, Any]) -> tuple[str, str] | None: + """Return ``(file_id, fallback_media_type)`` for any media on the message.""" + photo = message.get("photo") + if isinstance(photo, list) and photo: + # Telegram delivers photos as an array of progressively-larger sizes. + largest = photo[-1] + if isinstance(largest, Mapping) and (fid := largest.get("file_id")): + return str(fid), _TELEGRAM_MEDIA_DEFAULT_MIMETYPE["photo"] + for kind in ("document", "voice", "audio", "video"): + media = message.get(kind) + if media and isinstance(media, Mapping) and (fid := media.get("file_id")): + return str(fid), str(media.get("mime_type") or _TELEGRAM_MEDIA_DEFAULT_MIMETYPE[kind]) + return None + + +async def _parse_telegram_message( + message: Mapping[str, Any], + resolve_file_url: Callable[[str], Awaitable[str | None]], +) -> Message: + """Translate one Telegram ``message`` object into an Agent Framework Message.""" + parts: list[Content] = [] + if (text := message.get("text") or message.get("caption")) and isinstance(text, str): + parts.append(Content.from_text(text=text)) + + if (media := _telegram_media_file_id(message)) is not None: + file_id, media_type = media + if (uri := await resolve_file_url(file_id)) is not None: + parts.append(Content.from_uri(uri=uri, media_type=media_type)) + + if not parts: + # Edge case: no recognizable content — emit an empty placeholder so the + # agent contract still receives a Message and can react gracefully. + parts.append(Content.from_text(text="")) + return Message("user", parts) + + +class TelegramChannel: + """Telegram channel with both polling and webhook transports. + + Update kinds handled (both transports): + - ``message`` / ``edited_message`` — text, captions, and media + (photo/document/voice/audio/video). + - ``callback_query`` — inline-button presses; the ``data`` payload is + treated as the user's next utterance and the click is acknowledged. + + Streaming + --------- + The channel defaults to ``stream=True`` on every ``ChannelRequest``: it + drives ``ChannelContext.run_stream`` and progressively edits a single + Telegram message as ``AgentResponseUpdate`` chunks arrive (Telegram has + no native streaming primitive). Pass ``stream=False`` on the constructor + to opt out for all messages, or override per-request inside the + ``run_hook`` (set ``ChannelRequest.stream = False``). A ``stream_transform_hook`` + can rewrite or drop individual updates before they hit the wire — useful + for redaction, formatting, or merging tool-call deltas. + """ + + name = "telegram" + + def __init__( + self, + *, + bot_token: str, + path: str = "/telegram", + commands: Sequence[ChannelCommand] = (), + register_native_commands: bool = True, + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + api_base: str = "https://api.telegram.org", + webhook_url: str | None = None, + secret_token: str | None = None, + parse_mode: str | None = None, + send_typing_action: bool = True, + transport: Literal["auto", "polling", "webhook"] = "auto", + polling_timeout: int = 30, + stream: bool = True, + stream_transform_hook: ChannelStreamTransformHook | None = None, + stream_edit_min_interval: float = 0.4, + ) -> None: + self.path = path + self._token = bot_token + self._commands = list(commands) + self._register = register_native_commands + self._hook = run_hook + self.response_hook = response_hook + self._stream_default = stream + self._stream_transform_hook = stream_transform_hook + self._stream_edit_min_interval = stream_edit_min_interval + self._api = f"{api_base}/bot{bot_token}" + self._webhook_url = webhook_url + self._secret_token = secret_token + self._parse_mode = parse_mode + self._send_typing_action = send_typing_action + if transport == "auto": + transport = "webhook" if webhook_url else "polling" + if transport == "webhook" and not webhook_url: + raise ValueError("transport='webhook' requires webhook_url") + self._transport: Literal["polling", "webhook"] = transport + self._polling_timeout = polling_timeout + self._ctx: ChannelContext | None = None + self._http: httpx.AsyncClient | None = None + self._poll_task: asyncio.Task[None] | None = None + # Tracks all in-flight tasks (per-chat workers + webhook-spawned + # dispatcher tasks). Drained on shutdown. + self._update_tasks: set[asyncio.Task[None]] = set() + # Per-chat serial workers preserve in-chat ordering: each + # chat_id has its own asyncio.Queue + worker task. Updates for + # different chats run in parallel; updates for the same chat + # run strictly in arrival order. + self._chat_queues: dict[int, asyncio.Queue[Mapping[str, Any]]] = {} + self._chat_workers: dict[int, asyncio.Task[None]] = {} + + def contribute(self, context: ChannelContext) -> ChannelContribution: + """Register the webhook route (only in ``webhook`` transport) plus lifecycle hooks. + + Polling-mode hosts intentionally expose no HTTP route — adding one + would just confuse readers who expect inbound HTTP traffic to do + something. + """ + self._ctx = context + routes: list[BaseRoute] = [] + if self._transport == "webhook": + routes.append(Route("/webhook", self._handle, methods=["POST"])) + return ChannelContribution( + routes=routes, + commands=self._commands, + on_startup=[self._on_startup], + on_shutdown=[self._on_shutdown], + ) + + # -- lifecycle --------------------------------------------------------- # + + async def _on_startup(self) -> None: + """Open the HTTP client, optionally register slash commands, and start the transport. + + - Polling: clears any previously-set webhook (Telegram refuses + ``getUpdates`` while one is registered) and launches the + long-poll task. + - Webhook: ``setWebhook`` to the configured URL, including the + optional secret token used to authenticate inbound calls. + """ + # ``getUpdates`` blocks for up to ``polling_timeout`` seconds, so the + # client timeout has to comfortably exceed it. Skip when a client has + # been pre-injected (e.g. by tests). + if self._http is None: + self._http = httpx.AsyncClient(timeout=self._polling_timeout + 15) + if self._register and self._commands: + cmd_payload: dict[str, Any] = { + "commands": [{"command": c.name, "description": c.description} for c in self._commands] + } + await self._http.post(f"{self._api}/setMyCommands", json=cmd_payload) + logger.info("Registered %d Telegram commands", len(self._commands)) + + if self._transport == "webhook": + payload: dict[str, Any] = { + "url": self._webhook_url, + "allowed_updates": ["message", "edited_message", "callback_query"], + } + if self._secret_token: + payload["secret_token"] = self._secret_token + response = await self._http.post(f"{self._api}/setWebhook", json=payload) + response.raise_for_status() + logger.info("Telegram webhook registered: %s", self._webhook_url) + else: + # Telegram refuses getUpdates while a webhook is set, so clear it. + await self._http.post(f"{self._api}/deleteWebhook", json={"drop_pending_updates": False}) + self._poll_task = asyncio.create_task(self._poll_loop(), name="telegram-poll") + logger.info("Telegram polling started (long-poll timeout=%ss)", self._polling_timeout) + + async def _on_shutdown(self) -> None: + """Stop the polling task, drain in-flight workers, drop the webhook, close HTTP. + + Drain order: + 1. Cancel the poll task so no new updates are admitted. + 2. Cancel + await per-chat worker tasks so any currently-running + agent invocations can finish before we yank the HTTP client + out from under them. + 3. Cancel + await any webhook-dispatched tasks tracked in + ``_update_tasks`` (the webhook handler returns 200 immediately + and runs the agent in a background task, which the previous + shutdown ignored entirely). + 4. Best-effort `deleteWebhook` and HTTP client close. + + Webhook teardown is best-effort — failures (e.g. revoked token at + shutdown) are logged but never raised so app shutdown can complete. + """ + if self._poll_task is not None: + self._poll_task.cancel() + with contextlib.suppress(asyncio.CancelledError, Exception): + await self._poll_task + self._poll_task = None + # Cancel per-chat workers; their queues are no longer being fed. + for worker in list(self._chat_workers.values()): + worker.cancel() + for worker in list(self._chat_workers.values()): + with contextlib.suppress(asyncio.CancelledError, Exception): + await worker + self._chat_workers.clear() + self._chat_queues.clear() + # Webhook-spawned dispatcher tasks (the ack-before-run path) live + # in _update_tasks alongside any leftover poll-spawned tasks. + for task in list(self._update_tasks): + task.cancel() + for task in list(self._update_tasks): + with contextlib.suppress(asyncio.CancelledError, Exception): + await task + self._update_tasks.clear() + if self._http is not None: + if self._transport == "webhook": + try: + await self._http.post(f"{self._api}/deleteWebhook") + except Exception: # pragma: no cover - best-effort cleanup + logger.exception("deleteWebhook failed") + await self._http.aclose() + + # -- polling loop ------------------------------------------------------ # + + async def _poll_loop(self) -> None: + """Long-poll ``getUpdates`` until cancelled. + + Each batch advances the ``offset`` by the highest seen + ``update_id`` so processed updates aren't redelivered. Updates + are routed to per-chat serial workers via :meth:`_enqueue_update` + — this preserves in-chat ordering (Telegram only guarantees + ordering up to ``getUpdates``; the previous fan-out into one + task per update broke that guarantee for adjacent updates). + Different chats still process in parallel because each has its + own worker. Transient errors back off for 2 seconds before + retrying. + """ + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + offset: int | None = None + while True: + try: + params: dict[str, Any] = { + "timeout": self._polling_timeout, + "allowed_updates": '["message","edited_message","callback_query"]', + } + if offset is not None: + params["offset"] = offset + response = await self._http.get(f"{self._api}/getUpdates", params=params) + response.raise_for_status() + payload = response.json() + if not payload.get("ok"): + logger.warning("Telegram getUpdates returned error: %s", payload) + await asyncio.sleep(1.0) + continue + for update in payload.get("result", []) or []: + update_id = update.get("update_id") + if isinstance(update_id, int): + offset = update_id + 1 + self._enqueue_update(update) + except asyncio.CancelledError: + raise + except Exception: + logger.exception("Telegram polling iteration failed; retrying in 2s") + await asyncio.sleep(2.0) + + def _chat_id_for_update(self, update: Mapping[str, Any]) -> int | None: + """Best-effort extraction of the chat id from any supported update shape.""" + message = update.get("message") or update.get("edited_message") + if isinstance(message, Mapping): + chat = message.get("chat") + if isinstance(chat, Mapping): + cid = chat.get("id") + if isinstance(cid, int): + return cid + callback = update.get("callback_query") + if isinstance(callback, Mapping): + inner = callback.get("message") + if isinstance(inner, Mapping): + chat = inner.get("chat") + if isinstance(chat, Mapping): + cid = chat.get("id") + if isinstance(cid, int): + return cid + return None + + def _enqueue_update(self, update: Mapping[str, Any]) -> None: + """Route an update to its per-chat serial worker. + + Updates with no resolvable chat_id (malformed payloads, unknown + update types) fall back to a one-shot dispatcher task so they + can't deadlock the main loop. + """ + chat_id = self._chat_id_for_update(update) + if chat_id is None: + # No chat to serialise on — fire and forget, but still track + # so shutdown can drain. + task = asyncio.create_task(self._safe_process_update(update)) + self._update_tasks.add(task) + task.add_done_callback(self._update_tasks.discard) + return + queue = self._chat_queues.get(chat_id) + if queue is None: + queue = asyncio.Queue() + self._chat_queues[chat_id] = queue + worker = asyncio.create_task( + self._chat_worker(chat_id, queue), + name=f"telegram-chat-worker-{chat_id}", + ) + self._chat_workers[chat_id] = worker + # Ensure shutdown can drain this worker too. + self._update_tasks.add(worker) + worker.add_done_callback(self._update_tasks.discard) + queue.put_nowait(update) + + async def _chat_worker(self, chat_id: int, queue: asyncio.Queue[Mapping[str, Any]]) -> None: + """Drain a single chat's queue serially. + + Per-chat ordering is preserved by processing one update at a + time. Exceptions in :meth:`_safe_process_update` are already + swallowed, so the worker keeps running. The worker is cancelled + on channel shutdown. + """ + try: + while True: + update = await queue.get() + try: + await self._safe_process_update(update) + finally: + queue.task_done() + except asyncio.CancelledError: + raise + + async def _safe_process_update(self, update: Mapping[str, Any]) -> None: + """Wrap :meth:`_process_update` so a failure on one update never escapes a task.""" + try: + await self._process_update(update) + except Exception: + logger.exception("Telegram update processing failed: %s", update.get("update_id")) + + # -- request handling -------------------------------------------------- # + + async def _handle(self, request: Request) -> Response: + """Webhook endpoint — verifies the secret token then queues the update. + + Telegram includes the configured secret in the + ``X-Telegram-Bot-Api-Secret-Token`` header on every webhook delivery; + we reject mismatches so leaked URLs alone aren't enough to inject + traffic. + + **Acks before running the agent.** Telegram redelivers any update + the webhook doesn't 200 within ~60 seconds, so a streamed agent + reply that runs longer than that would otherwise trigger a + retry storm and duplicate replies. We enqueue onto the + per-chat serial worker (preserving ordering with polling-mode) + and immediately return 200; the actual processing happens in + the worker task tracked by ``_update_tasks`` and drained on + shutdown. + """ + if self._secret_token is not None: + received = request.headers.get("x-telegram-bot-api-secret-token") + if received != self._secret_token: + logger.warning("Telegram webhook secret token mismatch — rejecting update") + return JSONResponse({"ok": False, "error": "invalid secret"}, status_code=401) + + try: + update = await request.json() + except Exception: + logger.warning("Telegram webhook received malformed JSON; returning 400") + return JSONResponse({"ok": False, "error": "invalid json"}, status_code=400) + if not isinstance(update, Mapping): + logger.warning("Telegram webhook received non-object payload; returning 400") + return JSONResponse({"ok": False, "error": "invalid payload"}, status_code=400) + # Ack immediately, route through per-chat worker so ordering with + # polling-mode is identical and shutdown drains all in-flight work. + self._enqueue_update(update) + return JSONResponse({"ok": True}) + + async def _process_update(self, update: Mapping[str, Any]) -> None: + """Convert one Telegram update into a :class:`ChannelRequest` and dispatch. + + Branches: + - ``callback_query`` — inline-button click; handled separately so we + can ack the click and treat the button payload as the next user + utterance. + - ``message`` / ``edited_message`` — the common text-and-attachment + case; runs slash commands when present, otherwise builds a + message and dispatches to the agent. + """ + if self._ctx is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + + # Inline-button presses: ack the click, treat the payload as input. + if (callback := update.get("callback_query")) is not None: + await self._handle_callback_query(callback) + return + + # message and edited_message share the same shape. + message = update.get("message") or update.get("edited_message") or {} + chat_id = (message.get("chat") or {}).get("id") + text = message.get("text") or message.get("caption") + has_media = any(k in message for k in ("photo", "document", "voice", "audio", "video")) + if chat_id is None or (not isinstance(text, str) and not has_media): + return # Nothing actionable. + + # Native command dispatch — bypasses the agent. + if isinstance(text, str) and text.startswith("/"): + command_name = text[1:].split()[0].split("@", 1)[0] + handler = next((c for c in self._commands if c.name == command_name), None) + if handler is not None: + channel_request = ChannelRequest( + channel=self.name, + operation="command.invoke", + input=text, + session=ChannelSession(isolation_key=telegram_isolation_key(chat_id)), + attributes={"chat_id": chat_id}, + identity=ChannelIdentity(channel=self.name, native_id=str(chat_id)), + ) + ctx = ChannelCommandContext( + request=channel_request, + reply=lambda body, cid=chat_id: self._send(cid, body), + ) + await handler.handle(ctx) + return + + # Plain message → agent run. Build a multi-content Message with the + # text/caption alongside any attached media (photo, document, ...). + parsed = await _parse_telegram_message(message, self._resolve_file_url) + channel_request = ChannelRequest( + channel=self.name, + operation="message.create", + input=[parsed], + session=ChannelSession(isolation_key=telegram_isolation_key(chat_id)), + attributes={"chat_id": chat_id}, + stream=self._stream_default, + identity=ChannelIdentity(channel=self.name, native_id=str(chat_id)), + ) + if self._hook is not None: + channel_request = await apply_run_hook( + self._hook, + channel_request, + target=self._ctx.target, + protocol_request=update, + ) + + await self._dispatch(chat_id, channel_request) + + async def _handle_callback_query(self, callback: Mapping[str, Any]) -> None: + """Handle an inline-button click. + + Always answers the callback query to clear the spinner on the user's + client, then treats the button's ``data`` payload as the user's + next utterance and dispatches it as if they had typed it. + Callbacks without a chat or string ``data`` are silently dropped. + """ + if self._ctx is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + callback_id = callback.get("id") + data = callback.get("data") + message = callback.get("message") or {} + chat_id = (message.get("chat") or {}).get("id") + + if callback_id is not None: + # Always answer to remove the loading spinner on the user's client. + try: + await self._http.post(f"{self._api}/answerCallbackQuery", json={"callback_query_id": callback_id}) + except Exception: # pragma: no cover - defensive + logger.exception("answerCallbackQuery failed") + + if chat_id is None or not isinstance(data, str): + return + + channel_request = ChannelRequest( + channel=self.name, + operation="message.create", + input=data, + session=ChannelSession(isolation_key=telegram_isolation_key(chat_id)), + attributes={"chat_id": chat_id, "callback_query_id": callback_id}, + stream=self._stream_default, + identity=ChannelIdentity(channel=self.name, native_id=str(chat_id)), + ) + if self._hook is not None: + channel_request = await apply_run_hook( + self._hook, + channel_request, + target=self._ctx.target, + protocol_request=callback, + ) + + await self._dispatch(chat_id, channel_request) + + async def _resolve_file_url(self, file_id: str) -> str | None: + """Resolve a Telegram file_id into an HTTPS URL via getFile.""" + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + try: + response = await self._http.get(f"{self._api}/getFile", params={"file_id": file_id}) + response.raise_for_status() + file_path = response.json().get("result", {}).get("file_path") + except Exception: # pragma: no cover - defensive: bad token, network, etc. + logger.exception("getFile failed for %s", file_id) + return None + return f"{self._api.replace('/bot', '/file/bot')}/{file_path}" if file_path else None + + # -- outbound helpers -------------------------------------------------- # + + async def _dispatch(self, chat_id: int, request: ChannelRequest) -> None: + """Run the request and forward results to ``chat_id``.""" + if self._ctx is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + if not request.stream: + if self._send_typing_action: + await self._send_chat_action(chat_id, "typing") + result = await self._ctx.run(request) + include_originating = await self._ctx.deliver_response(request, result) + if include_originating: + result = await self._apply_response_hook(result, request) + await self._reply_with_result(chat_id, result.result) + return + + stream = self._ctx.run_stream(request) + await self._stream_to_chat(chat_id, request, stream) + + async def _stream_to_chat( + self, + chat_id: int, + request: ChannelRequest, + stream: ResponseStream[AgentResponseUpdate, AgentResponse], + ) -> None: + """Iterate the agent's ResponseStream and progressively edit a Telegram message. + + Smoothness recipe: + + 1. Send the placeholder message up front so the user sees instant + activity (a "…" bubble) instead of waiting for the first edit. + 2. Token consumption never awaits the network — a background + ``edit_worker`` watches an asyncio.Event, coalesces accumulated + text, rate-limits itself to ``stream_edit_min_interval`` (default + 0.4s — well under Telegram's per-chat edit limits), and only sends + when the text actually changed. + 3. Interim edits are sent as **plain text** even if a ``parse_mode`` + is configured. Partial Markdown/HTML mid-stream is invalid and + Telegram rejects it with 400 ``can't parse entities``. The final + edit re-applies the configured ``parse_mode`` so the user ends up + with formatted output. + 4. ``sendChatAction("typing")`` is re-issued every 4s while the + stream is live so the typing bubble doesn't disappear on long + responses (Telegram clears it after ~5s). + """ + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + # Pin to a local so mypy narrows inside the nested closures below. + http = self._http + + accumulated = "" + last_sent = "" + last_edit_at = 0.0 + message_id: int | None = None + worker_done = asyncio.Event() + wake = asyncio.Event() + + async def send_initial_placeholder() -> None: + nonlocal message_id, last_edit_at + try: + response = await http.post( + f"{self._api}/sendMessage", + json={"chat_id": chat_id, "text": "…"}, + ) + response.raise_for_status() + message_id = response.json().get("result", {}).get("message_id") + last_edit_at = time.monotonic() + except Exception: # pragma: no cover - placeholder is best-effort + logger.exception("Telegram placeholder send failed") + + async def edit_worker() -> None: + nonlocal last_sent, last_edit_at + while not (worker_done.is_set() and accumulated == last_sent): + await wake.wait() + wake.clear() + if message_id is None or accumulated == last_sent: + continue + elapsed = time.monotonic() - last_edit_at + if elapsed < self._stream_edit_min_interval: + try: + await asyncio.wait_for(wake.wait(), timeout=self._stream_edit_min_interval - elapsed) + wake.clear() + except asyncio.TimeoutError: + pass + snapshot = accumulated[:_TELEGRAM_MAX_TEXT_LEN] + if snapshot == last_sent: + continue + # Interim edits go out as plain text — partial Markdown/HTML + # is invalid mid-stream and Telegram returns 400. + try: + await http.post( + f"{self._api}/editMessageText", + json={"chat_id": chat_id, "message_id": message_id, "text": snapshot}, + ) + except Exception: # pragma: no cover - keep streaming on error + logger.exception("Telegram interim edit failed") + last_sent = snapshot + last_edit_at = time.monotonic() + + async def typing_worker() -> None: + while not worker_done.is_set(): + await self._send_chat_action(chat_id, "typing") + try: + await asyncio.wait_for(worker_done.wait(), timeout=4.0) + except asyncio.TimeoutError: + continue + + await send_initial_placeholder() + edit_task = asyncio.create_task(edit_worker(), name="telegram-edit-worker") + typing_task = asyncio.create_task(typing_worker(), name="telegram-typing-worker") + + try: + async for update in stream: + if self._stream_transform_hook is not None: + transformed = self._stream_transform_hook(update) + if isinstance(transformed, Awaitable): + transformed = await transformed + if transformed is None: + continue + update = transformed + chunk = getattr(update, "text", None) + if chunk: + accumulated += chunk + wake.set() + except Exception: + logger.exception("Telegram streaming consumption failed") + finally: + worker_done.set() + wake.set() + try: + await edit_task + except Exception: # pragma: no cover + logger.exception("Telegram edit worker crashed") + typing_task.cancel() + with contextlib.suppress(asyncio.CancelledError, Exception): + await typing_task + + # Always finalize so context providers / history hooks run. + try: + final = await stream.get_final_response() + except Exception: # pragma: no cover - finalize is best-effort + logger.exception("Stream finalize failed") + final = None + + # Final edit applies parse_mode (if configured) to the full text. + final_text = (accumulated or last_sent)[:_TELEGRAM_MAX_TEXT_LEN] + result = _text_result(final_text) if final_text else None + include_originating = True + if result is not None and self._ctx is not None: + include_originating = await self._ctx.deliver_response(request, result) + if include_originating: + result = await self._apply_response_hook(result, request) + final_text = result.result.text[:_TELEGRAM_MAX_TEXT_LEN] + if message_id is not None and final_text and final_text != last_sent: + payload: dict[str, Any] = { + "chat_id": chat_id, + "message_id": message_id, + "text": final_text, + } + if self._parse_mode: + payload["parse_mode"] = self._parse_mode + try: + response = await self._http.post(f"{self._api}/editMessageText", json=payload) + # If parse_mode rejected the final edit, retry as plain text + # so the user still sees the answer. + if response.status_code == 400 and self._parse_mode: + payload.pop("parse_mode", None) + await self._http.post(f"{self._api}/editMessageText", json=payload) + except Exception: # pragma: no cover + logger.exception("Telegram final edit failed") + + # If nothing ever streamed (no text chunks at all), fall back to the + # full result so images / tool outputs still reach the user. + if not accumulated and include_originating: + if final is not None: + wrapped_final = await self._apply_response_hook(HostedRunResult(final), request) + final = wrapped_final.result + await self._reply_with_result(chat_id, final) + + async def _apply_response_hook( + self, + result: HostedRunResult[Any], + request: ChannelRequest, + ) -> HostedRunResult[Any]: + """Apply the channel-level response hook for an originating reply.""" + if self.response_hook is None: + return result + context = ChannelResponseContext( + request=request, + channel_name=self.name, + destination_identity=None, + originating=True, + is_echo=False, + ) + return await apply_response_hook(self.response_hook, result, context=context) + + async def _reply_with_result(self, chat_id: int, result: Any) -> None: + """Forward an AgentRunResponse back to Telegram. + + Sends any image attachments on the last assistant message as photos, + then the text body via ``sendMessage``. Falls back to a ``"(no + response)"`` placeholder if neither text nor images are present so + the user is never left hanging. + """ + sent_photo = False + last_message = None + messages = getattr(result, "messages", None) or [] + for msg in reversed(messages): + if getattr(msg, "role", None) == "assistant": + last_message = msg + break + + if last_message is not None: + for content in getattr(last_message, "contents", []) or []: + uri = getattr(content, "uri", None) + media_type = getattr(content, "media_type", "") or "" + if uri and isinstance(media_type, str) and media_type.startswith("image/"): + await self._send_photo(chat_id, uri) + sent_photo = True + + text = getattr(result, "text", None) + if text: + await self._send(chat_id, text) + elif not sent_photo: + await self._send(chat_id, "(no response)") + + async def _send(self, chat_id: int, text: str, **extra: Any) -> None: + """POST a ``sendMessage`` to Telegram, applying the configured ``parse_mode`` by default. + + Extra kwargs are merged into the payload after ``parse_mode`` so + callers can override any field per-call (e.g. drop ``parse_mode`` + for a known-unsafe interim text). + """ + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + payload: dict[str, Any] = {"chat_id": chat_id, "text": text} + if self._parse_mode and "parse_mode" not in extra: + payload["parse_mode"] = self._parse_mode + payload.update(extra) + await self._http.post(f"{self._api}/sendMessage", json=payload) + + # -- ChannelPush -------------------------------------------------------- # + + async def push(self, identity: ChannelIdentity, payload: HostedRunResult[AgentResponse]) -> None: + """Proactive delivery to a Telegram chat. + + Implements :class:`host.ChannelPush` so other channels' callers can + target Telegram via ``ChannelRequest.response_target`` + (e.g. ``ResponseTarget.channels(["telegram:8741188429"])`` from a + ``/responses`` request). ``identity.native_id`` is the Telegram + chat id. + """ + try: + chat_id = int(identity.native_id) + except ValueError as exc: + raise ValueError(f"Telegram push requires an int chat_id, got {identity.native_id!r}") from exc + if self._http is None: + raise RuntimeError("TelegramChannel.push called before startup") + await self._send(chat_id, payload.result.text) + + async def _send_photo(self, chat_id: int, photo_url: str, caption: str | None = None) -> None: + """POST a ``sendPhoto`` to Telegram with an optional caption.""" + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + payload: dict[str, Any] = {"chat_id": chat_id, "photo": photo_url} + if caption: + payload["caption"] = caption + await self._http.post(f"{self._api}/sendPhoto", json=payload) + + async def _send_chat_action(self, chat_id: int, action: str) -> None: + """Fire a ``sendChatAction`` (typing, upload_photo, …); errors are logged and swallowed. + + Chat actions are pure UX hints — Telegram clears them after ~5s + — so failures should never propagate to the caller. + """ + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("telegram channel not started") + try: + await self._http.post(f"{self._api}/sendChatAction", json={"chat_id": chat_id, "action": action}) + except Exception: # pragma: no cover - non-critical UX + logger.exception("sendChatAction failed") + + +__all__ = ["TelegramChannel", "telegram_isolation_key"] diff --git a/python/packages/hosting-telegram/pyproject.toml b/python/packages/hosting-telegram/pyproject.toml new file mode 100644 index 00000000000..41f8d7ea0cd --- /dev/null +++ b/python/packages/hosting-telegram/pyproject.toml @@ -0,0 +1,107 @@ +[project] +name = "agent-framework-hosting-telegram" +description = "Telegram channel for agent-framework-hosting." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "agent-framework-hosting==1.0.0a260424", + "httpx>=0.27,<1", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting_telegram"] +exclude = ['tests'] +# Telegram's API delivers loosely-typed JSON-ish maps (chat, message, photo, +# media, callback_query). Strict ``Unknown`` reporting on every ``.get(...)`` +# adds noise without catching real bugs — narrowing happens via runtime +# isinstance checks instead. Other type checks remain strict. +reportUnknownArgumentType = "none" +reportUnknownMemberType = "none" +reportUnknownVariableType = "none" +reportUnknownLambdaType = "none" +reportOptionalMemberAccess = "none" + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting_telegram"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_telegram" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_telegram --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" diff --git a/python/packages/hosting-telegram/tests/__init__.py b/python/packages/hosting-telegram/tests/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/python/packages/hosting-telegram/tests/test_channel.py b/python/packages/hosting-telegram/tests/test_channel.py new file mode 100644 index 00000000000..6b4930b48dd --- /dev/null +++ b/python/packages/hosting-telegram/tests/test_channel.py @@ -0,0 +1,389 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Unit tests for :mod:`agent_framework_hosting_telegram`. + +These tests exercise the internal parsing helpers and the webhook entry-point +without spinning up a real Telegram bot. The polling loop and HTTP-side +helpers are excluded from coverage because they require a live bot token. +""" + +from __future__ import annotations + +import asyncio +import contextlib +from collections.abc import Mapping +from dataclasses import dataclass +from typing import Any +from unittest.mock import AsyncMock, MagicMock + +from agent_framework import AgentResponse, Content, Message +from agent_framework_hosting import ( + AgentFrameworkHost, + ChannelCommand, + ChannelCommandContext, + ChannelRequest, + HostedRunResult, +) +from starlette.testclient import TestClient + +from agent_framework_hosting_telegram import TelegramChannel, telegram_isolation_key +from agent_framework_hosting_telegram._channel import ( + _parse_telegram_message, + _telegram_media_file_id, +) + +# --------------------------------------------------------------------------- # +# Pure helpers # +# --------------------------------------------------------------------------- # + + +def test_telegram_isolation_key_format() -> None: + assert telegram_isolation_key(42) == "telegram:42" + assert telegram_isolation_key("abc") == "telegram:abc" + + +class TestMediaFileId: + def test_no_media(self) -> None: + assert _telegram_media_file_id({"text": "hi"}) is None + + def test_photo_picks_largest(self) -> None: + assert _telegram_media_file_id({"photo": [{"file_id": "small"}, {"file_id": "large"}]}) == ( + "large", + "image/jpeg", + ) + + def test_photo_empty_list(self) -> None: + assert _telegram_media_file_id({"photo": []}) is None + + def test_document_uses_mime_type(self) -> None: + result = _telegram_media_file_id({"document": {"file_id": "f1", "mime_type": "application/pdf"}}) + assert result == ("f1", "application/pdf") + + def test_voice_default_mime(self) -> None: + result = _telegram_media_file_id({"voice": {"file_id": "v1"}}) + assert result == ("v1", "audio/ogg") + + +class TestParseTelegramMessage: + async def test_text_only(self) -> None: + async def resolve(_: str) -> str | None: + return None + + msg = await _parse_telegram_message({"text": "hello"}, resolve) + assert msg.role == "user" + assert msg.text == "hello" + + async def test_text_and_photo(self) -> None: + async def resolve(file_id: str) -> str | None: + return f"https://files.telegram.org/{file_id}" + + msg = await _parse_telegram_message({"caption": "look", "photo": [{"file_id": "p1"}]}, resolve) + assert msg.text == "look" + # Image content present. + assert any((getattr(c, "uri", None) or "").endswith("/p1") for c in msg.contents) + + async def test_unresolvable_media_falls_back_to_text(self) -> None: + async def resolve(_: str) -> str | None: + return None + + msg = await _parse_telegram_message({"text": "x", "voice": {"file_id": "v1"}}, resolve) + # Resolver returned None — the contents should still include the + # text without crashing. + assert msg.text == "x" + + +# --------------------------------------------------------------------------- # +# Webhook entry point # +# --------------------------------------------------------------------------- # + + +@dataclass +class _FakeAgentResponse: + text: str + + +class _FakeAgent: + def __init__(self, reply: str = "ok") -> None: + self._reply = reply + self.runs: list[Any] = [] + + def create_session(self, *, session_id: str | None = None) -> Any: + return {"session_id": session_id} + + def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + self.runs.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + + async def _coro() -> _FakeAgentResponse: + return _FakeAgentResponse(text=self._reply) + + return _coro() + + +def _run_result(text: str) -> HostedRunResult[AgentResponse]: + return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) + + +def _make_telegram(stream_default: bool = False) -> tuple[TelegramChannel, _FakeAgent]: + agent = _FakeAgent("hi") + ch = TelegramChannel( + bot_token="123:abc", + webhook_url="https://example.com/hook", + secret_token="s3cr3t", + stream=stream_default, + ) + # Replace the internal HTTP client with an AsyncMock so the channel + # never tries to call the real Telegram API. + fake_http = MagicMock() + # post() returns a response object whose raise_for_status() is sync. + response_mock = MagicMock() + response_mock.json = MagicMock(return_value={"ok": True, "result": {}}) + fake_http.post = AsyncMock(return_value=response_mock) + fake_http.get = AsyncMock(return_value=response_mock) + fake_http.aclose = AsyncMock() + ch._http = fake_http + return ch, agent + + +class TestTelegramWebhook: + def test_webhook_accepts_text_message_and_dispatches_to_agent(self) -> None: + ch, agent = _make_telegram() + host = AgentFrameworkHost(target=agent, channels=[ch]) + # Skip lifespan so polling/setWebhook are not invoked. + with TestClient(host.app) as client: + r = client.post( + "/telegram/webhook", + json={"update_id": 1, "message": {"chat": {"id": 99}, "text": "hello"}}, + headers={"x-telegram-bot-api-secret-token": "s3cr3t"}, + ) + assert r.status_code == 200 + assert agent.runs, "expected the agent to be invoked" + + def test_webhook_rejects_bad_secret(self) -> None: + ch, agent = _make_telegram() + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post( + "/telegram/webhook", + json={"update_id": 1, "message": {"chat": {"id": 99}, "text": "hi"}}, + headers={"x-telegram-bot-api-secret-token": "WRONG"}, + ) + assert r.status_code == 401 + assert not agent.runs + + async def test_response_hook_can_rewrite_originating_reply(self) -> None: + contexts: list[Any] = [] + + def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: + contexts.append(kwargs["context"]) + return HostedRunResult(_FakeAgentResponse(text=result.result.text.upper()), session=result.session) + + ch, agent = _make_telegram() + ch.response_hook = hook + + class _Ctx: + target: Any = agent + + async def run(self, _request: ChannelRequest) -> HostedRunResult: + return HostedRunResult(_FakeAgentResponse(text="hi")) + + async def deliver_response(self, _request: ChannelRequest, _payload: HostedRunResult) -> bool: + return True + + ch._ctx = _Ctx() # type: ignore[assignment] # pyright: ignore[reportPrivateUsage] + + request = ChannelRequest(channel="telegram", operation="message.create", input="hi", stream=False) + await ch._dispatch(99, request) # pyright: ignore[reportPrivateUsage] + + assert ch._http is not None + args, kwargs = ch._http.post.call_args # type: ignore[attr-defined] + assert args[0].endswith("/sendMessage") + assert kwargs["json"]["text"] == "HI" + assert contexts + assert contexts[0].channel_name == "telegram" + assert contexts[0].originating is True + assert contexts[0].destination_identity is None + + +class TestPushAndCommand: + async def test_push_calls_send(self) -> None: + ch, _agent = _make_telegram() + from agent_framework_hosting import ChannelIdentity + + await ch.push(ChannelIdentity(channel="telegram", native_id="42"), _run_result("hi")) + assert ch._http is not None + ch._http.post.assert_called() # type: ignore[attr-defined] + args, kwargs = ch._http.post.call_args # type: ignore[attr-defined] + assert args[0].endswith("/sendMessage") + assert kwargs["json"]["chat_id"] in ("42", 42) + assert kwargs["json"]["text"] == "hi" + + async def test_command_handler_invoked(self) -> None: + captured: list[ChannelCommandContext] = [] + + async def handler(ctx: ChannelCommandContext) -> None: + captured.append(ctx) + await ctx.reply("pong") + + ch = TelegramChannel( + bot_token="123:abc", + webhook_url="https://example.com/hook", + commands=[ChannelCommand(name="ping", description="ping", handle=handler)], + register_native_commands=False, + ) + fake_http = MagicMock() + response_mock = MagicMock() + response_mock.json = MagicMock(return_value={"ok": True, "result": {}}) + fake_http.post = AsyncMock(return_value=response_mock) + fake_http.get = AsyncMock(return_value=response_mock) + fake_http.aclose = AsyncMock() + ch._http = fake_http + host = AgentFrameworkHost(target=_FakeAgent(), channels=[ch]) + + with TestClient(host.app) as client: + r = client.post( + "/telegram/webhook", + json={"update_id": 2, "message": {"chat": {"id": 7}, "text": "/ping"}}, + ) + assert r.status_code == 200 + assert captured and captured[0].request.operation == "command.invoke" + + +# --------------------------------------------------------------------------- # +# Per-chat serial ordering # +# --------------------------------------------------------------------------- # + + +class TestPerChatOrdering: + async def test_updates_for_same_chat_run_serially(self) -> None: + """Two updates for the same chat must process in arrival order.""" + ch, _ = _make_telegram() + order: list[int] = [] + slow_event = asyncio.Event() + + async def fake_process(update: Mapping[str, Any]) -> None: + uid = update.get("update_id") + assert isinstance(uid, int) + if uid == 1: + # Block the first update so the second is queued behind it. + await slow_event.wait() + order.append(uid) + + ch._process_update = fake_process # type: ignore[method-assign] + + ch._enqueue_update({"update_id": 1, "message": {"chat": {"id": 100}, "text": "first"}}) + ch._enqueue_update({"update_id": 2, "message": {"chat": {"id": 100}, "text": "second"}}) + + # Let the worker start the first update. + await asyncio.sleep(0) + assert order == [] # blocked on slow_event + slow_event.set() + # Drain. + worker = ch._chat_workers[100] + # Wait for the queue to empty. + await ch._chat_queues[100].join() + # Cleanup + worker.cancel() + with contextlib.suppress(asyncio.CancelledError): + await worker + + assert order == [1, 2] + + async def test_updates_for_different_chats_run_in_parallel(self) -> None: + """Different chats get separate workers and can interleave freely.""" + ch, _ = _make_telegram() + started: list[int] = [] + gate_a = asyncio.Event() + + async def fake_process(update: Mapping[str, Any]) -> None: + uid = update.get("update_id") + assert isinstance(uid, int) + started.append(uid) + if uid == 1: + await gate_a.wait() + + ch._process_update = fake_process # type: ignore[method-assign] + + ch._enqueue_update({"update_id": 1, "message": {"chat": {"id": 1}, "text": "a"}}) + ch._enqueue_update({"update_id": 2, "message": {"chat": {"id": 2}, "text": "b"}}) + + # Both should be admitted into their respective workers despite + # update 1 being blocked. + await asyncio.sleep(0) + # Update 2 finishes; update 1 still blocked. + assert 2 in started + gate_a.set() + for cid in (1, 2): + await ch._chat_queues[cid].join() + for w in ch._chat_workers.values(): + w.cancel() + with contextlib.suppress(asyncio.CancelledError): + await w + + +# --------------------------------------------------------------------------- # +# Webhook ack-before-run + shutdown drains workers # +# --------------------------------------------------------------------------- # + + +class TestWebhookAckBeforeRun: + async def test_webhook_returns_200_before_agent_completes(self) -> None: + """The webhook must ack before the agent runs, to dodge Telegram's 60s redelivery.""" + ch, _ = _make_telegram() + from starlette.requests import Request + + agent_started = asyncio.Event() + agent_release = asyncio.Event() + + async def fake_process(update: Mapping[str, Any]) -> None: + agent_started.set() + await agent_release.wait() + + ch._process_update = fake_process # type: ignore[method-assign] + + async def receive() -> Any: + payload = b'{"update_id":1,"message":{"chat":{"id":42},"text":"hi"}}' + return {"type": "http.request", "body": payload, "more_body": False} + + scope = { + "type": "http", + "method": "POST", + "path": "/telegram/webhook", + "headers": [(b"x-telegram-bot-api-secret-token", b"s3cr3t")], + "query_string": b"", + } + request = Request(scope, receive=receive) + + # Drive the webhook handler. Even though the agent won't complete + # (gate_a still cleared) the webhook must still 200 promptly. + resp = await ch._handle(request) + assert resp.status_code == 200 + # The agent task is in flight but not finished — proves ack came first. + await asyncio.wait_for(agent_started.wait(), timeout=1.0) + assert not agent_release.is_set() + + # Cleanup: release the agent and drain. + agent_release.set() + await ch._chat_queues[42].join() + for w in list(ch._chat_workers.values()): + w.cancel() + with contextlib.suppress(asyncio.CancelledError): + await w + + +class TestShutdownDrainsWorkers: + async def test_shutdown_cancels_in_flight_chat_workers(self) -> None: + """`_on_shutdown` must drain per-chat workers, not leak them.""" + ch, _ = _make_telegram() + forever = asyncio.Event() + + async def stuck(update: Mapping[str, Any]) -> None: + await forever.wait() + + ch._process_update = stuck # type: ignore[method-assign] + ch._enqueue_update({"update_id": 9, "message": {"chat": {"id": 1}, "text": "a"}}) + await asyncio.sleep(0) + assert ch._chat_workers and ch._update_tasks + + await ch._on_shutdown() + assert not ch._chat_workers + assert not ch._update_tasks diff --git a/python/pyproject.toml b/python/pyproject.toml index 5f51bf598e5..5a6de014657 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -89,6 +89,7 @@ agent-framework-gemini = { workspace = true } agent-framework-github-copilot = { workspace = true } agent-framework-hosting = { workspace = true } agent-framework-hosting-invocations = { workspace = true } +agent-framework-hosting-telegram = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } agent-framework-mem0 = { workspace = true } @@ -214,6 +215,7 @@ executionEnvironments = [ { root = "packages/github_copilot/tests", reportPrivateUsage = "none" }, { root = "packages/hosting/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-invocations/tests", reportPrivateUsage = "none" }, + { root = "packages/hosting-telegram/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/uv.lock b/python/uv.lock index 5c9fb2686ac..62cf0e99907 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -53,6 +53,7 @@ members = [ "agent-framework-hosting", "agent-framework-hosting-responses", "agent-framework-hosting-invocations", + "agent-framework-hosting-telegram", "agent-framework-hyperlight", "agent-framework-lab", "agent-framework-mem0", @@ -649,6 +650,16 @@ provides-extras = ["serve", "disk"] [package.metadata.requires-dev] dev = [{ name = "httpx", specifier = ">=0.28.1" }] +[[package]] +name = "agent-framework-hosting-telegram" +version = "1.0.0a260424" +source = { editable = "packages/hosting-telegram" } +dependencies = [ + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "httpx", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + [[package]] name = "agent-framework-hosting-responses" version = "1.0.0a260424" @@ -672,6 +683,7 @@ source = { editable = "packages/hosting-invocations" } dependencies = [ { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "httpx", specifier = ">=0.27,<1" }, ] [[package]] From eec0714d12c7056caed742c6f254b9331b4e46d4 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Thu, 28 May 2026 14:37:18 +0200 Subject: [PATCH 08/20] Python: add agent-framework-hosting-activity-protocol channel (#5641) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(hosting-activity-protocol): rename Bot Framework channel to ActivityProtocolChannel The existing Bot-Framework-via-Azure-Bot-Service channel was previously shipped under the name ``hosting-teams`` / ``TeamsChannel``. That name is misleading for what the channel actually does -- it speaks the Bot Framework Activity Protocol against Azure Bot Service, which fans out across MS Teams, Slack, Webex, Telegram-via-Bot-Service, etc., and does not provide any Teams-specific affordances. This PR renames the package atomically and frees the ``hosting-teams`` name for a future Teams-native channel built on ``microsoft-teams-apps`` (PR-5b, spec req #28). Renames (all in one commit): - Package: ``agent-framework-hosting-teams`` -> ``agent-framework-hosting-activity-protocol`` - Module: ``agent_framework_hosting_teams`` -> ``agent_framework_hosting_activity_protocol`` - Channel class: ``TeamsChannel`` -> ``ActivityProtocolChannel`` - Helper: ``teams_isolation_key`` -> ``activity_protocol_isolation_key`` (isolation key prefix ``teams:`` -> ``activity:``) - Channel name: ``"teams"`` -> ``"activity"``; default mount path ``/teams`` -> ``/activity`` - Internal helper: ``_parse_teams_activity`` -> ``_parse_activity`` - Worker task name + a couple of error strings updated for consistency Updates README.md and the module docstring to call out: - this is the channel-neutral Activity Protocol channel, - it surfaces what every Bot-Service-connected channel has in common (text in / text out), - a forthcoming ``agent-framework-hosting-teams`` package will layer Teams-specific affordances (adaptive cards, message extensions, dialogs, SSO, ...) on the same Bot Service transport. Workspace: registers ``agent-framework-hosting-activity-protocol`` in ``python/pyproject.toml`` and adds the matching pyright ``executionEnvironments`` entry. Behavior is unchanged. Pyright + mypy clean, 11 tests pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * review: address PR-5 round 2 feedback - security (#3198327004): add `service_url_allowed_hosts` constructor option (default `botframework.com` + `smba.trafficmanager.net`) and reject inbound activities whose `serviceUrl` host falls outside it with HTTP 400 — without this gate a malicious caller could redirect outbound replies (and the attached bearer token) to an attacker-controlled host - security (#3198324219): add `inbound_auth_validator` async callback; log a loud WARNING at startup when no validator AND no operator reverse-proxy is configured so the dev-mode bypass cannot accidentally ship to production. Document the contract: prototype intentionally does not ship JWT validation (out of scope); operators must plug a validator or terminate auth in front of the channel - retry semantics (#3198328746): distinguish transient outbound failures (httpx network errors, non-2xx from Bot Service) — return 502 so Bot Service retries — from deterministic agent failures — return 200 so Bot Service does not retry the same broken activity in a loop - bug (#3198330424): fix the placeholder-failure deadlock. When `send_initial_placeholder` fails, `activity_id` stays `None`, the edit-worker loop exit condition (`accumulated == last_sent`) is unreachable while no PUT is possible, and the worker would deadlock on `wake.wait()` forever after `worker_done` is set. Now: skip the worker entirely on placeholder failure and POST a single final activity at the end with whatever accumulated - tests (#3198334465, #3187178091, #3198336045): add coverage for - `_is_service_url_allowed` allow/deny matrix + webhook 400 on disallowed serviceUrl - `inbound_auth_validator` allow/deny/raises paths - outbound `Authorization: Bearer ` header presence in production mode and absence in dev mode - the streaming path (`_stream_to_conversation`): placeholder + final edit, placeholder-failure fallback (with timeout guard against deadlock regression), and empty-stream `(no response)` placeholder replacement - retry-signal differentiation: outbound `httpx.ConnectError` → 502; deterministic `ValueError` from the agent → 200 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * test(hosting): drop redundant @pytest.mark.asyncio decorators asyncio_mode = "auto" is configured in pyproject.toml across the hosting packages, so individual @pytest.mark.asyncio decorators are unnecessary. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * feat(hosting-activity-protocol): add response hooks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * docs(hosting-activity-protocol): mark constructor keyword args Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../hosting-activity-protocol/LICENSE | 21 + .../hosting-activity-protocol/README.md | 43 + .../__init__.py | 7 + .../_channel.py | 734 ++++++++++++++++++ .../hosting-activity-protocol/pyproject.toml | 107 +++ .../tests/__init__.py | 0 .../tests/test_channel.py | 452 +++++++++++ python/pyproject.toml | 2 + python/uv.lock | 13 + 9 files changed, 1379 insertions(+) create mode 100644 python/packages/hosting-activity-protocol/LICENSE create mode 100644 python/packages/hosting-activity-protocol/README.md create mode 100644 python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/__init__.py create mode 100644 python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py create mode 100644 python/packages/hosting-activity-protocol/pyproject.toml create mode 100644 python/packages/hosting-activity-protocol/tests/__init__.py create mode 100644 python/packages/hosting-activity-protocol/tests/test_channel.py diff --git a/python/packages/hosting-activity-protocol/LICENSE b/python/packages/hosting-activity-protocol/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting-activity-protocol/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting-activity-protocol/README.md b/python/packages/hosting-activity-protocol/README.md new file mode 100644 index 00000000000..367dc666dab --- /dev/null +++ b/python/packages/hosting-activity-protocol/README.md @@ -0,0 +1,43 @@ +# agent-framework-hosting-activity-protocol + +Bot Framework **Activity Protocol** channel for +[agent-framework-hosting](../hosting). Connects to **Azure Bot Service** so +the same agent can be reached from Microsoft Teams, Slack, Webex, +Telegram-via-bot-channel, and any other channel Azure Bot Service +supports — without having to learn each channel's native protocol. + +> Looking for a deeper Microsoft Teams integration with adaptive cards, +> message extensions, dialogs, SSO, etc? See the companion +> [`agent-framework-hosting-teams`](../hosting-teams) package, which is +> built on `microsoft-teams-apps` and exposes Teams-specific affordances +> on top of (still) Azure Bot Service. + +Handles inbound `message` activities, outbound replies, mid-stream +`updateActivity` edits, typing indicators, and both client-secret and +certificate credential modes for the outbound Bot Framework token. + +## Usage + +```python +from agent_framework_hosting import AgentFrameworkHost +from agent_framework_hosting_activity_protocol import ActivityProtocolChannel + +host = AgentFrameworkHost( + target=my_agent, + channels=[ + ActivityProtocolChannel( + app_id="", + client_secret="", + tenant_id="botframework.com", # or your tenant id + ) + ], +) +host.serve() +``` + +For tenants that disallow client secrets, supply `certificate_path=` (and +optionally `certificate_password=`) instead. See the docstring at the top of +`_channel.py` for the openssl one-liner that generates a usable PEM. + +In dev mode (no credentials), the channel skips outbound auth so the Bot +Framework Emulator can hit the endpoint without setup. diff --git a/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/__init__.py b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/__init__.py new file mode 100644 index 00000000000..4c205b4f045 --- /dev/null +++ b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/__init__.py @@ -0,0 +1,7 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Bot Framework Activity Protocol channel for :mod:`agent_framework_hosting`.""" + +from ._channel import ActivityProtocolChannel, activity_protocol_isolation_key + +__all__ = ["ActivityProtocolChannel", "activity_protocol_isolation_key"] diff --git a/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py new file mode 100644 index 00000000000..8952e1ef7b9 --- /dev/null +++ b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py @@ -0,0 +1,734 @@ +# Copyright (c) Microsoft. All rights reserved. + +r"""Built-in channel: Bot Framework Activity Protocol (Azure Bot Service). + +Activity Protocol is the Bot Framework messaging shape used by Azure Bot +Service to fan one bot endpoint out across many surfaces (Microsoft +Teams, Slack, Webex, Telegram, …). An incoming ``Activity`` is POSTed to +your bot's ``/messages`` endpoint, and you reply by POSTing one or more +``Activity`` objects back to the conversation URL the inbound activity +carried in ``serviceUrl``. Auth is an OAuth2 client-credentials token +from Entra (the legacy multi-tenant ``botframework.com`` authority for +public Bot Framework channels, or your own tenant for single-tenant +bots). + +This is the channel-neutral Activity-Protocol channel — it surfaces what +every Bot-Service-connected channel has in common (text in, text out). +For deeper Microsoft Teams affordances (adaptive cards, message +extensions, dialogs, SSO, …) on the same Bot Service transport, see the +companion ``agent-framework-hosting-teams`` package. + +This channel handles: + +- inbound ``message`` activities — text and attachments resolved to URIs, +- outbound replies via ``POST /v3/conversations/{id}/activities``, +- streaming via ``PUT /v3/conversations/{id}/activities/{id}`` mid-stream + edits (Teams supports updateActivity in personal chats and groups), +- typing indicators while the agent works, +- per-conversation isolation key ``activity:`` so a Responses + caller can resume a Teams conversation by passing the conversation id, +- two credential modes for the outbound token — **client secret** or + **certificate** (for tenants that disallow secrets) — both via + ``azure.identity.aio``, +- dev-mode auth bypass when no credentials are passed so the Bot Framework + Emulator can hit the endpoint with no credentials. + +Out of scope for the prototype: full JWT validation of inbound requests, +adaptive cards, file uploads, OAuth sign-in flows, and the Teams streaming +preview API (``StreamItem``). + +Generating a certificate +------------------------ +For tenants that disallow client secrets, register a certificate on your +Bot Framework / Entra app instead. Self-signed PEM (private key + cert in +one file) is what ``azure.identity.CertificateCredential`` expects:: + + # 1. Generate a 2048-bit RSA key + self-signed cert (10y), single PEM. + openssl req -x509 -newkey rsa:2048 -nodes -days 3650 \\ + -subj "/CN=my-teams-bot" \\ + -keyout teams-bot.key -out teams-bot.crt + cat teams-bot.key teams-bot.crt > teams-bot.pem + + # 2. Upload teams-bot.crt to your Entra app under + # "Certificates & secrets" → "Certificates" → "Upload certificate". + + # 3. Point the channel at the combined PEM: + ActivityProtocolChannel( + app_id="", + tenant_id="", # or "botframework.com" for legacy bots + certificate_path="teams-bot.pem", + ) + +To encrypt the private key, drop ``-nodes`` from the openssl command and +pass ``certificate_password=`` to the channel. +""" + +from __future__ import annotations + +import asyncio +import time +from collections.abc import Awaitable, Callable, Mapping +from typing import Any +from urllib.parse import urlparse + +import httpx +from agent_framework import ( + AgentResponse, + AgentResponseUpdate, + Content, + Message, + ResponseStream, +) +from agent_framework_hosting import ( + ChannelContext, + ChannelContribution, + ChannelRequest, + ChannelResponseContext, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + ChannelStreamTransformHook, + HostedRunResult, + apply_response_hook, + apply_run_hook, + logger, +) +from azure.core.credentials_async import AsyncTokenCredential +from azure.identity.aio import CertificateCredential, ClientSecretCredential +from starlette.requests import Request +from starlette.responses import JSONResponse, Response +from starlette.routing import Route + +# Bot Framework v4 multi-tenant authority used by the public Bot Framework +# channels (including Microsoft Teams). Single-tenant bots should override +# ``tenant_id`` with their own tenant. +_BOTFRAMEWORK_TENANT = "botframework.com" +_BOTFRAMEWORK_SCOPE = "https://api.botframework.com/.default" + +# Default allow-list of host suffixes the channel will POST a bearer token +# to. Bot Service surfaces ``serviceUrl`` per-conversation as one of these +# canonical hosts; a malicious inbound activity claiming a serviceUrl +# outside this set could otherwise exfiltrate a real Bot Framework access +# token. Operators with a private deployment (sovereign cloud, Direct Line +# only, etc.) override this via ``service_url_allowed_hosts``. +_DEFAULT_SERVICE_URL_HOSTS = ( + "botframework.com", + "smba.trafficmanager.net", +) + + +InboundAuthValidator = Callable[[Request], Awaitable[bool]] + + +def activity_protocol_isolation_key(conversation_id: Any) -> str: + """Build the namespaced isolation key the Teams channel writes under. + + Exposed at module scope so other channels' run hooks can opt into the + same per-conversation session (e.g. a Responses caller resuming a Teams + conversation by passing the conversation id). + """ + return f"activity:{conversation_id}" + + +class _OutboundError(RuntimeError): + """Marker for transient outbound failures that should produce 502/retry.""" + + +def _parse_activity(activity: Mapping[str, Any]) -> Message: + """Translate one Bot Framework ``message`` Activity into an Agent Framework Message. + + Pulls the activity's ``text`` plus any image/file attachments with a + ``contentType`` and resolvable URL into ``Content`` parts. If the + activity has no usable parts an empty text part is emitted so the + caller never sees a content-less message. + """ + parts: list[Content] = [] + if (text := activity.get("text")) and isinstance(text, str): + parts.append(Content.from_text(text=text)) + + for attachment in activity.get("attachments") or []: + if not isinstance(attachment, Mapping): + continue + url = attachment.get("contentUrl") or attachment.get("content") + content_type = attachment.get("contentType") + if isinstance(url, str) and isinstance(content_type, str) and "/" in content_type: + parts.append(Content.from_uri(uri=url, media_type=content_type)) + + if not parts: + parts.append(Content.from_text(text="")) + return Message("user", parts) + + +class ActivityProtocolChannel: + """Microsoft Teams channel via Bot Framework v4 webhook. + + Streaming + --------- + When ``stream=True`` (default), the channel sends an initial placeholder + activity, then edits it in place as the agent emits ``AgentResponseUpdate`` + chunks (``PUT /v3/conversations/{id}/activities/{id}``). When ``stream=False`` + it just sends the final reply. A ``stream_transform_hook`` can rewrite or + drop individual updates before they hit the wire. + """ + + name = "activity" + + def __init__( + self, + *, + path: str = "/activity", + app_id: str | None = None, + app_password: str | None = None, + certificate_path: str | None = None, + certificate_password: bytes | None = None, + tenant_id: str = _BOTFRAMEWORK_TENANT, + token_scope: str = _BOTFRAMEWORK_SCOPE, + credential: AsyncTokenCredential | None = None, + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + send_typing_action: bool = True, + stream: bool = True, + stream_transform_hook: ChannelStreamTransformHook | None = None, + stream_edit_min_interval: float = 0.7, + inbound_auth_validator: InboundAuthValidator | None = None, + service_url_allowed_hosts: tuple[str, ...] = _DEFAULT_SERVICE_URL_HOSTS, + ) -> None: + """Configure the Teams channel. + + Keyword Args: + path: Mount path. The webhook lives at ``{path}/messages``. + app_id: Bot Framework / Entra application (client) id. Required + whenever any credential is supplied. + app_password: Application secret for OAuth2 client credentials. + Mutually exclusive with ``certificate_path``. + certificate_path: Path to a PEM file containing **both** the + private key and the X.509 certificate. Use this for tenants + that disallow client secrets. See the module docstring for an + ``openssl`` recipe. + certificate_password: Password for the PEM private key, if any. + tenant_id: Entra tenant. Defaults to ``"botframework.com"`` for + public Bot Framework channels; pass your tenant id for + single-tenant bots. + token_scope: OAuth2 scope to request. Defaults to the Bot + Framework resource. + credential: Bring your own ``AsyncTokenCredential`` (e.g. a + ``DefaultAzureCredential`` configured elsewhere). Overrides + ``app_password`` / ``certificate_path``. + run_hook: Optional rewrite of ``ChannelRequest`` before invocation. + response_hook: Optional rewrite of the + :class:`HostedRunResult` before the originating Activity + reply is serialized. The host also invokes this hook when + delivering to this channel as a non-originating push + destination. + send_typing_action: Whether to send ``typing`` activities while + the agent runs. + stream: Whether to stream by default. ``run_hook`` can flip per + request. + stream_transform_hook: Optional rewrite of each + ``AgentResponseUpdate`` before it hits the wire. + stream_edit_min_interval: Seconds between successive in-place + edits. Teams is more rate-sensitive than Telegram, so default + is higher. + inbound_auth_validator: Optional async callable invoked for each + inbound webhook request **before** the activity is parsed. + Return ``True`` to allow, ``False`` to reject with HTTP 401. + The webhook endpoint accepts unauthenticated requests by + default — Bot Framework normally validates inbound calls via + the JWT in the ``Authorization`` header (see Microsoft's + bot framework auth docs). The prototype intentionally does + NOT ship a built-in JWT validator (key rotation, OpenID + config caching, etc. are out of scope); plug your own + validator here, or terminate auth in front of the channel + (e.g. APIM, Application Gateway). When no credentials AND + no validator are configured the channel logs a loud + warning at startup so the dev-mode bypass cannot + accidentally ship. + service_url_allowed_hosts: Host (or host suffix) allow-list the + channel will POST a bearer token to. Defaults to the public + Bot Framework host suffixes (``botframework.com`` and + ``smba.trafficmanager.net``). An inbound activity claiming a + ``serviceUrl`` outside this set is rejected — without this + gate a malicious caller could redirect outbound replies (and + the attached bearer token) to an attacker-controlled host. + Pass an extended tuple for sovereign clouds or private + deployments; pass ``()`` to disable the check entirely + (only safe with strong inbound auth). + """ + if app_password and certificate_path: + raise ValueError("ActivityProtocolChannel: pass either app_password or certificate_path, not both.") + self.path = path + self._app_id = app_id + self._token_scope = token_scope + self._tenant_id = tenant_id + self._hook = run_hook + self.response_hook = response_hook + self._send_typing_action = send_typing_action + self._stream_default = stream + self._stream_transform_hook = stream_transform_hook + self._stream_edit_min_interval = stream_edit_min_interval + self._inbound_auth_validator = inbound_auth_validator + self._service_url_allowed_hosts = tuple(h.lower().lstrip(".") for h in service_url_allowed_hosts) + self._ctx: ChannelContext | None = None + self._http: httpx.AsyncClient | None = None + + # Build the credential up front so misconfiguration fails at construction. + self._credential: AsyncTokenCredential | None + if credential is not None: + self._credential = credential + elif app_id and certificate_path: + self._credential = CertificateCredential( + tenant_id=tenant_id, + client_id=app_id, + certificate_path=certificate_path, + password=certificate_password, + ) + elif app_id and app_password: + self._credential = ClientSecretCredential( + tenant_id=tenant_id, + client_id=app_id, + client_secret=app_password, + ) + else: + self._credential = None # dev mode + + def contribute(self, context: ChannelContext) -> ChannelContribution: + """Capture the host context and register the ``POST /messages`` webhook.""" + self._ctx = context + return ChannelContribution( + routes=[Route("/messages", self._handle, methods=["POST"])], + on_startup=[self._on_startup], + on_shutdown=[self._on_shutdown], + ) + + # -- lifecycle --------------------------------------------------------- # + + async def _on_startup(self) -> None: + """Open the outbound HTTP client and emit a startup banner. + + When no Bot Framework credential is configured we log a loud warning — + outbound replies will not authenticate, which is only acceptable + against the local Bot Framework Emulator. + + When no inbound auth validator is configured we also log a loud + warning so the dev-mode bypass cannot accidentally ship to + production: Bot Framework normally validates inbound requests via + a JWT in ``Authorization``; without that gate any caller that can + reach the webhook can drive the bot. + """ + if self._http is None: + self._http = httpx.AsyncClient(timeout=30.0) + if self._credential is None: + logger.warning( + "ActivityProtocolChannel running without credentials — outbound replies " + "will not authenticate. Use only with the Bot Framework " + "Emulator for local development." + ) + else: + cred_kind = type(self._credential).__name__ + logger.info( + "ActivityProtocolChannel listening on %s/messages (auth=%s, tenant=%s)", + self.path, + cred_kind, + self._tenant_id, + ) + if self._inbound_auth_validator is None: + logger.warning( + "ActivityProtocolChannel %s/messages has no inbound_auth_validator — " + "the webhook will accept ANY caller. Plug an inbound_auth_validator " + "or terminate auth in front of the channel before exposing this " + "endpoint to a public network.", + self.path, + ) + + async def _on_shutdown(self) -> None: + """Close the HTTP client and best-effort close the credential. + + Credential ``close`` failures are logged but never raised — shutdown + must never be allowed to mask the original cause of an app exit. + """ + if self._http is not None: + await self._http.aclose() + if self._credential is not None: + close = getattr(self._credential, "close", None) + if close is not None: + try: + await close() + except Exception: # pragma: no cover - best-effort + logger.exception("ActivityProtocolChannel credential close failed") + + # -- token management -------------------------------------------------- # + + async def _get_token(self) -> str | None: + """Acquire (and cache) an outbound bearer token. + + ``azure.identity`` credentials cache and refresh internally, so we + just delegate. + """ + if self._credential is None: + return None + access_token = await self._credential.get_token(self._token_scope) + return access_token.token + + def _auth_headers(self, token: str | None) -> dict[str, str]: + """Return Bot Framework auth headers, or an empty dict in dev mode.""" + return {"Authorization": f"Bearer {token}"} if token else {} + + # -- request handling -------------------------------------------------- # + + def _is_service_url_allowed(self, service_url: str | None) -> bool: + """Return ``True`` if ``service_url`` host matches the allow-list.""" + if not self._service_url_allowed_hosts: + return True + if not service_url: + return False + try: + host = (urlparse(service_url).hostname or "").lower() + except Exception: + return False + if not host: + return False + return any(host == allowed or host.endswith(f".{allowed}") for allowed in self._service_url_allowed_hosts) + + async def _handle(self, request: Request) -> Response: + """Bot Framework webhook entry point. + + Only ``message`` activities are processed; ``conversationUpdate``, + ``invoke``, ``typing`` and other activity types are silently + acknowledged. Auth-rejected requests return 401, malformed JSON + returns 400, and serviceUrl outside the allow-list returns 400. + + For *transient* outbound failures (network error / non-2xx from + Bot Service / token acquisition failure) we surface 502 so Bot + Service retries the inbound activity. Non-transient failures + (parsing errors, validation errors, deterministic agent crashes) + return 200 so Bot Service does not retry the same broken + activity in a loop. + """ + if self._inbound_auth_validator is not None: + try: + allowed = await self._inbound_auth_validator(request) + except Exception: + logger.exception("ActivityProtocolChannel inbound_auth_validator raised; rejecting request") + return JSONResponse({"error": "unauthorized"}, status_code=401) + if not allowed: + return JSONResponse({"error": "unauthorized"}, status_code=401) + + try: + activity = await request.json() + except Exception: + return JSONResponse({"error": "invalid json"}, status_code=400) + + # We accept only message activities for now. ``conversationUpdate``, + # ``invoke``, ``typing`` and friends are silently ack'd. + if activity.get("type") != "message": + return JSONResponse({}, status_code=202) + + service_url = activity.get("serviceUrl") + if not self._is_service_url_allowed(service_url if isinstance(service_url, str) else None): + logger.warning( + "ActivityProtocolChannel rejecting activity with serviceUrl=%r (not in allow-list)", + service_url, + ) + return JSONResponse({"error": "serviceUrl not allowed"}, status_code=400) + + try: + await self._process_activity(activity) + except (httpx.HTTPError, _OutboundError): + # Transient outbound failure (network error, non-2xx from Bot + # Service, token acquisition error). Surface 502 so Bot + # Service retries the inbound activity rather than dropping it. + logger.exception("ActivityProtocolChannel outbound transient failure — signalling Bot Service to retry") + return JSONResponse({"error": "upstream failure"}, status_code=502) + except Exception: + # Deterministic / agent-side failure: 200 so Bot Service does + # not retry the same broken activity in a loop. Operator picks + # the failure up via logs / telemetry. + logger.exception("ActivityProtocolChannel activity processing failed") + # Bot Framework expects 200 OK to dequeue the activity. + return JSONResponse({}, status_code=200) + + async def _process_activity(self, activity: Mapping[str, Any]) -> None: + """Build a :class:`ChannelRequest` from a message Activity and dispatch. + + The Teams isolation key is per-conversation so all members of a + group chat share session state. Activity metadata (``reply_to_id``, + ``recipient``) is preserved so reply-as-reaction style flows can + reconstruct the original message context. + """ + if self._ctx is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("activity channel not started") + conversation = activity.get("conversation") or {} + conversation_id = conversation.get("id") + service_url = activity.get("serviceUrl") + if not isinstance(conversation_id, str) or not isinstance(service_url, str): + logger.warning("Teams activity missing conversation.id or serviceUrl — dropping") + return + + parsed = _parse_activity(activity) + channel_request = ChannelRequest( + channel=self.name, + operation="message.create", + input=[parsed], + session=ChannelSession(isolation_key=activity_protocol_isolation_key(conversation_id)), + attributes={ + "conversation_id": conversation_id, + "service_url": service_url, + "from_id": (activity.get("from") or {}).get("id"), + "channel_id": activity.get("channelId"), + }, + metadata={"reply_to_id": activity.get("id"), "recipient": activity.get("recipient")}, + stream=self._stream_default, + ) + if self._hook is not None: + channel_request = await apply_run_hook( + self._hook, + channel_request, + target=self._ctx.target, + protocol_request=activity, + ) + + await self._dispatch(activity, channel_request) + + # -- outbound helpers -------------------------------------------------- # + + async def _dispatch(self, inbound: Mapping[str, Any], request: ChannelRequest) -> None: + """Run the target and ship the result back into the originating Teams conversation. + + Optionally fires a typing indicator before non-streaming runs; + streaming runs route through ``_stream_to_conversation`` which + progressively edits a single placeholder activity. + """ + if self._ctx is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("activity channel not started") + if self._send_typing_action: + await self._send_typing(inbound) + + if not request.stream: + result = await self._ctx.run(request) + include_originating = await self._ctx.deliver_response(request, result) + if include_originating: + result = await self._apply_response_hook(result, request) + text = getattr(result.result, "text", None) or "(no response)" + await self._send_message(inbound, text) + return + + stream = self._ctx.run_stream(request) + await self._stream_to_conversation(inbound, stream) + + async def _apply_response_hook( + self, + result: HostedRunResult[Any], + request: ChannelRequest, + ) -> HostedRunResult[Any]: + """Apply the channel-level response hook for an originating reply.""" + if self.response_hook is None: + return result + context = ChannelResponseContext( + request=request, + channel_name=self.name, + destination_identity=None, + originating=True, + is_echo=False, + ) + return await apply_response_hook(self.response_hook, result, context=context) + + async def _stream_to_conversation( + self, + inbound: Mapping[str, Any], + stream: ResponseStream[AgentResponseUpdate, AgentResponse], + ) -> None: + """Iterate the stream and progressively edit a single Teams activity. + + If the initial placeholder POST fails we fall back to buffering + the whole stream and POSTing a single final message at the end. + Without that fallback the edit-loop's exit condition + ``accumulated == last_sent`` is unreachable while ``activity_id`` + is ``None`` (no PUT possible), and the worker would deadlock + forever on ``wake.wait()`` after ``worker_done`` is set. + """ + accumulated = "" + last_sent = "" + last_edit_at = 0.0 + activity_id: str | None = None + placeholder_ok = False + worker_done = asyncio.Event() + wake = asyncio.Event() + + async def send_initial_placeholder() -> None: + nonlocal activity_id, last_edit_at, placeholder_ok + try: + activity_id = await self._send_message(inbound, "…") + last_edit_at = time.monotonic() + placeholder_ok = activity_id is not None + except Exception: + logger.exception( + "Activity placeholder send failed — falling back to single final POST", + ) + placeholder_ok = False + + async def edit_worker() -> None: + nonlocal last_sent, last_edit_at + # When the placeholder failed we have no activity_id to PUT + # into; the loop's only useful work is exiting cleanly. Skip + # straight to that — the final flush below will POST the + # accumulated text in one shot. + if not placeholder_ok: + return + while not (worker_done.is_set() and accumulated == last_sent): + await wake.wait() + wake.clear() + if accumulated == last_sent: + continue + elapsed = time.monotonic() - last_edit_at + if elapsed < self._stream_edit_min_interval: + try: + await asyncio.wait_for(wake.wait(), timeout=self._stream_edit_min_interval - elapsed) + wake.clear() + except asyncio.TimeoutError: + pass + snapshot = accumulated + if snapshot == last_sent: + continue + try: + await self._update_activity(inbound, activity_id or "", snapshot) + except Exception: # pragma: no cover + logger.exception("Activity interim edit failed") + last_sent = snapshot + last_edit_at = time.monotonic() + + await send_initial_placeholder() + edit_task = asyncio.create_task(edit_worker(), name="activity-edit-worker") + + try: + async for update in stream: + if self._stream_transform_hook is not None: + transformed = self._stream_transform_hook(update) + if isinstance(transformed, Awaitable): + transformed = await transformed + if transformed is None: + continue + update = transformed + chunk = getattr(update, "text", None) + if chunk: + accumulated += chunk + wake.set() + except Exception: + logger.exception("Activity streaming consumption failed") + finally: + worker_done.set() + wake.set() + try: + await edit_task + except Exception: # pragma: no cover + logger.exception("Activity edit worker crashed") + + try: + await stream.get_final_response() + except Exception: # pragma: no cover + logger.exception("Stream finalize failed") + + # Final flush — make sure the user sees everything that arrived after + # the worker's last edit. If the placeholder failed we POST a fresh + # activity here with whatever accumulated. + if not placeholder_ok: + text = accumulated or "(no response)" + try: + await self._send_message(inbound, text) + except Exception: # pragma: no cover + logger.exception("Activity fallback final send failed") + elif activity_id is not None and accumulated and accumulated != last_sent: + try: + await self._update_activity(inbound, activity_id, accumulated) + except Exception: # pragma: no cover + logger.exception("Activity final edit failed") + elif not accumulated and activity_id is not None: + # No text streamed — replace the placeholder with a stub so the + # user isn't left staring at "…". + try: + await self._update_activity(inbound, activity_id, "(no response)") + except Exception: # pragma: no cover + logger.exception("Activity placeholder replace failed") + + # -- Bot Framework REST helpers --------------------------------------- # + + def _activity_payload(self, inbound: Mapping[str, Any], text: str) -> dict[str, Any]: + """Build the outbound Activity envelope (text-only message).""" + recipient = inbound.get("from") or {} + from_user = inbound.get("recipient") or {} + return { + "type": "message", + "from": from_user, + "recipient": recipient, + "conversation": inbound.get("conversation") or {}, + "replyToId": inbound.get("id"), + "channelId": inbound.get("channelId"), + "serviceUrl": inbound.get("serviceUrl"), + "text": text, + "textFormat": "plain", + } + + async def _send_message(self, inbound: Mapping[str, Any], text: str) -> str | None: + """POST a new Activity. Returns the assigned activity id.""" + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("activity channel not started") + service_url = str(inbound.get("serviceUrl") or "").rstrip("/") + conversation_id = (inbound.get("conversation") or {}).get("id") + if not service_url or not isinstance(conversation_id, str): + return None + url = f"{service_url}/v3/conversations/{conversation_id}/activities" + token = await self._get_token() + response = await self._http.post( + url, json=self._activity_payload(inbound, text), headers=self._auth_headers(token) + ) + response.raise_for_status() + payload = response.json() if response.content else {} + return payload.get("id") if isinstance(payload, dict) else None + + async def _update_activity(self, inbound: Mapping[str, Any], activity_id: str, text: str) -> None: + """PUT-edit an existing Activity (Teams updateActivity).""" + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("activity channel not started") + service_url = str(inbound.get("serviceUrl") or "").rstrip("/") + conversation_id = (inbound.get("conversation") or {}).get("id") + if not service_url or not isinstance(conversation_id, str): + return + url = f"{service_url}/v3/conversations/{conversation_id}/activities/{activity_id}" + token = await self._get_token() + response = await self._http.put( + url, json=self._activity_payload(inbound, text), headers=self._auth_headers(token) + ) + response.raise_for_status() + + async def _send_typing(self, inbound: Mapping[str, Any]) -> None: + """Send a Teams typing indicator; failures are logged and swallowed. + + The typing activity is purely a UX nicety — if it fails (token + expired, transient network issue, channel that doesn't support + typing) we never surface that to the user or block the actual + agent run. + """ + if self._http is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("activity channel not started") + service_url = str(inbound.get("serviceUrl") or "").rstrip("/") + conversation_id = (inbound.get("conversation") or {}).get("id") + if not service_url or not isinstance(conversation_id, str): + return + url = f"{service_url}/v3/conversations/{conversation_id}/activities" + token = await self._get_token() + try: + await self._http.post( + url, + json={ + "type": "typing", + "from": inbound.get("recipient") or {}, + "recipient": inbound.get("from") or {}, + "conversation": inbound.get("conversation") or {}, + "serviceUrl": inbound.get("serviceUrl"), + }, + headers=self._auth_headers(token), + ) + except Exception: # pragma: no cover - non-critical UX + logger.exception("Teams typing send failed") + + +__all__ = ["ActivityProtocolChannel", "activity_protocol_isolation_key"] diff --git a/python/packages/hosting-activity-protocol/pyproject.toml b/python/packages/hosting-activity-protocol/pyproject.toml new file mode 100644 index 00000000000..cd18431a073 --- /dev/null +++ b/python/packages/hosting-activity-protocol/pyproject.toml @@ -0,0 +1,107 @@ +[project] +name = "agent-framework-hosting-activity-protocol" +description = "Bot Framework Activity Protocol channel for agent-framework-hosting (Teams, Slack, etc. via Azure Bot Service)." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "agent-framework-hosting==1.0.0a260424", + "httpx>=0.27,<1", + "azure-identity>=1.20,<2", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting_activity_protocol"] +exclude = ['tests'] +# Bot Framework activities arrive as loosely-typed JSON-ish maps. Strict +# ``Unknown`` reporting on every ``.get(...)`` adds noise without catching +# real bugs — narrowing happens via runtime isinstance checks instead. +reportUnknownArgumentType = "none" +reportUnknownMemberType = "none" +reportUnknownVariableType = "none" +reportUnknownLambdaType = "none" +reportOptionalMemberAccess = "none" + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting_activity_protocol"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_activity_protocol" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_activity_protocol --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" diff --git a/python/packages/hosting-activity-protocol/tests/__init__.py b/python/packages/hosting-activity-protocol/tests/__init__.py new file mode 100644 index 00000000000..e69de29bb2d diff --git a/python/packages/hosting-activity-protocol/tests/test_channel.py b/python/packages/hosting-activity-protocol/tests/test_channel.py new file mode 100644 index 00000000000..495a9c6dc84 --- /dev/null +++ b/python/packages/hosting-activity-protocol/tests/test_channel.py @@ -0,0 +1,452 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Unit tests for :mod:`agent_framework_hosting_activity_protocol`. + +The Bot Framework outbound calls and azure-identity credentials are mocked +out so the suite never touches the network. Live token acquisition, +streaming edits and certificate paths are out of scope here. +""" + +from __future__ import annotations + +from dataclasses import dataclass +from typing import Any +from unittest.mock import AsyncMock, MagicMock + +import pytest +from agent_framework_hosting import AgentFrameworkHost, HostedRunResult +from starlette.testclient import TestClient + +from agent_framework_hosting_activity_protocol import ActivityProtocolChannel, activity_protocol_isolation_key +from agent_framework_hosting_activity_protocol._channel import _parse_activity + + +def test_activity_protocol_isolation_key_format() -> None: + assert activity_protocol_isolation_key("19:meeting_xyz@thread.v2") == "activity:19:meeting_xyz@thread.v2" + assert activity_protocol_isolation_key(123) == "activity:123" + + +class TestParseActivity: + def test_text_only(self) -> None: + msg = _parse_activity({"type": "message", "text": "hello"}) + assert msg.role == "user" + assert msg.text == "hello" + + def test_with_attachment(self) -> None: + msg = _parse_activity({ + "type": "message", + "text": "see this", + "attachments": [ + {"contentType": "image/png", "contentUrl": "https://example.com/x.png"}, + ], + }) + assert msg.text == "see this" + assert any((getattr(c, "uri", None) or "").endswith("/x.png") for c in msg.contents) + + def test_skips_invalid_attachments(self) -> None: + msg = _parse_activity({ + "type": "message", + "text": "hi", + "attachments": [ + "not-a-mapping", + {"contentType": "image/png"}, # no url + {"contentUrl": "https://example.com/y", "contentType": "no-slash"}, + ], + }) + assert msg.text == "hi" + # No URI content survived. + assert not any(getattr(c, "uri", None) for c in msg.contents) + + +@dataclass +class _FakeAgentResponse: + text: str + + +class _FakeAgent: + def __init__(self, reply: str = "ok") -> None: + self._reply = reply + self.runs: list[Any] = [] + + def create_session(self, *, session_id: str | None = None) -> Any: + return {"session_id": session_id} + + def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + self.runs.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + + async def _coro() -> _FakeAgentResponse: + return _FakeAgentResponse(text=self._reply) + + return _coro() + + +def _make_teams(stream: bool = False) -> tuple[ActivityProtocolChannel, _FakeAgent]: + agent = _FakeAgent("hi there") + ch = ActivityProtocolChannel(stream=stream, send_typing_action=False) + fake_http = MagicMock() + response_mock = MagicMock() + response_mock.raise_for_status = MagicMock() + response_mock.json = MagicMock(return_value={"id": "act-1"}) + fake_http.post = AsyncMock(return_value=response_mock) + fake_http.put = AsyncMock(return_value=response_mock) + fake_http.aclose = AsyncMock() + ch._http = fake_http + return ch, agent + + +_VALID_ACTIVITY: dict[str, Any] = { + "type": "message", + "id": "in-1", + "text": "hello bot", + "conversation": {"id": "19:meeting_xyz@thread.v2"}, + "from": {"id": "user-1"}, + "recipient": {"id": "bot-1"}, + "channelId": "msteams", + "serviceUrl": "https://smba.trafficmanager.net/amer/", +} + + +class TestTeamsWebhook: + def test_message_activity_dispatches_to_agent(self) -> None: + ch, agent = _make_teams() + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + assert r.status_code == 200 + assert agent.runs, "expected the agent to be invoked" + # And the channel posted a reply back to the conversation URL. + assert ch._http is not None + ch._http.post.assert_called() # type: ignore[attr-defined] + url, _ = ch._http.post.call_args[0], ch._http.post.call_args[1] # type: ignore[attr-defined] # noqa: F841 + assert "/v3/conversations/" in ch._http.post.call_args[0][0] # type: ignore[attr-defined] + body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert body["text"] == "hi there" + + def test_response_hook_can_rewrite_originating_reply(self) -> None: + contexts: list[Any] = [] + + def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: + contexts.append(kwargs["context"]) + return HostedRunResult(_FakeAgentResponse(text=result.result.text.upper()), session=result.session) + + ch, agent = _make_teams() + ch.response_hook = hook + host = AgentFrameworkHost(target=agent, channels=[ch]) + + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + + assert r.status_code == 200 + assert ch._http is not None + body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert body["text"] == "HI THERE" + assert contexts + assert contexts[0].channel_name == "activity" + assert contexts[0].originating is True + + def test_non_message_activities_are_acked(self) -> None: + ch, agent = _make_teams() + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post( + "/activity/messages", + json={"type": "conversationUpdate", "conversation": {"id": "x"}}, + ) + assert r.status_code == 202 + assert not agent.runs + + def test_invalid_json_returns_400(self) -> None: + ch, agent = _make_teams() + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post( + "/activity/messages", + content=b"not-json", + headers={"content-type": "application/json"}, + ) + assert r.status_code == 400 + assert not agent.runs + + def test_message_missing_serviceurl_is_dropped(self) -> None: + ch, agent = _make_teams() + host = AgentFrameworkHost(target=agent, channels=[ch]) + bad = dict(_VALID_ACTIVITY) + bad.pop("serviceUrl") + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=bad) + # No serviceUrl → fails the allow-list check (None doesn't match + # any allowed host suffix), surfaced as 400 so a misconfigured + # caller knows the activity was structurally invalid. + assert r.status_code == 400 + assert not agent.runs + + +class TestOutbound: + async def test_send_message_posts_to_conversation_url(self) -> None: + ch, _agent = _make_teams() + await ch._send_message(_VALID_ACTIVITY, "hi") + assert ch._http is not None + ch._http.post.assert_called() # type: ignore[attr-defined] + url = ch._http.post.call_args[0][0] # type: ignore[attr-defined] + assert "/v3/conversations/" in url + body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert body["text"] == "hi" + + +class TestConfig: + def test_rejects_both_secret_and_certificate(self) -> None: + with pytest.raises(ValueError, match="not both"): + ActivityProtocolChannel( + app_id="x", + app_password="s", + certificate_path="/tmp/does-not-exist.pem", + ) + + def test_dev_mode_no_credential(self) -> None: + ch = ActivityProtocolChannel() + assert ch._credential is None + + +class TestServiceUrlAllowList: + """``serviceUrl`` is supplied by the inbound activity and the channel + POSTs a real bearer token to it — anything outside the Bot Framework + host suffixes must be rejected so a malicious caller can't redirect + outbound replies to an attacker-controlled host.""" + + def test_default_allows_smba_trafficmanager(self) -> None: + ch = ActivityProtocolChannel() + assert ch._is_service_url_allowed("https://smba.trafficmanager.net/amer/") + assert ch._is_service_url_allowed("https://emea.smba.trafficmanager.net/") + assert ch._is_service_url_allowed("https://api.botframework.com/") + + def test_default_rejects_arbitrary_host(self) -> None: + ch = ActivityProtocolChannel() + assert not ch._is_service_url_allowed("https://attacker.example.com/") + assert not ch._is_service_url_allowed("https://botframework.com.attacker.com/") + assert not ch._is_service_url_allowed("") + assert not ch._is_service_url_allowed(None) + + def test_custom_allowlist(self) -> None: + ch = ActivityProtocolChannel(service_url_allowed_hosts=("internal.contoso.com",)) + assert ch._is_service_url_allowed("https://internal.contoso.com/v3/") + assert ch._is_service_url_allowed("https://eu.internal.contoso.com/") + assert not ch._is_service_url_allowed("https://smba.trafficmanager.net/") + + def test_empty_allowlist_disables_check(self) -> None: + ch = ActivityProtocolChannel(service_url_allowed_hosts=()) + assert ch._is_service_url_allowed("https://anywhere.example.org/") + + def test_webhook_rejects_disallowed_serviceurl(self) -> None: + ch, agent = _make_teams() + host = AgentFrameworkHost(target=agent, channels=[ch]) + bad = dict(_VALID_ACTIVITY) + bad["serviceUrl"] = "https://attacker.example.com/v3/" + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=bad) + assert r.status_code == 400 + assert not agent.runs + # No outbound POST attempted with a bearer token. + assert ch._http is not None + ch._http.post.assert_not_called() # type: ignore[attr-defined] + + +class TestInboundAuthValidator: + def test_allow_passes_through(self) -> None: + async def allow(_req: Any) -> bool: + return True + + ch, agent = _make_teams() + ch._inbound_auth_validator = allow + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + assert r.status_code == 200 + assert agent.runs + + def test_reject_returns_401(self) -> None: + async def deny(_req: Any) -> bool: + return False + + ch, agent = _make_teams() + ch._inbound_auth_validator = deny + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + assert r.status_code == 401 + assert not agent.runs + + def test_validator_raises_returns_401(self) -> None: + async def boom(_req: Any) -> bool: + raise RuntimeError("validator broke") + + ch, agent = _make_teams() + ch._inbound_auth_validator = boom + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + assert r.status_code == 401 + assert not agent.runs + + +class TestOutboundAuthHeader: + async def test_no_credential_sends_no_authorization_header(self) -> None: + ch, _agent = _make_teams() + # Default _make_teams has no credential — dev mode. + await ch._send_message(_VALID_ACTIVITY, "hi") + assert ch._http is not None + headers = ch._http.post.call_args[1]["headers"] # type: ignore[attr-defined] + assert "Authorization" not in headers + + async def test_with_credential_sends_bearer_token(self) -> None: + ch, _agent = _make_teams() + # Inject a fake credential with a fixed token. + token_obj = MagicMock() + token_obj.token = "tok-abc123" + cred = MagicMock() + cred.get_token = AsyncMock(return_value=token_obj) + ch._credential = cred # type: ignore[assignment] + await ch._send_message(_VALID_ACTIVITY, "hi") + assert ch._http is not None + headers = ch._http.post.call_args[1]["headers"] # type: ignore[attr-defined] + assert headers.get("Authorization") == "Bearer tok-abc123" + + +class TestRetrySignal: + """Distinguish transient outbound failures (network / 5xx) — which + must surface 502 so Bot Service retries — from deterministic agent + failures (which must return 200 to avoid retry loops).""" + + def test_outbound_http_error_returns_502(self) -> None: + import httpx as _httpx + + ch, agent = _make_teams() + # Make _send_message raise a transient httpx error. + assert ch._http is not None + ch._http.post = AsyncMock(side_effect=_httpx.ConnectError("nope")) # type: ignore[attr-defined] + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + assert r.status_code == 502 + + def test_deterministic_agent_failure_returns_200(self) -> None: + ch, agent = _make_teams() + + def boom(messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: + async def _coro() -> Any: + raise ValueError("agent crashed") + + return _coro() + + agent.run = boom # type: ignore[assignment] + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + # Deterministic failure → 200 (Bot Service does not retry the same + # broken activity in a loop). + assert r.status_code == 200 + + +class TestStreaming: + async def test_stream_sends_placeholder_and_edits(self) -> None: + ch, _agent = _make_teams(stream=True) + + # Build a fake stream that emits two text chunks then finalizes. + @dataclass + class _Up: + text: str + + class _Stream: + def __init__(self) -> None: + self._chunks = ["hel", "lo"] + + def __aiter__(self) -> Any: + async def gen() -> Any: + for c in self._chunks: + yield _Up(c) + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="hello") + + # Use a tight throttle so the test doesn't sit on `wait_for`. + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(_VALID_ACTIVITY, _Stream()) # type: ignore[arg-type] + assert ch._http is not None + # Placeholder POST + at least one final PUT. + ch._http.post.assert_called() # type: ignore[attr-defined] + ch._http.put.assert_called() # type: ignore[attr-defined] + # Final edit body carries the full accumulated text. + last_put_body = ch._http.put.call_args[1]["json"] # type: ignore[attr-defined] + assert last_put_body["text"] == "hello" + + async def test_stream_placeholder_failure_falls_back_to_single_post(self) -> None: + # The bug: when send_initial_placeholder fails, activity_id stays + # None, the edit_worker can never reach its exit condition + # (`accumulated == last_sent` while no PUT possible) and the + # whole conversation deadlocks. After the fix we fall back to + # buffering the stream and POSTing a single final activity. + ch, _agent = _make_teams(stream=True) + # Make the FIRST POST (placeholder) raise; subsequent POST (final + # fallback) succeeds. + import httpx as _httpx + + ok_response = MagicMock() + ok_response.raise_for_status = MagicMock() + ok_response.json = MagicMock(return_value={"id": "act-final"}) + ok_response.content = b"{}" + post_mock = AsyncMock(side_effect=[_httpx.HTTPError("boom"), ok_response]) + assert ch._http is not None + ch._http.post = post_mock # type: ignore[attr-defined] + + @dataclass + class _Up: + text: str + + class _Stream: + def __aiter__(self) -> Any: + async def gen() -> Any: + yield _Up("partial-1") + yield _Up("-partial-2") + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="partial-1-partial-2") + + ch._stream_edit_min_interval = 0.0 + # Should NOT hang. Use asyncio.wait_for with a small timeout to + # guard the test against future regressions of the deadlock. + import asyncio as _asyncio + + await _asyncio.wait_for( + ch._stream_to_conversation(_VALID_ACTIVITY, _Stream()), # type: ignore[arg-type] + timeout=2.0, + ) + # Two POSTs total: placeholder (failed) + fallback final. + assert post_mock.await_count == 2 + # Fallback POST contains the full accumulated text. + fallback_body = post_mock.call_args[1]["json"] + assert fallback_body["text"] == "partial-1-partial-2" + + async def test_stream_with_no_text_replaces_placeholder(self) -> None: + ch, _agent = _make_teams(stream=True) + + class _EmptyStream: + def __aiter__(self) -> Any: + async def gen() -> Any: + if False: + yield None # type: ignore[unreachable] + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="") + + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(_VALID_ACTIVITY, _EmptyStream()) # type: ignore[arg-type] + # The placeholder PUT-replaces with "(no response)" so the user + # isn't left staring at "…". + assert ch._http is not None + last_put_body = ch._http.put.call_args[1]["json"] # type: ignore[attr-defined] + assert last_put_body["text"] == "(no response)" diff --git a/python/pyproject.toml b/python/pyproject.toml index 5a6de014657..4edad656e0a 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -90,6 +90,7 @@ agent-framework-github-copilot = { workspace = true } agent-framework-hosting = { workspace = true } agent-framework-hosting-invocations = { workspace = true } agent-framework-hosting-telegram = { workspace = true } +agent-framework-hosting-activity-protocol = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } agent-framework-mem0 = { workspace = true } @@ -216,6 +217,7 @@ executionEnvironments = [ { root = "packages/hosting/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-invocations/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-telegram/tests", reportPrivateUsage = "none" }, + { root = "packages/hosting-activity-protocol/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/uv.lock b/python/uv.lock index 62cf0e99907..62638ffbea5 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -54,6 +54,7 @@ members = [ "agent-framework-hosting-responses", "agent-framework-hosting-invocations", "agent-framework-hosting-telegram", + "agent-framework-hosting-activity-protocol", "agent-framework-hyperlight", "agent-framework-lab", "agent-framework-mem0", @@ -676,6 +677,17 @@ requires-dist = [ { name = "openai", specifier = ">=1.99.0,<3" }, ] +[[package]] +name = "agent-framework-hosting-activity-protocol" +version = "1.0.0a260424" +source = { editable = "packages/hosting-activity-protocol" } +dependencies = [ + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "azure-identity", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "httpx", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + [[package]] name = "agent-framework-hosting-invocations" version = "1.0.0a260424" @@ -683,6 +695,7 @@ source = { editable = "packages/hosting-invocations" } dependencies = [ { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "azure-identity", specifier = ">=1.20,<2" }, { name = "httpx", specifier = ">=0.27,<1" }, ] From 3af116e48541b84ab080f7bff3eb536af5cd9a6e Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Thu, 28 May 2026 14:47:36 +0200 Subject: [PATCH 09/20] Python: add agent-framework-hosting-entra identity-link helpers (#5644) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit * feat(hosting-entra): add Entra (Azure AD) identity-linking channel New ``agent-framework-hosting-entra`` package implementing a Microsoft Entra OAuth-based identity-linking channel for the Hosting framework. Mounts a small set of routes (``/entra/login``, ``/entra/callback``, ``/entra/whoami``) that walk a user through an Entra/Azure AD authorization-code flow and stick the resulting verified identity (``oid`` / ``email`` / ``tid``) onto the host's identity table so later requests on any other channel (Responses, Telegram, …) can be linked to the same user. Surface (re-exported from ``agent_framework_hosting_entra``): - ``EntraChannel`` -- concrete ``Channel`` implementation. Owns the three Starlette routes, signs/verifies short-lived ``state`` tokens to bind the round-trip to the originating channel, exchanges the authorization code for an ID token via MSAL, and writes the verified identity into the host's identity store via the standard ``ChannelIdentity`` plumbing so cross-channel push (e.g. send a Telegram message to the user who completed the link from Responses) works without the channels having to coordinate directly. - 14 unit tests covering route wiring, ``state`` issue / verify, callback exchange happy + failure paths, and identity-store write. Registers the package in ``python/pyproject.toml`` ``[tool.uv.sources]`` and adds the matching pyright ``executionEnvironments`` entry. Stacks on PR-2 (Hosting core); independent of PR-3 / PR-4 / PR-6. The cross-channel sample (``local_identity_link/``) that demonstrates this end-to-end alongside Responses + Telegram lands in PR-8 (samples). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * fix(hosting-entra): close IDOR + reflected-XSS + open-redirect on the OAuth flow Three SECURITY-CRITICAL fixes flagged in round-2 review. 1. IDOR on /auth/start (3198518308). Without authentication the endpoint accepted (channel, channel_id) from the query string and bound *whoever signed in* to that pair. An attacker could bind their own Entra oid to a victim's per-channel id (e.g. `telegram:`), redirecting all of the victim's future inbound traffic to the attacker's isolation key. Fix: introduce link_token_secret + mint_start_url(channel, id, ...). When set, /auth/start requires `exp` + `sig` (HMAC-SHA256 over `channel|channel_id|expires_at`) before issuing the redirect. Channels that hand out start URLs (a Telegram /link command after verifying the inbound webhook signature) call mint_start_url so the token proves the (channel, id) pair was authorised by the channel that owns the surface. Unsigned mode is opt-in and logs a loud WARNING at startup *and* on every accepted request. 2. Reflected XSS on /auth/callback (3198520256, 3198527896). `error`, `error_description`, channel_key (from the unauthenticated /start query), and `upn` (from a Graph response) flowed straight into the text/html response body unescaped. With the IDOR above, an attacker could stash `", + "error_description": "", + }, + ) + assert r.status_code == 400 + assert "@x"} + ) + ch._http = MagicMock(aclose=AsyncMock()) + ch._http.get = AsyncMock(return_value=graph_response) + # Mint a binding via authorize_url_for (channel-side trusted call). + ch.authorize_url_for("", "42") + state = next(iter(ch._pending.keys())) + with TestClient(self._mount(ch)) as client: + r = client.get("/auth/callback", params={"code": "abc", "state": state}) + assert r.status_code == 200 + assert "

hello there

"}, + ], + }) + assert msg.text == "hello there" + assert not any(getattr(c, "uri", None) for c in msg.contents) + + def test_skips_attachment_contenturl_without_scheme(self) -> None: + msg = _parse_activity({ + "type": "message", + "text": "hi", + "attachments": [ + {"contentType": "image/png", "contentUrl": "/relative/path.png"}, + ], + }) + assert msg.text == "hi" + assert not any(getattr(c, "uri", None) for c in msg.contents) + + +class TestCommandText: + def test_plain_text_unchanged(self) -> None: + assert _command_text({"text": "/help"}) == "/help" + + def test_non_string_text_returns_empty(self) -> None: + assert _command_text({"text": None}) == "" + assert _command_text({}) == "" + + def test_strips_bot_mention(self) -> None: + activity = { + "text": "Personal Assistant /todos", + "recipient": {"id": "bot-1"}, + "entities": [ + {"type": "mention", "text": "Personal Assistant", "mentioned": {"id": "bot-1"}}, + ], + } + assert _command_text(activity) == "/todos" + + def test_strips_bot_mention_without_space(self) -> None: + activity = { + "text": "Bot/help", + "recipient": {"id": "bot-1"}, + "entities": [{"type": "mention", "text": "Bot", "mentioned": {"id": "bot-1"}}], + } + assert _command_text(activity) == "/help" + + def test_keeps_other_user_mention(self) -> None: + activity = { + "text": "/whoami Someone", + "recipient": {"id": "bot-1"}, + "entities": [{"type": "mention", "text": "Someone", "mentioned": {"id": "user-9"}}], + } + # Another user's mention must not be stripped. + assert _command_text(activity) == "/whoami Someone" + + def test_malformed_entities_are_ignored(self) -> None: + activity = { + "text": "/help", + "recipient": {"id": "bot-1"}, + "entities": ["not-a-mapping", {"type": "clientInfo"}, {"type": "mention"}], + } + assert _command_text(activity) == "/help" + @dataclass class _FakeAgentResponse: @@ -80,9 +156,11 @@ async def _coro() -> _FakeAgentResponse: return _coro() -def _make_teams(stream: bool = False) -> tuple[ActivityProtocolChannel, _FakeAgent]: +def _make_teams( + stream: bool = False, *, path: str = "/activity/messages" +) -> tuple[ActivityProtocolChannel, _FakeAgent]: agent = _FakeAgent("hi there") - ch = ActivityProtocolChannel(stream=stream, send_typing_action=False) + ch = ActivityProtocolChannel(path=path, stream=stream, send_typing_action=False) fake_http = MagicMock() response_mock = MagicMock() response_mock.raise_for_status = MagicMock() @@ -105,6 +183,11 @@ def _make_teams(stream: bool = False) -> tuple[ActivityProtocolChannel, _FakeAge "serviceUrl": "https://smba.trafficmanager.net/amer/", } +# Minimal request envelope for direct ``_stream_to_conversation`` calls. The +# channel only consults it for cross-channel fan-out, which is skipped when +# ``_ctx`` is unset (as in these unit tests). +_VALID_REQUEST = ChannelRequest(channel="activity", operation="message.create", input=[]) + class TestTeamsWebhook: def test_message_activity_dispatches_to_agent(self) -> None: @@ -122,6 +205,14 @@ def test_message_activity_dispatches_to_agent(self) -> None: body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] assert body["text"] == "hi there" + def test_empty_path_mounts_at_app_root(self) -> None: + ch, agent = _make_teams(path="") + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/", json=_VALID_ACTIVITY) + assert r.status_code == 200 + assert agent.runs, "expected the agent to be invoked" + def test_response_hook_can_rewrite_originating_reply(self) -> None: contexts: list[Any] = [] @@ -181,6 +272,107 @@ def test_message_missing_serviceurl_is_dropped(self) -> None: assert not agent.runs +class TestCommands: + def _make_with_commands(self, commands: list[ChannelCommand]) -> tuple[ActivityProtocolChannel, _FakeAgent]: + agent = _FakeAgent("hi there") + ch = ActivityProtocolChannel(send_typing_action=False, commands=commands) + fake_http = MagicMock() + response_mock = MagicMock() + response_mock.raise_for_status = MagicMock() + response_mock.json = MagicMock(return_value={"id": "act-1"}) + fake_http.post = AsyncMock(return_value=response_mock) + fake_http.put = AsyncMock(return_value=response_mock) + fake_http.aclose = AsyncMock() + ch._http = fake_http + return ch, agent + + def test_slash_command_bypasses_agent_and_replies(self) -> None: + seen: list[ChannelCommandContext] = [] + + async def handle(ctx: ChannelCommandContext) -> None: + seen.append(ctx) + await ctx.reply("listed") + + ch, agent = self._make_with_commands([ChannelCommand("todos", "List", handle)]) + host = AgentFrameworkHost(target=agent, channels=[ch]) + activity = dict(_VALID_ACTIVITY, text="/todos") + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=activity) + assert r.status_code == 200 + assert not agent.runs, "command must bypass the agent" + assert seen and seen[0].request.operation == "command.invoke" + assert seen[0].request.input == "/todos" + assert seen[0].request.session is not None + assert seen[0].request.session.isolation_key == activity_protocol_isolation_key("19:meeting_xyz@thread.v2") + assert ch._http is not None + assert ch._http.post.call_args[1]["json"]["text"] == "listed" # type: ignore[attr-defined] + + def test_command_match_is_case_insensitive(self) -> None: + ran = False + + async def handle(ctx: ChannelCommandContext) -> None: + nonlocal ran + ran = True + + ch, agent = self._make_with_commands([ChannelCommand("New", "reset", handle)]) + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=dict(_VALID_ACTIVITY, text="/new")) + assert r.status_code == 200 + assert ran + assert not agent.runs + + def test_unknown_command_falls_through_to_agent(self) -> None: + async def handle(ctx: ChannelCommandContext) -> None: # pragma: no cover - never called + raise AssertionError("should not run") + + ch, agent = self._make_with_commands([ChannelCommand("todos", "List", handle)]) + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=dict(_VALID_ACTIVITY, text="/unknown")) + assert r.status_code == 200 + assert agent.runs, "unknown /command must reach the agent" + + def test_command_failure_does_not_retry(self) -> None: + async def handle(ctx: ChannelCommandContext) -> None: + raise RuntimeError("boom") + + ch, agent = self._make_with_commands([ChannelCommand("todos", "List", handle)]) + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=dict(_VALID_ACTIVITY, text="/todos")) + # Best-effort: a failing command is swallowed and acked with 200 so Bot + # Service does not retry (and re-run a non-idempotent command). + assert r.status_code == 200 + assert not agent.runs + + def test_run_hook_applied_to_command_request(self) -> None: + def hook(request: ChannelRequest, **_: Any) -> ChannelRequest: + return replace(request, session=ChannelSession(isolation_key="resolved-key")) + + captured: list[str] = [] + + async def handle(ctx: ChannelCommandContext) -> None: + assert ctx.request.session is not None + captured.append(ctx.request.session.isolation_key) + + agent = _FakeAgent("hi") + ch = ActivityProtocolChannel(send_typing_action=False, commands=[ChannelCommand("todos", "x", handle)]) + ch._hook = hook + fake_http = MagicMock() + response_mock = MagicMock() + response_mock.raise_for_status = MagicMock() + response_mock.json = MagicMock(return_value={"id": "act-1"}) + fake_http.post = AsyncMock(return_value=response_mock) + fake_http.aclose = AsyncMock() + ch._http = fake_http + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=dict(_VALID_ACTIVITY, text="/todos")) + assert r.status_code == 200 + assert captured == ["resolved-key"] + + class TestOutbound: async def test_send_message_posts_to_conversation_url(self) -> None: ch, _agent = _make_teams() @@ -193,6 +385,103 @@ async def test_send_message_posts_to_conversation_url(self) -> None: assert body["text"] == "hi" +class TestPush: + """The channel implements ``host.ChannelPush`` so it can be a + non-originating destination for cross-channel fan-out / echo replay.""" + + def test_is_channel_push_instance(self) -> None: + from agent_framework_hosting import ChannelPush + + ch, _agent = _make_teams() + assert isinstance(ch, ChannelPush) + + def _identity(self) -> ChannelIdentity: + return ChannelIdentity( + channel="activity", + native_id="19:meeting_xyz@thread.v2", + attributes={ + "service_url": "https://smba.trafficmanager.net/amer/", + "conversation": {"id": "19:meeting_xyz@thread.v2"}, + "bot": {"id": "bot-1"}, + "user": {"id": "user-1"}, + "channel_id": "msteams", + "locale": "en-US", + }, + ) + + async def test_push_posts_proactive_activity(self) -> None: + ch, _agent = _make_teams() + await ch.push(self._identity(), _text_result("broadcast hello")) + assert ch._http is not None + ch._http.post.assert_called() # type: ignore[attr-defined] + url = ch._http.post.call_args[0][0] # type: ignore[attr-defined] + assert url == ("https://smba.trafficmanager.net/amer/v3/conversations/19:meeting_xyz@thread.v2/activities") + body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert body["text"] == "broadcast hello" + # Outbound activity speaks AS the bot: inbound recipient -> from, + # inbound from -> recipient. + assert body["from"] == {"id": "bot-1"} + assert body["recipient"] == {"id": "user-1"} + assert body["conversation"] == {"id": "19:meeting_xyz@thread.v2"} + + async def test_push_requires_service_url(self) -> None: + ch, _agent = _make_teams() + identity = ChannelIdentity( + channel="activity", + native_id="conv-x", + attributes={"conversation": {"id": "conv-x"}}, + ) + with pytest.raises(ValueError, match="service_url"): + await ch.push(identity, _text_result("hi")) + + async def test_push_rejects_disallowed_service_url(self) -> None: + # ``push`` runs out-of-band against a persisted identity, so it must + # re-validate the service_url against the allow-list rather than trust + # the value captured (possibly hours) earlier. + ch, _agent = _make_teams() + identity = ChannelIdentity( + channel="activity", + native_id="conv-x", + attributes={ + "service_url": "https://attacker.example.com/", + "conversation": {"id": "conv-x"}, + "bot": {"id": "bot-1"}, + "user": {"id": "user-1"}, + }, + ) + with pytest.raises(ValueError, match="not in the allowed hosts"): + await ch.push(identity, _text_result("hi")) + assert ch._http is not None + ch._http.post.assert_not_called() # type: ignore[attr-defined] + + +class TestIdentityRecording: + """``_process_activity`` must stamp the inbound conversation reference + onto ``ChannelRequest.identity`` so the host can record it for fan-out.""" + + async def test_inbound_sets_request_identity(self) -> None: + ch, agent = _make_teams() + captured: dict[str, Any] = {} + + async def hook(req: ChannelRequest, **_: Any) -> ChannelRequest: + captured["request"] = req + return req + + ch._hook = hook # type: ignore[assignment] + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post("/activity/messages", json=_VALID_ACTIVITY) + assert r.status_code == 200 + request = captured["request"] + assert request.identity is not None + assert request.identity.channel == "activity" + assert request.identity.native_id == "19:meeting_xyz@thread.v2" + attrs = request.identity.attributes + assert attrs["service_url"] == "https://smba.trafficmanager.net/amer/" + assert attrs["bot"] == {"id": "bot-1"} + assert attrs["user"] == {"id": "user-1"} + + class TestConfig: def test_rejects_both_secret_and_certificate(self) -> None: with pytest.raises(ValueError, match="not both"): @@ -371,7 +660,7 @@ async def get_final_response(self) -> Any: # Use a tight throttle so the test doesn't sit on `wait_for`. ch._stream_edit_min_interval = 0.0 - await ch._stream_to_conversation(_VALID_ACTIVITY, _Stream()) # type: ignore[arg-type] + await ch._stream_to_conversation(_VALID_ACTIVITY, _VALID_REQUEST, _Stream()) # type: ignore[arg-type] assert ch._http is not None # Placeholder POST + at least one final PUT. ch._http.post.assert_called() # type: ignore[attr-defined] @@ -420,7 +709,7 @@ async def get_final_response(self) -> Any: import asyncio as _asyncio await _asyncio.wait_for( - ch._stream_to_conversation(_VALID_ACTIVITY, _Stream()), # type: ignore[arg-type] + ch._stream_to_conversation(_VALID_ACTIVITY, _VALID_REQUEST, _Stream()), # type: ignore[arg-type] timeout=2.0, ) # Two POSTs total: placeholder (failed) + fallback final. @@ -444,9 +733,150 @@ async def get_final_response(self) -> Any: return _FakeAgentResponse(text="") ch._stream_edit_min_interval = 0.0 - await ch._stream_to_conversation(_VALID_ACTIVITY, _EmptyStream()) # type: ignore[arg-type] + await ch._stream_to_conversation(_VALID_ACTIVITY, _VALID_REQUEST, _EmptyStream()) # type: ignore[arg-type] # The placeholder PUT-replaces with "(no response)" so the user # isn't left staring at "…". assert ch._http is not None last_put_body = ch._http.put.call_args[1]["json"] # type: ignore[attr-defined] assert last_put_body["text"] == "(no response)" + + async def test_non_edit_channel_buffers_and_posts_single_message(self) -> None: + # Web Chat (and every non-Teams channel) does not support + # PUT /activities/{id}; the channel must buffer the stream and POST + # a single final message rather than the placeholder+edit dance. + ch, _agent = _make_teams(stream=True) + webchat_activity = {**_VALID_ACTIVITY, "channelId": "webchat"} + + @dataclass + class _Up: + text: str + + class _Stream: + def __aiter__(self) -> Any: + async def gen() -> Any: + yield _Up("hel") + yield _Up("lo") + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="hello") + + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(webchat_activity, _VALID_REQUEST, _Stream()) # type: ignore[arg-type] + assert ch._http is not None + # No PUT (no editing); exactly one POST with the full text. + ch._http.put.assert_not_called() # type: ignore[attr-defined] + assert ch._http.post.await_count == 1 # type: ignore[attr-defined] + body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert body["text"] == "hello" + + async def test_non_edit_channel_empty_stream_posts_no_response(self) -> None: + ch, _agent = _make_teams(stream=True) + webchat_activity = {**_VALID_ACTIVITY, "channelId": "directline"} + + class _EmptyStream: + def __aiter__(self) -> Any: + async def gen() -> Any: + if False: + yield None # type: ignore[unreachable] + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="") + + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(webchat_activity, _VALID_REQUEST, _EmptyStream()) # type: ignore[arg-type] + assert ch._http is not None + ch._http.put.assert_not_called() # type: ignore[attr-defined] + body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert body["text"] == "(no response)" + + async def test_buffer_empty_stream_consults_host_and_can_suppress(self) -> None: + # Empty streamed replies must still consult the host so that + # ``ResponseTarget.none`` (deliver_response -> False) suppresses the + # originating message instead of posting "(no response)". + ch, _agent = _make_teams(stream=True) + webchat_activity = {**_VALID_ACTIVITY, "channelId": "directline"} + ctx = MagicMock() + ctx.deliver_response = AsyncMock(return_value=False) + ch._ctx = ctx + + class _EmptyStream: + def __aiter__(self) -> Any: + async def gen() -> Any: + if False: + yield None # type: ignore[unreachable] + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="") + + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(webchat_activity, _VALID_REQUEST, _EmptyStream()) # type: ignore[arg-type] + assert ch._http is not None + ctx.deliver_response.assert_awaited_once() + ch._http.post.assert_not_called() # type: ignore[attr-defined] + ch._http.put.assert_not_called() # type: ignore[attr-defined] + + async def test_edit_empty_stream_consults_host_and_can_suppress(self) -> None: + # Same contract for the edit-capable (Teams) progressive path. + ch, _agent = _make_teams(stream=True) + ctx = MagicMock() + ctx.deliver_response = AsyncMock(return_value=False) + ch._ctx = ctx + + class _EmptyStream: + def __aiter__(self) -> Any: + async def gen() -> Any: + if False: + yield None # type: ignore[unreachable] + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="") + + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(_VALID_ACTIVITY, _VALID_REQUEST, _EmptyStream()) # type: ignore[arg-type] + ctx.deliver_response.assert_awaited_once() + + async def test_edit_405_falls_back_to_single_post(self) -> None: + # Defensive: a channel advertised as edit-capable that nonetheless + # rejects the PUT with 405 must stop editing and POST the final + # text as a fresh message instead of silently leaving "…". + import httpx as _httpx + + ch, _agent = _make_teams(stream=True) + assert ch._http is not None + + request_405 = _httpx.Request("PUT", "https://smba.trafficmanager.net/amer/v3/x") + response_405 = _httpx.Response(405, request=request_405) + ch._http.put = AsyncMock( # type: ignore[attr-defined] + side_effect=_httpx.HTTPStatusError("405", request=request_405, response=response_405) + ) + + @dataclass + class _Up: + text: str + + class _Stream: + def __aiter__(self) -> Any: + async def gen() -> Any: + yield _Up("hel") + yield _Up("lo") + + return gen() + + async def get_final_response(self) -> Any: + return _FakeAgentResponse(text="hello") + + ch._stream_edit_min_interval = 0.0 + await ch._stream_to_conversation(_VALID_ACTIVITY, _VALID_REQUEST, _Stream()) # type: ignore[arg-type] + # Placeholder POST + fallback final POST = 2 POSTs; the final one + # carries the full text. + assert ch._http.post.await_count == 2 # type: ignore[attr-defined] + final_body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] + assert final_body["text"] == "hello" diff --git a/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py b/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py index c7125ff91f4..18da136c0ca 100644 --- a/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py +++ b/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py @@ -92,7 +92,7 @@ def __init__( public_key: str, bot_token: str | None = None, guild_id: str | None = None, - path: str = "/discord", + path: str = "/discord/interactions", agent_command: str = "ask", agent_command_description: str = "Ask the agent", agent_command_option: str = "prompt", @@ -120,8 +120,8 @@ def __init__( guild_id: Optional guild id for guild-scoped slash command registration. Recommended for development because global command registration can take a long time to propagate. - path: Host mount path. The interaction route is contributed as - ``/interactions`` below this path. + path: Interaction endpoint path on the host. Use ``""`` to expose + the interaction route at the app root. agent_command: Slash command name that invokes the hosted agent. agent_command_description: Description for the agent slash command. agent_command_option: String option name that carries the prompt. @@ -159,7 +159,7 @@ def __init__( self.agent_command_description = agent_command_description self.agent_command_option = agent_command_option self.register_commands = register_commands - self._commands: set[ChannelCommand] = set(commands) or {} # type: ignore + self._commands = tuple(commands or ()) self._command_by_name = {command.name: command for command in self._commands} self._run_hook = run_hook self.response_hook = response_hook @@ -184,7 +184,7 @@ def contribute(self, context: ChannelContext) -> ChannelContribution: """Register the Discord interaction route and lifecycle hooks.""" self._ctx = context return ChannelContribution( - routes=[Route("/interactions", self._handle, methods=["POST"])], + routes=[Route("/", self._handle, methods=["POST"])], commands=self._commands, on_startup=[self._on_startup], on_shutdown=[self._on_shutdown], diff --git a/python/packages/hosting-discord/tests/discord/test_channel.py b/python/packages/hosting-discord/tests/discord/test_channel.py index 868f905a441..5addcd9a065 100644 --- a/python/packages/hosting-discord/tests/discord/test_channel.py +++ b/python/packages/hosting-discord/tests/discord/test_channel.py @@ -126,9 +126,9 @@ def test_ping_requires_valid_signature_and_returns_pong() -> None: body = json.dumps({"type": 1}).encode("utf-8") with TestClient(app) as client: - ok = client.post("/interactions", content=body, headers=_headers(signing_key, body)) + ok = client.post("/", content=body, headers=_headers(signing_key, body)) bad = client.post( - "/interactions", + "/", content=body, headers={ **_headers(signing_key, body), @@ -159,11 +159,11 @@ def test_request_validation_errors() -> None: unsupported_app = Starlette(routes=list(unsupported_channel.contribute(_FakeContext()).routes)) # type: ignore[arg-type] with TestClient(app) as client: - too_large = client.post("/interactions", content=b"{}x") - invalid_json = client.post("/interactions", content=b"{") + too_large = client.post("/", content=b"{}x") + invalid_json = client.post("/", content=b"{") with TestClient(unsupported_app) as client: - non_object = client.post("/interactions", json=[]) - unsupported = client.post("/interactions", json={"type": 99}) + non_object = client.post("/", json=[]) + unsupported = client.post("/", json={"type": 99}) assert too_large.status_code == 413 assert invalid_json.status_code == 400 diff --git a/python/packages/hosting-invocations/README.md b/python/packages/hosting-invocations/README.md index 5587a2b6365..de77dfbd40c 100644 --- a/python/packages/hosting-invocations/README.md +++ b/python/packages/hosting-invocations/README.md @@ -1,13 +1,13 @@ # agent-framework-hosting-invocations -Minimal `POST /invoke` channel for [agent-framework-hosting](../hosting). Useful +Minimal `POST /invocations` channel for [agent-framework-hosting](../hosting). Useful for smoke-testing, durable-task drivers, and bespoke clients that don't speak the OpenAI Responses protocol. ## Wire shape ``` -POST /invocations/invoke +POST /invocations { "message": "hello", "session_id": "user-42", diff --git a/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py b/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py index 2ad7b4be911..572348dec1f 100644 --- a/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py +++ b/python/packages/hosting-invocations/agent_framework_hosting_invocations/__init__.py @@ -1,6 +1,6 @@ # Copyright (c) Microsoft. All rights reserved. -"""Minimal ``POST /invoke`` channel for :mod:`agent_framework_hosting`.""" +"""Minimal ``POST /invocations`` channel for :mod:`agent_framework_hosting`.""" from ._channel import InvocationsChannel diff --git a/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py b/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py index bbaf27b4959..6f70614b82b 100644 --- a/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py +++ b/python/packages/hosting-invocations/agent_framework_hosting_invocations/_channel.py @@ -1,6 +1,6 @@ # Copyright (c) Microsoft. All rights reserved. -"""Minimal ``POST /invoke`` channel. +"""Minimal ``POST /invocations`` channel. Inspired by ``agent-framework-foundry-hosting``'s ``InvocationsHostServer``. A framework-agnostic surface for callers that just want to send a message and @@ -32,7 +32,7 @@ class InvocationsChannel: - """Minimal ``POST /invoke`` surface. + """Minimal ``POST /invocations`` surface. A run hook can rewrite the channel request (e.g. inject a session, add options) before the host invokes the agent. A stream-transform hook can @@ -51,8 +51,8 @@ def __init__( ) -> None: """Configure the invocations endpoint. - ``path`` is the mount root the host prefixes when registering this - channel's routes (the actual handler is ``POST {path}/invoke``). + ``path`` is the endpoint path the host uses when registering this + channel. Use ``""`` to expose the handler at the app root. ``run_hook`` may rewrite the :class:`ChannelRequest` before the host invokes the target — typically to attach session metadata or translate the wire payload into ``Message`` instances. @@ -68,12 +68,12 @@ def __init__( self._ctx: ChannelContext | None = None def contribute(self, context: ChannelContext) -> ChannelContribution: - """Capture the host-supplied context and register ``POST /invoke``.""" + """Capture the host-supplied context and register the endpoint route.""" self._ctx = context - return ChannelContribution(routes=[Route("/invoke", self._handle, methods=["POST"])]) + return ChannelContribution(routes=[Route("/", self._handle, methods=["POST"])]) async def _handle(self, request: Request) -> Response: - """Handle a single ``POST /invoke`` call. + """Handle a single Invocations call. Validates the JSON body shape, builds a :class:`ChannelRequest` (optionally with a ``ChannelSession`` keyed by ``session_id``), diff --git a/python/packages/hosting-invocations/pyproject.toml b/python/packages/hosting-invocations/pyproject.toml index 80cb40bfc17..8050c4e86fd 100644 --- a/python/packages/hosting-invocations/pyproject.toml +++ b/python/packages/hosting-invocations/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "agent-framework-hosting-invocations" -description = "Minimal POST /invoke channel for agent-framework-hosting." +description = "Minimal POST /invocations channel for agent-framework-hosting." authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] readme = "README.md" requires-python = ">=3.10" diff --git a/python/packages/hosting-invocations/tests/test_channel.py b/python/packages/hosting-invocations/tests/test_channel.py index cdd3403850a..e2e64535edd 100644 --- a/python/packages/hosting-invocations/tests/test_channel.py +++ b/python/packages/hosting-invocations/tests/test_channel.py @@ -60,9 +60,9 @@ async def _coro() -> _FakeAgentResponse: return _coro() -def _make_client(agent: _FakeAgent | None = None) -> tuple[TestClient, _FakeAgent]: +def _make_client(agent: _FakeAgent | None = None, *, path: str = "/invocations") -> tuple[TestClient, _FakeAgent]: agent = agent or _FakeAgent() - host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) + host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel(path=path)]) return TestClient(host.app), agent @@ -70,14 +70,21 @@ class TestInvocations: def test_post_invoke_returns_response(self) -> None: client, _agent = _make_client(_FakeAgent(reply="pong")) with client: - r = client.post("/invocations/invoke", json={"message": "ping"}) + r = client.post("/invocations", json={"message": "ping"}) + assert r.status_code == 200 + assert r.json() == {"response": "pong", "session_id": None} + + def test_empty_path_mounts_at_app_root(self) -> None: + client, _agent = _make_client(_FakeAgent(reply="pong"), path="") + with client: + r = client.post("/", json={"message": "ping"}) assert r.status_code == 200 assert r.json() == {"response": "pong", "session_id": None} def test_session_id_propagates_to_target(self) -> None: client, agent = _make_client() with client: - r = client.post("/invocations/invoke", json={"message": "x", "session_id": "s1"}) + r = client.post("/invocations", json={"message": "x", "session_id": "s1"}) assert r.status_code == 200 assert r.json()["session_id"] == "s1" sess = agent.calls[0]["kwargs"].get("session") @@ -90,7 +97,7 @@ def test_invalid_json_returns_400(self) -> None: client, _ = _make_client() with client: r = client.post( - "/invocations/invoke", + "/invocations", content=b"{not json", headers={"content-type": "application/json"}, ) @@ -99,26 +106,26 @@ def test_invalid_json_returns_400(self) -> None: def test_empty_message_returns_422(self) -> None: client, _ = _make_client() with client: - r = client.post("/invocations/invoke", json={"message": ""}) + r = client.post("/invocations", json={"message": ""}) assert r.status_code == 422 def test_non_string_session_id_returns_422(self) -> None: client, _ = _make_client() with client: - r = client.post("/invocations/invoke", json={"message": "x", "session_id": 1}) + r = client.post("/invocations", json={"message": "x", "session_id": 1}) assert r.status_code == 422 def test_non_object_body_returns_422(self) -> None: client, _ = _make_client() with client: - r = client.post("/invocations/invoke", json=[]) + r = client.post("/invocations", json=[]) assert r.status_code == 422 def test_streaming_emits_data_lines_and_done(self) -> None: agent = _FakeAgent(chunks=["hel", "lo"]) host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 body = r.text assert "data: hel" in body @@ -136,7 +143,7 @@ async def hook(req: ChannelRequest, **_: Any) -> ChannelRequest: agent = _FakeAgent(reply="ok") host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel(run_hook=hook)]) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 # Even though caller asked for stream=True, hook flipped it off — so # we get JSON back, not SSE. @@ -154,7 +161,7 @@ def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel(response_hook=hook)]) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "ping"}) + r = client.post("/invocations", json={"message": "ping"}) assert r.status_code == 200 assert r.json() == {"response": "hooked:pong", "session_id": None} @@ -174,7 +181,7 @@ def transform(update: Any) -> Any: channels=[InvocationsChannel(stream_transform_hook=transform)], ) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 body = r.text assert "data: FOO" in body @@ -192,7 +199,7 @@ def transform(update: Any) -> Any: channels=[InvocationsChannel(stream_transform_hook=transform)], ) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 body = r.text assert "data: keep" in body @@ -210,7 +217,7 @@ async def transform(update: Any) -> Any: channels=[InvocationsChannel(stream_transform_hook=transform)], ) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 assert "data: aa!" in r.text @@ -221,7 +228,7 @@ def test_streaming_chunk_with_crlf_splits_into_separate_data_lines(self) -> None agent = _FakeAgent(chunks=["line1\r\nline2"]) host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 body = r.text assert "data: line1\n" in body @@ -247,7 +254,7 @@ def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> A agent = _AgentWithFailingFinal() host = AgentFrameworkHost(target=agent, channels=[InvocationsChannel()]) with TestClient(host.app) as client: - r = client.post("/invocations/invoke", json={"message": "x", "stream": True}) + r = client.post("/invocations", json={"message": "x", "stream": True}) assert r.status_code == 200 body = r.text assert "data: partial" in body diff --git a/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py b/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py index cf85cca0260..e29431063ac 100644 --- a/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py +++ b/python/packages/hosting-responses/agent_framework_hosting_responses/_channel.py @@ -77,7 +77,7 @@ class ResponsesChannel: Mounts ``POST /responses`` (default path ``/responses`` so the full route is ``/responses/responses`` when the channel is prefixed, - or just ``/responses`` when ``path=""``). + or just ``/`` when ``path=""``). """ name = "responses" @@ -85,7 +85,7 @@ class ResponsesChannel: def __init__( self, *, - path: str = "", + path: str = "/responses", run_hook: ChannelRunHook | None = None, response_hook: ChannelResponseHook | None = None, stream_transform_hook: ChannelStreamTransformHook | None = None, @@ -94,9 +94,9 @@ def __init__( """Create a Responses channel. Keyword Args: - path: Mount prefix on the host. Default ``""`` mounts the - ``POST /responses`` route at the app root, matching the - upstream OpenAI surface. + path: Endpoint path on the host. Default ``"/responses"`` matches + the upstream OpenAI surface; use ``""`` to expose this channel + at the app root. run_hook: Optional :data:`ChannelRunHook` invoked with the parsed :class:`ChannelRequest` before the agent target runs. May return a replacement request. @@ -145,12 +145,12 @@ def __init__( ) def contribute(self, context: ChannelContext) -> ChannelContribution: - """Capture the host-supplied context and register ``POST /responses``.""" + """Capture the host-supplied context and register the endpoint route.""" self._ctx = context - return ChannelContribution(routes=[Route("/responses", self._handle, methods=["POST"])]) + return ChannelContribution(routes=[Route("/", self._handle, methods=["POST"])]) async def _handle(self, request: Request) -> Response: - """Handle a single ``POST /responses`` call. + """Handle a single Responses API call. Parses the OpenAI Responses-shaped body into ``Message`` / ``options`` / ``ChannelSession`` triples via :mod:`._parsing`, diff --git a/python/packages/hosting-responses/tests/test_channel.py b/python/packages/hosting-responses/tests/test_channel.py index c76dda9c448..c750e1d346f 100644 --- a/python/packages/hosting-responses/tests/test_channel.py +++ b/python/packages/hosting-responses/tests/test_channel.py @@ -91,9 +91,13 @@ async def push(self, identity: ChannelIdentity, payload: HostedRunResult) -> Non # --------------------------------------------------------------------------- # -def _make_client(agent: _FakeAgent | None = None) -> tuple[TestClient, AgentFrameworkHost, _FakeAgent]: +def _make_client( + agent: _FakeAgent | None = None, + *, + path: str = "/responses", +) -> tuple[TestClient, AgentFrameworkHost, _FakeAgent]: agent = agent or _FakeAgent() - host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel()]) + host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel(path=path)]) return TestClient(host.app), host, agent @@ -110,6 +114,13 @@ def test_post_responses_returns_completed_envelope(self) -> None: assert body["output"][0]["content"][0]["text"] == "hi back" assert len(agent.calls) == 1 + def test_empty_path_mounts_at_app_root(self) -> None: + client, _host, _agent = _make_client(_FakeAgent(reply="hi back"), path="") + with client: + r = client.post("/", json={"input": "hi"}) + assert r.status_code == 200 + assert r.json()["output"][0]["content"][0]["text"] == "hi back" + def test_invalid_json_returns_400(self) -> None: client, *_ = _make_client() with client: diff --git a/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py b/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py index 2e7c20fb937..b2a026a9437 100644 --- a/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py +++ b/python/packages/hosting-telegram/agent_framework_hosting_telegram/_channel.py @@ -88,6 +88,21 @@ def _text_result(text: str) -> HostedRunResult[AgentResponse]: return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) +def _is_echo_payload(payload: HostedRunResult[AgentResponse]) -> bool: + """Return ``True`` when a push payload is an echoed user turn. + + Per the :class:`~agent_framework_hosting.ChannelPush` contract the host + mirrors the originating user's input as a one-or-more message + :class:`~agent_framework.AgentResponse` with every ``role == "user"``, + delivered *before* the agent's (``role == "assistant"``) reply. Treating a + payload whose messages are all user-role as an echo lets the channel pick + echo-only delivery options (e.g. silent notifications) without the host + having to thread an explicit ``is_echo`` flag through ``push``. + """ + messages = getattr(payload.result, "messages", None) or [] + return bool(messages) and all(getattr(m, "role", None) == "user" for m in messages) + + def _telegram_media_file_id(message: Mapping[str, Any]) -> tuple[str, str] | None: """Return ``(file_id, fallback_media_type)`` for any media on the message.""" photo = message.get("photo") @@ -151,7 +166,7 @@ def __init__( self, *, bot_token: str, - path: str = "/telegram", + path: str = "/telegram/webhook", commands: Sequence[ChannelCommand] = (), register_native_commands: bool = True, run_hook: ChannelRunHook | None = None, @@ -159,6 +174,7 @@ def __init__( api_base: str = "https://api.telegram.org", webhook_url: str | None = None, secret_token: str | None = None, + delete_webhook_on_shutdown: bool = False, parse_mode: str | None = None, send_typing_action: bool = True, transport: Literal["auto", "polling", "webhook"] = "auto", @@ -179,6 +195,7 @@ def __init__( self._api = f"{api_base}/bot{bot_token}" self._webhook_url = webhook_url self._secret_token = secret_token + self._delete_webhook_on_shutdown = delete_webhook_on_shutdown self._parse_mode = parse_mode self._send_typing_action = send_typing_action if transport == "auto": @@ -210,7 +227,7 @@ def contribute(self, context: ChannelContext) -> ChannelContribution: self._ctx = context routes: list[BaseRoute] = [] if self._transport == "webhook": - routes.append(Route("/webhook", self._handle, methods=["POST"])) + routes.append(Route("/", self._handle, methods=["POST"])) return ChannelContribution( routes=routes, commands=self._commands, @@ -258,7 +275,7 @@ async def _on_startup(self) -> None: logger.info("Telegram polling started (long-poll timeout=%ss)", self._polling_timeout) async def _on_shutdown(self) -> None: - """Stop the polling task, drain in-flight workers, drop the webhook, close HTTP. + """Stop the polling task, drain in-flight workers, close HTTP. Drain order: 1. Cancel the poll task so no new updates are admitted. @@ -269,10 +286,17 @@ async def _on_shutdown(self) -> None: ``_update_tasks`` (the webhook handler returns 200 immediately and runs the agent in a background task, which the previous shutdown ignored entirely). - 4. Best-effort `deleteWebhook` and HTTP client close. - - Webhook teardown is best-effort — failures (e.g. revoked token at - shutdown) are logged but never raised so app shutdown can complete. + 4. Close the HTTP client. + + The webhook registration is intentionally **left in place** on + shutdown. A Telegram webhook is a single global resource, so + deleting it here races rolling redeploys: the new revision calls + ``setWebhook`` on startup, then the old revision's shutdown would + delete it, silently breaking inbound delivery until the next boot. + ``setWebhook`` is overwriting/idempotent, so the next startup + re-asserts it anyway. Set ``delete_webhook_on_shutdown=True`` to opt + into best-effort teardown (e.g. for a one-off/ephemeral deployment); + failures are logged but never raised so app shutdown can complete. """ if self._poll_task is not None: self._poll_task.cancel() @@ -296,7 +320,7 @@ async def _on_shutdown(self) -> None: await task self._update_tasks.clear() if self._http is not None: - if self._transport == "webhook": + if self._transport == "webhook" and self._delete_webhook_on_shutdown: try: await self._http.post(f"{self._api}/deleteWebhook") except Exception: # pragma: no cover - best-effort cleanup @@ -845,7 +869,17 @@ async def push(self, identity: ChannelIdentity, payload: HostedRunResult[AgentRe raise ValueError(f"Telegram push requires an int chat_id, got {identity.native_id!r}") from exc if self._http is None: raise RuntimeError("TelegramChannel.push called before startup") - await self._send(chat_id, payload.result.text) + # The Bot API can only ever send AS the bot, so there is no way to + # impersonate the user for an echo (the MTProto ``send_as`` field is + # not exposed to bots). The next best UX is to deliver echoes + # *silently* (``disable_notification``) so a mirrored input doesn't + # buzz the user's device the way a genuine reply does. Echo phases are + # identified per the ChannelPush contract: a payload whose messages are + # all ``role == "user"`` is the originating turn mirrored here. + extra: dict[str, Any] = {} + if _is_echo_payload(payload): + extra["disable_notification"] = True + await self._send(chat_id, payload.result.text, **extra) async def _send_photo(self, chat_id: int, photo_url: str, caption: str | None = None) -> None: """POST a ``sendPhoto`` to Telegram with an optional caption.""" diff --git a/python/packages/hosting-telegram/tests/test_channel.py b/python/packages/hosting-telegram/tests/test_channel.py index 6b4930b48dd..0eb5031c222 100644 --- a/python/packages/hosting-telegram/tests/test_channel.py +++ b/python/packages/hosting-telegram/tests/test_channel.py @@ -123,10 +123,13 @@ def _run_result(text: str) -> HostedRunResult[AgentResponse]: return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) -def _make_telegram(stream_default: bool = False) -> tuple[TelegramChannel, _FakeAgent]: +def _make_telegram( + stream_default: bool = False, *, path: str = "/telegram/webhook" +) -> tuple[TelegramChannel, _FakeAgent]: agent = _FakeAgent("hi") ch = TelegramChannel( bot_token="123:abc", + path=path, webhook_url="https://example.com/hook", secret_token="s3cr3t", stream=stream_default, @@ -158,6 +161,18 @@ def test_webhook_accepts_text_message_and_dispatches_to_agent(self) -> None: assert r.status_code == 200 assert agent.runs, "expected the agent to be invoked" + def test_empty_path_mounts_at_app_root(self) -> None: + ch, agent = _make_telegram(path="") + host = AgentFrameworkHost(target=agent, channels=[ch]) + with TestClient(host.app) as client: + r = client.post( + "/", + json={"update_id": 1, "message": {"chat": {"id": 99}, "text": "hello"}}, + headers={"x-telegram-bot-api-secret-token": "s3cr3t"}, + ) + assert r.status_code == 200 + assert agent.runs, "expected the agent to be invoked" + def test_webhook_rejects_bad_secret(self) -> None: ch, agent = _make_telegram() host = AgentFrameworkHost(target=agent, channels=[ch]) @@ -216,6 +231,23 @@ async def test_push_calls_send(self) -> None: assert args[0].endswith("/sendMessage") assert kwargs["json"]["chat_id"] in ("42", 42) assert kwargs["json"]["text"] == "hi" + # Agent replies must stay loud: no silent flag on a non-echo push. + assert "disable_notification" not in kwargs["json"] + + async def test_push_echo_is_silent(self) -> None: + ch, _agent = _make_telegram() + from agent_framework_hosting import ChannelIdentity + + echo = HostedRunResult( + AgentResponse(messages=[Message(role="user", contents=[Content.from_text(text="said via X")])]) + ) + await ch.push(ChannelIdentity(channel="telegram", native_id="42"), echo) + assert ch._http is not None + _args, kwargs = ch._http.post.call_args # type: ignore[attr-defined] + # Bots cannot impersonate the user (no MTProto send_as), so the echo is + # delivered silently instead of buzzing the device like a real reply. + assert kwargs["json"]["disable_notification"] is True + assert kwargs["json"]["text"] == "said via X" async def test_command_handler_invoked(self) -> None: captured: list[ChannelCommandContext] = [] @@ -387,3 +419,39 @@ async def stuck(update: Mapping[str, Any]) -> None: await ch._on_shutdown() assert not ch._chat_workers assert not ch._update_tasks + + +def _deletewebhook_called(http_mock: MagicMock) -> bool: + return any( + call.args and str(call.args[0]).endswith("/deleteWebhook") for call in http_mock.post.call_args_list + ) + + +class TestWebhookShutdownTeardown: + async def test_shutdown_keeps_webhook_by_default(self) -> None: + """Default: shutdown must NOT delete the webhook (avoids redeploy races).""" + ch, _ = _make_telegram() + assert ch._transport == "webhook" + await ch._on_shutdown() + assert not _deletewebhook_called(ch._http) # type: ignore[arg-type] + ch._http.aclose.assert_awaited() # type: ignore[union-attr] + + async def test_shutdown_deletes_webhook_when_opted_in(self) -> None: + """Opt-in: ``delete_webhook_on_shutdown=True`` performs best-effort teardown.""" + ch = TelegramChannel( + bot_token="123:abc", + webhook_url="https://example.com/hook", + secret_token="s3cr3t", + delete_webhook_on_shutdown=True, + stream=False, + ) + fake_http = MagicMock() + response_mock = MagicMock() + response_mock.json = MagicMock(return_value={"ok": True, "result": {}}) + fake_http.post = AsyncMock(return_value=response_mock) + fake_http.get = AsyncMock(return_value=response_mock) + fake_http.aclose = AsyncMock() + ch._http = fake_http + await ch._on_shutdown() + assert _deletewebhook_called(fake_http) + fake_http.aclose.assert_awaited() diff --git a/python/packages/hosting/agent_framework_hosting/_host.py b/python/packages/hosting/agent_framework_hosting/_host.py index 05ea0fabb99..d90a9df29c3 100644 --- a/python/packages/hosting/agent_framework_hosting/_host.py +++ b/python/packages/hosting/agent_framework_hosting/_host.py @@ -48,7 +48,7 @@ from starlette.middleware import Middleware from starlette.requests import Request from starlette.responses import PlainTextResponse -from starlette.routing import BaseRoute, Mount, Route +from starlette.routing import BaseRoute, Mount, Route, WebSocketRoute from starlette.types import ASGIApp, Receive, Scope, Send from ._authorization import ( @@ -110,6 +110,21 @@ RuntimeMode = Literal["long_running", "ephemeral"] +def _exact_path_route(path: str, route: BaseRoute) -> BaseRoute | None: + """Clone a root route so ``Mount('/x', Route('/'))`` also handles ``/x`` without a redirect.""" + if isinstance(route, Route) and route.path == "/": + return Route( + path, + route.endpoint, + methods=route.methods, + name=route.name, + include_in_schema=route.include_in_schema, + ) + if isinstance(route, WebSocketRoute) and route.path == "/": + return WebSocketRoute(path, route.endpoint, name=route.name) + return None + + def _detect_runtime_mode(env: Mapping[str, str] | None = None) -> tuple[RuntimeMode, str | None]: """Inspect deployment markers and return ``(mode, matched_marker_or_None)``. @@ -257,7 +272,7 @@ def _workflow_event_to_update(event: WorkflowEvent[Any]) -> AgentResponseUpdate @asynccontextmanager -async def _suppress_already_consumed() -> AsyncIterator[None]: # noqa: RUF029 +async def _suppress_already_consumed() -> AsyncIterator[None]: """Yield, swallowing finalizer failures so consumer cleanup never crashes the host. The bridge stream calls ``get_final_response()`` after iterating the @@ -1233,7 +1248,7 @@ def _log_startup( Mirrors the ``AgentServerHost`` convention from ``azure.ai.agentserver.core``: one INFO line that captures the - target type, every channel + its mount path, the bind address + target type, every channel + its endpoint path, the bind address (when known), whether we're running inside a Foundry Hosted Agents container, and the worker count. Keeps log noise low while still giving an operator a single grep-able anchor when @@ -1290,11 +1305,19 @@ async def _readiness(_request: Request) -> PlainTextResponse: # noqa: RUF029 for channel in self.channels: contribution = channel.contribute(context) # Channels publish routes relative to their root; mount under channel.path. - # An empty path means "mount at the app root" — useful for single-channel hosts - # that don't want a prefix (e.g. ResponsesChannel exposing POST /responses directly). + # An empty path means "mount at the app root" — useful when an external + # platform requires the channel endpoint at "/" or at a route contributed + # by the channel. if contribution.routes: if channel.path: - routes.append(Mount(channel.path, routes=list(contribution.routes))) + channel_routes = list(contribution.routes) + exact_routes = [ + exact_route + for route in channel_routes + if (exact_route := _exact_path_route(channel.path, route)) is not None + ] + routes.extend(exact_routes) + routes.append(Mount(channel.path, routes=channel_routes)) else: routes.extend(contribution.routes) on_startup.extend(contribution.on_startup) diff --git a/python/packages/hosting/agent_framework_hosting/_types.py b/python/packages/hosting/agent_framework_hosting/_types.py index 64ac9258f89..93dcc437a0c 100644 --- a/python/packages/hosting/agent_framework_hosting/_types.py +++ b/python/packages/hosting/agent_framework_hosting/_types.py @@ -710,7 +710,7 @@ class Channel(Protocol): """ name: str - path: str # default mount path (e.g. "/responses"); use "" to mount routes at the app root + path: str # default endpoint path (e.g. "/responses"); use "" to mount contributed routes at the app root def contribute(self, context: ChannelContext) -> ChannelContribution: ... diff --git a/python/packages/hosting/tests/test_host.py b/python/packages/hosting/tests/test_host.py index 283992f597f..cefdf5c8481 100644 --- a/python/packages/hosting/tests/test_host.py +++ b/python/packages/hosting/tests/test_host.py @@ -96,7 +96,7 @@ def __init__(self, name: str = "fake", path: str = "/fake", supports_push: bool self.pushes: list[tuple[ChannelIdentity, HostedRunResult[Any]]] = [] self._push_raises: Exception | None = None self._supports_push = supports_push - # Provide a single trivial route so contribute() exercises the mount path. + # Provide a single trivial route so contribute() exercises the endpoint path. self._routes: Sequence[BaseRoute] = (Route("/ping", _ping),) def contribute(self, context: ChannelContext) -> ChannelContribution: @@ -239,6 +239,18 @@ def test_app_mounts_channel_routes_under_path(self) -> None: assert r.status_code == 200 assert r.json() == {"ok": True} + def test_app_mounts_root_route_at_exact_channel_path(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel(path="/fake") + ch._routes = (Route("/", _ping),) + host = AgentFrameworkHost(target=agent, channels=[ch]) + + with TestClient(host.app, follow_redirects=False) as client: + r = client.get("/fake") + assert r.status_code == 200 + assert r.json() == {"ok": True} + assert client.get("/fake/").status_code == 200 + def test_app_mounts_at_root_when_path_is_empty(self) -> None: agent = _FakeAgent() ch = _RecordingChannel(path="") diff --git a/python/samples/04-hosting/af-hosting/README.md b/python/samples/04-hosting/af-hosting/README.md index 369c1ae73f1..ad368b9d445 100644 --- a/python/samples/04-hosting/af-hosting/README.md +++ b/python/samples/04-hosting/af-hosting/README.md @@ -15,6 +15,7 @@ its own package (`agent-framework-hosting-responses`, | [`local_responses/`](./local_responses) | The minimal shape: one agent + one `@tool` + `ResponsesChannel` + a single `run_hook` that strips caller-supplied options and forces a `reasoning` preset. | **Local only.** Start here to learn the run-hook seam. | | [`local_responses_workflow/`](./local_responses_workflow) | A 4-step `Workflow` (typed `SloganBrief` intake → writer → legal → formatter) hosted behind **both** the Responses and Invocations channels via a shared `run_hook` that parses inbound text/JSON into the workflow's typed input. The host writes per-conversation checkpoints via `checkpoint_location=…`. Demonstrates workflow targets + structured input adaptation + multi-channel + resume-across-turns. Includes a `call_server.rest` file with REST examples for both endpoints. | **Local only.** | | [`foundry_hosted_agent/`](./foundry_hosted_agent) | One Foundry agent, **Responses + Invocations only** — the minimal shape that is **runtime-compatible with the Foundry Hosted Agents platform**. | Ships with `Dockerfile` + `agent.yaml` + `agent.manifest.yaml` + `azure.yaml` so the same image runs locally **or** as a Foundry Hosted Agent (`azd up`). | +| [`foundry_telegram_invocations_weather/`](./foundry_telegram_invocations_weather) | Experimental Telegram weather bot that mounts `TelegramChannel` at `POST /invocations`, registers the Foundry Hosted Agents Invocations URL as the Telegram webhook, and uses `FoundryHostedAgentHistoryProvider` for storage. | Ships with `Dockerfile` + `agent.yaml` + `agent.manifest.yaml` + `azure.yaml`; used to validate whether a non-Responses channel can run under Foundry Invocations. | | [`local_telegram/`](./local_telegram) | Adds Telegram, a `@tool`, `FileHistoryProvider`, run hooks (per-user / per-chat session keying), extra Telegram commands, and `ResponseTarget` multicast. Runs under Hypercorn with multiple workers. | **Local only.** No Dockerfile / Foundry packaging. | | [`local_identity_link/`](./local_identity_link) | Everything in `local_telegram/` plus Teams and the Entra identity-link sidecar (`/auth/start` + `/auth/callback`). Demonstrates linking a Telegram chat to an Entra user so multiple non-Entra channels can share one isolation key. | **Local only.** No Dockerfile / Foundry packaging. | diff --git a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md index ca57acc89e4..2fb766ae3bd 100644 --- a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md +++ b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md @@ -3,7 +3,7 @@ Smallest end-to-end hosting sample. One Foundry-backed agent, two channels, no human-chat surface — and that minimal shape is the whole point: a host configured with at least the **Responses** and -**Invocations** channels under their default mount roots is +**Invocations** channels under their default endpoints is **runtime-compatible with the Foundry Hosted Agents platform**. The same container image runs locally, behind any ASGI server, or as a Hosted Agent — no protocol shim, no extra adapter. @@ -11,7 +11,7 @@ Hosted Agent — no protocol shim, no extra adapter. | Route | Channel | Used by | | ------------------------------ | -------------------- | ------------------------------------------- | | `POST /responses` | `ResponsesChannel` | OpenAI Responses clients (`call_server.py`) | -| `POST /invocations/invoke` | `InvocationsChannel` | Host-native JSON envelope (Hosted Agents) | +| `POST /invocations` | `InvocationsChannel` | Host-native JSON envelope (Hosted Agents) | ## Conversation history diff --git a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/app.py b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/app.py index 67bf14fdd0f..2c1eaa8e15a 100644 --- a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/app.py +++ b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/app.py @@ -4,7 +4,7 @@ This sample is intentionally minimal and is **runtime-compatible with the Foundry Hosted Agents platform**: a host that exposes the Responses and -Invocations channels under their default mount roots can be packaged as a +Invocations channels under their default endpoints can be packaged as a container image and deployed to Foundry Hosted Agents without any protocol shim. The same image runs locally, behind any ASGI server, or as a Hosted Agent. @@ -52,7 +52,7 @@ Routes ------ - ``POST /responses`` — OpenAI Responses-shaped surface. -- ``POST /invocations/invoke`` — host-native JSON envelope. +- ``POST /invocations`` — host-native JSON envelope. """ from __future__ import annotations diff --git a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/call_server.py b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/call_server.py index be39d68756a..9114f61f1be 100644 --- a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/call_server.py +++ b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/call_server.py @@ -3,7 +3,7 @@ """Call the foundry_hosted_agent server three ways. The foundry_hosted_agent host exposes ``POST /responses`` (OpenAI Responses-shaped) and -``POST /invocations/invoke`` (host-native), and that minimal contract is +``POST /invocations`` (host-native), and that minimal contract is **runtime-compatible with the Foundry Hosted Agents platform** — so the same agent code that calls the local server also calls the same image deployed as a Hosted Agent. diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/Dockerfile b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/Dockerfile new file mode 100644 index 00000000000..612ab1854d0 --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/Dockerfile @@ -0,0 +1,19 @@ +FROM ghcr.io/astral-sh/uv:python3.12-bookworm-slim + +WORKDIR /app + +# The sample depends on hosting packages from Git refs until they publish to +# PyPI, so the remote builder needs git available during `uv sync`. +RUN apt-get update \ + && apt-get install -y --no-install-recommends git \ + && rm -rf /var/lib/apt/lists/* + +COPY pyproject.toml ./ +COPY app.py ./ + +RUN uv sync --no-dev + +ENV PORT=8000 +EXPOSE 8000 + +CMD ["uv", "run", "python", "app.py"] diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/Dockerfile.dockerignore b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/Dockerfile.dockerignore new file mode 100644 index 00000000000..df8255e305b --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/Dockerfile.dockerignore @@ -0,0 +1,4 @@ +* +!app.py +!pyproject.toml +!Dockerfile diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/README.md b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/README.md new file mode 100644 index 00000000000..6f3eba95a39 --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/README.md @@ -0,0 +1,66 @@ +# foundry_telegram_invocations_weather + +Telegram weather bot sample for validating a non-Responses channel on Foundry +Hosted Agents. The sample configures `TelegramChannel(path="/invocations")` so +the webhook handler runs at the container endpoint `POST /invocations`; Foundry +exposes that route publicly as: + +```text +{FOUNDRY_PROJECT_ENDPOINT}/agents/agent-framework-telegram-invocations-weather/endpoint/protocols/invocations?api-version=2025-11-15-preview +``` + +| Route | Channel | Used by | +|---|---|---| +| `POST /responses` | `ResponsesChannel` | Quick hosted-agent sanity checks | +| `POST /invocations` | `TelegramChannel` | Telegram webhook payloads | + +The agent uses `FoundryHostedAgentHistoryProvider` and a small +`lookup_weather` tool so Telegram requests exercise model calls, tool calls, +and Foundry-hosted storage. + +## Important platform note + +This is an intentional experiment. Current Foundry Hosted Agents behavior +requires Entra bearer auth before a request reaches the container. Telegram +cannot attach that bearer token to webhook deliveries, so webhook registration +can succeed while live Telegram deliveries fail at the Foundry front door with +`401`. Authenticated calls to the Invocations endpoint are still useful for +validating the channel and storage behavior inside the container. + +The sample does not configure `TELEGRAM_WEBHOOK_SECRET` because prior probing +showed Foundry strips Telegram's `X-Telegram-Bot-Api-Secret-Token` header before +the request reaches the container. + +## Run locally + +```bash +export FOUNDRY_PROJECT_ENDPOINT=https://.services.ai.azure.com +export MODEL_DEPLOYMENT_NAME=gpt-5.4-nano +export TELEGRAM_BOT_TOKEN= +export TELEGRAM_WEBHOOK_URL=https:///invocations +az login + +uv sync +uv run python app.py +``` + +## Deploy + +```bash +set -a +. ../../../../.env +set +a + +azd env set TELEGRAM_BOT_TOKEN "$TELEGRAM_BOT_TOKEN" +azd env set MODEL_DEPLOYMENT_NAME "${MODEL_DEPLOYMENT_NAME:-gpt-5.4-nano}" +azd env set HOSTING_INVOCATIONS_API_VERSION 2025-11-15-preview +azd up +``` + +If you connect this sample to an existing Foundry project instead of running +`azd provision`, make sure the azd environment has `AZURE_AI_PROJECT_ID` and the +project's ACR connection values set before running `azd deploy`. + +On startup, `TelegramChannel` calls `setWebhook` using the Foundry public +Invocations URL derived from `FOUNDRY_PROJECT_ENDPOINT` and +`FOUNDRY_AGENT_NAME`. diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/agent.manifest.yaml b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/agent.manifest.yaml new file mode 100644 index 00000000000..900f33547c4 --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/agent.manifest.yaml @@ -0,0 +1,38 @@ +name: agent-framework-telegram-invocations-weather +description: > + Telegram weather bot sample hosted by Agent Framework. The Telegram webhook + handler is mounted at /invocations so the Foundry Hosted Agents Invocations + protocol endpoint can be registered as the bot's webhook URL. +metadata: + tags: + - Agent Framework + - AI Agent Hosting + - Azure AI AgentServer + - Responses Protocol + - Invocations Protocol + - Telegram +template: + name: agent-framework-telegram-invocations-weather + kind: hosted + protocols: + - protocol: responses + version: 1.0.0 + - protocol: invocations + version: 1.0.0 + environment_variables: + - name: MODEL_DEPLOYMENT_NAME + value: "{{MODEL_DEPLOYMENT_NAME}}" + - name: TELEGRAM_BOT_TOKEN + value: "{{TELEGRAM_BOT_TOKEN}}" + - name: HOSTING_INVOCATIONS_API_VERSION + value: "{{HOSTING_INVOCATIONS_API_VERSION}}" +resources: + - kind: model + id: gpt-5.4-nano + name: MODEL_DEPLOYMENT_NAME +parameters: + properties: + - name: TELEGRAM_BOT_TOKEN + secret: true + - name: HOSTING_INVOCATIONS_API_VERSION + secret: false diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/agent.yaml b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/agent.yaml new file mode 100644 index 00000000000..5206c10385c --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/agent.yaml @@ -0,0 +1,31 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/microsoft/AgentSchema/refs/heads/main/schemas/v1.0/ContainerAgent.yaml + +kind: hosted +name: agent-framework-telegram-invocations-weather +description: | + Telegram weather bot sample hosted by Agent Framework. The Telegram webhook + handler is mounted at /invocations so the Foundry Hosted Agents Invocations + protocol endpoint can be registered as the bot's webhook URL. +metadata: + tags: + - Agent Framework + - AI Agent Hosting + - Azure AI AgentServer + - Responses Protocol + - Invocations Protocol + - Telegram +protocols: + - protocol: responses + version: 1.0.0 + - protocol: invocations + version: 1.0.0 +resources: + cpu: "1" + memory: 2Gi +environment_variables: + - name: MODEL_DEPLOYMENT_NAME + value: ${MODEL_DEPLOYMENT_NAME} + - name: TELEGRAM_BOT_TOKEN + value: ${TELEGRAM_BOT_TOKEN} + - name: HOSTING_INVOCATIONS_API_VERSION + value: ${HOSTING_INVOCATIONS_API_VERSION} diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py new file mode 100644 index 00000000000..a7c4ef4007f --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py @@ -0,0 +1,194 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Telegram weather bot hosted behind Foundry Hosted Agents Invocations. + +This sample intentionally mounts the Telegram webhook handler at the container's +``/invocations`` route so the Foundry public Invocations protocol URL can be +registered as the Telegram webhook URL: + +``{FOUNDRY_PROJECT_ENDPOINT}/agents/{FOUNDRY_AGENT_NAME}/endpoint/protocols/invocations`` + +It uses ``FoundryHostedAgentHistoryProvider`` for conversation history and a +small weather tool to validate that a normal channel can run under the +Hosted Agents runtime. The sample also exposes Responses for a quick platform +sanity check. + +Sample output after sending "weather in Amsterdam" to the Telegram bot: +Assistant:> Amsterdam is cloudy with a high of 16 C. +""" + +from __future__ import annotations + +import logging +import os +from dataclasses import replace +from typing import Annotated + +from agent_framework import Agent, tool +from agent_framework.observability import enable_instrumentation +from agent_framework_foundry import FoundryChatClient +from agent_framework_foundry_hosting import FoundryHostedAgentHistoryProvider, foundry_response_id +from agent_framework_hosting import ( + AgentFrameworkHost, + ChannelCommand, + ChannelCommandContext, + ChannelRequest, +) +from agent_framework_hosting_responses import ResponsesChannel +from agent_framework_hosting_telegram import TelegramChannel, telegram_isolation_key +from azure.identity.aio import DefaultAzureCredential + +AGENT_NAME = "agent-framework-telegram-invocations-weather" +DEFAULT_MODEL_DEPLOYMENT = "gpt-5.4-nano" +DEFAULT_INVOCATIONS_API_VERSION = "2025-11-15-preview" + +logging.basicConfig( + level=os.environ.get("LOG_LEVEL", "INFO").upper(), + format="%(asctime)s %(levelname)s %(name)s: %(message)s", +) +for _noisy in ( + "httpx", + "httpcore", + "azure.core.pipeline.policies.http_logging_policy", + "urllib3", +): + logging.getLogger(_noisy).setLevel(logging.WARNING) + +logger = logging.getLogger(__name__) + + +@tool(approval_mode="never_require") +def lookup_weather(location: Annotated[str, "The city to look up weather for."]) -> str: + """Return a deterministic weather report for a city.""" + reports = { + "seattle": "Seattle is rainy with a high of 12 C.", + "amsterdam": "Amsterdam is cloudy with a high of 16 C.", + "tokyo": "Tokyo is clear with a high of 22 C.", + "london": "London is misty with a high of 11 C.", + } + normalized = location.strip().lower() + return reports.get(normalized, f"{location} is sunny with a high of 20 C.") + + +def _foundry_invocations_webhook_url() -> str: + """Build the public Foundry Invocations URL used as Telegram's webhook.""" + explicit = os.environ.get("TELEGRAM_WEBHOOK_URL") + if explicit: + return explicit + + project_endpoint = os.environ["FOUNDRY_PROJECT_ENDPOINT"].rstrip("/") + agent_name = os.environ.get("FOUNDRY_AGENT_NAME", AGENT_NAME) + api_version = os.environ.get("HOSTING_INVOCATIONS_API_VERSION", DEFAULT_INVOCATIONS_API_VERSION) + return f"{project_endpoint}/agents/{agent_name}/endpoint/protocols/invocations?api-version={api_version}" + + +def _configure_observability() -> None: + """Wire Azure Monitor OpenTelemetry when Foundry injects a connection string.""" + conn_str = os.environ.get("APPLICATIONINSIGHTS_CONNECTION_STRING") + if not conn_str: + logger.info("APPLICATIONINSIGHTS_CONNECTION_STRING not set; skipping Azure Monitor export.") + return + + from azure.monitor.opentelemetry import configure_azure_monitor # pyright: ignore[reportUnknownVariableType] + + configure_azure_monitor(connection_string=conn_str) + logger.info("Azure Monitor OpenTelemetry configured.") + + +def telegram_hook(request: ChannelRequest, **_: object) -> ChannelRequest: + """Clamp request options for Telegram-originating runs.""" + options = dict(request.options or {}) + options.pop("store", None) + options["reasoning"] = {"effort": "high", "summary": "auto"} + return replace(request, options=options) + + +def make_commands() -> list[ChannelCommand]: + """Create Telegram slash commands used by the sample.""" + + async def handle_start(ctx: ChannelCommandContext) -> None: + await ctx.reply("Hi! Ask me for weather in Seattle, Amsterdam, Tokyo, London, or any city.") + + async def handle_help(ctx: ChannelCommandContext) -> None: + await ctx.reply( + "/weather - call the weather tool directly\n" + "/whoami - show your Telegram session key\n" + "/help - show this message" + ) + + async def handle_whoami(ctx: ChannelCommandContext) -> None: + await ctx.reply(f"Your session key is {telegram_isolation_key(ctx.request.attributes.get('chat_id'))}.") + + async def handle_weather(ctx: ChannelCommandContext) -> None: + command_text = ctx.request.input if isinstance(ctx.request.input, str) else "" + _, _, location = command_text.partition(" ") + await ctx.reply(lookup_weather(location=(location.strip() or "Seattle"))) + + return [ + ChannelCommand("start", "Introduce the bot", handle_start), + ChannelCommand("help", "List available commands", handle_help), + ChannelCommand("whoami", "Show the Telegram session key", handle_whoami), + ChannelCommand("weather", "Call the weather tool: /weather ", handle_weather), + ] + + +def build_host() -> AgentFrameworkHost: + """Build the Foundry-hosted Telegram weather agent.""" + # 1. Create a shared credential for model calls and Foundry storage. + credential = DefaultAzureCredential() + project_endpoint = os.environ["FOUNDRY_PROJECT_ENDPOINT"] + + # 2. Create the agent with a simple weather tool and Foundry-backed history. + agent = Agent( + client=FoundryChatClient( + project_endpoint=project_endpoint, + model=os.environ.get("MODEL_DEPLOYMENT_NAME", DEFAULT_MODEL_DEPLOYMENT), + credential=credential, + ), + name="TelegramInvocationsWeatherAgent", + instructions=( + "You are a concise weather assistant. Use lookup_weather for weather questions " + "and answer in one short sentence." + ), + tools=[lookup_weather], + context_providers=[ + FoundryHostedAgentHistoryProvider( + credential=credential, + endpoint=project_endpoint, + ), + ], + ) + + # 3. Register Telegram at /invocations and keep Responses available for sanity checks. + return AgentFrameworkHost( + target=agent, + allow_in_process_runner=True, + channels=[ + ResponsesChannel(response_id_factory=foundry_response_id), + TelegramChannel( + bot_token=os.environ["TELEGRAM_BOT_TOKEN"], + path="/invocations", + transport="webhook", + webhook_url=_foundry_invocations_webhook_url(), + parse_mode="Markdown", + commands=make_commands(), + run_hook=telegram_hook, + ), + ], + ) + + +_configure_observability() +enable_instrumentation(enable_sensitive_data=True) +app = build_host().app + + +if __name__ == "__main__": + import asyncio + + import hypercorn.asyncio + import hypercorn.config + + config = hypercorn.config.Config() + config.bind = [f"0.0.0.0:{int(os.environ.get('PORT', '8000'))}"] + asyncio.run(hypercorn.asyncio.serve(app, config)) # type: ignore[arg-type] diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/azure.yaml b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/azure.yaml new file mode 100644 index 00000000000..b52679db740 --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/azure.yaml @@ -0,0 +1,18 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/Azure/azure-dev/main/schemas/v1.0/azure.yaml.json + +requiredVersions: + extensions: + azure.ai.agents: '>=0.1.0-preview' +name: ai-foundry-telegram-invocations-weather +services: + agent-framework-telegram-invocations-weather: + project: . + host: azure.ai.agent + language: docker + docker: + remoteBuild: true + config: + container: + resources: + cpu: "1" + memory: 2Gi diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/pyproject.toml b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/pyproject.toml new file mode 100644 index 00000000000..6cbddf04a21 --- /dev/null +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/pyproject.toml @@ -0,0 +1,26 @@ +[project] +name = "agent-framework-hosting-foundry-telegram-invocations-weather" +version = "0.0.1" +description = "Foundry Hosted Agents Telegram weather sample using the Invocations path." +requires-python = ">=3.10" +dependencies = [ + "agent-framework-foundry", + "agent-framework-foundry-hosting", + "agent-framework-hosting", + "agent-framework-hosting-responses", + "agent-framework-hosting-telegram", + "azure-identity", + "aiohttp>=3.13.5", + "hypercorn>=0.17", + "mcp>=1.24,<2", + "azure-monitor-opentelemetry>=1.6", +] + +[tool.uv] +package = false + +[tool.uv.sources] +agent-framework-foundry-hosting = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/foundry_hosting" } +agent-framework-hosting = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting" } +agent-framework-hosting-responses = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-responses" } +agent-framework-hosting-telegram = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-telegram" } diff --git a/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md b/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md index a168ef482ef..7e1a04f56d6 100644 --- a/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md +++ b/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md @@ -15,7 +15,7 @@ of the workflow. `Workflow` target and dispatches to `workflow.run(...)` (no `Agent.create_session(...)`). - Two channels are mounted side-by-side (`ResponsesChannel` at - `/responses`, `InvocationsChannel` at `/invocations/invoke`). Both + `/responses`, `InvocationsChannel` at `/invocations`). Both share the **same `brief_hook`** that **adapts the channel-native input into the workflow start executor's typed input** — Responses delivers a `list[Message]`, Invocations delivers a `str`, but the diff --git a/python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.rest b/python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.rest index 75f5b78c964..005353cfe4e 100644 --- a/python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.rest +++ b/python/samples/04-hosting/af-hosting/local_responses_workflow/call_server.rest @@ -45,7 +45,7 @@ Content-Type: application/json ### # 4. Invocations API — structured brief -POST {{host}}/invocations/invoke +POST {{host}}/invocations Content-Type: application/json { @@ -55,7 +55,7 @@ Content-Type: application/json ### # 5. Invocations API — plain topic -POST {{host}}/invocations/invoke +POST {{host}}/invocations Content-Type: application/json { @@ -66,7 +66,7 @@ Content-Type: application/json ### # 6. Invocations API — resume the same session_id to reuse the # workflow's per-conversation checkpoint store. -POST {{host}}/invocations/invoke +POST {{host}}/invocations Content-Type: application/json { @@ -77,7 +77,7 @@ Content-Type: application/json ### # 7. Invocations API — streaming (SSE; one `data:` line per chunk, # terminated by `data: [DONE]`). -POST {{host}}/invocations/invoke +POST {{host}}/invocations Content-Type: application/json Accept: text/event-stream From 109cff4d202c8646d39489c125ae181c8409acb6 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Fri, 12 Jun 2026 08:34:08 +0200 Subject: [PATCH 13/20] Simplify Python hosting core (#6492) Remove linking, multicast, durable delivery, and host push machinery from the v1 hosting core. Keep those scenarios in a proposed follow-up ADR and update channel packages, samples, docs, tests, and workspace metadata around the smaller host/channel contract. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/decisions/0027-hosting-channels.md | 521 +--- ...-hosting-linking-multicast-enhancements.md | 132 + docs/specs/002-python-hosting-channels.md | 2316 ++--------------- .../_channel.py | 204 +- .../tests/test_channel.py | 171 +- .../_channel.py | 105 +- .../tests/discord/test_channel.py | 205 +- python/packages/hosting-entra/LICENSE | 21 - python/packages/hosting-entra/README.md | 39 - .../agent_framework_hosting_entra/__init__.py | 15 - .../agent_framework_hosting_entra/_channel.py | 505 ---- python/packages/hosting-entra/pyproject.toml | 108 - .../packages/hosting-entra/tests/__init__.py | 0 .../hosting-entra/tests/test_channel.py | 464 ---- .../_channel.py | 64 +- .../hosting-invocations/tests/test_channel.py | 30 +- .../__init__.py | 2 - .../_channel.py | 126 +- .../_parsing.py | 69 +- .../hosting-responses/tests/test_channel.py | 68 +- .../hosting-responses/tests/test_parsing.py | 60 - .../_channel.py | 147 +- .../hosting-telegram/tests/test_channel.py | 73 +- python/packages/hosting/README.md | 187 +- .../agent_framework_hosting/__init__.py | 83 +- .../agent_framework_hosting/_authorization.py | 485 ---- .../hosting/agent_framework_hosting/_host.py | 1469 ++--------- .../agent_framework_hosting/_persistence.py | 107 +- .../agent_framework_hosting/_runner.py | 751 ------ .../agent_framework_hosting/_state_store.py | 308 +-- .../hosting/agent_framework_hosting/_types.py | 800 +----- .../hosting/tests/test_authorization.py | 580 ----- python/packages/hosting/tests/test_host.py | 706 +---- .../packages/hosting/tests/test_host_disk.py | 247 +- .../packages/hosting/tests/test_isolation.py | 33 +- python/packages/hosting/tests/test_runner.py | 333 --- .../hosting/tests/test_runner_disk.py | 278 -- python/packages/hosting/tests/test_types.py | 288 -- python/pyproject.toml | 2 - .../samples/04-hosting/af-hosting/README.md | 11 +- .../af-hosting/foundry_hosted_agent/README.md | 6 +- .../app.py | 1 - .../af-hosting/local_identity_link/README.md | 67 - .../af-hosting/local_identity_link/app.py | 395 --- .../local_identity_link/call_server.py | 72 - .../local_identity_link/pyproject.toml | 32 - .../af-hosting/local_telegram/README.md | 7 +- .../local_telegram/call_server_multicast.py | 92 - .../af-hosting/local_telegram/pyproject.toml | 2 +- python/uv.lock | 22 - 50 files changed, 1224 insertions(+), 11585 deletions(-) create mode 100644 docs/decisions/0028-hosting-linking-multicast-enhancements.md delete mode 100644 python/packages/hosting-entra/LICENSE delete mode 100644 python/packages/hosting-entra/README.md delete mode 100644 python/packages/hosting-entra/agent_framework_hosting_entra/__init__.py delete mode 100644 python/packages/hosting-entra/agent_framework_hosting_entra/_channel.py delete mode 100644 python/packages/hosting-entra/pyproject.toml delete mode 100644 python/packages/hosting-entra/tests/__init__.py delete mode 100644 python/packages/hosting-entra/tests/test_channel.py delete mode 100644 python/packages/hosting/agent_framework_hosting/_authorization.py delete mode 100644 python/packages/hosting/agent_framework_hosting/_runner.py delete mode 100644 python/packages/hosting/tests/test_authorization.py delete mode 100644 python/packages/hosting/tests/test_runner.py delete mode 100644 python/packages/hosting/tests/test_runner_disk.py delete mode 100644 python/samples/04-hosting/af-hosting/local_identity_link/README.md delete mode 100644 python/samples/04-hosting/af-hosting/local_identity_link/app.py delete mode 100644 python/samples/04-hosting/af-hosting/local_identity_link/call_server.py delete mode 100644 python/samples/04-hosting/af-hosting/local_identity_link/pyproject.toml delete mode 100644 python/samples/04-hosting/af-hosting/local_telegram/call_server_multicast.py diff --git a/docs/decisions/0027-hosting-channels.md b/docs/decisions/0027-hosting-channels.md index 7e8c5e36112..ffbad09d81b 100644 --- a/docs/decisions/0027-hosting-channels.md +++ b/docs/decisions/0027-hosting-channels.md @@ -1,478 +1,145 @@ --- status: proposed contact: eavanvalkenburg -date: 2026-04-24 +date: 2026-06-11 deciders: eavanvalkenburg --- -# Agent Framework hosting core with pluggable channels +# Python minimal hosting core and pluggable channels -## What are the business goals for this feature? +## Context and Problem Statement -Give Agent Framework app authors — in every supported language — one low-level hosting surface that can expose a single **hostable target** (an agent or a workflow) on **one or more channels** (Responses API, Invocations API, Telegram, future A2A, MCP-tool, Activity Protocol via Azure Bot Service — which fronts Teams, Web Chat, Slack, …— WhatsApp, optional future direct-to-Teams, custom webhooks) without requiring them to hand-build protocol routing or server glue per protocol, **and** let an end user start a conversation on one channel (e.g. Telegram on their phone) and seamlessly continue it on another (e.g. Teams at their desk via the Activity channel) against the same target and the same conversation history. +Agent Framework has several protocol-specific hosting surfaces. App authors who want one agent or workflow on multiple protocols must compose servers, routes, middleware, session handling, and lifecycle code by hand. -This consolidates the protocol-specific hosting layers that exist today (in Python: `agent-framework-foundry-hosting`, `-ag-ui`, `-a2a`, `-devui`; in .NET: the analogous per-protocol hosting helpers) into a shared composable model where: +We will introduce a small Python hosting core that owns the common server shape and leaves protocol details inside channel packages. The first public contract must be intentionally narrow so Python can ship a base contract before adding identity linking, proactive delivery, or multicast behavior. Other language implementations may reuse the same conceptual boundary, but this ADR records the Python decision. -- a host owns the application object and channels own protocol shape, -- the host's hostable target may be an **agent** (executed via the per-language agent execution seam) **or** a **workflow** (executed via the per-language workflow execution seam) — channels do not care which, because the channel's `run_hook` adapts the inbound `ChannelRequest` into the input shape the target needs, and -- session identity is **channel-neutral** — the host resolves a session from a channel-supplied `isolation_key` (e.g. a stable user identity) so two channels mounted on the same host can resolve to the **same** session for the same end user, and a shared session store extends that continuity across hosts and processes. -- channel-native identity is **mapped, not assumed** — every channel has its own user namespace (Telegram `chat_id`, Teams AAD object id, WhatsApp phone number, Slack user id, …). The host provides a first-class **identity resolver** seam that maps a channel-native identifier into the channel-neutral `isolation_key`, and a first-class **identity linker** seam that lets an end user **connect** a new channel to an existing `isolation_key` through a well-known mechanism (OAuth, MFA, signed one-time code, …) so cross-channel continuity is achievable without ad-hoc per-channel bookkeeping, and -- **response delivery is decoupled from request origin** — a target's response can be routed back to the **originating** channel (default), the user's **active** channel (the channel most recently observed for that `isolation_key`), a **specific** channel, **all linked** channels (fan-out), or **none** (background). Background/asynchronous runs are first-class: a channel can kick off a run, return a `ContinuationToken` to the caller, and the response is delivered when the user is next observed on any (or a chosen) channel — so a user can start a long task on Telegram and pick up the result on Teams. - -We know we're successful when: - -- after the target is created, a basic multi-channel sample requires only one host, channel objects, and one start call — no handwritten protocol routes and no per-protocol server bootstrap. The hosting core itself takes no dependency on the legacy protocol-specific hosts (e.g. Python's `agentserver`); individual channel packages MAY consume lower-level building blocks shipped in those packages where they ship reusable SDKs (e.g. the Foundry response-store SDK in `azure.ai.agentserver`), -- the same host construction works whether the target is an agent or a workflow — only the `run_hook` (channel-default or app-supplied) changes to adapt the input, -- a single host configured with two channels (e.g. Telegram + a future Activity Protocol channel — Teams via Azure Bot Service) can be exercised by one end user across both channels and observe one continuous conversation, **and** -- the same conceptual model applies in Python and .NET. - -## Problem Statement - -### How do developers solve this problem today? - -Today, every protocol surface is its own integration package with its own server. A developer who wants to expose one agent over both the Responses API and a webhook channel has to stand up two separate hosts and stitch them into one application by hand. In Python that means manually mounting two `agentserver`-based hosts into a Starlette app and calling `uvicorn.run(...)`. In .NET it means composing two protocol-specific hosting helpers into one `WebApplication` and wiring middleware twice. - -Adding a Telegram bot to the same agent today means leaving the hosting stack entirely: spinning up a separate process, installing a Telegram SDK, writing the polling/webhook loop, manually translating updates into agent calls, and wiring command handlers (`/start`, `/new`, `/cancel`, …) and native command registration (`set_my_commands(...)`) by hand — none of which is reusable across other message channels (Teams, WhatsApp, …) or across languages. - -### Why does this problem require a new hosting abstraction? +## Decision Drivers -The gap is between **owning a hostable target** (an agent or a workflow) and **operationalizing it on multiple channels**. Agent Framework already provides agents, workflows, sessions, run inputs, response/update streaming, and per-language execution seams (`SupportsAgentRun.run(...)` and the workflow execution seam in Python; `AIAgent.RunAsync(...)` and the workflow execution seam in .NET). What's missing is a generic host that: +- Keep the first host easy to explain: one app, one hostable target, one or more channels. +- Reuse Agent Framework's existing agent, workflow, session, history, and checkpoint primitives. +- Let channel packages own protocol parsing, protocol responses, authentication details, and native command surfaces. +- Make session continuity explicit through a channel-supplied `ChannelSession(isolation_key=...)`. +- Avoid approving cross-channel identity and delivery semantics before their safety model is reviewed. -1. Owns one application object and one set of lifecycle hooks per language. -2. Lets channels contribute routes, middleware, commands, and startup/shutdown without protocol leakage into the host. -3. Standardizes how protocol requests become target invocations (input, options, session, streaming) and how target results flow back out — independent of whether the target is an agent or a workflow. -4. **Resolves a session from a channel-neutral `isolation_key`** so two channels mounted on the same host can converge on the same session for the same end user — enabling cross-channel chat continuity (start on Telegram, continue on Teams) without per-channel session bookkeeping. -5. **Bridges channel-native identities into the shared `isolation_key` namespace** — every channel has its own user identifier (Telegram `chat_id`, Teams AAD object id, WhatsApp phone, Slack user id). The generic host needs (a) an **identity resolver** seam that maps a channel-native id to an `isolation_key` for already-known users, and (b) an **identity linker** seam that lets an end user **connect** a new channel to an existing `isolation_key` through a well-known mechanism (OAuth, MFA, signed one-time code) — without each channel reinventing the linking flow. -6. Provides a first-class extension seam for webhook/message channels with native command catalogs (per PR #5393 Telegram sample). -7. Treats the **run hook** as the developer's runtime escape hatch over a uniform request envelope. Every channel translates its native protocol payload (Responses JSON body, Telegram update, Invocations request, …) into the same `ChannelRequest` shape — that uniformity is what lets one host front many channels with one target. The run hook runs **after** that channel-internal translation and **before** the target is invoked, receives the channel-built `ChannelRequest`, and returns a possibly-modified `ChannelRequest`. The same seam covers, for example: reshaping a free-form chat message into the typed input a workflow target requires, removing or adding fields on `ChatOptions` (e.g. dropping `temperature`/`store` that a particular target should never see, or injecting a default `model`), enforcing app policy (rejecting requests that omit a required option), or overriding `session_mode` / `response_target`. The list is illustrative, not exhaustive — anything the channel put on the `ChannelRequest` is fair game for the hook to validate, rewrite, or strip. -8. Treats **response delivery** as a first-class, configurable concern — by default the response goes back to the originating channel synchronously, but the host must support routing the response to a different channel (the user's most recently active channel, a specific channel, or all linked channels) and **background runs** where the request returns immediately with a `ContinuationToken` and the response is delivered later via a channel push when the user is next observed (or polled by the caller). -9. Applies the same conceptual model across language ecosystems so concepts, terminology, and behavior transfer between teams and docs. +## Considered Options -The current top-level protocol-specific hosts (e.g. `ResponsesAgentServerHost`, `InvocationAgentServerHost`) are valuable prior art but sit too high in the stack — they encode protocol ownership at the host level and are duplicated per language. The new generic core learns from their behavior without depending on those top-level wrappers; individual channel packages may still consume the lower-level SDKs that ship alongside (notably the Foundry response-store SDK). +1. Keep only protocol-specific hosts. +2. Ship a large hosting core with identity linking, authorization, background delivery, active-channel routing, and multicast in v1. +3. Ship a minimal host/channel core now and track linking/multicast as follow-up work. -## Non-Goals / Relationship to existing hosting packages +### Keep only protocol-specific hosts -The hosting core is deliberately **not** a replacement for the existing protocol packages in their first form, and it is not a multi-agent router. It is a peer abstraction layer that lets future protocol packages share one host. +- Good: no new abstraction or package surface. +- Neutral: each protocol can continue evolving independently. +- Bad: every multi-channel app still has to compose servers, lifecycle, and session handling by hand. -| Dimension | Existing protocol packages | Hosting core | -|---|---|---| -| **Mental model** | One package = one protocol surface, owns its own server | One host owns the app; channels plug protocols in | -| **Scope** | Protocol-specific request/session/event mapping | Generic host + channel contract; protocol logic lives in channel packages | -| **Composition** | One protocol per process or per Mount | Many channels per host, shared middleware, lifecycle, session resolution | -| **Multi-agent** | Out of scope per package | **No.** One host = one agent. Future work, if desired. | -| **Cross-language** | Per language, per protocol | Same conceptual model in every implementing language | +### Ship the large cross-channel host in v1 -**Explicit non-goals:** +- Good: the richest cross-channel scenarios are available immediately. +- Neutral: the host becomes the natural place to demonstrate identity and delivery policy. +- Bad: v1 becomes a security-sensitive identity and delivery system before the safety model is reviewed. -- Migrating existing protocol packages (AG-UI, A2A, DevUI in Python; analogous .NET helpers) onto the new core in the first implementation. -- Standardizing a persistent session storage contract across all channels in the first phase. (Cross-channel continuity within one host is enabled by `isolation_key` resolution; cross-host/cross-process continuity requires the pluggable session store, listed as a fast follow.) -- Hosting multiple agents behind one router in this first design. -- Designing every detail of WhatsApp, the full Activity Protocol surface, or a future direct-to-Teams channel now (only Telegram is concretely targeted, informed by PR #5393; Activity Protocol via Azure Bot Service is designed-in fast follow alongside A2A and MCP-tool). Within Telegram and Activity, **broadcast Telegram Channels** (the read-only product) and **adaptive-card `Invoke` activity** flows are explicitly fast-follow scope; v1 ships group/supergroup/forum-topic and `personal` / `groupChat` / `channel` `conversationType` support. -- Shipping the **A2A** (agent-to-agent) and **MCP-tool** (exposing the agent as an MCP tool) channels in the first implementation. Both are explicitly **in scope for the overall design** — the host contract, `ChannelRequest` envelope, identity/session/response-target stack, and persisted delivery envelope must accommodate them as caller-supplied-session channels — but their concrete protocol bindings, route catalogs, and packages are **fast-follow work** after the first Telegram + Responses + Invocations release. -- Replacing protocol-specific serializers with one generic event model. -- Taking a runtime or package dependency on the legacy protocol-specific top-level hosts (e.g. `ResponsesAgentServerHost` / `InvocationAgentServerHost` in Python's `agentserver`) from the new hosting core. Channel packages MAY depend on lower-level building blocks shipped alongside those hosts where they provide reusable SDKs (notably the Foundry response-store SDK consumed by `FoundryHostedAgentHistoryProvider`). -- Forcing identical type names across languages — each language follows its own idioms while preserving the same concepts and terminology. +### Ship the minimal core now -**Boundary rule:** If you need protocol-specific event semantics, codecs, or signature validation, that lives in the channel package. The host owns the application object, lifecycle, session resolution, and the call into the agent's run/stream seam. +- Good: the host/channel boundary can be implemented, tested, and explained without solving linking and durable delivery at the same time. +- Neutral: apps that need richer behavior must build it locally or wait for ADR-0028 follow-up work. +- Bad: proactive delivery and multicast scenarios are deliberately absent from v1. -## Decision Drivers +## Decision Outcome -These are the design principles applied on top of the [business goals](#what-are-the-business-goals-for-this-feature) above. +Chosen option: **minimal host/channel core now, follow-up enhancements later**. -- Keep the app author experience simple for the common case (one host, channels, one start call). -- Treat agents and workflows as peer hostable targets behind one host, so the same channel ecosystem (Responses, Invocations, Telegram, Activity, …) can serve either without rework. -- Preserve room for channel-specific capabilities (signature validation, conversations, streaming, native commands, action surfaces). -- Support message-channel capabilities — native commands, command menus, action surfaces — from the start. -- Support channels that need startup/shutdown behavior (long polling, platform-side command registration) in addition to routes. -- Use the existing protocol-specific implementations as prior art **without** taking a runtime dependency on them. -- Keep the new core protocol-agnostic. -- Align to the per-language agent **and workflow** execution seams rather than introducing a new contract for the target. -- Follow each language's idiomatic packaging conventions rather than growing a monolithic integration package. -- Avoid forcing migration of existing protocol packages as part of the first implementation. -- Keep the abstractions language-neutral so the same conceptual model can be implemented by Python, .NET, and future language ecosystems with idiomatic code. +`AgentFrameworkHost` owns: -## Considered Options +- one application object, +- one hostable target (`SupportsAgentRun` agent-compatible object or a `Workflow`), and +- one or more channels. -- Keep the current protocol-specific hosting packages only. -- Create one monolithic `hosting` package with the host and all channels built in. -- Create a new hosting core plus new channel packages, but reimplement the channel stack from scratch with no reference to the current protocol implementations. -- Create a new hosting core plus separate channel packages, informed by the current protocol-specific implementations but without depending on them. +Channels own: -## Decision Outcome +- contributed routes, middleware, commands, and lifecycle callbacks, +- protocol-native request parsing into `ChannelRequest`, +- protocol-native rendering of the originating response, and +- any channel-specific authentication or signature validation. -Chosen option: **Create a new hosting core plus separate channel packages, informed by the current protocol-specific implementations but without depending on them.** Apply the same conceptual model in Python and .NET, with idiomatic per-language API shapes. - -### Summary - -We will introduce a new hosting core distribution package per language. The full conceptual vocabulary is defined once in [Terminology](#terminology); this section calls out only the design decisions baked into each concept. - -- **Host** (`AgentFrameworkHost`) — owns the application object (Starlette in Python, ASP.NET Core / Kestrel in .NET), one **hostable target**, and a sequence of channels. Exposes the underlying app as the canonical portability surface and a `serve(...)`-style convenience for the common single-process case. **Named `AgentFrameworkHost` rather than `AgentHost` because the target is not restricted to agents.** -- **Hostable target** — may be either an **agent** (per-language agent execution seam) or a **workflow** (per-language workflow execution seam). The host detects the kind and dispatches; channels are unchanged. -- **Channel**, **`ChannelContext`**, **`ChannelRequest`**, **`ChannelSession`**, **`ChannelContribution`**, **`ChannelCommand`** — the channel-authoring surface. Defined in Terminology. -- **`ChannelRunHook`** — the developer's runtime escape hatch over the uniform `ChannelRequest` envelope. Channels translate their native protocol payload into `ChannelRequest`; the hook then runs **after** that translation and **before** target invocation, receiving and returning a `ChannelRequest`. Examples (illustrative): reshaping a chat message into a workflow's typed input, dropping/injecting `ChatOptions` fields, enforcing required options, overriding `session_mode` / `response_target`. -- **`ChannelResponseHook`** — the *outbound* counterpart to `ChannelRunHook`. Applied per destination after target invocation, before the channel pushes the result onto its wire. Receives the `HostedRunResult` and a `ChannelResponseContext` (request, channel name, destination identity, originating flag, `is_echo` phase flag) and returns a (possibly transformed) `HostedRunResult`. Used for channel-side projections of the target's output: a text-only wire reads `result.result.text` (for agent targets) or projects `result.result.get_outputs()` into a single text turn (for workflow targets); a card-capable channel iterates the underlying contents; a workflow result with a typed final output can be rebound to a channel-friendly `AgentResponse` via `result.replace(result=...)`. Hooks are stored as a `response_hook` attribute on the channel instance — duck-typed, not part of the `Channel` Protocol, so adding hook support to a new channel package never breaks the Protocol contract. The host clones the `HostedRunResult` envelope per destination before invoking the hook so one channel's `replace(result=...)` cannot leak into another's payload. -- **`HostedRunResult[TResult]` is a generic typed envelope around the target's full-fidelity output.** For agent targets `TResult` narrows to `AgentResponse` (channels read `result.messages`, `result.value`, `result.usage_details`, `result.response_id`, … directly); for workflow targets to `WorkflowRunResult` (channels iterate `result.get_outputs()` / inspect `result.get_final_state()`). The host never collapses or pre-shapes — multi-modality and structured outputs survive end-to-end. The envelope also carries the resolved `session: AgentSession | None` (None for workflows, which do not own session state in the agent sense). Channels decide what subset their wire can carry through their `response_hook` and their native serializer. -- **`IdentityResolver`** + **`IdentityLinker`** + **`IdentityAllowlist`** — the channel-neutral identity stack. Resolver maps channel-native ids to `isolation_key`; linker runs the **link/connect ceremony** (OAuth / MFA / signed one-time code) so a new channel can join an existing `isolation_key`. The host owns the routes and short-lived state the linker needs; channels surface entry points. `IdentityAllowlist` is the **authorization** seam, orthogonal to linking: combined with the per-channel `require_link: bool` it produces three named profiles — **open** (default), **forced-link** (any authenticated identity), and **allowlist** (native-id list, IdP-claim list, or composition of both). Decisions are tri-state (`ALLOW` / `DENY` / `ABSTAIN`) so the host can run the allowlist twice — once with the raw channel identity and again with linker-emitted claims — and compose multiple lists (`AnyOfAllowlists`, `AllOfAllowlists`) without one list's missing information silently denying the request. The host runs a startup validator that rejects silent-deny-everyone configurations (e.g. a claim-based allowlist with no source of verified claims). Channels may declare `require_link=True` to enforce "authenticate before chatting", and the linker stores verified IdP claims (e.g. Entra ID `oid`) so subsequent channels that supply the same claim are auto-merged onto the same `isolation_key` without a second ceremony. -- **`ResponseTarget`** + **`ChannelPush`** + **`ContinuationToken`** + **active channel** — the response-delivery stack. `ResponseTarget` decouples *where* a response is delivered from *where* it originated; `ChannelPush` is the optional channel capability used for non-`originating` delivery; `ContinuationToken` makes background runs first-class with a stable id and status; the host tracks last-seen `(isolation_key, channel)` to resolve `response_target="active"`. `ResponseTarget` constructors that name destinations accept `echo_input=True` to also push the originating user message onto each non-originating destination before the agent reply — keeps the destination channel's UI coherent with the user's actual turn when the host orchestrates cross-channel delivery. -- **`confidentiality_tier`** + **`LinkPolicy`** — the multi-tier-on-one-host stack. `confidentiality_tier` is an opaque per-channel label; `LinkPolicy` is the host-level decision over which channel pairs may share an `isolation_key` (link) and which may push to one another (deliver). Built-in `DenyAllLinks` enforces "share a target, never share a session"; running multiple hosts is always a valid alternative. -- **Intent-only delivery envelope + pluggable `DurableTaskRunner`** — assistant messages stored by the host carry an `intended_targets[]` array on `Message.additional_properties["hosting"]` capturing the resolved destination set (after `ResponseTarget` + `LinkPolicy` filtering). The write is **immutable** — a single record of intent, never mutated post-push. Per-destination operational state (attempts, retries, last error, success timestamp, channel-issued id) lives in a pluggable `DurableTaskRunner` (`register` / `schedule` / `get`) that the host uses to fan out non-originating pushes. Built-in `InProcessTaskRunner` (asyncio + bounded retry) is the default for `long_running` deployments; adapter packages (`agent-framework-hosting-durabletask`, future Foundry adapter) plug in for `ephemeral` deployments. Replay is a property of the configured runner — native for durable adapters, not supported for in-process. This eliminates the earlier `pending`/`delivered`/`failed`/`skipped` state machine, the `SupportsDeliveryTracking` provider capability, and the Foundry `update_item` service ask. -- **Caller-supplied vs. host-tracked session carriage** — channels split into two families based on whether the upstream protocol carries a per-conversation key on every request. *Caller-supplied* channels (Responses' `previous_response_id`, Invocations, A2A, MCP) parse it into `ChannelSession.key` and let the caller branch threads by sending fresh ids. *Host-tracked* channels (Telegram, Activity Protocol via Azure Bot Service — Teams/Web Chat/Slack/…— WhatsApp) carry only a stable identity and rely on the host's per-`isolation_key` session alias plus a `host.reset_session(...)` `/new`-style command. The split is invisible to the agent target and explains why `reset_session` and aliasing exist at all (host-tracked channels have no other way to start a fresh thread). Anonymous vs. identified is an orthogonal axis; identity is supplied by the channel, the resolver, or both. -- **Multi-user surfaces are first-class.** Telegram groups, supergroups, forum topics, and Activity Protocol multi-user `conversationType`s (`groupChat`, `channel`) are designed-in from v1 — not retrofitted. The contract enforces a clean separation of **user identity** (`ChannelIdentity.native_id` = `from.id` / `from.aadObjectId`) and **conversation locator** (`ChannelRequest.conversation_id` = `chat.id` (+ optional `message_thread_id` / `replyToId`)). Channel implementations expose a `conversation_scope` option (`per_user`, `per_user_per_conversation` (default in groups), `per_conversation`) and an `accept_in_group` addressing rule (`mention_only` (default), `command_only`, `mention_or_command`, `all`) so the bot does not respond to every message in a group and so a single user's group context does not leak into their DM by default. Linker challenge messages (OAuth URL / one-time code) MUST redirect to the user's DM in group contexts. -- **Built-in channels** — own their protocol-defined endpoint paths (`/responses`, `/invocations`, `/telegram/webhook`) without the app author spelling those out. - -Channel implementations live in **separate distribution packages**, one per channel, with public surfaces kept stable per language. - -| Concept | Python (proposed) | .NET (proposed) | -|---|---|---| -| Core | `agent-framework-hosting` → `agent_framework.hosting` | `Microsoft.Agents.AI.Hosting` | -| Responses channel | `agent-framework-hosting-responses` → `agent_framework.hosting.ResponsesChannel` (lazy) | `Microsoft.Agents.AI.Hosting.Responses` | -| Invocations channel | `agent-framework-hosting-invocations` → `agent_framework.hosting.InvocationsChannel` (lazy) | `Microsoft.Agents.AI.Hosting.Invocations` | -| Telegram channel | `agent-framework-hosting-telegram` → `agent_framework.hosting.TelegramChannel` (lazy) | `Microsoft.Agents.AI.Hosting.Telegram` | - -Each language follows its own conventions: - -- Python keeps the public import path stable at `agent_framework.hosting` via lazy imports. -- .NET keeps the public namespaces stable per package, following existing `Microsoft.Agents.AI.*` conventions. - -The new hosting core and its channel packages **must not** take a dependency on legacy protocol-specific hosts; those are prior art and parity reference only. - -The initial design target, in every implementing language, is: - -- any execution-seam-compatible target (not just the concrete `Agent`/`ChatClientAgent`), -- built-in channel designs for Responses and Invocations, -- a documented authoring model for webhook/message channels, including a first detailed Telegram design, -- conceptual alignment with existing protocol packages but no implementation or migration requirement for those in the first phase. - -### Conceptual API shape - -The top-level user experience should look the same conceptually in every language: compose one host with one agent and a list of channels, then start it. The channel-authoring seam should follow each language's idioms while preserving the same concepts. - -| Concept | Python idiom | .NET idiom | -|---|---|---| -| Define a host | `AgentFrameworkHost(target, channels=[...])` (target = agent or workflow) | `AgentFrameworkHostBuilder` / `AddAgentFrameworkHost(target, ...)` on the host builder | -| Canonical app surface | `host.app` (Starlette `Starlette`) — supports HTTP **and** WebSocket scopes via ASGI | `WebApplication` (ASP.NET Core) — supports HTTP **and** WebSocket via `app.UseWebSockets()` / `MapWebSocket(...)` | -| Convenience start | `host.serve(host=, port=)` (lazy `uvicorn`) | `host.RunAsync()` (Kestrel) | -| Channel contract | `Channel` Protocol with `contribute(context) -> ChannelContribution` | `IChannel` interface with `Contribute(IChannelContext)` returning `ChannelContribution` | -| Per-request hook | `ChannelRunHook = Callable[..., ChannelRequest \| Awaitable[ChannelRequest]]` invoked as `hook(request, *, target=..., protocol_request=...)` | `Func>` / delegate with named extras | -| Identity resolver | `IdentityResolver = Callable[[ChannelIdentity], str \| None]` | `IIdentityResolver` (returns `isolation_key`) | -| Identity linker | `IdentityLinker` Protocol with `begin(...)` / `complete(...)` plus `routes()` for callback / verification endpoints | `IIdentityLinker` interface with begin/complete + route contributions | -| Authorization policy | `require_link: bool` + `allowlist: IdentityAllowlist \| Literal["inherit"] \| None` on each channel; built-in allowlists `AllowAll`, `NativeIdAllowlist`, `LinkedClaimAllowlist`, `AnyOfAllowlists`, `AllOfAllowlists`, `CallableAllowlist`; host seam `host.authorize(identity, *, require_link, allowlist, verified_claims) -> AuthorizationOutcome` (`Allowed` \| `LinkRequired` \| `Denied`) with tri-state `AllowlistDecision` (`ALLOW` / `DENY` / `ABSTAIN`); named factories on `AuthPolicy` (`.open()` / `.require_link()` / `.native_allowlist(...)` / `.linked_claim_allowlist(...)`) | `IIdentityAllowlist.EvaluateAsync(AuthorizationContext)` returning `AllowlistDecision`; built-ins `AllowAll`, `NativeIdAllowlist`, `LinkedClaimAllowlist`, `AnyOfAllowlists`, `AllOfAllowlists`; `IAgentFrameworkHost.AuthorizeAsync(...)` returning `AuthorizationOutcome` discriminated union with `Allowed` / `LinkRequired` / `Denied` variants | -| Response routing | `ChannelRequest.response_target = ResponseTarget.originating \| .active \| .channel("activity") \| .all_linked \| .none`; channels expose `ChannelPush` if they can deliver proactively | `ChannelRequest.ResponseTarget` discriminated union; `IChannelPush` interface for proactive delivery | -| Background runs | `ContinuationToken` returned by `host.run_in_background(request)`; channels may return it as their protocol response and/or expose a poll route | `ContinuationToken` record + `HostStateStore` for persistence (file-based default; pluggable Cosmos / SQL / Redis) | -| Runtime mode | `runtime_mode: Literal["long_running", "ephemeral"] \| None = None` on `AgentFrameworkHost`; `None` triggers auto-detect via deployment env markers (`FOUNDRY_HOSTING_ENVIRONMENT`, `AZURE_FUNCTIONS_ENVIRONMENT`, `AWS_LAMBDA_FUNCTION_NAME`); falls back to `"long_running"`. Advisory — drives defaults for `HostStateStore` / `DurableTaskRunner` / identity-link state. | `RuntimeMode` enum on `AgentFrameworkHostBuilder` with the same auto-detection contract | -| Durable task runner | `DurableTaskRunner` Protocol (`register` / `schedule` / `get`) on the host; built-in `InProcessTaskRunner` (asyncio + bounded retry); adapter packages plug TaskHub / Foundry / SQLite backends. Used internally for non-originating push fan-out; in v1 fast-follow shared with background-run plumbing. | `IDurableTaskRunner` interface with the same register/schedule/get triple; built-in in-process runner; adapter packages mirror the Python set | -| Confidentiality tier on a channel | `Channel.confidentiality_tier: str \| None` (opaque) | `IChannel.ConfidentialityTier { get; }` (opaque string) | -| Link / delivery policy | `LinkPolicy = Callable[[LinkPolicyContext], bool]` with built-ins `AllowAllLinks`, `SameConfidentialityTierOnly`, `ExplicitAllowList`, `DenyAllLinks` | `ILinkPolicy.IsAllowed(LinkPolicyContext)` with the same set of built-in implementations | -| Command descriptor | `ChannelCommand` dataclass | `ChannelCommand` record | -| Lifecycle | `on_startup` / `on_shutdown` callables | `IHostedService` integration / explicit lifecycle delegates | - -Built-in channels own the default mapping from each protocol's request model into a `ChannelRequest`, **and** expose a per-request invocation-hook seam so app authors can validate or rewrite invocation behavior before the host invokes the agent. - -The full Python API surface — exact types, fields, default routes, code samples — is specified in the companion Python spec. A future .NET spec captures the .NET-idiomatic API surface for the same model. - -#### Runtime topology - -How the pieces wire at runtime. Channels contribute routes to the host's app; inbound traffic splits at the parse step into a command dispatch (handled in the channel) or a message that flows through `host.authorize` → target invocation → response delivery. Non-originating destinations go through the configured `DurableTaskRunner`; the originating channel is rendered synchronously. - -```mermaid -graph LR - Caller[External caller /
messaging app] - - subgraph Host[AgentFrameworkHost] - direction TB - ASGI[Starlette / ASP.NET Core app] - Router[Channel router] - Parse{parse →
command or
message?} - Auth[host.authorize] - Resolver[IdentityResolver] - Delivery[_deliver_response] - Push[_handle_push_task] - end - - Channels[Channels
Responses · Invocations ·
Telegram · Activity ·
IdentityLinker] - CmdHandler[CommandHandler
via ChannelCommandContext] - Target[(Agent or Workflow)] - Runner[DurableTaskRunner] - StateStore[(HostStateStore)] - - Caller --> ASGI - ASGI --> Router - Router --> Parse - Parse -- /command --> CmdHandler - Parse -- message --> Auth - CmdHandler -- ctx.run --> Auth - CmdHandler -- local reply --> Channels - Auth --> Resolver - Resolver --> StateStore - Auth --> Target - Target --> Delivery - Delivery -- originating sync --> Channels - Delivery -- non-originating --> Runner - Runner --> Push - Push --> Channels - Channels --> ASGI -``` - -#### Channel contribution shape - -Every channel exposes the same three contribution slots (all optional except `routes`). The host duck-types each slot and stitches them in at construction. - -```mermaid -graph LR - subgraph C[ConcreteChannel
e.g. TelegramChannel] - direction TB - Routes[routes:
webhook / poller / API endpoints
→ Starlette router] - Commands[commands: Sequence ChannelCommand
name · description · handle ·
scopes · locales · expose_in_ui] - Push[ChannelPush.push
+ optional ChannelPushCodec
+ optional response_hook] - end - - Host[Host] - Native[Platform native catalog
Telegram set_my_commands ·
Teams app manifest · …] - Dispatch[CommandHandler dispatch] - Delivery[Originating sync delivery
+ runner-scheduled fan-out] - - Routes -- contribute at startup --> Host - Commands -- startup projection --> Native - Commands -- runtime dispatch --> Dispatch - Push -- driven by --> Delivery -``` - -#### Authorization decision - -`require_link` and `allowlist` are orthogonal axes. The `require_link` gate runs first against the current link state from `StateStore`; an unlinked identity on a `require_link=True` channel returns `LinkRequired` regardless of allowlist. A claim-dependent allowlist that has not yet seen claims returns `ABSTAIN` from `evaluate` and is converted into a `LinkRequired` outcome so the user gets a link prompt rather than a silent deny. - -```mermaid -flowchart TB - Start([authorize identity,
require_link, allowlist]) - Linked{identity already
linked?
StateStore lookup} - Required{require_link?} - OpenPath{allowlist is None?} - Resolve[/isolation_key:
linked → existing,
else auto-issue channel:native_id/] - Evaluate[/allowlist.evaluate context/] - Decision{decision} - Abstain{requires_linked_claims?} - Allowed([Allowed isolation_key]) - DeniedPre([Denied
allowlist_denied_pre_link]) - LinkReq([LinkRequired
via configured linker]) - - Start --> Linked - Linked -- yes --> OpenPath - Linked -- no --> Required - Required -- yes --> LinkReq - Required -- no --> OpenPath - OpenPath -- yes --> Resolve --> Allowed - OpenPath -- no --> Evaluate --> Decision - Decision -- ALLOW --> Resolve - Decision -- DENY --> DeniedPre - Decision -- ABSTAIN --> Abstain - Abstain -- yes --> LinkReq - Abstain -- no --> Resolve -``` - -## Terminology - -These terms are language-neutral and shared between Python and .NET implementations. Each language realizes them with idiomatic types and naming. - -- **Host**: The object that owns one application, one execution-seam-compatible target, and a sequence of channels. Provides the underlying app object (canonical portability surface) and a convenience start method. -- **Channel**: A pluggable component that contributes routes (HTTP and/or WebSocket), middleware, commands, and lifecycle hooks to a host. One channel = one external protocol surface. Used interchangeably with "head" in earlier discussions; **Channel** is the canonical name. -- **`ChannelRequest`**: The host-neutral, normalized invocation envelope produced by a channel before the host invokes the agent. Carries `input`, `options`, `session` hint, `session_mode`, and channel-specific `attributes`. Also carries a small set of **typed slots** for protocol-extension data so multiple event-rich channels (AG-UI today, future custom front-ends) can settle on a shared shape rather than smuggling fields through `attributes`: `client_state` (mutable per-request state object), `client_tools` (frontend tool catalog the agent should see but not execute), and `forwarded_props` (pass-through bag for resume/command/HITL payloads). All three are optional and channel-defined in shape; the host treats them as opaque. -- **`ChannelSession`**: A small session hint with a stable lookup key, an optional protocol-visible conversation/thread identifier, and an opaque `isolation_key`. The host resolves it into a framework session; storage specifics are deferred. -- **`isolation_key`**: An opaque partition boundary aligned with hosted-agent terminology — may represent a user, tenant, chat, or other scope without baking direct identity semantics into the generic host. -- **Channel-native identity**: The **user/account** identifier the channel observes from its own platform (Telegram `from.id`, Teams `from.aadObjectId`, WhatsApp phone number, Slack user id). Always per-channel; never assumed to align across channels. Distinct from the **conversation locator** (e.g. Telegram `chat.id`, Teams `conversation.id` + optional `replyToId`) which lives on `ChannelRequest.conversation_id` — in multi-user surfaces (Telegram groups, Teams group chats and team channels) the two never coincide, and the spec defines a `conversation_scope` knob (`per_user`, `per_user_per_conversation` (default in groups), `per_conversation`) plus default `mention_only` addressing so the bot does not respond to every message in a group. -- **Identity resolver**: Host-level seam that maps a channel-native identity into an `isolation_key`. Default behavior **auto-issues and persists** a fresh, stable `isolation_key` on first contact per `(channel, native_id)` so every end user automatically gets a per-user partition without app code; linking merges the second channel's auto-issued key onto the first channel's existing key. Apps that already own an identity namespace can supply a custom resolver that returns those values directly. -- **Identity linker**: Host-level seam that runs a connect ceremony — typically OAuth, MFA, or a signed one-time code — to associate a new channel-native identity with an existing `isolation_key`. Channels expose entry points (e.g. a `/link` command or button); the host owns the ceremony's routes and short-lived state. Mechanism (OAuth provider, MFA factor, code transport) is pluggable; the contract is not. -- **`ResponseTarget`**: Per-request directive on `ChannelRequest` that controls **where** the response is delivered: `originating` (default), `active` (the user's most recently observed channel), a specific channel, a list of channels, `all_linked`, or `none` (background-only). Independent of `session_mode`. When the target differs from the originating channel, delivery uses the destination channel's `ChannelPush` capability. -- **`ChannelPush`**: Optional channel capability for **proactive** outbound delivery (proactive Telegram message, Activity Protocol proactive message via Azure Bot Service, webhook callback, SSE broadcast). Channels that don't implement it cannot be the destination of a non-`originating` `ResponseTarget`. -- **Active channel**: The channel most recently observed for a given `isolation_key`. The host tracks last-seen `(isolation_key, channel)` so `response_target="active"` resolves to whichever channel the user is currently using. -- **`confidentiality_tier`** (channel-level): An opaque label declared on a channel (`"corp"`, `"public"`, `"internal"`, …) consumed by the host's `LinkPolicy`. Two channels with different confidentiality tiers can share an agent target on one host while remaining session-isolated. -- **`LinkPolicy`**: Host-level decision over which channel pairs may share an `isolation_key` (link) and which channel pairs may be `ResponseTarget` source/destination for one another (deliver). Built-in variants: allow-all (default), same-tier-only, explicit allow-list, deny-all (the explicit "no cross-channel continuity" mode). Running multiple hosts is always a valid alternative; the policy exists for cases where one shared host with policy-enforced isolation is preferred. -- **`ContinuationToken`**: First-class artifact for background/asynchronous runs. Carries an opaque, URL-safe `token`, current status (`queued` | `running` | `completed` | `failed`), and the resolved `isolation_key`. Channels may return it directly in their protocol response (e.g. an Invocations 202 with the token plus a polling URL) so the caller can poll later, while the host also pushes the result to the configured response target when ready. Persisted via the host-level `HostStateStore` (file-based default in v1) so background runs survive host restarts. -- **`session_mode`**: Per-request directive (`auto` | `required` | `disabled`) that controls whether the host resolves a session before invoking the agent. Lets channels honor protocol semantics like Responses `store=False` and lets app authors enforce extra policy. -- **`ChannelContribution`**: What a channel returns from its `contribute(...)` method — routes, middleware, commands, and startup/shutdown lifecycle hooks. The host aggregates contributions into one application. -- **`ChannelCommand`**: A transport-neutral command descriptor. Message channels project these into native command surfaces — Telegram bot commands, future Activity Protocol slash commands / adaptive cards, WhatsApp menus. -- **`ChannelRunHook`**: Per-request callable on built-in channels. Runs after the channel's default `ChannelRequest` is produced, before session resolution. The escape hatch for forcing or forbidding session use, requiring extra options, or adapting to targets like `A2AAgent`. -- **Native command registration**: The startup-time projection of `ChannelCommand` metadata into a platform's native command catalog (e.g. Telegram `set_my_commands(...)`). -- **Hostable target**: The executable object the host fronts — either an **agent** (invoked via the agent execution seam) or a **workflow** (invoked via the workflow execution seam). The host detects the kind and dispatches to the appropriate runner; channels remain unchanged. -- **Execution seam**: The framework's existing per-language invocation contracts — for agents, `SupportsAgentRun.run(...)` in Python and `AIAgent.RunAsync(...)` in .NET; for workflows, the equivalent per-language workflow execution seam. The host requires one of these from the hosted target. -- **`HostedStreamResult.raw_events`**: Optional passthrough seam onto the underlying agent event stream **before** update normalization, for channels whose protocol carries domain events the framework does not model (e.g. AG-UI's `StateSnapshotEvent` / `StateDeltaEvent` / `ToolCallStartEvent`). Channels that consume `raw_events` bear responsibility for the full event translation; the request still flows through `context.stream(...)` so session resolution, identity, push, and policy continue to apply. The high-level normalized `updates` stream remains the happy path for Responses, Invocations, Telegram, and most channels. -- **Per-conversation storage seam**: One public seam — `ContextProvider`. Messages flow through `HistoryProvider` (its canonical subclass); non-message per-thread state for event-rich channels (e.g. AG-UI `client_state`) flows through a channel-owned `ContextProvider` subclass that writes into the same per-source state slot. **No parallel `StateProvider` Protocol is introduced.** Host-level pluggable state (`ContinuationToken`s, identity-link grants, last-seen records) and workflow `CheckpointStorage` are deliberately separate seams because the data shapes are structurally different; all three MAY be backed by the same physical store. Per-request transport state (response ids, platform isolation keys, future signals) flows from channels via `ChannelRequest.attributes` into `ContextProvider.bind_request_context(**attrs)`, so providers consume backend-specific request signals without app authors having to wrap the host's ASGI app or install middleware. +The host owns: -## Consequences +- route/lifecycle aggregation, +- invocation of the target, +- `ChannelSession(isolation_key=...)` to `AgentSession` resolution and caching, +- `reset_session(isolation_key=...)`, +- host-level middleware, including Foundry isolation middleware only when the Foundry hosting environment flag is present, +- invocation of per-channel hooks (`ChannelRunHook`, `ChannelResponseHook`, `ChannelStreamUpdateHook`), and +- workflow checkpoint wiring through an explicit `checkpoint_location`. -- Good, because app authors get one consistent low-level hosting story for single- and multi-channel scenarios in each supported language. -- Good, because channel packages can stay opinionated about protocol payloads and capabilities without pushing those semantics into the core. -- Good, because the existing protocol-specific implementations provide proven prior art and behavioral guidance. -- Good, because the design supports webhook/message channels that do not look like OpenAI or Foundry APIs. -- Good, because command-capable message channels such as Telegram are first-class channels rather than special-case samples. -- Good, because architectural portability stays at the **standard web-application object** level (ASGI app in Python, `WebApplication` in .NET), so the host is not fundamentally coupled to any one server implementation even when a `serve(...)` convenience uses one. -- Good, because channels can ship sensible invocation defaults while still giving app authors a clear place to enforce extra policy or adapt to different agent implementations (e.g. `A2AAgent`). -- Good, because cross-channel chat continuity for one end user is achievable in the first phase whenever channels can produce a stable `isolation_key`, without requiring any new cross-package storage contract. -- Good, because the same conceptual model is shared across languages — concepts, terminology, and behavior transfer between Python and .NET teams and docs. -- Bad, because we introduce new package and namespace surface area that must be versioned and documented in each language. -- Bad, because we still need to reimplement the needed behavior in Agent Framework-owned code per language. -- Bad, because there will be a temporary overlap with the existing protocol-specific hosts until the new channel packages are implemented and stabilized. -- Neutral, because existing protocol packages remain outside the first implementation scope even though the model keeps a path open for later convergence. - -## Validation - -The decision is validated when, in each implementing language: - -1. a one-channel Responses sample and a two-channel Responses + Invocations sample can be expressed with one host, default route layouts under `/responses` and `/invocations`, and no handwritten protocol routing, -2. a Responses channel by default forwards official request parameters like `temperature` into agent options and maps `store=False` into disabled session use, -3. app authors can override that default per request with an run hook that validates or rewrites the final `ChannelRequest` (for example requiring `temperature`, ignoring `store`, or adapting for `A2AAgent`), -4. a Telegram-style message channel can express command metadata, command registration, and either webhook or polling lifecycle behavior through the new channel contract, -5. a custom webhook/message channel can be authored only against the new channel contract plus the language's web-framework primitives and lifecycle hooks, -6. two channels mounted on the same host (e.g. Telegram + a future Activity Protocol channel — Teams via Azure Bot Service) configured with a stable per-user `isolation_key` resolve to the same session for the same end user, so a conversation started on one channel can be continued on the other against the same conversation history, -7. an end user who is known on one channel can **link a second channel to the same `isolation_key`** through a host-provided ceremony (OAuth, MFA, or a signed one-time code) without each channel reinventing the linking flow, and subsequent requests from the linked channel resolve to the same session as the original channel, -8. a request submitted on one channel can opt into **delivery on a different channel** — `response_target="active"` (whichever channel the user is currently using), a specific channel id, all linked channels, or `none` (background only) — using the destination channel's `ChannelPush` capability, without the originating channel having to know how the destination delivers, -9. **background runs are first-class**: a channel can submit a request that returns a `ContinuationToken` immediately and the response is later delivered both via channel push (when the user is next observed on the configured target channel) and via a poll route the caller can hit with the token, -10. the **same host construction** can front either an agent or a workflow target — the channel ecosystem (Responses, Invocations, Telegram, …) is unchanged, and only the `run_hook` (channel-default or app-supplied) differs to adapt the inbound `ChannelRequest` into the input shape the target requires, -11. a host configured with at least the Responses and Invocations channels can be packaged into a container image whose runtime contract (exposed routes, request/response shapes, health/lifecycle behavior) is **compatible with the Hosted Agents platform**, so the same image can be deployed to that platform without protocol shims, -12. a channel can contribute a **WebSocket endpoint** alongside its HTTP routes through the same `Channel` contract, the host's app object exposes it through the standard ASGI / ASP.NET Core WebSocket scope, and the built-in Responses channel exposes a WebSocket transport (default `/responses/ws`) carrying the same Responses request/event model as its HTTP+SSE transport — so the host is forward-compatible with the OpenAI Responses WebSocket transport without changing the hosting contract, -13. a host can **mix channels of different confidentiality tiers** under a `LinkPolicy` so e.g. a corporate-tier channel (Teams) and a public-tier channel (Telegram) share one agent target without sharing a session, cross-tier link attempts are refused with a typed error, cross-tier `ResponseTarget` deliveries are dropped, and the same outcome is reachable by simply running two separate hosts (validating that the policy is a convenience, not a load-bearing mechanism), and -14. the first Responses and Invocations implementations achieve parity with the important behavior of the current protocol-specific hosts without introducing a runtime dependency on them or leaking protocol-specific request models into the hosting core. - -## Pros and Cons of the Options - -### Keep the current protocol-specific hosting packages only - -- Good, because no new package or abstraction needs to be introduced. -- Good, because each protocol can move independently. -- Bad, because users still cannot host one agent on multiple channels through one shared host. -- Bad, because request/session/event bridging keeps being rebuilt at the protocol layer. -- Bad, because webhook/message channels still have no natural home. -- Bad, because the same gap exists in every language with no shared conceptual model. - -### One monolithic `hosting` package with all channels built in - -- Good, because discovery is straightforward. -- Good, because cross-channel refactoring is simpler inside one package. -- Bad, because every app pays the dependency and maintenance cost of every channel. -- Bad, because lifecycle and stability become coupled across unrelated channels. -- Bad, because it does not fit either ecosystem's subpackage direction. - -### New hosting core plus new channel packages, reimplemented without reference to current hosting implementations - -- Good, because the abstraction boundary can be kept very clean. -- Good, because package ownership is clear. -- Bad, because it ignores useful prior art in the current hosting implementations. -- Bad, because it increases implementation cost and migration risk. -- Bad, because it makes early channel parity harder. - -### New hosting core plus separate channel packages, informed by current protocol-specific implementations - -- Good, because it gives us a reusable host abstraction without discarding what we learned from current protocol work. -- Good, because the core stays protocol-agnostic while channel packages remain Agent Framework-owned and dependency-free with respect to the legacy protocol-specific hosts. -- Good, because it gives future channels a deeper seam than today's top-level host wrappers. -- Good, because the conceptual model can be applied uniformly in Python and .NET. -- Neutral, because some implementation details may look similar to the current hosts when they are solving the same problem. -- Bad, because the design team must still curate the boundary carefully to avoid copying protocol-specific assumptions into the generic host. - -## Open Questions - -| # | Question | Notes | -|---|---|---| -| 6 | Is "Channel" the GA name in both languages? "Head" was used interchangeably during design discussions. | Use "Channel" for now in spec, ADR, samples, and sub-package names. Other names remain on the table; revisit before public docs in either language. | -| 14 | For the Responses WebSocket transport, what subprotocol identifier and auth carrier should the channel adopt — `Authorization` header on the `Upgrade`, a `Sec-WebSocket-Protocol` token, or a query-string-bound short-lived token? | Wait for the upstream OpenAI Responses WS spec to land. The channel codec is intentionally swappable (the host contract does not depend on the WS framing) so the channel package can track upstream changes without touching the host. Document the swappable-codec property explicitly in the spec. | - -## Resolved Questions (decisions log) - -| # | Question | Decision | -|---|---|---| -| 1 | Final distribution package and namespace names per language. | Accept the proposed Python distribution + import names (`agent-framework-hosting` → `agent_framework.hosting`, plus per-channel `agent-framework-hosting-{responses,invocations,telegram}`). Keep the proposed .NET namespaces (`Microsoft.Agents.AI.Hosting{,.Responses,.Invocations,.Telegram}`) as the working target. | -| 2 | How tightly do Python and .NET API names need to match? | Keep concepts and terminology identical across languages; allow idiomatic naming differences (e.g. `serve` vs `RunAsync`). | -| 3 | Should generic auth helpers (HMAC signature, bearer token) live in core, in optional shared helpers, or per channel? | Per-channel auth + host-level middleware composition (current draft). No separate shared-helpers package in v1. **Cross-check the matching decision in the Python spec.** | -| 4 | Should a later phase define a pluggable session store interface, and should it be cross-language or per-language? | Per-language interface (idiomatic per ecosystem). Cross-language compatibility is **not** a v1 goal; revisit if/when concrete demand emerges. | -| 5 | Should the host support multi-target hosting (one host fronting a router across multiple agents/workflows)? | **No.** One host = one target. External routers compose multiple single-target hosts (e.g. via Starlette mount in Python, equivalent in .NET). Confirms the existing non-goal. | -| 7 | Should command scopes / projection metadata (private vs group, per-locale descriptions) become first-class on `ChannelCommand`? | Add **optional** `scopes` and `locales` fields on `ChannelCommand`. Channels are free to ignore them. Keeps the cross-channel surface lean while letting Telegram (and the future Activity Protocol channel) project the metadata into their native command catalog. | -| 8 | Which identity-linking mechanisms ship in the first phase? | Ship two first-party helpers in v1 fast-follow: **Entra OAuth** (preset on `OAuthIdentityLinker`) and **`OneTimeCodeIdentityLinker`** (cross-channel code exchange). **Drop `MfaIdentityLinker`** from the v1 fast-follow list. The generic `IdentityLinker` contract still admits any other linker app authors want to write. | -| 9 | Where do issued link grants live? | **File storage for v1**, leveraging Hosted Agents' isolated, persistent per-instance file storage. Resolved together with Q11. | -| 10 | Should the identity resolver be invoked per channel or once on the host with `(channel_id, native_id)`? | **Host-level resolver receiving `(channel_id, native_id)`** so cross-channel decisions stay in one place. Per-channel overrides remain a future option if real cases emerge. | -| 11 | Where does the continuation-token store live? At-rest format and TTL? | Same as Q9 — **file storage for v1** (`FileHostStateStore` under `./.af-hosting/continuations/`, atomic JSON-per-token writes, 24h default TTL on completed entries). Shares the host-level `HostStateStore` contract with link grants and last-seen records. Pluggable Cosmos / SQL / Redis adapters tracked in spec req #23. | -| 12 | Contract for `ChannelPush` failure (offline destination, opt-out, expired token)? | **Retry handled by the configured `DurableTaskRunner` per its `RetryPolicy`.** The host registers a single internal `"hosting.push"` handler at startup; each non-originating destination becomes a `runner.schedule("hosting.push", payload)` call. Failures inside the handler are caught by the runner, retried with backoff, and ultimately marked terminal-failed when `max_attempts` is exhausted. Downstream push outcomes live in the runner's own log — there is no per-destination return surface. The earlier `DeliveryReport` value type has been removed; the host's internal `_deliver_response` helper returns `bool` (whether any work was scheduled) for the originating channel. Per-request override via `run_hook`. See the Python spec's [Intended targets + durable delivery](../specs/002-python-hosting-channels.md#intended-targets--durable-delivery) and [Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). | -| 13 | Should `response_target="active"` use a time window? Behavior on expiry? | Yes — configurable `active_window_seconds` on the host (suggested default **300 s**). On expiry, fall back to `originating`, then to `all_linked`. Per-request override via `run_hook`. | -| 15 | Should `Channel.confidentiality_tier` stay opaque or become an ordered enum? | **Keep as opaque string.** Apps define their own taxonomy. Built-in policies do equality / set membership checks only — no ordered-comparison policy is shipped. | -| 16 | Should authorization (per-channel allowlist) ship as a single `auth_mode` enum or as two orthogonal parameters? | **Two orthogonal parameters (`require_link: bool` + `allowlist: IdentityAllowlist`)** plus named `AuthPolicy` factories for ergonomics. A single enum cannot express the Mixed profile (native ids bypass auth, everyone else is funneled into linking) without sub-parameters that defeat the point. Composition uses a **tri-state `AllowlistDecision` (`ALLOW` / `DENY` / `ABSTAIN`)** so claim-based allowlists can `ABSTAIN` until claims are available without that being read as a denial. `LinkedClaimAllowlist` use without a source of verified claims is rejected at host startup via a typed `ChannelConfigurationError` — silent deny-everyone is the worst possible default and is not allowed. The core PR includes the channel-neutral pieces: `IdentityAllowlist` Protocol, `AllowlistDecision`, built-ins (`AllowAll` / `NativeIdAllowlist` / `LinkedClaimAllowlist` / combinators / `CallableAllowlist`), `IdentityLinker` Protocol, `LinkedIdentity`, `LinkChallenge`, `AuthPolicy` factories, `Allowed` / `LinkRequired` / `Denied` outcomes, `Host(default_allowlist=..., identity_linker=...)` + per-channel `allowlist`, three-rule construction validator (`require_link=True` without a linker now raises), and `host.authorize(...)` for open, native-id, and linked-claim profiles. Provider-specific linkers (for example Entra OAuth helpers) ship as separate packages. See the Python spec's [Authorization profiles section](../specs/002-python-hosting-channels.md#authorization-profiles-and-the-identityallowlist-seam) for full mechanics. | -| 17 | How does the host decide long-running vs ephemeral runtime, and is that distinction enforced? | **Single `runtime_mode` parameter, advisory, auto-detected by default.** `None` (the default) inspects known deployment markers (`FOUNDRY_HOSTING_ENVIRONMENT`, `AZURE_FUNCTIONS_ENVIRONMENT`, `AWS_LAMBDA_FUNCTION_NAME`) and picks `"ephemeral"` on the first hit; otherwise falls back to `"long_running"` (sensible local-dev / always-on default). The mode drives the *default selection* of seams that have FHA-shaped vs container-shaped defaults — `HostStateStore`, `DurableTaskRunner`, identity-link state — but every choice remains independently overridable. Detected mode is logged at startup so misdetection is visible. See the Python spec's [Runtime modes](../specs/002-python-hosting-channels.md#runtime-modes). | -| 18 | How does delivery to non-originating destinations actually happen, and what is the retry / replay contract? | **Out-of-band via a pluggable `DurableTaskRunner`.** The host registers an internal `"hosting.push"` handler at startup; each non-originating destination becomes a `runner.schedule("hosting.push", payload)` call. The originating destination (when `ResponseTarget` includes it) is **still rendered synchronously** on the originating channel's wire — only fan-out goes through the runner. Default runner is `InProcessTaskRunner` (asyncio + bounded retry, no cross-restart persistence — suitable for `long_running`). Durable adapter packages (`agent-framework-hosting-durabletask`, future Foundry adapter) plug into the same Protocol for `ephemeral` deployments. Replay across host restarts is a property of the configured runner (native for durable adapters; not supported for the in-process runner). See the Python spec's [Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). | -| 19 | What is the audit shape on the assistant message — full per-destination state machine, or intent only? | **Intent only.** `Message.additional_properties["hosting"]["intended_targets"]` is a single immutable write that records the resolved destination set (after `ResponseTarget` + `LinkPolicy` filtering). Operational state — attempt count, last error, success timestamp, channel-issued id — lives in the `DurableTaskRunner` and is observed via the runner's backend. This eliminates the earlier per-destination `pending`/`delivered`/`failed`/`skipped` state machine, the `SupportsDeliveryTracking` provider capability, and the Foundry `update_item` service ask. See the Python spec's [Intended targets + durable delivery](../specs/002-python-hosting-channels.md#intended-targets--durable-delivery). | -| 20 | What is the wire contract between `DurableTaskRunner` and push-capable channels when the runner is out-of-process? | **A two-piece contract.** Each `DurableTaskRunner` declares its `payload_mode` (`OBJECT` for in-process pass-by-reference; `JSON` for runners that round-trip through JSON). Push-capable channels that ship non-JSON-native payloads expose a `ChannelPushCodec` (`encode` / `decode`). At construction the host validates the pairing and refuses a `JSON`-mode runner paired with codec-less push channels (`ChannelConfigurationError`). The push handler accepts both `OBJECT` and `JSON` envelope shapes so the same handler serves both runner backends. See the Python spec's [Codec contract for durable serialisation](../specs/002-python-hosting-channels.md#codec-contract-for-durable-serialisation). | -| 21 | What is the operational contract for `runtime_mode="ephemeral"` without a configured durable runner, and for clean shutdown of the in-process runner? | **Strict ephemeral by default + 2-phase drain.** `ephemeral` + default (in-process) runner raises `RuntimeError` at construction unless `allow_in_process_runner=True` is opted in (warning logged) — silently using the in-process runner in an ephemeral environment would drop in-flight pushes on the next scale-to-zero. For `long_running`, `InProcessTaskRunner` ships a `shutdown_grace_seconds` window (default `5.0`) that lets in-flight retries finish before cancellation; `CancelledError` from the cancellation phase is swallowed as the expected shutdown shape. When `echo_input=True`, the push task carries an `echo_done` cursor in runner-owned state so a retry that fires after the echo succeeded does not double-echo. See the Python spec's [Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). | - -## Decisions-driven follow-ups - -These are spec-body / sample / code edits implied by the resolutions above, **out of scope for this ADR pass** but tracked here so they aren't lost: - -- **Q3** — cross-check the Python spec's auth-helpers stance against the resolved "per-channel + host middleware" decision; reconcile any drift. -- **Q7** — spec, `ChannelCommand` reference, and the Telegram channel design need optional `scopes` and `locales` fields with clear "channels free to ignore" semantics. -- **Q8** — ✅ Done in spec rev. Req #24 lists only `OAuthIdentityLinker` and `OneTimeCodeIdentityLinker`; the linker-helper table and the OAuth scenario no longer reference `MfaIdentityLinker`. -- **Q9 + Q11** — ✅ Resolved in spec rev. Spec req #23 now names the seam **`HostStateStore`** with a v1 default of `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`), so continuation tokens, link grants, and last-seen records all survive single-node restarts. Pluggable Cosmos / SQL / Redis adapters remain v1 fast follow. -- **Q12** — ✅ Resolved by the durable-delivery seam (Q18). The `ChannelPush` failure narrative is now "retry per `RetryPolicy` via the runner, observe in the runner's backend"; no separate `deliveries[]` annotation required. -- **Q13** — add `active_window_seconds` (default 300 s) to the host config surface and document the `originating` → `all_linked` fallback chain. -- **Q14** — explicitly document the **swappable WS codec** property in the Responses channel section (host contract does not depend on the framing) so the spec stays valid as upstream OpenAI evolves. -- **Q15** — confirm the spec consistently treats `confidentiality_tier` as an opaque string and that no built-in policy assumes an ordered hierarchy. -- **Q17 / Q18 / Q19** — ✅ Spec text added: new top-level §[Runtime modes](../specs/002-python-hosting-channels.md#runtime-modes), rewritten §[Intended targets + durable delivery](../specs/002-python-hosting-channels.md#intended-targets--durable-delivery), new §[Durable task runner](../specs/002-python-hosting-channels.md#durable-task-runner). Core code lands `DurableTaskRunner` Protocol + `InProcessTaskRunner` + `runtime_mode` constructor parameter + auto-detection in this PR; durable runner adapters (`agent-framework-hosting-durabletask`, Foundry adapter) ship as separate follow-up packages. +`ChannelIdentity`, when present, is request metadata only. In v1 it is not a linking, authorization, or delivery key. -## More Information +### Trust boundary for `isolation_key` -See [Non-Goals](#non-goals--relationship-to-existing-hosting-packages) for what this ADR explicitly does **not** require in the first phase. +The host treats `ChannelSession.isolation_key` as a session partition key, not as proof of identity. Channels or host middleware must authenticate and authorize any externally supplied value before passing it to the host. For example, a Responses caller must not be allowed to choose an arbitrary `previous_response_id` or header-derived key unless the platform or middleware has already established that the caller owns that conversation. The host deliberately does not infer that trust from the string itself. -The Telegram sample proposed in PR #5393 is prior art for native command catalogs and for channels that need startup/shutdown lifecycle behavior beyond plain route registration. The same shape is expected to inform the future Activity Protocol channel (Teams/Web Chat/etc. via Azure Bot Service) and a future WhatsApp channel in both languages. +### Hook ownership -**Designed-in followup channels.** Three further channels are explicitly part of the overall design but are scheduled as fast-follow work after the first Responses + Invocations + Telegram release: +Channels provide hook configuration and protocol-native context. The host invokes those hooks as part of the common invocation pipeline: -- **A2A channel** — exposes the hostable target over the Agent-to-Agent protocol so other agents can consume it as a peer. Fits the existing **caller-supplied session** family (alongside Responses and Invocations): A2A's per-conversation identifier is parsed into `ChannelSession.key`, the calling agent's identity (e.g. its A2A agent card / signed JWT) flows through the standard `IdentityResolver` seam, and structured replies fit the existing `ChannelRequest` / `ResponseTarget` envelope. No new host primitives are required to support it; the work is the protocol binding and the package. -- **MCP-tool channel** — exposes the hostable target as a **Model Context Protocol tool** so MCP clients (other agents, IDE tooling, …) can invoke it. Same caller-supplied-session family: the MCP `tool/call` carries the conversation key into `ChannelSession.key`, the MCP client identity flows through `IdentityResolver`, and the tool result is the target's response. Streaming MCP tools map onto the host's existing streaming response delivery; non-streaming MCP tools map onto background runs with `ContinuationToken` if the target needs more time than a single tool-call round-trip allows. -- **Activity Protocol channel** (`ActivityChannel`) — exposes the hostable target behind **Azure Bot Service**, which fronts Teams, Web Chat, Slack-style connectors, and the rest of the Bot Framework / M365 connector ecosystem. Native translations from Activity Protocol objects (`Activity`, `ConversationReference`, adaptive cards, `Invoke` activities, …) onto the host's `ChannelRequest` / `ChannelResponse` types — so the contract is **explicit** rather than smuggled through a generic Invocations endpoint. Fits the **host-tracked session** family: Bot Service authenticates with a JWT carrying the AAD object id, the channel populates `ChannelIdentity` from `from.aadObjectId`, the host's per-`isolation_key` alias decides which `AgentSession` to resolve, and `host.reset_session(...)` is reachable via a Teams slash command or adaptive-card action. `ChannelPush` is implemented over Bot Service's `ConversationReference` + `continueConversationAsync` pattern. Naming this channel **Activity** rather than **Teams** keeps a `TeamsChannel` name available for any future direct-to-Teams transport that bypasses Bot Service. +- `ChannelRunHook` runs after channel parsing and before target invocation. +- `ChannelResponseHook` runs after target invocation and before the originating channel serializes its response. +- `ChannelStreamUpdateHook` is applied by the host while the channel consumes streamed updates because streaming serialization is protocol-specific. -All three channels MUST be reachable through the **same** `AgentFrameworkHost` as Responses, Invocations, and Telegram so the cross-channel `isolation_key` continuity story (start a task via MCP from an IDE, follow up on Telegram, deliver the result on Teams via the Activity channel) is coherent. Their detailed API surfaces are deferred to dedicated follow-up specs. +`ChannelStreamUpdateHook` is an update hook, not a final-response sanitizer. Channels that use it for redaction or filtering must also apply equivalent policy to any final response they render. Channels choose whether the response is streaming before run hooks execute. -Companion specs cover the per-language API surface, information design, and sample code: +This keeps hook call conventions centralized while leaving protocol payload parsing and response formatting in channel packages. -- [SPEC-002 Python hosting core and pluggable channels](../specs/002-python-hosting-channels.md) -- *(future)* SPEC-00X .NET hosting core and pluggable channels +### State owned by v1 -## Appendix A — Comparison with Microsoft 365 Activity Protocol +`state_dir` is limited to host-owned local files for reset-session aliases and workflow checkpoint path derivation. It does not store linked identities, active-channel state, response-routing state, continuation records, durable runner queues, or delivery attempts. Those storage concerns belong to ADR-0028. -The [Microsoft 365 Agents SDK Activity Protocol](https://learn.microsoft.com/en-us/microsoft-365/agents-sdk/activity-protocol) (and its underlying [protocol-activity spec](https://github.com/microsoft/Agents/blob/main/specs/activity/protocol-activity.md)) is the closest existing Microsoft prior art for a multi-channel hosting layer. It powers Microsoft 365 Copilot, Copilot Studio, and the M365 Agents SDK across Teams, web chat, Slack-style connectors, and so on. This appendix contrasts the two designs so future readers know which problems we deliberately solve differently and why. +## Non-goals for v1 -### Mental model +The following are deliberately **not** part of the v1 contract: -| Concept | Activity Protocol | This ADR | -|---|---|---| -| **Inbound + outbound envelope** | A single `Activity` JSON envelope used in both directions, distinguished by `type` (`message`, `event`, `invoke`, `conversationUpdate`, `typing`, …). | Asymmetric: `ChannelRequest` for inbound, `HostedRunResult` / `HostedStreamResult` / `ChannelPush` for outbound. Protocol-native bytes never leave the channel package. | -| **Channel surface** | A `ChannelID` string (e.g. `msteams`) on every Activity; channels are connected via Bot Framework Connector Service or M365 Agents SDK adapters. | A `Channel` Protocol contributed by an in-process Python package. Each channel owns its own routes, parsing, auth validation, and protocol model — no central connector service. | -| **Adapter** | `Adapter` / `CloudAdapter` translates channel-native protocol ↔ Activity and runs the turn. Adapters are framework-supplied. | `Channel.contribute(context) -> ChannelContribution` returns Starlette routes + lifecycle. Channels are user-extensible packages. | -| **Turn** | `TurnContext` bundles incoming `Activity`, outbound `SendActivityAsync`, `TurnState`, and adapter. Per-turn, disposed at end. | Channel handler calls `await context.run(channel_request)` / `context.stream(...)`; reply is the awaited `HostedRunResult`. No per-turn state object beyond the request itself. Earlier draft had a `ChannelRunHookContext`; that wrapper was removed in favor of `(request, **kwargs)`. | -| **Identity** | `Activity.From` + `Activity.Recipient` carry per-turn identities; cross-channel identity unification is not in protocol. | `ChannelIdentity(channel, native_id, attributes)` extracted by the channel; host-level `IdentityResolver` maps to a stable `isolation_key`; `IdentityLinker` performs cross-channel link ceremonies. | -| **Conversation context** | `Activity.Conversation.id` is the per-channel conversation key; conversation history is the agent author's responsibility. | `ChannelSession(key, isolation_key)` resolves to an `AgentSession` host-side, with cross-channel continuity when channels emit the same `isolation_key`. | -| **Routing reply target** | Reply goes to `Activity.Conversation.id` on the originating channel. Cross-channel proactive sends require manually persisting a `ConversationReference`. | `ResponseTarget` (`originating`, `active`, `channel(name)`, `channels([...])`, `all_linked`, `none`) is first-class on every request, resolved by the host against last-seen channel state and the identity store. | -| **Background work** | No first-class `ContinuationToken`; long work uses proactive messaging via stored `ConversationReference`. | `ContinuationToken` + `host.run_in_background(...)` + per-channel poll routes are part of the host contract; result delivery follows `ResponseTarget`. | -| **Auth** | Bot Framework Auth: JWTs signed by the Bot Connector Service, verified by the SDK adapter. | Each channel implements its own validation against the upstream protocol (Telegram secret token, Bot Service JWT for the planned `ActivityChannel`, OAuth on identity-link routes); host can layer Starlette middleware. | -| **Activity types beyond messages** | First-class `ConversationUpdate`, `Event`, `Invoke`, `Typing`, plus 20+ others — channels emit them uniformly. | `ChannelRequest.operation` is a free-form discriminator (default `"message.create"`); other categories (typing indicators, membership change, structured `invoke` request/reply) are channel-package concerns and not modeled centrally. | -| **Outbound streaming** | `SendActivityAsync(typing)` + multiple `SendActivity` calls. | `HostedStreamResult` async iterator returned to the channel; channel decides how to render onto its protocol (SSE for Responses, long messages for Telegram, etc.). | +- cross-channel identity linking (`IdentityLinker`, `local_identity_link`, or `agent-framework-hosting-entra`), +- identity allowlists or authorization policy (`IdentityAllowlist`, `AuthPolicy`), +- response routing beyond the originating channel (`ResponseTarget`, active channel, specific linked channel, `all_linked`), +- push or payload codecs (`ChannelPush`, `ChannelPushCodec`), +- background/continuation delivery, +- durable task runners (`DurableTaskRunner`, `InProcessTaskRunner`), +- retry/replay policy (`RetryPolicy`), +- fan-out, multicast, or all-linked delivery, +- confidentiality tiers and `LinkPolicy`, and +- a host-level multi-agent router. -### Where we deliberately diverge +These areas are follow-up enhancements covered by [ADR-0028](0028-hosting-linking-multicast-enhancements.md). They are not prerequisites for shipping or using the v1 host. -1. **Asymmetric envelopes instead of a single `Activity`.** The Activity envelope is heavyweight and tightly coupled to Bot Framework conventions (`From`/`Recipient`/`Conversation`/`ServiceUrl`). For a hosting layer that fronts the Responses HTTP API, OpenAI-style invocations, and Telegram all at once, forcing every channel through a unified envelope would either dilute it (Responses-shaped JSON wedged into `Activity.Value`) or impose Bot Framework semantics on protocols that don't carry them (Responses has no per-message `From` to fill). The cost of asymmetry is that channels write their own outbound serialization; the gain is each channel stays idiomatic to its upstream protocol. +## Consequences -2. **In-process channel packages instead of a connector service.** Activity Protocol assumes a Bot Connector Service (cloud-hosted by Microsoft for Teams/Web Chat/etc.) sits between the channel and the agent. We target a single Starlette ASGI app the developer runs anywhere, with each channel package owning its own webhook/HTTP/SSE/WS surface. This is critical for the Responses and Invocations channels (which **are** the upstream protocol; there is no connector to terminate them) and removes the operational dependency for self-hosted deployments. The trade-off is that scaling, auth federation, and channel-update rollout become the operator's problem instead of being centralized. **Note:** the planned `ActivityChannel` (designed-in fast follow) does deliberately sit behind Azure Bot Service so we inherit the connector model for Teams/Web Chat/Slack — that channel is the *interop* path for Activity Protocol; the contrast above is about the rest of the channel set (Responses, Invocations, Telegram, A2A, MCP-tool) where there is no equivalent service and a direct in-process binding is the only sensible option. +Positive: -3. **Cross-channel identity is first-class.** Activity Protocol has no native concept of "this Teams user is the same person as this Telegram user." Bot Framework's User Authentication / OAuth Connection Settings handle per-channel sign-in but not the merge. Our `IdentityLinker` + host-managed identity store explicitly model the link ceremony and the resulting merge so a single `AgentSession` can span channels. This is required for the multi-channel scenarios this hosting layer was created to support (Scenarios 7 and 8 in SPEC-002) and is intentionally above what the Activity Protocol contract guarantees. +- The host/channel model can be implemented and tested without designing a security-sensitive identity graph. +- Existing and new channel packages can share one Starlette app, middleware stack, lifecycle, and target invocation path. +- Session continuity is explicit and debuggable: two channels share history only when they produce the same `isolation_key`. +- Hook invocation is centralized in the host, so channels do not each invent the call convention. -4. **`ResponseTarget` as a request-level field instead of an out-of-band proactive-send pattern.** Activity Protocol treats proactive cross-channel delivery as a deployment exercise (persist `ConversationReference`, restore later, call `continueConversationAsync`). We elevate it to a typed field on every request, consumed by the host. This makes "submit on Telegram, deliver result on Teams" a one-line authoring change instead of a custom pipeline, but it does require that channels capable of proactive delivery implement the `ChannelPush` capability. +Negative: -5. **No central activity-type taxonomy in v1.** `ChannelRequest.operation` is intentionally free-form. Activity Protocol's `Type` discriminator (`message`, `event`, `invoke`, `conversationUpdate`, `typing`, …) is a real strength — it lets generic middleware reason about non-message events uniformly. We accept the gap in v1 because (a) the Responses + Invocations + Telegram set has effectively one "type" (a message that wants a reply), and (b) modeling the long tail of typed events properly is a design exercise that should not block hosting v1. See **Possible influence on future iterations** below. +- Apps that need OAuth linking, allowlists, proactive messages, or multicast must continue to implement those behaviors outside the v1 host. +- Some richer cross-channel scenarios from the original design move to a separate decision and validation cycle. +- The host must document `isolation_key` trust clearly because it now provides the shared session boundary. -6. **No `TurnContext`-style per-turn bag.** Earlier drafts of this ADR proposed `ChannelRunHookContext` to play a similar role to `TurnContext`. It was removed in favor of `def hook(request, **kwargs) -> ChannelRequest` because the only consumers (run hooks) don't need most of what `TurnContext` provides, and forcing a wrapper made simple hooks awkward to write inline. Channels that need adapter-style state can compose it inside their own `Channel` implementation. +## Validation Gates -### Where Activity Protocol could influence future iterations +Before this ADR is accepted: -- **Typed event taxonomy.** Adopting a small enum for `ChannelRequest.operation` modeled on Activity Protocol's set (`message`, `event`, `conversationUpdate`, `invoke`, `typing`) would let generic middleware (rate limit, audit, content moderation) reason about channel traffic uniformly. This is additive and could land alongside the v1.x telemetry work without breaking the free-form string field. -- **Outbound `Activity`-style envelope as a serialization target.** This is the planned `ActivityChannel` (designed-in fast follow) — it maps `HostedRunResult` ↔ `Activity` inside the channel package and forwards through Azure Bot Service. The hosting contract was designed so this binding requires no new host primitives. -- **`ConversationReference`-style proactive seed.** When `ResponseTarget.active` cannot find a recently seen channel, falling back to a stored `ConversationReference`-equivalent (last-known channel + last-known native id, persisted in the identity store) would mirror Bot Framework's proactive-message recovery story. This is implicit in the v1.x identity-store work (Open Question 9). -- **Invoke-style synchronous request/reply.** Activity Protocol's `Invoke` (`task/fetch`, `task/submit`) is a useful precedent for what a typed `InvocationsChannel.invoke()` operation could look like beyond "post one message, get one reply" — particularly for Teams adaptive-card submit flows that the `ActivityChannel` will eventually need to host. +- A sample can expose one target on multiple channels with one `AgentFrameworkHost` and no handwritten Starlette route composition. +- Built-in channel tests prove that routes, commands, startup, and shutdown callbacks are contributed by channels and aggregated by the host. +- Session tests prove that identical `ChannelSession.isolation_key` values resolve to the same cached `AgentSession`, and `reset_session` rotates that mapping. +- Channel tests prove that each channel renders only its own originating response; there is no host-level push, multicast, or active-channel delivery path. +- Workflow tests or samples use an explicit `checkpoint_location`. +- Foundry isolation middleware is documented and covered by integration or contract tests, including the non-Foundry case where raw isolation headers are ignored. +- The v1 API and packages do not expose the removed symbols or packages listed in [Non-goals for v1](#non-goals-for-v1). +- The Python spec is updated to match this simplified contract and uses "public", "stable", or "released" terminology for Agent Framework APIs. -### Summary +## More Information -Activity Protocol optimizes for **a single Microsoft-operated abstraction over many client surfaces**, with a uniform envelope, a connector service in the middle, and per-channel adapters supplied by the SDK. This ADR optimizes for **a self-hosted, in-process Python (and later .NET) layer that fronts both LLM-shaped HTTP protocols and human-chat channels**, with each channel owning its idiomatic protocol and the host owning identity, sessions, and cross-channel routing. The two designs solve overlapping but distinct problems; nothing in this ADR precludes a future Activity Protocol channel package, and several of Activity Protocol's primitives (typed event taxonomy, conversation reference, invoke) are tracked as candidate future enhancements. +- Python v1 specification: [SPEC-002](../specs/002-python-hosting-channels.md) +- Follow-up linking and multicast ADR: [ADR-0028](0028-hosting-linking-multicast-enhancements.md) diff --git a/docs/decisions/0028-hosting-linking-multicast-enhancements.md b/docs/decisions/0028-hosting-linking-multicast-enhancements.md new file mode 100644 index 00000000000..390881b7ea6 --- /dev/null +++ b/docs/decisions/0028-hosting-linking-multicast-enhancements.md @@ -0,0 +1,132 @@ +--- +status: proposed +contact: eavanvalkenburg +date: 2026-06-11 +deciders: eavanvalkenburg +--- + +# Hosting linking and multicast enhancements + +## Context and Problem Statement + +[ADR-0027](0027-hosting-channels.md) defines the minimal v1 hosting core: originating-channel responses, explicit `ChannelSession.isolation_key`, and no host-level identity linking, push, multicast, background delivery, or durable runners. + +This ADR tracks the richer cross-channel behaviors that were removed from v1. These enhancements are **follow-up work** and are **not prerequisites** for shipping, using, or stabilizing the v1 host/channel core. + +## Decision Drivers + +- Cross-channel continuity must not create accidental cross-user, cross-tenant, or cross-channel data leaks. +- Non-originating delivery must be observable, idempotent, retryable, and supportable. +- Protocol payloads must remain channel-native while still being safe to persist and replay. +- App authors need opt-in policy controls, not hidden defaults. +- The enhancement stack should layer on top of the v1 host without reshaping the minimal channel contract. + +## Enhancement Areas + +The follow-up design should cover these capabilities together because they share identity, storage, delivery, and replay concerns: + +- **Cross-channel identity linking** — a user can connect multiple `ChannelIdentity` values to one channel-neutral `isolation_key`. +- **Authorization and allowlist policy** — channels or hosts can require verified identity, allow specific native identities or claims, and deny unknown callers. +- **Non-originating response delivery** — a run can respond somewhere other than the request's originating protocol when explicitly configured. +- **Active-channel routing** — delivery can target the most recently observed linked channel for an `isolation_key`. +- **Multicast / all-linked delivery** — delivery can fan out to every linked channel or a selected set. +- **Background runs and continuation tokens** — long-running requests can return immediately and complete later, with a polling/status fallback. +- **Durable delivery runners** — delivery work can survive process restarts and support dead-letter handling. +- **Retry and replay semantics** — delivery attempts are bounded, deduplicated, and safe to replay. +- **Payload serialization** — channel-specific payloads can be persisted, redacted, versioned, and reconstructed without losing protocol fidelity. + +Candidate API names from the broader design (`IdentityLinker`, `IdentityAllowlist`, `AuthPolicy`, `ResponseTarget`, `ChannelPush`, `ChannelPushCodec`, `DurableTaskRunner`, `InProcessTaskRunner`, `RetryPolicy`, `LinkPolicy`) remain design vocabulary for this ADR. They are not approved v1 APIs. + +## Considered Options + +### Option A — Leave all behavior to applications + +Applications implement linking, authorization, push, retry, and serialization independently. + +- Good: the hosting core stays very small. +- Neutral: advanced apps can still build what they need. +- Bad: every app must solve the same security and delivery problems, likely inconsistently. + +### Option B — Add the full enhancement stack to v1 + +The first host release includes linking, authorization, active channel, multicast, background runs, durable runners, and codecs. + +- Good: the original cross-channel experience is available immediately. +- Neutral: samples can demonstrate rich end-to-end flows. +- Bad: v1 becomes security-sensitive, storage-heavy, and harder to stabilize. + +### Option C — Layer opt-in enhancement packages after v1 + +Ship the minimal host first, then add linking, authorization, and delivery packages behind explicit configuration. + +- Good: v1 remains simple while leaving room for a reviewed, supportable enhancement stack. +- Neutral: apps that need advanced delivery wait for follow-up packages. +- Bad: the first release does not satisfy proactive or all-linked scenarios. + +### Option D — Build only platform-specific integrations + +Implement linking and proactive delivery separately in Telegram, Activity Protocol, Discord, and future channels. + +- Good: each package can match its protocol exactly. +- Neutral: some shared abstractions may emerge later. +- Bad: cross-channel behavior becomes fragmented and hard to reason about. + +## Decision Outcome + +Proposed direction: **Option C — layered opt-in enhancement packages after v1**. + +The minimal host remains the foundation. Follow-up packages may add linking, authorization, delivery, and durable execution, but must be explicitly enabled and must pass the validation gates below before becoming part of the public contract. + +## Safety Requirements + +### Threat model + +The design must account for: + +- spoofed channel-native identities, +- stolen or replayed link challenges, +- cross-tenant or cross-confidentiality data leakage, +- unsolicited proactive messages, +- malicious payloads persisted for replay, +- denial-of-service through fan-out or retry storms, and +- privacy leakage through logs, metrics, or support tooling. + +Required mitigations include verified identity claims where available, signed and expiring link challenges, explicit user consent, per-channel capability checks, default-deny policy options, tenant partitioning, and uninformative denial messages on shared channels. + +### Idempotency and replay + +Exactly-once delivery is not a realistic guarantee. The design must provide: + +- stable run, continuation, and delivery-attempt identifiers, +- channel-level idempotency keys where protocols support them, +- bounded retry with jitter and explicit terminal states, +- replay windows and expiration, +- duplicate suppression for persisted attempts, and +- clear semantics for "delivered", "accepted by platform", and "observed by user". + +### Storage + +Enhancement storage must stay distinct from v1 `AgentSession` history and workflow checkpoints unless an implementation deliberately backs them with the same physical store. + +Stored data should be schema-versioned, minimized, encrypted or otherwise protected as appropriate, and partitioned by tenant/project. Link records, continuation records, active-channel state, delivery attempts, dead letters, and serialized payloads need independent TTL and deletion policies. + +### Observability and support + +The design must include structured logs, traces, and metrics for link attempts, authorization decisions, delivery scheduling, retries, replay, and dead-letter outcomes. Logs must avoid message content and sensitive identity claims by default. Operators need a way to inspect, revoke, replay, or purge stuck records safely. + +## Validation Gates + +Before these enhancements are accepted: + +- A reviewed threat model covers identity linking, authorization, non-originating delivery, multicast, and replay. +- Cross-channel linking tests prove a verified identity can link two channels and that unlink/deny paths do not leak information. +- Authorization tests cover native-id allowlists, verified-claim allowlists, default-deny behavior, and misconfiguration failures. +- Delivery tests cover originating-only, specific-channel, active-channel, selected-channel, and all-linked routing. +- Background/continuation tests cover polling fallback, cancellation or expiration, process restart, retry, and dead-letter behavior. +- Codec tests prove payloads are versioned, redacted where needed, backward compatible, and rejected safely when unknown. +- Multicast tests prove fan-out is bounded, independently retried, and idempotent per destination. +- Observability tests or manual validation prove support operators can correlate a request to delivery attempts without exposing sensitive content. + +## Relationship to ADR-0027 + +ADR-0027 remains valid without any of these enhancements. This ADR extends the hosting model only after the safety, storage, and support requirements above are satisfied. diff --git a/docs/specs/002-python-hosting-channels.md b/docs/specs/002-python-hosting-channels.md index af7ae6cd464..fbfaa0a8145 100644 --- a/docs/specs/002-python-hosting-channels.md +++ b/docs/specs/002-python-hosting-channels.md @@ -1,2276 +1,320 @@ --- status: proposed contact: eavanvalkenburg -date: 2026-04-24 +date: 2026-06-11 deciders: eavanvalkenburg --- # Python hosting core and pluggable channels -## What are the business goals for this feature? +## Scope -Give Python app authors one low-level, Starlette-based hosting surface that can expose a single **hostable target** — either a `SupportsAgentRun`-compatible agent **or** a `Workflow` — on one or more channels (Responses API, Invocations API, Telegram, future A2A, MCP-tool, Activity Protocol via Azure Bot Service — which fronts Teams, Web Chat, Slack, …— WhatsApp, optional future direct-to-Teams, etc.) without requiring them to hand-build protocol routing or server glue per protocol, **and** let an end user start a conversation on one channel (e.g. Telegram on their phone) and seamlessly continue it on another (e.g. Teams at their desk via the Activity channel) against the same target and the same conversation history. +This specification is the Python implementation plan for [ADR-0027](../decisions/0027-hosting-channels.md). It documents the simplified v1 host/channel contract only. -This consolidates the protocol-specific hosting layers that exist today (`agent-framework-foundry-hosting`, `agent-framework-ag-ui`, `agent-framework-a2a`, `agent-framework-devui`) into a shared composable model where: +The v1 contract is: -- a host owns the ASGI app and channels own protocol shape, -- session identity is **channel-neutral** — the host resolves a session from a channel-supplied `isolation_key` (e.g. a stable user identity) so two channels mounted on the same host can resolve to the **same** `AgentSession` for the same end user, and a future pluggable session store extends that continuity across hosts and processes, and -- channel-native identity is **mapped, not assumed** — the host owns a first-class `IdentityResolver` seam (channel-native id → `isolation_key`) and an `IdentityLinker` seam (well-known connect ceremony — OAuth, MFA, signed one-time code — to associate a new channel-native id with an existing `isolation_key`), so cross-channel continuity does not depend on each channel's user namespace happening to align, and -- response delivery is **decoupled from request origin** — every `ChannelRequest` carries a `ResponseTarget` (`originating` (default), `active` for the user's most recently used channel, a specific channel id, all linked channels, or `none` for background-only). Background/asynchronous runs are first-class via a `ContinuationToken` returned by `host.run_in_background(...)` so a user can submit a long-running request on one channel and receive the result on another (or poll by continuation token), and -- channels can be assigned different **confidentiality tiers** so two channels on one host can share an agent without sharing a session — e.g. Teams (corporate, allowed to access internal resources) and Telegram (public) can run against the same target while remaining session-isolated, with a host-level `LinkPolicy` that decides which confidentiality tiers may be linked (and includes an explicit "deny all" variant for hosts that want no cross-channel continuity at all). Running two separate hosts is always a valid alternative; the per-tier policy exists for cases where one shared host with two policy-isolated tiers is preferred, and -- **multi-user surfaces** (Telegram groups, supergroups, forum topics; Teams group chats and team channels) are first-class — the channel layer separates user identity from conversation locator, defaults to safe behavior (`mention_only` addressing, `per_user_per_conversation` session scoping, link ceremonies redirected to DMs), and exposes per-channel options to opt into shared-context modes when desired (see [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels)). +- `AgentFrameworkHost` owns one Starlette app, one hostable target, and one or more channels. +- A hostable target is either a `SupportsAgentRun`-compatible agent or a `Workflow`. +- Channels contribute routes, middleware, commands, and lifecycle callbacks. +- Channels parse protocol-native input into `ChannelRequest`. +- Channels render their own originating response. +- Session continuity is explicit: a channel supplies `ChannelSession(isolation_key=...)`, and the host resolves/caches an `AgentSession` for that key. +- The host invokes `ChannelRunHook` and `ChannelResponseHook`; channels provide hook configuration and protocol context. -We know we're successful when: +The host does not link identities, route responses to other channels, run background continuations, or multicast in v1. Those enhancements are tracked in [ADR-0028](../decisions/0028-hosting-linking-multicast-enhancements.md). -- after the agent is created, a basic multi-channel sample requires only one `AgentFrameworkHost`, channel objects, and one `host.serve(...)` call — no handwritten protocol routes and no per-protocol server bootstrap. The hosting core itself takes no dependency on `agentserver`; individual channel packages MAY depend on it where it provides directly reusable building blocks (e.g. `agent-framework-foundry-hosting` builds on the Foundry response-store SDK that ships in `azure.ai.agentserver`), -- a single `AgentFrameworkHost` configured with two channels (e.g. Telegram + a future Activity Protocol channel — Teams via Azure Bot Service) can be exercised by one end user across both channels and observe one continuous conversation, -- an end user known on one channel can run a host-provided `link`/`connect` command on a second channel, complete an OAuth (or MFA, or one-time-code) ceremony, and see subsequent messages on the second channel resolved against the same `AgentSession` as the first, **and** -- a user can submit a long-running request on Telegram with `response_target="active"`, switch to Teams (via the Activity channel), and receive the result there as a proactive message — with a poll route as a fallback for callers that prefer polling. +## Goals -## Problem Statement +- Let an app expose one agent or workflow on multiple protocols without handwritten Starlette composition. +- Keep protocol parsing and response formatting inside channel packages. +- Provide one session-resolution path shared by all channels. +- Keep the channel authoring surface small enough for new channels to implement. +- Preserve full-fidelity agent and workflow results until a channel decides how to render them. -### How do developers solve this problem today? +## Non-goals for v1 -Today, every protocol surface is its own package with its own server. A developer who wants to expose one agent over both the Responses API and a webhook channel has to stand up two separate hosts and stitch them into one ASGI app by hand: +The following are removed from the v1 implementation pass: -```python -# Today: developer composes two protocol-specific hosts manually -import os -import uvicorn -from starlette.applications import Starlette -from starlette.routing import Mount - -from agent_framework import Agent -from agent_framework.openai import OpenAIChatClient -from agent_framework.foundry_hosting import ( - ResponsesHostServer, - InvocationsHostServer, -) - -agent = Agent( - name="WeatherAgent", - instructions="You are a helpful weather agent.", - client=OpenAIChatClient(model="gpt-4.1-mini"), -) - -# Two separate, protocol-specific host wrappers, each with their own -# request/session/event mapping inside. -responses_host = ResponsesHostServer(agent=agent) -invocations_host = InvocationsHostServer(agent=agent) - -# Manually mount each into a Starlette app so they share a process. -app = Starlette(routes=[ - Mount("/responses", app=responses_host.app), - Mount("/invocations", app=invocations_host.app), -]) - -# Bring up the server by hand. -if __name__ == "__main__": - uvicorn.run(app, host="localhost", port=8000) -``` - -Adding a Telegram bot to the same agent today means leaving this stack entirely: spinning up a separate process, installing a Telegram SDK, writing the polling/webhook loop, manually translating updates into agent calls, and wiring command handlers (`/start`, `/new`, `/cancel`, ...) and `set_my_commands(...)` registration by hand — none of which is reusable across other message channels. - -### Why does this problem require a new hosting abstraction? - -The gap is between **owning a hostable target** (a `SupportsAgentRun` agent or a `Workflow`) and **operationalizing it on multiple channels**. Agent Framework already provides agents, workflows, sessions, run inputs, response/update streaming, the `SupportsAgentRun` execution seam, and the `Workflow` execution seam. What's missing is a generic host that: - -1. Owns one Starlette app and one set of lifecycle hooks. -2. Lets channels contribute routes, middleware, commands, and startup/shutdown without protocol leakage into the host. -3. Standardizes how protocol requests become agent invocations (input, options, session, streaming) and how agent results flow back out. -4. **Resolves a session from a channel-neutral `isolation_key`** so two channels mounted on the same host can converge on the same `AgentSession` for the same end user — enabling cross-channel chat continuity (start on Telegram, continue on Teams) without per-channel session bookkeeping. -5. Provides a first-class extension seam for webhook/message channels with native command catalogs (per PR #5393 Telegram sample). +- `IdentityLinker`, `IdentityAllowlist`, `AuthPolicy`, and `LinkPolicy` +- `ResponseTarget`, active-channel routing, `all_linked`, fan-out, and multicast +- `ChannelPush` and `ChannelPushCodec` +- `DurableTaskRunner`, `InProcessTaskRunner`, and `RetryPolicy` +- continuation tokens and background delivery +- confidentiality tiers +- `agent-framework-hosting-entra` +- `local_identity_link` -The current `agentserver`-based hosts are valuable prior art but sit too high in the stack — they encode protocol ownership at the host level. The new generic core learns from their behavior without depending on them; individual channel packages may still depend on the parts of `agentserver` that ship reusable building blocks (notably the Foundry response-store SDK). +These are follow-up design topics, not hidden requirements of the v1 host. -## Non-Goals / Relationship to existing hosting packages +## Packages -The hosting core is deliberately **not** a replacement for the existing protocol packages in their first form, and it is not a multi-agent router. Hosting core, `ag-ui`, `a2a`, `devui`, and `foundry-hosting` solve adjacent but distinct problems: - -| Dimension | Existing protocol packages | `agent-framework-hosting` | +| Package | Import surface | Contents | |---|---|---| -| **Mental model** | One package = one protocol surface, owns its own server | One host owns ASGI app; channels plug protocols in | -| **Scope** | Protocol-specific request/session/event mapping | Generic host + channel contract; protocol logic lives in channel packages | -| **Composition** | One protocol per process or per Mount | Many channels per host, shared middleware, lifecycle, session resolution | -| **Multi-agent** | Out of scope per package | **No.** One host = one agent. Future work. | - -**Explicit non-goals:** -- Migrating `ag-ui`, `a2a`, or `devui` onto the new core in the first implementation. -- Standardizing a persistent session storage contract across all channels. -- Hosting multiple agents behind one router in this first design. -- Designing every detail of WhatsApp, the full Activity Protocol surface, or a future direct-to-Teams channel now (only Telegram is concretely targeted, informed by PR #5393; Activity Protocol via Azure Bot Service, A2A, MCP-tool, and Teams-native via `microsoft/teams.py` are designed-in fast follow — see reqs #25–#28). -- Replacing protocol-specific serializers with one generic event model. -- Taking a runtime or package dependency on the legacy protocol-specific hosts (e.g. `ResponsesAgentServerHost`, `InvocationAgentServerHost`) from the new hosting core. Channel packages MAY depend on lower-level parts of `azure.ai.agentserver` where it ships reusable building blocks (e.g. the Foundry response-store SDK consumed by `FoundryHostedAgentHistoryProvider`). - -**Boundary rule:** If you need protocol-specific event semantics, codecs, or signature validation, that lives in the channel package. The host owns ASGI, lifecycle, session resolution, and the call into the target's execution seam (`SupportsAgentRun.run(...)` for agents, the workflow execution seam for workflows). - -## Requirements - -After we deliver `agent-framework-hosting` and its first channel packages, users will be able to: - -1. **Compose one host with one or more channels** — instantiate `AgentFrameworkHost(target=..., channels=[...])` where `target` is either a `SupportsAgentRun`-compatible agent or a `Workflow`, and get one Starlette application with all channels mounted. -2. **Expose the Responses API** — add `ResponsesChannel()` and serve `/responses` without writing protocol handlers. -3. **Expose the Invocations API** — add `InvocationsChannel()` and serve `/invocations` without writing protocol handlers. -4. **Expose a Telegram bot** — add `TelegramChannel(bot_token=...)` with either `polling` or `webhook` transport, and register native commands declaratively with `ChannelCommand`. -5. **Override endpoint paths** — pass `path="/public/responses"` to move a channel endpoint, or `path=""` when an external platform must call the app root. -6. **Customize per-request invocation behavior** — pass a `run_hook` to any built-in channel. The hook receives the channel-produced `ChannelRequest` (the host-neutral envelope each channel builds from its own protocol parsing — see [Key Types](#key-types)) and returns a possibly-modified `ChannelRequest`. Use it to validate, rewrite, or strip channel-derived options (e.g. enforce or drop `temperature`, override `session_mode`) before the host calls the target's execution seam. It is also the **adapter** that reshapes the channel's default `ChannelRequest.input` into the typed inputs a workflow target requires. -7. **Control session use per request** — built-in channels set `ChannelRequest.session_mode` to `auto`, `required`, or `disabled`; the host honors that when resolving `AgentSession`. -8. **Partition sessions by isolation key** — channels populate `ChannelSession.isolation_key` (user, tenant, chat, …) using hosted-agent terminology. -9. **Resolve to the same session across channels on one host** — two channels mounted on the same `AgentFrameworkHost` that produce the same `isolation_key` (e.g. a stable user identity mapped from each channel's native identifier) resolve to the same `AgentSession`, so an end user starting a chat on Telegram can continue it on Teams against the same conversation history without per-channel session bookkeeping. -10. **Map channel-native identity into `isolation_key`** — every channel has its own user namespace (Telegram `chat_id`, Teams AAD object id, WhatsApp phone, Slack user id). The host accepts a host-level `identity_resolver` callable that maps a `ChannelIdentity(channel_id, native_id, attributes)` into an `isolation_key` (or `None` if unknown). Channels publish the native identity they observed; the resolver decides whether it maps to an existing user. -11. **Link a new channel to an existing identity through a well-known ceremony** — the host accepts a host-level `identity_linker` (e.g. `OAuthIdentityLinker(...)`, `OneTimeCodeIdentityLinker(...)`) which contributes its own routes/lifecycle and exposes a `begin(channel_identity) -> LinkChallenge` / `complete(challenge_id, proof) -> isolation_key` flow. Channels surface a `link`/`connect` `ChannelCommand` that delegates to the linker; on success the resolver subsequently maps the new channel-native identity to the existing `isolation_key`. Mechanism (OAuth provider, signed one-time code, future linker types) is pluggable; the contract is fixed. -12. **Route the response to a chosen channel** — `ChannelRequest.response_target` accepts `ResponseTarget.originating` (default — synchronous response on the originating channel), `ResponseTarget.active` (the channel most recently observed for the resolved `isolation_key`), `ResponseTarget.channel("activity")` (specific channel id, recipient resolved from the link store), `ResponseTarget.channels([...])` (a list), `ResponseTarget.identities([ChannelIdentity(...)])` (one or more **explicit channel-native identities** — bypasses the link store, used when the caller already knows the recipient's channel-native id), `ResponseTarget.all_linked` (every channel where this `isolation_key` is known), or `ResponseTarget.none` (background-only — caller must poll the `ContinuationToken`). When the target is not the originating channel, the host delivers via the destination channel's `ChannelPush` capability. -13. **Push proactively from a channel** — channels that can deliver outbound messages without a prior request (Telegram bot proactive message, Activity Protocol proactive message via Azure Bot Service, webhook callbacks, SSE broadcasts) implement an optional `ChannelPush` capability on top of the base `Channel` protocol. Channels without push can only be the `originating` target. -14. **Submit background runs as a first-class operation** — `host.run_in_background(request) -> ContinuationToken` returns immediately with an opaque, URL-safe `token` and a status (`queued` | `running` | `completed` | `failed`). The host invokes the target asynchronously and, when complete, both delivers the result via the configured `ResponseTarget` push **and** records it against the token so callers can poll `host.get_continuation(token)`. Built-in channels expose poll routes (`/responses/{continuation_token}`, `/invocations/{continuation_token}`) that surface this without app code. Continuation tokens are persisted via a `HostStateStore` (file-based by default — see [Host state storage](#host-state-storage)) so background runs survive host restarts. -15. **Track the active channel per `isolation_key`** — the host records `(isolation_key, last_seen_channel, last_seen_at)` on every successfully resolved request so `ResponseTarget.active` resolves correctly. Apps can override in the `run_hook` (e.g. force `active` to a specific channel for a particular request). -16. **Add Starlette middleware at the host level** — pass `middleware=[Middleware(CORSMiddleware, ...)]` to `AgentFrameworkHost`. -17. **Serve with one call** — call `host.serve(host="localhost", port=8000)` without manually importing `uvicorn`, while `host.app` remains the canonical ASGI surface for any other server (Hypercorn, Daphne, Granian, Gunicorn+uvicorn workers). -18. **Author new channels** — implement the `Channel` protocol, return a `ChannelContribution` with routes/middleware/commands/lifecycle hooks, and call `context.run(...)` or `context.stream(...)` to invoke the agent. -19. **Target any `SupportsAgentRun` or `Workflow`** — host an `Agent`, `A2AAgent`, or a `Workflow`; the `run_hook` is the seam for adapting the channel's default `ChannelRequest` into the target-specific input shape (free-form messages for agents, typed inputs for workflows). -20. **Contribute WebSocket endpoints from a channel** — `ChannelContribution.routes` accepts both `Route` (HTTP) and `WebSocketRoute` (WS); the channel codec is responsible for framing and the same `run_hook` / default mapping pipeline applies. Built-in `ResponsesChannel` exposes a WebSocket transport (default `/responses/ws`, controlled by `transports=("http", "websocket")`) alongside its HTTP+SSE transport, anticipating the OpenAI Responses WebSocket transport. The host requires an ASGI server with WebSocket scope support (Uvicorn, Hypercorn, Daphne, Granian). -21. **Mix channels of different confidentiality tiers on one host** — every `Channel` may declare an opaque `confidentiality_tier: str | None` (e.g. `"corp"`, `"public"`). The host's `LinkPolicy` decides which `(source_tier, target_tier)` pairs may share an `isolation_key` (link) and which may be `ResponseTarget` source/destination for one another (deliver). Built-in policies (`AllowAllLinks` (default), `SameConfidentialityTierOnly`, `ExplicitAllowList`, `DenyAllLinks`) and the policy contract are defined in [LinkPolicy](#linkpolicy-and-confidentiality_tier). Cross-tier link attempts are refused with a typed error; cross-tier deliveries are dropped — so two tiers can share **an agent target** on one host while remaining strictly session-isolated. -22. **Choose an authorization profile per channel** — every channel that emits a `ChannelIdentity` composes from two orthogonal parameters, `require_link: bool` and `allowlist: IdentityAllowlist | None`, producing the three named profiles **open** (default), **forced-link** (must authenticate, any authenticated identity accepted), and **allowlist** (only listed identities — keyed on either the channel-native id pre-link or on a verified IdP claim post-link). Built-in allowlists (`NativeIdAllowlist`, `LinkedClaimAllowlist`, plus `AnyOfAllowlists` / `AllOfAllowlists` combinators) and the unified host seam (`host.authorize(...)` → `AuthorizationOutcome` of `Allowed` / `LinkRequired` / `Denied`) are defined in [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam). The host applies a `default_allowlist` to every channel whose `allowlist` is left at the sentinel `"inherit"`, so app authors can lock down a whole bot in one place. Configuration combinations that would silently deny every user (e.g. `LinkedClaimAllowlist` on a channel with `require_link=False` and no native verified claims) are rejected at host startup with a typed `ChannelConfigurationError`. - -### v1 Fast Follow -23. **Generic auth helpers** — shared middleware for common channel auth patterns (HMAC signature, bearer token). -24. **Pluggable host state store** — interface for cross-host persistence of `ContinuationToken`s, identity-link grants, and last-seen `(isolation_key, channel)` records. Default implementation in v1 is **file-based** (`FileHostStateStore`); `InMemoryHostStateStore` is available for tests. A future `CosmosHostStateStore` / `SQLHostStateStore` would extend cross-channel chat continuity (req #9), background runs (req #14), and identity-link continuity (req #11) beyond a single host/process — but the v1 file-based default already survives host restarts on a single node. Same protocol covers session aliasing where applicable. -25. **First-party identity linker helpers** — concrete `OAuthIdentityLinker` (with provider presets) and `OneTimeCodeIdentityLinker` (cross-channel code exchange) shipped as opt-in helpers on top of the `IdentityLinker` contract. Investigation of additional first-party linker types tracked as a follow-up. -26. **`A2AChannel` package** (`agent-framework-hosting-a2a`) — exposes the hostable target over the Agent-to-Agent protocol so other agents can consume it as a peer. Caller-supplied-session family (alongside Responses and Invocations): A2A's per-conversation id maps to `ChannelSession.key`; the calling agent's identity (e.g. its A2A agent card / signed JWT) flows through `IdentityResolver`; structured replies fit the existing `ChannelRequest` + `ResponseTarget` envelope. No new host primitives required — only the protocol binding and package. -27. **`MCPToolChannel` package** (`agent-framework-hosting-mcp`) — exposes the hostable target as a **Model Context Protocol tool** so MCP clients (other agents, IDE tooling) can invoke it. Same caller-supplied-session family: the MCP `tool/call` carries the conversation key into `ChannelSession.key`; the MCP client identity flows through `IdentityResolver`; the tool result is the target's response. Streaming MCP tools map onto the host's existing streaming response delivery; long-running MCP tools map onto background runs with `ContinuationToken` when the work outlasts a single tool-call round-trip. -28. **`ActivityChannel` package** (`agent-framework-hosting-activity`) — exposes the hostable target behind **Azure Bot Service**, which fronts Teams, Web Chat, Slack-style connectors, and the rest of the Bot Framework / M365 connector ecosystem. Provides **native translations** between Activity Protocol objects (`Activity`, `ConversationReference`, adaptive cards, `Invoke` activities, …) and the host's `ChannelRequest` / `ChannelResponse` types — so the contract is **explicit** rather than implicit through a generic Invocations endpoint. Host-tracked-session family: Bot Service authenticates with a JWT carrying the AAD object id, the channel populates `ChannelIdentity` from `from.aadObjectId`, the host's per-`isolation_key` alias decides which `AgentSession` to resolve, and `host.reset_session(...)` is reachable via a Teams slash command or adaptive-card action. `ChannelPush` is implemented over Bot Service's `ConversationReference` + `continueConversationAsync` pattern. Naming this channel **Activity** rather than **Teams** keeps a `TeamsChannel` name available for the Teams-native channel below (req #29) and for any future direct-to-Teams transport. -29. **`TeamsChannel` package** (`agent-framework-hosting-teams`) — Teams-native channel built on the MIT-licensed [`microsoft/teams.py`](https://github.com/microsoft/teams.py) SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`). Where `ActivityChannel` (req #28) targets the **generic** Activity Protocol surface across all Bot Service-fronted channels, `TeamsChannel` exploits **Teams-specific affordances** that the generic Activity Protocol does not surface natively: - - **Adaptive Cards** via the typed `microsoft-teams-cards` builder, attached as tool side-effects through a `ContextVar`-scoped pending-cards collector consumed by the channel's result projector. - - **Streamed assistant replies** via `ctx.stream.emit(chunk)` — the channel projects `agent.run(..., stream=True)` chunks directly. - - **Teams "AI generated" badge**, **built-in feedback controls + custom feedback form**, **suggested-prompt chips** (`SuggestedActions` / `CardAction(IM_BACK)`), **inline citations** (`CitationAppearance` populated from a `FunctionMiddleware` that assigns stable positions to tool-result sources). - - **Modal Dialogs** (multi-step forms) with submission events routed through the host's normal request pipeline. - - **Message Extensions** — action commands (modal forms invoked from the compose box / message context menu), search commands (typed-ahead inline cards), and link unfurling (preview cards on URL paste). Each is exposed via the same `ChannelCommand` model as Telegram-style slash commands. - - **Proactive, targeted (ephemeral), and threaded messages** via `app.send(conversation_id, MessageActivityInput(...))`, `with_recipient(account, is_targeted=True)`, and `to_threaded_conversation_id(conversation_id, message_id)` — used by `ChannelPush` and by `ResponseTarget.identities([ChannelIdentity(channel="teams", chat_id=…)])`. - - **SSO / OAuth** via the SDK's MSAL-backed connections, surfaced through `IdentityResolver` and the channel's run hook. - - **Teams API client + Microsoft Graph client** preconfigured on the SDK's `App`, available to the run hook for Teams-specific lookups (team roster, channel metadata, …) without re-implementing auth. - - Mounts the SDK's `App` into the host's Starlette app via a custom `HttpServerAdapter` that defers `register_route(...)` to `ChannelContribution.routes` — the SDK does **not** start its own server; the host owns the lifecycle. Host-tracked-session family (same as `ActivityChannel`): `from.aadObjectId` populates `ChannelIdentity`. The result projector reads `AgentRunResult.messages[*].contents` and routes the rich content variants to their Teams-native renderings (`TextContent` → markdown body, `DataContent`/structured output → Adaptive Card, citation entries from `additional_properties` → `add_citation`, `ErrorContent` → typed error card). - - **Note on transport.** `TeamsChannel` **still rides on Azure Bot Service in v1** — the `microsoft/teams.py` SDK is a higher-level Pythonic wrapper over the same Activity Protocol pipeline that `ActivityChannel` exposes raw. The difference is **what the developer writes against**, not the underlying network path. A truly Bot-Service-free Teams transport is *not currently possible* and is tracked as a separate, speculative stretch item (req #31); when/if Microsoft ships one, the new transport would slot in under the same `TeamsChannel` package without changing this requirement. - - **`ActivityChannel` vs `TeamsChannel` — pick by audience:** - - | Channel | Built on | Audience | - |---|---|---| - | `ActivityChannel` (req #28) | Activity Protocol over HTTP, no Teams-specific helpers | Bot Service-fronted channels generically (Teams, Web Chat, Slack-style connectors, DirectLine, …); maximum portability across the Bot Framework / M365 connector ecosystem | - | `TeamsChannel` (req #29) | `microsoft/teams.py` `App` mounted via custom `HttpServerAdapter` into the host's Starlette app | Teams-first deployments that want Adaptive Cards, modal Dialogs, Message Extensions, citations, feedback, suggested-prompt chips, and SSO out-of-the-box | - - Deployments that only need plain Activity Protocol over Bot Service stick with `ActivityChannel`; `TeamsChannel` is the upgrade path when Teams-native richness is wanted. - -### Stretch -30. **WhatsApp channel package** — using the same `Channel` + `ChannelCommand` model, designed so it participates in cross-channel continuity (req #9) and can serve as a `ChannelPush` destination (req #13) when paired with a stable per-user `isolation_key`. -31. **Direct-to-Teams channel package** — *speculative*. Reserved for a future transport that connects to Teams **without going through Azure Bot Service** (and therefore without the Activity Protocol pipeline that backs both `ActivityChannel` (req #28) and `TeamsChannel` (req #29)). At the time of writing **no such transport is publicly available** — the Microsoft Graph chat APIs (`/teams/{id}/channels/{id}/messages`, `/chats/{id}/messages`) and the `microsoft/teams.py` SDK both ultimately route through Bot Service for the bot-as-conversation-participant pattern. This requirement is kept on the roadmap purely to preserve the `TeamsChannel` naming line for if/when Microsoft ships a Bot-Service-free transport (a native Teams REST/RPC, a Graph subscription strong enough to drive both inbound and outbound message flow, or similar). Until then, **the canonical Teams channel is `TeamsChannel` (req #29)** and `ActivityChannel` (req #28) covers the generic Bot Service surface. - -## API Surface - -### Architecture overview - -The host wires one `Agent` (or `Workflow`) to one or more channels, each contributing routes, commands, and a push back-channel. **1a** is the runtime topology — how an inbound request flows through the host. **1b** is the contribution shape — what each channel hands the host at construction. - -#### Runtime topology - -```mermaid -graph LR - Caller[External caller /
messaging app] - - subgraph Host[AgentFrameworkHost] - direction TB - ASGI[Starlette app] - Router[Channel router] - Parse{parse →
command or
message?} - Auth[host.authorize] - Resolver[IdentityResolver] - Delivery[_deliver_response] - Push[_handle_push_task] - Annot[_annotate_intended_targets] - end - - Channels[Channels
Responses · Invocations ·
Telegram · Activity ·
IdentityLinker] - CmdHandler[CommandHandler
via ChannelCommandContext] - Target[(Agent or Workflow)] - Runner[DurableTaskRunner] - StateStore[(HostStateStore)] - - Caller --> ASGI - ASGI --> Router - Router --> Parse - Parse -- /command --> CmdHandler - Parse -- message --> Auth - CmdHandler -- ctx.run --> Auth - CmdHandler -- local reply --> Channels - Auth --> Resolver - Resolver --> StateStore - Auth --> Target - Target --> Delivery - Delivery -- originating sync --> Channels - Delivery -- non-originating --> Runner - Delivery --> Annot - Runner --> Push - Push --> Channels - Channels --> ASGI -``` +| `agent-framework-hosting` | `agent_framework_hosting` | `AgentFrameworkHost`, channel protocols, key request/result types, hooks, `reset_session`, state-path helpers. | +| `agent-framework-hosting-responses` | `agent_framework_hosting_responses` | `ResponsesChannel`. | +| `agent-framework-hosting-invocations` | `agent_framework_hosting_invocations` | `InvocationsChannel`. | +| `agent-framework-hosting-telegram` | `agent_framework_hosting_telegram` | `TelegramChannel` and Telegram command helpers. | +| `agent-framework-hosting-activity-protocol` | `agent_framework_hosting_activity_protocol` | `ActivityProtocolChannel` for Activity Protocol over Azure Bot Service. | +| `agent-framework-hosting-discord` | `agent_framework_hosting_discord` | `DiscordChannel` and Discord command/interaction helpers. | +| `agent-framework-foundry-hosting` | `agent_framework.foundry_hosting` | Foundry isolation middleware and Foundry-backed hosting helpers usable with the v1 host. | -#### Channel contribution shape +Channel packages may depend on their native SDKs. The core hosting package should not depend on channel SDKs or on top-level legacy protocol hosts. -Every channel exposes the same three contribution slots, all optional except `routes`. The host duck-types each slot and stitches them in at construction. +## Key Types -```mermaid -graph LR - subgraph C[ConcreteChannel
e.g. TelegramChannel] - direction TB - Routes[routes:
webhook / poller / API endpoints
→ Starlette router] - Commands[commands: Sequence ChannelCommand
name · description · handle ·
scopes · locales · expose_in_ui] - Push[ChannelPush.push
+ optional ChannelPushCodec
+ optional response_hook] - end +### `AgentFrameworkHost` - Host[Host] - Native[Platform native catalog
Telegram set_my_commands ·
Teams app manifest · …] - Dispatch[CommandHandler dispatch] - Delivery[Originating sync delivery
+ runner-scheduled fan-out] +The host constructor accepts: - Routes -- contribute at startup --> Host - Commands -- startup projection --> Native - Commands -- runtime dispatch --> Dispatch - Push -- driven by --> Delivery -``` +- `target`: one `SupportsAgentRun`-compatible object or one `Workflow` +- `channels`: one or more `Channel` instances +- optional Starlette middleware +- optional `state_dir` +- optional workflow `checkpoint_location` -The `IdentityLinker` is itself a Channel specialisation: when one is configured, the host auto-inserts a `link` / `connect` `ChannelCommand` into every other channel's catalog (opt-out per channel via `expose_in_ui=False` or rename via metadata). +The host exposes: -### Packages +- `app`: the canonical Starlette ASGI application +- `serve(...)`: a convenience wrapper for local serving +- `reset_session(isolation_key: str)`: rotate the cached `AgentSession` for a host-tracked conversation -| Distribution package | Public import surface | Purpose | -| --- | --- | --- | -| `agent-framework-hosting` | `agent_framework.hosting` | Core Starlette host, channel contract, session/request bridge | -| `agent-framework-hosting-responses` | `agent_framework.hosting` (lazy) | `ResponsesChannel` | -| `agent-framework-hosting-invocations` | `agent_framework.hosting` (lazy) | `InvocationsChannel` | -| `agent-framework-hosting-telegram` | `agent_framework.hosting` (lazy) | `TelegramChannel` and Telegram-specific helpers | +`state_dir` is narrowed to v1 host-owned local files only: -The split is between distribution packages. The **public import path stays stable at `agent_framework.hosting`** via lazy imports, consistent with the repository's packaging conventions. +- session aliases (`isolation_key` to current `AgentSession` id), and +- workflow checkpoint paths when the app chooses the host-provided file layout. -### Built-in routes +It is not a store for identity links, continuations, active-channel state, delivery attempts, or multicast payloads. -For built-in channels, `path` is the configurable endpoint root. Use `path=""` when an external platform requires that channel at the app root. +Externally supplied isolation keys are trusted only after the channel or host middleware has authenticated and authorized the caller. The host uses `isolation_key` as a partition key; the string itself is not proof of identity or ownership. -| Channel | Default `path` | Default exposed route(s) | -| --- | --- | --- | -| `ResponsesChannel` | `/responses` | `/responses` | -| `InvocationsChannel` | `/invocations` | `/invocations` | -| `TelegramChannel` | `/telegram/webhook` | webhook mode: `/telegram/webhook`; polling mode: no required HTTP route | +### `Channel` -Overrides replace the endpoint path: +A channel implements a small protocol: -```python -ResponsesChannel(path="/public/responses") # -> /public/responses -InvocationsChannel(path="/internal/invocations") # -> /internal/invocations -TelegramChannel(path="/bots/telegram/webhook", bot_token=token) # -> /bots/telegram/webhook -``` +- declare a stable channel id/name, +- contribute routes, middleware, commands, and lifecycle callbacks, +- parse inbound protocol data into `ChannelRequest`, +- call the host through `ChannelContext.run(...)` or `ChannelContext.run_stream(...)`, and +- serialize the returned result to the originating protocol response. -### Key Types +Channels own protocol authentication, signature validation, native command registration, and protocol-specific error bodies. -**`AgentFrameworkHost`** — owner of the Starlette app and channel lifecycle. Fronts one **hostable target** (an agent or a workflow). +### `ChannelContribution` -| Field / Method | Type | Description | -|---|---|---| -| `__init__(target, *, channels, middleware=(), identity_resolver=None, identity_linker=None, debug=False)` | constructor | Composes one host from one **hostable target** (`SupportsAgentRun` or `Workflow`) and a sequence of channels. Optional `identity_resolver` and `identity_linker` provide channel-native-id → `isolation_key` mapping and a connect ceremony for linking new channels to existing identities. The host detects the target kind and dispatches to the appropriate runner. | -| `app` | `Starlette` | Canonical ASGI surface; can be handed to any ASGI server. | -| `serve(*, host="127.0.0.1", port=8000, **kwargs)` | method | Convenience wrapper around `uvicorn.run(self.app, ...)`. Lazy-imports `uvicorn`. | -| `run_in_background(request)` | `-> ContinuationToken` | Submits a `ChannelRequest` for asynchronous execution. Returns a `ContinuationToken` immediately; the result is delivered via the configured `ResponseTarget` push when ready and recorded against the token (in the configured `HostStateStore`) for later polling. Channels typically call this when their protocol response should be a 202 / acknowledgement rather than the agent reply. | -| `get_continuation(token)` | `-> ContinuationToken \| None` | Look up a previously submitted background run by its opaque token. Returns `None` when the token is unknown or has expired. Reads through the `HostStateStore` so tokens issued before the most recent restart still resolve. | +`ChannelContribution` is the channel's host-facing contribution: -**`HostableTarget`** — the union of executable targets the host can front. +- Starlette routes and optional middleware, +- native command descriptors, +- startup and shutdown callbacks, and +- any channel-local metadata needed by the package. -| Variant | Type | Execution seam | -|---|---|---| -| Agent | `SupportsAgentRun` | `target.run(input, *, session=..., stream=...)` | -| Workflow | `Workflow` | `target.run(input, ...)` (workflow execution seam) | +The host aggregates contributions but does not interpret protocol payloads. -**`Channel`** (Protocol) — anything that contributes routes/commands/lifecycle to a host. +### `ChannelRequest` -| Field | Type | Description | -|---|---|---| -| `name` | `str` | Channel name used for routing, telemetry, and `ChannelRequest.channel`. | -| `confidentiality_tier` | `str?` | Optional opaque confidentiality tier (e.g. `"corp"`, `"public"`). Consumed by the host's `LinkPolicy` to decide which channels may be linked into the same `isolation_key` and which may be `ResponseTarget` destinations for a given originating request. `None` = single-tier (no policy filtering). See `LinkPolicy`. | -| `contribute(context: ChannelContext) -> ChannelContribution` | method | Called once at host construction; returns routes/middleware/commands/lifecycle. | +`ChannelRequest` is the host-neutral request envelope produced by a channel. It carries: -**`ChannelContext`** — host-owned bridge channels use to invoke the agent. +- target input, +- optional `ChannelSession`, +- optional `ChannelIdentity`, +- options and attributes produced by the channel, and +- request metadata useful to hooks and context providers. -| Method | Type | Description | -|---|---|---| -| `run(request: ChannelRequest)` | `-> HostedRunResult[Any]` | One-shot invocation. For agent targets `TResult` narrows to `AgentResponse`; for workflow targets to `WorkflowRunResult`. | -| `stream(request: ChannelRequest)` | `-> HostedStreamResult` | Streaming invocation. | +The host may pass attributes through to context providers and middleware. Channels should treat attributes as a documented extension bag, not as a cross-channel delivery contract. -**`ChannelContribution`** — what a channel returns from `contribute(...)`. +### `ChannelSession` -| Field | Type | Description | -|---|---|---| -| `routes` | `Sequence[BaseRoute]` | Starlette routes mounted under the channel's `path`. Accepts both `Route` (HTTP) and `WebSocketRoute` (WS) — both are `BaseRoute`. | -| `middleware` | `Sequence[Middleware]` | Channel-scoped middleware. | -| `commands` | `Sequence[ChannelCommand]` | Native command catalog (e.g. Telegram bot commands). | -| `on_startup` | `Sequence[Callable]` | Lifecycle hooks for polling workers, command registration, etc. | -| `on_shutdown` | `Sequence[Callable]` | Lifecycle hooks for cleanup. | +`ChannelSession(isolation_key=...)` is the only v1 session-continuity mechanism. -**`ChannelRequest`** — normalized ingress passed to the host. +When a request contains an isolation key: -| Field | Type | Description | -|---|---|---| -| `channel` | `str` | Originating channel name. | -| `operation` | `str` | e.g. `message.create`, `command.invoke`, `approval.respond`. | -| `input` | `AgentRunInputs` | Reuses framework input types. | -| `session` | `ChannelSession?` | Session hint from the channel. | -| `options` | `ChatOptions?` | Caller-derived options (e.g. Responses `temperature`). | -| `session_mode` | `Literal["auto", "required", "disabled"]` | Whether host-managed session use is automatic, mandatory, or bypassed. | -| `metadata` | `Mapping[str, Any]` | Protocol-level metadata for telemetry. | -| `attributes` | `Mapping[str, Any]` | Channel-specific structured values (signature state, capability hints). Host code never reads this map; reserved for channel-private bookkeeping. | -| `client_state` | `Mapping[str, Any] \| None` | Bidirectional, mutable per-request state object supplied by event-rich front-ends (e.g. AG-UI). Channel-defined shape; the host treats it as opaque. Channels typically thread this into a channel-owned `ContextProvider` (see [Channel-owned per-thread state](#channel-owned-per-thread-state)) and read it back after the run to emit state-snapshot/delta events. | -| `client_tools` | `Sequence[ToolDescriptor] \| None` | Frontend tool catalog supplied per request. The channel forwards definitions onto the agent's `ChatOptions` so the LLM can call them, but tool *execution* returns to the originating client (the host does not invoke them). Run hooks may filter or rewrite the catalog. | -| `forwarded_props` | `Mapping[str, Any] \| None` | Pass-through bag for channel-protocol extras the run hook needs to route into the target — e.g. AG-UI `resume` / `command` / HITL response payloads that drive workflow `RequestInfo` / `RequestResponse` round-trips. Opaque to the host; the run hook decides where it lands on the rebuilt `ChannelRequest.input`. | -| `identity` | `ChannelIdentity?` | Channel-native **user** identity observed on this request — `(channel, native_id, attributes)`. Channels populate it from the inbound payload's user field (Telegram `from.id`, Teams `from.aadObjectId`, Responses `safety_identifier`, …) — **not** the chat / conversation id, which is carried separately on `conversation_id` and matters in multi-user surfaces (Telegram groups, Teams group chats and channels — see [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels)). The host records `(isolation_key, channel) → identity` on every successful resolve so `ResponseTarget.active`, `.channel(name)`, `.channels([...])`, and `.all_linked` can find a destination native id without per-request payload bookkeeping. | -| `stream` | `bool` | Whether to invoke `stream(...)` rather than `run(...)`. | -| `response_target` | `ResponseTarget` | Where the response is delivered (default: `ResponseTarget.originating`). See `ResponseTarget` below. | -| `background` | `bool` | If `True`, host returns a `ContinuationToken` immediately rather than awaiting the response. Forced `True` when `response_target == ResponseTarget.none`. | - -**`ChannelSession`** — small, host-neutral session hint. - -| Field | Type | Description | -|---|---|---| -| `key` | `str?` | Stable host lookup key for an `AgentSession`. **Caller-supplied** channels populate it from the wire payload (e.g. `previous_response_id`, request-body `session_id`). **Host-tracked** channels leave it `None` and let the host's per-`isolation_key` alias decide which `AgentSession` to resolve (see [Channel session-carriage models](#channel-session-carriage-models)). | -| `conversation_id` | `str?` | Protocol-visible conversation/thread identifier when one exists. | -| `isolation_key` | `str?` | Opaque isolation boundary (user, tenant, chat, …) using hosted-agent terminology. | -| `attributes` | `Mapping[str, Any]` | Channel-specific session hints. | +1. The host looks up or creates the cached `AgentSession` for that key. +2. The target runs with that `AgentSession` when the target is an agent. +3. `reset_session(isolation_key)` rotates the alias so the next request starts a new conversation. -**`ChannelRunHook`** — per-request escape hatch for built-in channels. +If two channels produce the same isolation key on the same host, they share the same cached session. If they produce different keys, they do not share session state. -```python -ChannelRunHook = Callable[..., Awaitable[ChannelRequest] | ChannelRequest] -``` - -Channels invoke the hook positionally with the channel-built `ChannelRequest` and pass named extras as keyword arguments. The minimum signature an app author needs is: - -```python -def my_hook(request: ChannelRequest, **kwargs) -> ChannelRequest: ... -``` +### `ChannelIdentity` -Hooks that want the named extras pull them out by name: +`ChannelIdentity` is optional request metadata such as channel id, native user id, tenant id, claims, or display attributes. -| Keyword | Type | Description | -|---|---|---| -| `target` | `SupportsAgentRun \| Workflow` | The hosted target (so hooks can adapt to e.g. `A2AAgent` or to a `Workflow`'s typed inputs). | -| `protocol_request` | `Any?` | Original channel-native protocol payload — Responses JSON body, Telegram `Update` dict, Activity Protocol `Activity` dict, Invocations body, … (loosely typed in v1). | - -Runs **after** the channel has produced its default `ChannelRequest`, **before** the host resolves session behavior and calls the target's execution seam. This is the canonical adapter point for workflow targets, where the channel's free-form input must be reshaped into the workflow's typed inputs. - -> Earlier drafts wrapped these arguments into a `ChannelRunHookContext` object. The signature was simplified so the typical hook only needs `(request, **kwargs)` — making it safe against future named extras and easier to write inline. +In v1, `ChannelIdentity` does not link channels, authorize callers, select delivery destinations, or imply that two identities should share an `AgentSession`. A channel that wants shared history must still produce the same `ChannelSession.isolation_key`. -**`ChannelIdentity`** — the channel-native identity the host sees on each request, used as the resolver/linker input. +### Hooks -| Field | Type | Description | -|---|---|---| -| `channel` | `str` | Originating channel name (matches `Channel.name`). | -| `native_id` | `str` | Channel-native **user** identifier (Telegram `from.id`, Teams `from.aadObjectId`, WhatsApp phone number, Slack user id, …). In 1:1 chats this often coincides with the chat / conversation id; in multi-user surfaces (Telegram groups, Teams group chats and channels) it is **strictly the user** — the conversation locator lives separately on `ChannelRequest.conversation_id` / `ChannelSession.conversation_id`. Always per-channel; never assumed to align across channels. | -| `attributes` | `Mapping[str, Any]` | Optional per-channel context (display name, locale, group/private chat flag, Teams `tenantId`, Telegram `chat.type`, Teams `conversationType`, …) the resolver/linker may key on. | +Hooks are optional and channel-owned: -**`IdentityResolver`** — host-level seam that maps a `ChannelIdentity` to an `isolation_key`. +- `ChannelRunHook`: runs after channel parsing and before host invocation; returns the `ChannelRequest` to execute. +- `ChannelResponseHook`: runs after target completion and before the originating channel renders a one-shot response. +- `ChannelStreamUpdateHook`: the host applies it to streamed updates before the originating channel serializes the stream. -```python -IdentityResolver = Callable[[ChannelIdentity], Awaitable[str | None] | (str | None)] -``` +Common uses include adapting chat text into workflow inputs, enforcing deployment-specific options, flattening rich output for text-only protocols, or filtering streamed updates for a protocol. Stream update hooks are update-only; they do not automatically sanitize `get_final_response()` output. Channels choose their response transport from the parsed protocol request before invoking run hooks. -The **default resolver auto-issues** an `isolation_key` the first time a `(channel, native_id)` is seen and persists the mapping in the host's identity store, so every end user automatically gets a stable per-user `isolation_key` on first contact through **any** channel — no per-channel boilerplate is required for the single-channel case. Returning `None` is reserved for advanced cases where the resolver wants to refuse unknown identities; the dedicated host seam for accept/reject decisions is **`IdentityAllowlist`** — see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) below. +### `HostedRunResult` -Cross-channel continuity is then a one-shot **merge** operation: after a successful link ceremony (Scenario 6), the host atomically rewrites the second channel's auto-issued key to point at the first channel's existing `isolation_key`. Apps never have to write per-channel mapping hooks just to get continuity to work. +`HostedRunResult[T]` wraps the target's full-fidelity result plus the resolved `AgentSession | None`. -Apps that already own an identity namespace (corporate user id, tenant-scoped account id) can supply a custom resolver that returns those values directly — bypassing auto-issuance. +- Agent targets produce `HostedRunResult[AgentResponse]`. +- Workflow targets produce `HostedRunResult[WorkflowRunResult]`. -**`IdentityLinker`** (Protocol) — host-level seam that runs a connect ceremony to associate a new `ChannelIdentity` with an existing `isolation_key`. The linker is a peer of `Channel` for routing purposes and contributes its own routes/lifecycle. - -| Field / Method | Type | Description | -|---|---|---| -| `name` | `str` | Linker name; used for telemetry and to namespace its routes. | -| `contribute(context: ChannelContext) -> ChannelContribution` | method | Same shape as `Channel.contribute(...)`; lets the linker publish callback/verification routes (e.g. `/identity/oauth/callback`, `/identity/verify`) and lifecycle hooks. | -| `begin(identity: ChannelIdentity, *, requested_isolation_key=None) -> LinkChallenge` | method | Starts the ceremony for a channel-native identity. Returns a `LinkChallenge` describing what the user must do (URL to visit, code to enter, MFA prompt). | -| `complete(challenge_id: str, proof: Mapping[str, Any]) -> str` | method | Verifies the proof and returns the resolved `isolation_key`. On success the host atomically records both `(channel, native_id) → isolation_key` and any verified IdP claim recovered from the proof (e.g. `(microsoft.oid, )`) so subsequent channels that supply the same claim auto-link without a second ceremony. | -| `is_linked(identity: ChannelIdentity, *, verified_claims: Mapping[str, str] = {}) -> str \| None` | method | Returns the `isolation_key` for an already-linked identity, or `None` if no link exists. Channels with `require_link=True` call this on every inbound request before invoking the agent. When `verified_claims` are supplied (e.g. Teams' AAD `oid` from the inbound activity bearer) and a match exists in the link store, the linker silently auto-merges the new `(channel, native_id)` onto the existing `isolation_key` and returns it — this is the "sign in once, every other channel just works" mechanism. | - -| Built-in helper | Mechanism | Notes | -|---|---|---| -| `OAuthIdentityLinker(provider, ...)` | OAuth authorization-code redirect | Contributes `/identity/oauth/{provider}/start` + `/callback`; ships with provider presets (Microsoft, Google, GitHub) as opt-in helpers. Stores the verified IdP `sub` / `oid` as a verified claim alongside the channel-native identity so channels that authenticate with the same IdP (e.g. Teams via Entra ID) auto-link on first contact. | -| `OneTimeCodeIdentityLinker(...)` | Signed short-lived code | User runs `/link` on channel A, receives a code; runs `/link ` on channel B; host verifies and merges. | - -A built-in `link` (or `connect`) `ChannelCommand` is exposed automatically when an `IdentityLinker` is configured. Its `handle` invokes `linker.begin(...)` and replies with the `LinkChallenge` payload (URL, code, instructions) projected through the channel's native rendering. Channels may opt out (`expose_in_ui=False`) or override the command's name per channel. - -**`require_link` (per-channel)** — every channel that emits a `ChannelIdentity` accepts a `require_link: bool = False` constructor argument. When `True`, the channel calls `linker.is_linked(identity, verified_claims=…)` before producing a `ChannelRequest`; un-linked identities are short-circuited to a rendered `LinkChallenge` reply (the same payload the `link` command would emit) and the agent is **not** invoked for that turn. Combined with the linker's verified-claim auto-link, this gives an "authenticate before chatting" enforcement model where the first channel forces the OAuth ceremony and subsequent channels join the same `isolation_key` silently. See [Scenario 6](#scenario-6-linking-a-new-channel-to-an-existing-identity-via-oauth) for the end-to-end flow. Default is `False`, which preserves the opportunistic flow (auto-issued `isolation_key`, link manually later). Channels whose protocol does not authenticate the user (e.g. anonymous Responses calls) ignore the flag. `require_link` is the **"identity must be linked"** axis; the **orthogonal "identity is on the accept list"** axis is `allowlist` — see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) below. - -#### Authorization profiles and the `IdentityAllowlist` seam - -`require_link` (above) and `allowlist` (below) compose into the **three named authorization profiles** the spec supports for any channel that emits a `ChannelIdentity`. The two parameters stay **orthogonal** on the channel constructor — there is no single `auth_mode` enum — but the host exposes named factories on `AuthPolicy` (`AuthPolicy.open()` / `.require_link()` / `.native_allowlist(...)` / `.linked_claim_allowlist(...)` / `.mixed(...)`) for ergonomic configuration: - -| Profile | Channel config | What gets gated | Typical use | -|---|---|---|---| -| **Open** (default) | `require_link=False`, `allowlist=None` | Nothing — every identity gets an auto-issued `isolation_key` on first contact. | Public chatbot, internal dev/demo, single-tenant deployments. | -| **Forced link** | `require_link=True`, `allowlist=None` | Identity must complete the link ceremony at least once. Any successfully authenticated identity is then allowed. | "Sign in once with your corporate account, then chat freely" style bots that gate on tenancy via the IdP rather than per-user. | -| **Native allowlist** | `require_link=False`, `allowlist=NativeIdAllowlist(...)` | Only listed channel-native ids (Telegram `chat_id`s, WhatsApp numbers, Slack user ids) get through. Pre-link, no IdP claim involved. | Personal bots, single-user prototypes, small fixed-membership channels. | -| **Linked-claim allowlist** | `require_link=True`, `allowlist=LinkedClaimAllowlist(...)` | Identity must (a) complete the link ceremony **and** (b) carry an IdP claim whose value is on the list (e.g. AAD `oid in {…}` or `tid == ""`). | Multi-channel corporate bot where any channel works but only specific people in a specific tenant are admitted. | -| **Mixed** | `require_link=False`, `allowlist=AnyOfAllowlists(NativeIdAllowlist(...), LinkedClaimAllowlist(...))` | Either the native id is preapproved **or** the user successfully links and matches the claim allowlist. Native-id hits bypass the link ceremony; everyone else is funneled into it. | A bot that wants ops-team Telegram ids in immediately while still letting other corp users self-onboard via OAuth. | - -The decision pipeline that produces each of those profiles: - -```mermaid -flowchart TB - Start([authorize identity,
require_link, allowlist]) - Linked{identity already
linked?
StateStore lookup} - Required{require_link?} - OpenPath{allowlist is None?} - Resolve[/isolation_key:
linked → existing,
else auto-issue channel:native_id/] - Evaluate[/allowlist.evaluate context/] - Decision{decision} - Abstain{requires_linked_claims?} - Allowed([Allowed isolation_key]) - DeniedPre([Denied
allowlist_denied_pre_link]) - LinkReq([LinkRequired
via configured linker]) - - Start --> Linked - Linked -- yes --> OpenPath - Linked -- no --> Required - Required -- yes --> LinkReq - Required -- no --> OpenPath - OpenPath -- yes --> Resolve --> Allowed - OpenPath -- no --> Evaluate --> Decision - Decision -- ALLOW --> Resolve - Decision -- DENY --> DeniedPre - Decision -- ABSTAIN --> Abstain - Abstain -- yes --> LinkReq - Abstain -- no --> Resolve -``` - -The flow shows three terminal states: `Allowed`, `LinkRequired`, `Denied`. `LinkRequired` is reachable whenever `require_link=True` and the identity has not completed the link ceremony (or an allowlist `ABSTAIN`ed and `requires_linked_claims=True`), independent of whether an allowlist is configured. - -##### `IdentityAllowlist` Protocol (tri-state) - -Allowlists are evaluated by a host-level pipeline (`host.authorize(...)`, below) that calls them twice — once with the raw channel-native identity (`phase="pre_link"`) and, if necessary, again after the link ceremony surfaces verified IdP claims (`phase="post_link"`). To make composition (`AnyOfAllowlists`, `AllOfAllowlists`) well-defined and to keep claim-based allowlists from accidentally denying everyone when claims are not yet available, the contract is **tri-state**: - -```python -class AllowlistDecision(StrEnum): - ALLOW = "allow" # accept this identity unconditionally - DENY = "deny" # reject this identity unconditionally - ABSTAIN = "abstain" # this allowlist has no opinion at this phase - # (e.g. a claim-based list during pre_link) - -@dataclass(frozen=True) -class AuthorizationContext: - identity: ChannelIdentity - phase: Literal["pre_link", "post_link"] - isolation_key: str | None # None at pre_link; resolved at post_link - verified_claims: Mapping[str, str] # {} when no claims; populated post_link - claim_source: Literal["linker", "channel", "none"] - # "channel" when the channel itself emits - # verified claims (e.g. Activity Protocol - # bearer with AAD oid); "linker" when the - # IdentityLinker surfaces them; "none" otherwise. - -class IdentityAllowlist(Protocol): - requires_linked_claims: bool = False # if True, host validation rejects - # configurations where neither `require_link` - # nor a claim-emitting channel can deliver - # the claims this allowlist needs. - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: ... -``` +The host does not flatten, filter, or translate the result. Each channel decides how much of the result its protocol can carry. -`ABSTAIN` is **not** a denial — it is "this allowlist has no information yet". The host's decision pipeline (below) is what turns an all-`ABSTAIN` outcome into the appropriate next step (allow when open, escalate to a link ceremony when the configuration calls for one). Boolean allowlists were rejected as part of this design pass because two-state composition cannot distinguish "claim allowlist denies you" from "claim allowlist hasn't seen any claims yet" — a critical distinction for the **Mixed** profile. +## Host Behavior -##### Built-in allowlists +1. `AgentFrameworkHost` builds one Starlette app and asks each channel for its contribution. +2. A channel route receives a protocol-native request. +3. The channel validates/parses the native payload and creates `ChannelRequest`. +4. The channel passes the request, optional `ChannelRunHook`, and protocol-native context to the host. +5. The host invokes `ChannelRunHook`, if configured, and receives the prepared request. +6. The host resolves an `AgentSession` from `ChannelSession.isolation_key` when present. +7. The host invokes the agent or workflow target. +8. The host wraps the result in `HostedRunResult` or the streaming equivalent. +9. The host invokes `ChannelResponseHook`, if configured, for non-streaming/final response shaping. +10. The host applies stream update hooks while the channel consumes streams; the channel renders the originating protocol response. -| Helper | Pre-link behavior | Post-link behavior | Notes | -|---|---|---|---| -| `AllowAll()` | `ALLOW` | `ALLOW` | Explicit "open" sentinel; useful for tests and for overriding a host-level `default_allowlist`. | -| `NativeIdAllowlist(channel=None, native_ids=...)` | `ALLOW` if `(channel, native_id)` is on the list; `DENY` if `channel` matches but `native_id` does not; `ABSTAIN` if `channel` does not match (allows mixing per-channel native lists under one `AnyOfAllowlists`). | Same as pre-link — native-id allowlists do not depend on link state. | Constructor accepts `native_ids: Collection[str] \| Callable[[], Awaitable[Collection[str]]]` so the list can be loaded asynchronously (config file, secret store). | -| `LinkedClaimAllowlist(claim, values)` | `ABSTAIN` (no claims available yet). | `ALLOW` if `verified_claims.get(claim)` is in `values`; `DENY` otherwise. | `requires_linked_claims = True`. Host construction-time validator rejects use with `require_link=False` on a channel that does not also emit verified claims natively — this prevents the silent-deny-everyone footgun. | -| `AnyOfAllowlists(*allowlists)` | `ALLOW` if any child `ALLOW`s; `DENY` only if **all** children `DENY`; otherwise `ABSTAIN`. | Same rule. | Composition for the **Mixed** profile. | -| `AllOfAllowlists(*allowlists)` | `DENY` if any child `DENY`s; `ALLOW` only if **all** children `ALLOW`; otherwise `ABSTAIN`. | Same rule. | E.g. require both tenancy (`LinkedClaimAllowlist("tid", ...)`) **and** group membership (`LinkedClaimAllowlist("groups", ...)`). | -| `CallableAllowlist(fn)` | Calls `fn(context)` and returns its result. | Same. | Escape hatch for app-specific logic; recommended only after exhausting the structured variants. | +There is no host-level route from one channel's request to another channel's response in v1. -##### Host configuration: `default_allowlist` + explicit channel inheritance +## Workflow Checkpoints -Allowlists can be configured at the host level (`AgentFrameworkHost(default_allowlist=...)`) and per-channel. The channel-side default is **explicit inheritance**, not an implicit `None`: +Workflow checkpointing is explicit. Apps either configure checkpoint storage on the workflow itself or pass a `checkpoint_location` to the host so the workflow dispatch path can use the intended file location. -```python -class SomeChannel: - def __init__( - self, - *, - require_link: bool = False, - allowlist: IdentityAllowlist | Literal["inherit"] | None = "inherit", - ): ... -``` +`state_dir` may provide a conventional location for workflow checkpoint files, but checkpointing is still opt-in and separate from agent session history. Checkpoints are workflow-runtime state, not channel state and not identity-link state. -- `allowlist="inherit"` (default) → the host's `default_allowlist` applies. If the host did not set one either, the channel is open. -- `allowlist=None` → the channel is **explicitly open**, even if the host has a `default_allowlist`. Used to carve out a public endpoint inside an otherwise-locked-down host. -- `allowlist=` → that allowlist applies, overriding the host default. To **add to** the host default rather than replace it, compose explicitly: `allowlist=AllOfAllowlists(host.default_allowlist, MyExtraList())`. +## Foundry Isolation Middleware -##### `host.authorize(...)` and `AuthorizationOutcome` +V1 keeps Foundry isolation as middleware rather than as a channel-linking feature. -Channels do not run the decision pipeline themselves — they call into a single host seam after extracting `ChannelIdentity` and any natively verified claims: +The middleware is installed only when the Foundry hosting environment flag is present. In that environment it reads Foundry-provided isolation values at the trusted hosting boundary, exposes them as read-only request context for Foundry-aware history or memory providers, and rejects unsafe session resumes when the live isolation context does not match persisted session context. Outside Foundry, raw isolation headers are ignored unless an app supplies its own trusted middleware. -```python -@dataclass(frozen=True) -class Allowed: - isolation_key: str - -@dataclass(frozen=True) -class LinkRequired: - challenge: LinkChallenge - -@dataclass(frozen=True) -class Denied: - reason_code: str # stable, machine-readable - user_message: str | None = None # safe to render publicly (group-chat-safe) - log_details: Mapping[str, Any] = {} # never shown to users; structured for audit - -AuthorizationOutcome = Allowed | LinkRequired | Denied - -async def host.authorize( - identity: ChannelIdentity, - *, - require_link: bool, - allowlist: IdentityAllowlist | None, - verified_claims: Mapping[str, str] | None = None, - conversation_context: ConversationContext | None = None, # for group-chat policy -) -> AuthorizationOutcome: ... -``` +This middleware does not create cross-channel identity links and does not authorize non-Foundry channels. -**Decision order** (the pipeline the host runs): +## Current Channels -1. Build `AuthorizationContext(phase="pre_link", verified_claims=verified_claims or {}, claim_source=…)`. -2. `decision_pre = allowlist.evaluate(context_pre)` (defaults to `ALLOW` when `allowlist is None`). -3. `decision_pre == DENY` → `Denied(reason_code="allowlist_denied_pre_link", ...)`. -4. `decision_pre == ALLOW`: - - If `require_link=True` and the linker has no record yet → `LinkRequired(linker.begin(identity))`. - - Otherwise → `Allowed(resolved_or_auto_issued_isolation_key)`. -5. `decision_pre == ABSTAIN`: - - If `require_link=True` **or** the allowlist declared `requires_linked_claims`: attempt `linker.is_linked(identity, verified_claims=…)`. - - Not linked → `LinkRequired(linker.begin(identity))`. - - Linked → evaluate again at `phase="post_link"` with the linker-emitted claims. - - `ALLOW` → `Allowed(linked_isolation_key)`. - - `DENY` → `Denied(reason_code="allowlist_denied_post_link", ...)`. - - `ABSTAIN` post-link is a misconfiguration (no allowlist had an opinion even after linking); logged and treated as `Denied(reason_code="allowlist_abstain_after_link")`. - - Otherwise (open profile, no claim dependency): `Allowed(auto_issued_isolation_key)`. +### Responses -The channel **renders** the outcome — `Allowed` proceeds to `ChannelRequest`, `LinkRequired` projects the `LinkChallenge` through the channel's native UX (same path the `link` command already uses), `Denied` projects `user_message` (when set) through a short refusal. The channel **never** sees `log_details` and is responsible for not echoing `reason_code` to end users. +`ResponsesChannel` exposes the OpenAI-compatible Responses API shape. It maps request body fields such as input, options, and conversation identifiers into `ChannelRequest`, and it renders Responses-compatible one-shot or streaming responses. -##### Configuration validation (fail-fast) +Responses session continuity uses a channel-selected `isolation_key`, commonly derived from a response/conversation id, caller-provided session id, Foundry isolation context, or deployment-specific request metadata. -The host runs a startup validator across `(channel.require_link, channel.allowlist)` for every channel: +### Invocations -1. If `channel.allowlist` (after resolving `"inherit"`) contains any allowlist with `requires_linked_claims=True`, the channel **must** either have `require_link=True` or declare via a channel attribute that it natively emits verified claims (`Channel.emits_verified_claims: bool = False`). Otherwise: `raise ChannelConfigurationError("LinkedClaimAllowlist requires a source of verified claims; set require_link=True on or use a channel that emits them natively")`. -2. If `channel.allowlist` contains a `LinkedClaimAllowlist` and the host has no `identity_linker` configured: same `ChannelConfigurationError`. -3. If `channel.allowlist` contains a `NativeIdAllowlist(channel=)` whose `` is not a known channel on this host: `ChannelConfigurationError`. +`InvocationsChannel` exposes an invocation endpoint for server-side callers and tools. It maps the request body into `ChannelRequest` and renders the invocation result on the same HTTP response. -These errors are raised eagerly at `AgentFrameworkHost.__init__` (or `host.serve(...)` startup), not on the first inbound request — silent deny-everyone is the worst possible default and is not allowed. +Invocations is useful for typed workflow inputs because a `ChannelRunHook` can translate the request body into the workflow's expected input type. -##### Group chats and privacy of denial +### Telegram -Authorization runs **per message**, not per conversation: in a group chat, one allowlisted user invoking the bot does not authorize other group members for subsequent messages. The host also mirrors the `LinkChallenge` group-chat redirect pattern (see [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels)) for denials: +`TelegramChannel` supports webhook or polling transport, native command registration, and message rendering back to the originating Telegram chat. -- In a 1:1 chat, the channel may render the full `user_message` from `Denied`. -- In a group chat, the channel renders a generic refusal in-room (e.g. "You don't have access to this bot.") and, where the channel supports it, follows up with a DM containing the longer `user_message`. The full `log_details` payload only reaches the host's structured logs / OpenTelemetry span — never the wire. +The channel chooses a default `isolation_key` from Telegram-native data such as chat id, user id, or a configured user/chat scope. A `/new` or equivalent command may call `reset_session` for that isolation key. -Built-in `user_message` defaults are intentionally bland and tenancy-free ("You don't have access to this bot." / "Please link your account to continue.") to avoid leaking who else is in the allowlist or which tenant gates it. +### Activity Protocol -##### v1 shipping surface +`ActivityChannel` supports Activity Protocol requests, typically through Azure Bot Service for Teams, Web Chat, and other Bot Framework-fronted surfaces. -The core PR includes the channel-neutral authorization and identity-linking seam; provider-specific linker packages (for example Entra OAuth helpers) plug into it without making the core package depend on an IdP SDK: +The channel maps incoming `Activity` objects to `ChannelRequest` and renders a reply activity to the originating conversation. Proactive Activity delivery, active-channel routing, and all-linked fan-out are not v1 host semantics. -- `IdentityAllowlist` Protocol + `AllowlistDecision` enum + `AuthorizationContext` dataclass. -- `AllowAll`, `NativeIdAllowlist`, `LinkedClaimAllowlist`, `AnyOfAllowlists`, `AllOfAllowlists`, `CallableAllowlist` built-ins. -- `IdentityLinker` Protocol, `LinkedIdentity`, and `LinkChallenge` core types. A linker resolves a channel-native identity in one call, returning either a linked identity with verified claims or a challenge for the channel to render. -- `AuthorizationOutcome` (`Allowed` / `LinkRequired` / `Denied`) types. -- `AuthPolicy` factory helpers on the public surface. -- `Host(default_allowlist=..., identity_linker=...)` + per-channel `allowlist: ... | Literal["inherit"] | None` parameter and the construction-time config validator. The validator enforces rules #1 (claim-source), **#2 (linker presence — channels with `require_link=True` must be paired with a configured `identity_linker`; otherwise a `ChannelConfigurationError` is raised at construction so misconfigurations cannot ship)**, and #3 (NativeIdAllowlist channel typo). Combinator walking (`AnyOf` / `AllOf`) is recursive so nested misconfigurations are caught at the host level. -- `host.authorize(identity, *, require_link, allowlist, verified_claims=None)` supports open, native-id allowlist, and claim allowlist profiles end-to-end. The open path returns `Allowed` with an auto-issued `:` isolation key (linear-scan registry lookup re-issues a known key when the identity has been seen before). Native-id allowlists return `Allowed`/`Denied` per the list. Claim-based allowlists use channel-emitted `verified_claims` when present; otherwise, when a linker is configured, the host returns `LinkRequired(challenge)` for unresolved identities or evaluates `LinkedClaimAllowlist` against the linker's verified claims for resolved identities. +### Discord +`DiscordChannel` supports Discord messages, slash commands, and interactions as channel-native input. +The channel maps Discord-native user, guild, channel, thread, and interaction data into `ChannelRequest` metadata and a configured `ChannelSession.isolation_key`. It renders the result to the originating Discord response path. -#### `LinkPolicy` and `confidentiality_tier` +## High-level Samples -**`LinkPolicy`** — host-level decision over which channels may share an `isolation_key` and which channels may be a `ResponseTarget` for one another. Consumed by both the `IdentityLinker` (to refuse incompatible link attempts) and the host's response-routing layer (to filter `all_linked` / `active` / specific destinations). +### One agent on Responses ```python -LinkPolicy = Callable[[LinkPolicyContext], bool] -``` - -`LinkPolicyContext` carries the originating `Channel` (and its `confidentiality_tier`), the prospective destination `Channel` (and its `confidentiality_tier`), and the operation kind (`"link"` or `"deliver"`). Returns `True` to allow, `False` to refuse. Refusal during `link` raises a typed error to the user; refusal during `deliver` excludes that destination from the route set (and falls back to `originating` if the route set becomes empty). - -| Built-in policy | Behavior | -|---|---| -| `AllowAllLinks()` | Default. Any pair allowed; preserves today's single-tier behavior. | -| `SameConfidentialityTierOnly()` | Only allows pairs whose `confidentiality_tier` matches (including both `None`). Most common multi-tier setup. | -| `ExplicitAllowList(allowed_pairs={("public", "corp"), ...})` | Allows only the listed `(source, target)` pairs. Useful for one-directional escalation flows. | -| `DenyAllLinks()` | Refuses every link attempt and excludes every non-`originating` destination — channels share an agent target on the host but never share sessions. Equivalent to running each channel on its own host minus the deployment overhead. | - -Confidentiality tiers are **opaque labels** — the host does not interpret them; the policy decides what they mean. Setting `confidentiality_tier=None` on every channel preserves single-tier behavior. Two separate hosts is always a valid alternative to using `LinkPolicy`; the policy exists for cases where shared deployment, shared middleware, or a shared target object are preferred over running multiple hosts. - -#### Multi-user conversations (Telegram groups, Teams group chats and channels) - -Telegram and Activity Protocol (Bot Service) both surface **multi-user conversations** alongside 1:1 chats — Telegram has private chats, groups, supergroups, forum topics inside supergroups, and broadcast channels; Activity Protocol has `conversationType` of `personal`, `groupChat`, and `channel` (a Teams team channel, with optional threaded `replyToId`). The hosting contract treats these uniformly, but channel implementations and host configuration both need to make a few explicit choices: - -**Identity vs. conversation are two axes, not one.** `ChannelIdentity.native_id` is always the **user** (`from.id` / `from.aadObjectId`); `ChannelRequest.conversation_id` is the **chat / channel / thread**. In 1:1 chats they collapse onto the same value (Telegram `chat.id == from.id`); in groups they don't and must not be conflated. The default `IdentityResolver` keys on `(channel, native_id)`, so a single user automatically gets one `isolation_key` whether they message in a group or in DM — that may or may not be what you want (see scoping below). - -**Conversation scoping policy.** A channel exposes a `conversation_scope` constructor option declaring how the host should derive the resolved `isolation_key` for multi-user surfaces. Three built-ins: - -| Scope | `isolation_key` derivation in multi-user conversations | When to pick it | -|---|---|---| -| `per_user` | The user's `isolation_key` from `IdentityResolver(ChannelIdentity)` only — group and DM share state. | Personal-assistant agents where the bot follows the user across surfaces and their preferences/memory should travel with them. Risky if the agent emits user-specific data in a public group. | -| `per_user_per_conversation` (default for multi-user) | `f"{user_isolation_key}:{conversation_id}"` — same user gets a different `isolation_key` per group / channel / topic / DM. | Default and safest. The agent's memory of a Teams team channel is separate from its memory of the same user's DM. | -| `per_conversation` | `f"_conv:{channel}:{conversation_id}"` — every member of the group shares one `isolation_key` and one `AgentSession`. The user identity is still attached to each turn (via `ChannelRequest.identity`) so the agent can address users by name, but session state is shared. | "Bot lives in this channel" deployments: meeting-notes bot, shared scratchpad, support-triage queue. | - -1:1 chats always derive `isolation_key` from the user identity alone — the per-user-per-conversation key would just include the user's own DM and add no isolation value. - -**Addressing rule.** Group surfaces typically don't want the bot replying to every message. Channels expose an `accept_in_group` constructor option: - -| Mode | Semantics | Default for | -|---|---|---| -| `mention_only` | Accept only messages that explicitly mention the bot (`@bot` for Telegram, `botname` mention entity for Teams). | Telegram groups, Teams `groupChat`, Teams team channels | -| `command_only` | Accept only registered `ChannelCommand` invocations (e.g. `/ask …`). | — | -| `mention_or_command` | Either of the above. | — | -| `all` | Accept every inbound message. | 1:1 chats; opt-in for groups when the agent really is the only conversational participant | - -Messages that don't satisfy the rule are ignored at the channel layer — no `ChannelRequest` is produced and the agent is never invoked. This is purely an inbound filter; outbound delivery (push / response routing) is unaffected. - -**Reply / `originating` routing.** The `originating` `ResponseTarget` always replies in the **same conversation** the request came from — including the same Teams team-channel thread (`replyToId`) or Telegram forum topic (`message_thread_id`). Channels carry the conversation-locator details on `ChannelRequest.conversation_id` (and additional fields on `ChannelRequest.attributes` when needed, e.g. `thread_id`); the channel's reply path reads them back. Channels that cannot reply in-thread (rare) fall back to a fresh top-level reply in the same conversation. - -**`ChannelPush` in groups.** When a non-`originating` `ResponseTarget` lands on a multi-user surface, the push must address a `(user, conversation)` pair: the host calls `ChannelPush.push(identity, payload)` where `identity.attributes` includes the recorded `conversation_id` (and thread/topic id when applicable) of the most recent observation under that scope. For `per_conversation` scope, every member's `ChannelIdentity` resolves to the same `isolation_key`, so the host instead picks the most recently observed `conversation_id` for that key and posts a single message to the conversation rather than fanning out to each user. - -**Linker ceremonies in groups.** OAuth and one-time-code link flows MUST NOT post the challenge URL or code into a group conversation visible to other users. Channels that support groups MUST detect group context (via `ChannelIdentity.attributes`) and, when `require_link=True` triggers a `LinkChallenge`, redirect the rendered challenge to the user's DM (Telegram: bot DM with the user; Teams: `personal` scope conversation with the same user). If a DM cannot be opened (Telegram user has not started the bot, Teams personal scope not installed), the channel returns a short prompt asking the user to DM the bot and retry. Verified-claim auto-link is unaffected — when a Teams `groupChat` request carries an AAD-verified `from.aadObjectId` that already matches an existing claim in the link store, the merge happens silently with no group-visible artifact. - -**Confidentiality tier interaction.** A Teams team channel post is visible to every member of the team; a 1:1 DM is not. Operators who care about the distinction MUST configure separate `Channel` instances (e.g. `ActivityChannel(scopes=["personal"], confidentiality_tier="user")` + `ActivityChannel(scopes=["channel", "groupChat"], confidentiality_tier="team")`) and apply a `LinkPolicy` so cross-tier `ResponseTarget` deliveries and identity links are filtered. The hosting layer does not infer tier from `conversationType`; it is an explicit deployment choice. - -**Telegram broadcast `Channel` (the Telegram product) and forum topics.** - -- *Broadcast Channels* — bots that are members of a Telegram broadcast Channel can post but generally do not receive user replies; treat as `ChannelPush`-only and configure with `accept_in_group="command_only"` so admin-issued commands (`/announce …`) are the only inbound trigger. Out of scope for v1; v1 ships group/supergroup support and leaves broadcast Channels for fast follow. -- *Forum topics* — supergroups with topics surface `message_thread_id`. The `TelegramChannel` populates `ChannelRequest.conversation_id` as `f"{chat_id}:{message_thread_id}"` so `per_user_per_conversation` and `per_conversation` scopes naturally separate topics from each other and from the group's general thread. - -**Activity Protocol specifics for `ActivityChannel`.** - -- `conversationType` mapping: `personal` → 1:1 (`accept_in_group="all"` rule applied), `groupChat` and `channel` → multi-user (default `mention_only`). -- Teams team channels carry both a channel id and an optional `replyToId`. The channel populates `conversation_id` as `f"{conversation.id}:{replyToId}"` when replying in-thread is desired (`per_user_per_conversation` scope makes thread-isolated sessions easy); deployments that prefer a single session per Teams channel can set `conversation_scope="per_conversation"` and the channel will key on `conversation.id` alone. -- `tenantId` is recorded on `ChannelIdentity.attributes` so multi-tenant deployments can implement an `IdentityResolver` that scopes `isolation_key` by tenant (or refuses unknown tenants). -- Adaptive-card submit (`Invoke` activities) flows are addressed in fast-follow alongside the `ActivityChannel` package; v1 of the host contract supports them via `ChannelRequest.forwarded_props`, so no host-level change is needed. - -**`ResponseTarget`** — directs **where** the host delivers the agent response. Independent of `session_mode`. - -| Variant | Constructor | Behavior | -|---|---|---| -| Originating | `ResponseTarget.originating` (default) | Synchronous response on the originating channel. | -| Active | `ResponseTarget.active` | Delivered to the channel most recently observed for the resolved `isolation_key`. | -| Specific channel (link-store recipient) | `ResponseTarget.channel("activity")` | Delivered via the named channel's `ChannelPush` to whichever channel-native identity is recorded for the resolved `isolation_key` in the link store. | -| Explicit identities | `ResponseTarget.identities([ChannelIdentity("telegram", native_id=""), ...])` | Delivered via each named channel's `ChannelPush` to the **caller-supplied channel-native identity** — bypasses the link store entirely. Used when the originating caller already knows the recipient's channel-native id (e.g. a server-side Responses caller relaying for a known user). The host still consults `LinkPolicy` for each delivery. Convenience alias: `ResponseTarget.identity(ChannelIdentity(...))` for the single-identity case. | -| Multiple channels | `ResponseTarget.channels(["telegram", "activity"])` | Delivered to each named channel (link-store recipient per channel). | -| All linked | `ResponseTarget.all_linked` | Delivered to every channel where the resolved `isolation_key` is known. | -| None | `ResponseTarget.none` | Background-only — caller must poll the `ContinuationToken`. Forces `background=True`. | - -`ResponseTarget` constructors that take at least one channel id (`.channel(...)`, `.channels([...])`, `.identities([...])`) accept an `echo_input: bool = False` kwarg. When true, the host pushes the **originating user's input** to each non-originating destination as a `HostedRunResult[AgentResponse]` whose underlying `messages[*].role == "user"` **before** the agent reply (whose `messages[*].role == "assistant"`). Used when the developer wants downstream channels to mirror what the user said so their UI stays coherent (e.g. a workflow originating on Telegram that pushes to Teams as well — the Teams transcript shows both turns). The echo and the response are bundled into the **same scheduled push task** per destination (the runner-managed unit of work — see [Intended targets + durable delivery](#intended-targets--durable-delivery)); the echo is dispatched first, and an echo-push failure is logged and swallowed inside the task so a channel that drops echoes still receives the agent reply. Both pushes go through the same `ChannelPush.push(identity, payload)` entry point — channels distinguish the echo phase from the response phase by inspecting `payload.result.messages[*].role`, or (for channels that wire a `response_hook`) by branching on `ChannelResponseContext.is_echo` directly. Channels that cannot impersonate the user on their wire (most chat bots can only send as the bot) typically render echoes as a quoted / prefixed block, drop them, or rewrite them via their `response_hook`. - -When `response_target` is anything other than `originating`, the originating channel's protocol response is the **`ContinuationToken`** (e.g. an Invocations 202 with the token in the response body and/or a polling URL header), and the actual agent response is delivered out-of-band via the destination channel(s)' `ChannelPush`. If the destination channel doesn't implement `ChannelPush`, the host falls back per the configured policy (default: deliver to `originating`; surfaces a warning in telemetry). The configured `LinkPolicy` is consulted for every destination — destinations that fail the policy (e.g. a corp-tier channel addressed from a public-tier originating request) are dropped, and if every destination is dropped the host falls back to `originating`. - -**`ChannelPush`** (Protocol) — optional capability for channels that can deliver outbound messages without a prior request. - -| Method | Type | Description | -|---|---|---| -| `push(identity: ChannelIdentity, payload: HostedRunResult)` | async | Proactively delivers a completed run result to the given channel-native identity (Telegram proactive message, Activity Protocol proactive message via Bot Service `continueConversation`, webhook callback, SSE broadcast). Channels implement this in addition to `Channel`; channels that cannot push omit it. | - -**`ContinuationToken`** — first-class artifact for asynchronous / background runs. - -| Field | Type | Description | -|---|---|---| -| `token` | `str` | Opaque, URL-safe continuation token. The only field channels expose to callers; all other fields are implementation detail of the host's `HostStateStore`. Stable for the lifetime of the run record (until expiry / eviction). | -| `status` | `Literal["queued", "running", "completed", "failed"]` | Current status. | -| `isolation_key` | `str?` | The resolved isolation key the run is associated with. | -| `created_at` | `datetime` | Submission time. | -| `completed_at` | `datetime?` | Set when status is `completed` or `failed`. | -| `result` | `HostedRunResult?` | Populated on `completed`. | -| `error` | `str?` | Populated on `failed`. | -| `response_target` | `ResponseTarget` | The configured delivery target (recorded for diagnostics). | - -The host stores `ContinuationToken`s through a `HostStateStore` (see [Host state storage](#host-state-storage)). The v1 default is **`FileHostStateStore`** — one JSON file per token under a configurable directory (default `./.af-hosting/continuations/`), written atomically (`.tmp` + `os.replace`) so a host crash mid-write doesn't corrupt the record. This means background runs **survive host restarts**: a caller that polls `/responses/{continuation_token}` after the process recycles still gets a valid status (and the result if the run had completed before the crash). Completed/failed entries are evicted by a configurable TTL (default 24h). `InMemoryHostStateStore` is available for tests / ephemeral hosts. Built-in channels expose poll routes that surface the token in their native shape (`/responses/{continuation_token}` returns a Responses-shaped object; `/invocations/{continuation_token}` returns the Invocations status envelope). - -#### Host state storage - -`HostStateStore` is the single persistence seam for **host-execution metadata** that needs to outlive a single request: continuation tokens, identity-link grants, and last-seen `(isolation_key, channel)` records. It is deliberately separate from `ContextProvider` (per-conversation context) and `CheckpointStorage` (workflow checkpoints) because the data shapes are structurally different — but a deployment MAY back all three with the same physical store. - -| Method | Purpose | -|---|---| -| `put_continuation(token: ContinuationToken)` / `get_continuation(token: str)` / `delete_continuation(token: str)` | Background-run records. | -| `put_link_grant(grant: LinkGrant)` / `get_link_grant(code: str)` / `consume_link_grant(code: str)` | Pending identity-link grants (Entra OAuth state, one-time codes). | -| `record_last_seen(isolation_key: str, channel: str, identity: ChannelIdentity, ts: datetime)` / `get_last_seen(isolation_key: str)` | Backs `ResponseTarget.active`. | - -V1 ships two implementations: - -- **`FileHostStateStore(directory: Path = "./.af-hosting/")`** — default; one JSON file per record under `continuations/`, `link_grants/`, plus a `last_seen.json` keyed by isolation key. Atomic writes; per-namespace TTL cleanup (continuations 24h, link grants 15min, last-seen 30d by default). Suitable for single-node hosts and dev; works in hosted-agent environments where the working directory is persisted and isolated per agent. -- **`InMemoryHostStateStore()`** — testing / ephemeral; same protocol, no persistence. - -Pluggable v1-fast-follow implementations (Cosmos, SQL, Redis) plug into the same protocol — see req #24. - -In the Python core package, the host-level `state_dir` shorthand reserves a -`links` component for this identity-link store. Passing a single path derives -`state_dir/links/`; the `HostStatePaths` mapping form accepts `links=...` for -placing link-store data on a separate volume. The core host offers that path to -identity linkers that implement `SupportsLinkStorePath`; linkers that own a -provider-specific store can ignore it and be configured directly. - -**`ChannelCommand` / `ChannelCommandContext` / `CommandHandler`** — cross-channel native command model (per PR #5393). - -| Type | Fields | Description | -|---|---|---| -| `ChannelCommand` | `name`, `description`, `handle`, `expose_in_ui=True`, `metadata={}` | Transport-neutral command descriptor. | -| `ChannelCommandContext` | `session`, `state`, `raw_event`, `reply(...)`, `run(request)` | Runtime context for command handlers. | -| `CommandHandler` | `Callable[[ChannelCommandContext], Awaitable[None] \| None]` | Command implementation; may reply locally, mutate state, or invoke the agent. | - -**`HostedRunResult` / `HostedStreamResult`** — outbound results from the host. - -| Type | Fields | Description | -|---|---|---| -| `HostedRunResult[TResult]` | `result: TResult`, `session: AgentSession \| None` | One-shot outcome. `result` carries the **target's full-fidelity output unchanged**: `HostedRunResult[AgentResponse]` for agent targets (channels read `result.messages`, `result.text`, `result.value`, `result.response_id`, `result.usage_details`, … directly off the underlying response), `HostedRunResult[WorkflowRunResult]` for workflow targets (channels iterate `result.get_outputs()` and inspect `result.get_final_state()`). The host never pre-shapes, flattens, or filters — multi-modality and structured outputs survive end-to-end and each channel (through its `response_hook` and its native serializer) decides what subset its wire renders. The echo-input phase synthesises an `HostedRunResult[AgentResponse]` wrapping the originating user turn so the same delivery machinery applies. `session` carries the resolved per-isolation_key `AgentSession` (`None` for workflows, which do not own session state in the agent sense). Treat instances as immutable — the host clones per-destination via `result.replace(result=...)` before invoking each channel's `response_hook`; `replace()` is shallow, so channels that need to mutate ``result`` itself are responsible for their own deep copy. | -| `HostedStreamResult` | `updates: ResponseStream[...]`, `raw_events: AsyncIterable[Any] \| None`, `session: AgentSession?` | Streaming outcome. `updates` is the **normalized** stream of `AgentRunResponseUpdate` (lossless for messages, function calls, usage) and is the happy path for Responses, Invocations, Telegram, and most channels. `raw_events` is an optional **passthrough seam** onto the underlying agent event stream (before update normalization) for channels whose protocol carries domain events the framework does not model — e.g. AG-UI's `StateSnapshotEvent` / `StateDeltaEvent` / `ToolCallStartEvent`. Channels that consume `raw_events` bear responsibility for the full event translation; the request still flows through `context.stream(...)` so session resolution, identity, push, and policy continue to apply. `None` when the host has no raw upstream (e.g. a workflow-only target produced from cached events). | - -The host does **not** emit protocol events directly — channels translate `HostedRunResult`/`HostedStreamResult` into Responses events, Invocations SSE, webhook callbacks, or platform messages. - -**`ChannelResponseHook` / `ChannelResponseContext`** — dev-supplied post-processing seam applied per destination before push. - -| Type | Shape | Description | -|---|---|---| -| `ChannelResponseHook` | `Callable[[HostedRunResult[Any], *, context: ChannelResponseContext], HostedRunResult[Any] \| Awaitable[HostedRunResult[Any]]]` | Stored as a `response_hook` attribute on a channel instance — **duck-typed**, not part of the `Channel` Protocol. Receives a per-destination clone of the `HostedRunResult` and returns a (possibly rewritten) replacement. Hooks rebind ``result`` via `HostedRunResult.replace(result=...)` rather than mutating it in place. Common uses: flatten multi-modal output to text for a text-only wire, filter out tool-call contents, project a workflow `WorkflowRunResult` into a channel-friendly `AgentResponse` for text-only channels, attach citation entities, decide an Adaptive Card vs plain-text presentation. The hook signature stays `Any`-typed in the envelope's `TResult` so a single channel can serve both agent (`HostedRunResult[AgentResponse]`) and workflow (`HostedRunResult[WorkflowRunResult]`) payloads; channels narrow at hook entry if they want static checking. | -| `ChannelResponseContext` | `request: ChannelRequest`, `channel_name: str`, `destination_identity: ChannelIdentity`, `originating: bool`, `is_echo: bool` | Per-destination context passed to a hook. `originating=False` for push deliveries (current scope of the host's `_deliver_response`); `is_echo=True` when this invocation is for the `ResponseTarget.echo_input` user-message phase rather than the agent reply phase. | -| `apply_response_hook(hook, result, *, context)` | helper | Standardised invocation convention so channels (and the host's delivery layer) all call hooks the same way. | - -The host runs each destination's hook on a **cloned** `HostedRunResult`, so a hook that rebinds `result` cannot leak into the payload another destination observes. The clone is shallow — channels that need to mutate `result` itself (rather than rebind it via `replace()`) are responsible for their own deep copy. - - - -### Built-in channel constructors - -```python -class ResponsesChannel(Channel): - def __init__( - self, - *, - path: str = "/responses", - run_hook: ChannelRunHook | None = None, - expose_conversations: bool = True, - transports: Sequence[Literal["http", "websocket"]] = ("http",), - websocket_path: str = "/ws", - options: object | None = None, - ) -> None: ... - -class InvocationsChannel(Channel): - def __init__( - self, - *, - path: str = "/invocations", - run_hook: ChannelRunHook | None = None, - openapi_spec: dict[str, Any] | None = None, - ) -> None: ... - -class TelegramChannel(Channel): - def __init__( - self, - *, - bot_token: str, - transport: Literal["webhook", "polling"] = "webhook", - path: str = "/telegram", - run_hook: ChannelRunHook | None = None, - commands: Sequence[ChannelCommand] = (), - register_native_commands: bool = True, - require_link: bool = False, - ) -> None: ... -``` - -`options` on `ResponsesChannel` is intentionally loosely typed in this draft because the option-mapping boundary is still settling. If it becomes a formal type later, it should be Agent Framework-owned, not imported from `agentserver`. - -#### Conversation history for the Responses channel - -The Responses channel does **not** introduce its own history seam. Conversation history for every channel — Responses, Invocations, Telegram, Activity Protocol — flows through the agent's standard core `HistoryProvider` (`agent_framework._sessions.HistoryProvider`). The Responses channel is a *caller-supplied session* channel (see [Channel session-carriage models](#channel-session-carriage-models)): it parses `previous_response_id` (and/or `conversation_id`) off the inbound request and projects it into `ChannelSession.key`. The host then resolves an `AgentSession` for that key and the agent's `HistoryProvider` does the load / append exactly as it would for any other session. - -```text -POST /responses { "previous_response_id": "resp_018f…", "input": [...] } - -> ResponsesChannel parses previous_response_id - -> ChannelRequest.session = ChannelSession(key="resp_018f…") - -> host resolves AgentSession(id="resp_018f…") - -> agent.HistoryProvider.load_messages(session=…) # if load_messages=True - -> agent.run(input, session=…) - -> agent.HistoryProvider.save_messages(session=…, new_messages) - -> ResponsesChannel serializes the result with response_id="resp_018f…+1" -``` - -This means **any** AF `HistoryProvider` backs Responses out of the box — `FileHistoryProvider`, an in-memory provider, a future `CosmosHistoryProvider`, etc. The wire `previous_response_id` is just a session id with channel-defined formatting; nothing in the provider has to know "this is a Responses session". - -##### The Responses `store` parameter - -The OpenAI Responses API exposes a `store` boolean on every request. Its meaning in the official SDK is "service-side: persist this response so a later call can reference it via `previous_response_id`." In the hosting world this gets more interesting because there are **three** independent places a turn can end up persisted: - -- **Service-side** — the upstream provider's response store (e.g. OpenAI's hosted response store, accessible by `previous_response_id` against that provider directly). Controlled by the `store` flag on the agent's underlying `ChatClient` at construction time. -- **Hosted-agent storage** — the `HistoryProvider`(s) attached to the agent (`FileHistoryProvider`, `FoundryHostedAgentHistoryProvider`, in-memory, dual-write, …). Controlled by the host's `session_mode` directive, which `run_hook` can rewrite per request. -- **Caller-side** — the API caller keeps the `response_id` returned by the host and chains future calls with `previous_response_id`. Always available; out of host scope. - -These axes are **independent**. The same wire `store` value can land in any combination of them — or none — depending on (a) how the developer assembled the agent (`HistoryProvider` attached or not? `ChatClient` configured with its own `store=True` or not?) and (b) what the channel's `run_hook` does with the value. **The point of the matrix below is that `store` does not have a single canonical meaning at the hosted-agent layer — the developer of the hosted agent decides what it means.** - -| Caller sends | **Service-side** (underlying `ChatClient`'s own `store`) | **Hosted-agent storage** (agent's `HistoryProvider`) | **Caller-side** (caller chains `previous_response_id`) | -|---|---|---|---| -| `store=true` (or omitted; OpenAI default is `true`) | Writes **iff** the `ChatClient` was constructed to honor `store=true` against the upstream service. The host forwards the wire value into the chat client's options but does not look at it itself. | **Default:** loads and writes via the configured `HistoryProvider` (`session_mode="auto"`).
**Developer overrides** (via `run_hook`): `session_mode="disabled"` to suppress (compliance hold, ephemeral one-shots); `session_mode="required"` to fail closed if no session can be resolved instead of auto-issuing. | Always available — the host returns a chained `response_id` the caller may keep and re-send as `previous_response_id`. | -| `store=false` | Typically suppresses the service-side write — but the exact behavior depends on the `ChatClient` (some providers ignore the per-request flag, some honor it, some require a different opt-out). The host does not interpret it on the chat client's behalf. | **Default:** **still loads and writes** via the configured `HistoryProvider` — `store=false` is **not** auto-translated into a session-disable. The `HistoryProvider` is configured on the agent for app-level reasons (audit, replay, multi-channel continuity) the API caller has no business unilaterally overriding.
**Developer overrides** (via `run_hook`): `session_mode="disabled"` to **honor caller intent** (the path most apps that expose `store=false` as a real "stateless" guarantee will take); `session_mode="required"` (Scenario 3) to **ignore caller intent** and force host-managed sessions; conditional rules (e.g. honor `store=false` only from internal callers). | Always available — and the default fallback when both server-side surfaces are suppressed. | - -The same `store=false` request can therefore end up persisted in: - -- **service-side only** (chat client honors the flag → no service-side write; `HistoryProvider` not attached → no hosted-agent write; caller keeps `response_id`), -- **hosted-agent storage only** (chat client honors the flag → no service-side write; `HistoryProvider` attached and `run_hook` does not override → host writes anyway), -- **both** (chat client ignores the flag → service-side write happens; `HistoryProvider` attached and not overridden → hosted-agent write also happens), -- **neither** (chat client honors the flag and `run_hook` translates it into `session_mode="disabled"` → only the caller's local copy exists). - -Two design properties fall out of this: - -1. **`store` is forwarded, not auto-mapped to host policy.** The caller's `store` value is forwarded into the chat client's options (where the upstream provider's own `store` semantics apply), but it is **not** translated into a `session_mode` directive against the agent's `HistoryProvider` by default. Collapsing the two — for example to make `store=false` a real end-to-end "stateless" guarantee — is an explicit developer choice expressed in `run_hook`. -2. **Documenting `store` semantics is a per-deployment responsibility.** Because the resolved persistence depends on three independent developer decisions, the meaning of `store=true` / `store=false` against any given hosted agent is something the deployment **must document for its callers** — there is no framework-level guarantee beyond "the wire value is forwarded to the chat client, and the host's `HistoryProvider` runs by default unless `run_hook` says otherwise." -3. **Richer storage vocabulary via `extra_body`.** A single boolean is often too coarse to express what a deployment actually wants to offer. The OpenAI Responses request envelope supports an `extra_body` mapping (the official Python SDK exposes it on every call as a passthrough into the request JSON); the `ResponsesChannel` parses unknown body keys onto `ChannelRequest.attributes`, so `run_hook` can read deployment-specific knobs from there and translate them into `session_mode`, the chat client's `store` flag, or anything else. Examples a deployment might expose: `extra_body={"af_store": "audit_only"}` to write to the `HistoryProvider` but suppress the service-side mirror; `{"af_store": "ephemeral"}` to skip both server-side surfaces; `{"af_store": "replay_safe"}` to force `session_mode="required"` and reject calls without a resolvable session. The framework does not standardize these names — they are part of the deployment's documented contract with its callers, on top of the standard `store` flag. - -##### `FoundryHostedAgentHistoryProvider` — Foundry-backed history - -For users who want the conversation persisted in the **same Foundry response store** that `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider` writes to (so e.g. Foundry Workbench can replay the conversation, or other Foundry tools can introspect it), a new provider is added — proposed name `FoundryHostedAgentHistoryProvider` — implementing the standard `HistoryProvider` Protocol and built **on top of** the Foundry response-store SDK that ships in `azure.ai.agentserver` (so the wire contract, auth, and isolation headers stay aligned with the SDK without re-implementation). Shipped in `agent-framework-foundry-hosting`, attached the same way any other history provider is attached to an agent: - -```python -agent = Agent( - client=client, - history_provider=FoundryHostedAgentHistoryProvider( - endpoint=os.environ["FOUNDRY_ENDPOINT"], - load_messages=True, - ), -) - -host = AgentFrameworkHost(target=agent, channels=[ResponsesChannel()]) -``` - -The provider implements the standard `HistoryProvider` interface — there is no Responses-specific Protocol in between. It is also valid for any other channel (Telegram, Invocations, …) — Foundry storage simply becomes the chosen backend. - -Foundry's storage backend keys writes off two platform-injected request headers (`x-agent-user-isolation-key`, `x-agent-chat-isolation-key`) rather than the request body. The Responses and Invocations channels parse both headers off the inbound request and forward them as an opaque mapping on `ChannelRequest.attributes["isolation"]` (`{"user_key", "chat_key"}`); the host's per-request `bind_request_context` then passes that value to `FoundryHostedAgentHistoryProvider.bind_request_context(isolation=...)`, which the provider applies to its storage calls. Channels never import `IsolationContext`; the provider accepts both an `IsolationContext` instance and a plain mapping. When the headers are absent (local dev outside the Hosted Agents runtime) the attribute is omitted and storage falls back to non-isolated reads/writes, so the same code path works in both environments. - -##### Multi-provider composition - -The existing AF convention applies: an agent may compose **multiple** `HistoryProvider`s, but **only one** carries `load_messages=True`. Common patterns: - -- *Single store.* `FileHistoryProvider(load_messages=True)` — local dev. Or `FoundryHostedAgentHistoryProvider(load_messages=True)` — Foundry-backed prod. -- *Audit dual-write.* `FoundryHostedAgentHistoryProvider(load_messages=True)` + `CosmosHistoryProvider(load_messages=False)` — Foundry is the source of truth used to reconstruct context for the LLM; Cosmos receives a write-only audit copy. -- *Mirror to Foundry for Workbench replay only.* Conversely, an in-house store can hold `load_messages=True` while `FoundryHostedAgentHistoryProvider(load_messages=False)` mirrors writes into Foundry purely so the conversation shows up in Foundry tooling. - -The choice of where to store, and whether to dual-write, is fully the developer's. The channel does not need to know which backing store(s) the agent is using. - -#### Channel-owned per-thread state - -Some channel protocols carry **non-message** durable state attached to the conversation — most notably AG-UI's per-thread `state` object, mutated mid-stream via `StateSnapshotEvent` / `StateDeltaEvent` (JSON-Patch-shaped) and read by the front-end on the next turn. This is *not* message history, so it does not belong on `HistoryProvider`; but it has the same lifetime, isolation, and "opaque to the host" properties as messages, so the framework already has the right primitive: **`ContextProvider`**. - -`HistoryProvider` is only one concrete `ContextProvider` (the one that uses the per-source `state: dict[str, Any]` slot to hold messages). Channels with non-message per-thread state SHOULD ship their own `ContextProvider` subclass and write into the same per-source `state` slot. - -Sketch (for AG-UI; the same pattern applies to any event-rich front-end): - -```python -from agent_framework import ContextProvider, ContextProviderState - -class AgUiStateProvider(ContextProvider): - """Per-thread non-message state for AG-UI front-ends. - - Persists the AG-UI ``state`` object scoped by ``source_id`` (the - AgentSession id). Reads from ``ChannelRequest.client_state`` before - the run, exposes the current value to the agent via Context, and lets - the channel diff it after the run to emit StateSnapshotEvent / - StateDeltaEvent on the wire. - """ - - state_key = "ag_ui_state" # slot in the per-source state dict - - async def before_run(self, context, *, source_id, **kw): - slot = context.state.setdefault(source_id, {}) - # If the request supplied a fresh client_state, seed/replace it. - if (incoming := context.request.client_state) is not None: - slot[self.state_key] = dict(incoming) - # Expose the live value to the agent (e.g. into context.metadata). - - async def after_run(self, context, *, source_id, **kw): - # The current value lives in context.state[source_id][self.state_key]; - # the channel reads it and emits StateSnapshotEvent / StateDeltaEvent. - ... -``` - -Composition rules are unchanged: one `HistoryProvider` carries `load_messages=True`, additional `ContextProvider`s (including `AgUiStateProvider`) attach alongside. Backing storage is whatever the user wires — in-memory for dev, the same physical store as messages for prod. **No new storage protocol is introduced for channel state**; it shares the same per-source state slot that `HistoryProvider` uses. - -#### Storage taxonomy - -To make the picture explicit: there are exactly three distinct *storage seams* in the hosting design, each with a clear scope. The first two are usually backed by the same physical store the user wires; they stay distinct as protocols because the data shapes differ. - -| Seam | Scope | Examples | -|---|---|---| -| **`ContextProvider`** (per-conversation) | Per-`source_id` data the agent needs at run time. Messages (via `HistoryProvider`), AG-UI per-thread state (via `AgUiStateProvider`), or any future per-conversation extension. **The only public per-conversation seam.** | `FileHistoryProvider`, `FoundryHostedAgentHistoryProvider`, `AgUiStateProvider` | -| **Host-level pluggable store** (per-host) | `ContinuationToken`s for background runs, identity-link grants, last-seen `(isolation_key, channel)` records. **File-based by default** in v1 (`FileHostStateStore`, atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis adapters in v1 fast follow (req #24). MAY be backed by the same physical store as `ContextProvider`, but the protocol is distinct because the data is host-execution metadata, not per-conversation context. | `FileHostStateStore` (v1 default), `InMemoryHostStateStore`, future Cosmos / SQL / Redis adapters | -| **`CheckpointStorage`** (workflow runtime) | Workflow executor frames so a workflow can resume after process restart. Structurally distinct from both seams above (the data is workflow-runtime state, not session/identity state). MAY share a physical backend, but the protocol stays separate. | `FileCheckpointStorage`, future `CosmosCheckpointStorage` | - -Concretely, this means an app deploying onto e.g. Foundry storage can run **all three** against the same Foundry backend and still have three orthogonal protocol surfaces — one per concern — instead of one universal store everything accidentally collides in. - -Channels surface per-request transport state (response ids, isolation keys, future signals) on `ChannelRequest.attributes`; the host's `bind_request_context` forwards those attributes as kwargs to each `ContextProvider.bind_request_context` call so providers can apply them to their reads and writes. Providers SHOULD accept `**_` to ignore unknown attributes for forward-compat. This keeps channel↔provider coupling to a documented attribute name (e.g. `"isolation"`) instead of requiring providers to install ASGI middleware. - -The `ResponsesChannel` exposes both an HTTP transport (`{path}/v1/...`) and an optional **WebSocket transport** (`{path}{websocket_path}`, default `/responses/ws`) controlled by `transports`. The WS transport carries the same Responses request/event model as the HTTP+SSE variant — clients open a single connection per conversation and send/receive Responses frames as JSON messages. Both transports go through the same `run_hook`, the same default mapping, and the same `ChannelRequest` shape; the channel codec is responsible for framing only. Auth is reused from the HTTP transport (Authorization header on the `Upgrade` request); subprotocol negotiation is open (see Open Questions). - -### Default invocation behavior by channel - -Each built-in channel owns a **default** mapping from its protocol request model into a `ChannelRequest`. That mapping flows through the optional `run_hook` before the host resolves session behavior and invokes the target. - -| Channel | Default mapping | -|---|---| -| `ResponsesChannel` | Forwards relevant caller settings (e.g. `temperature`, `store`) into `ChannelRequest.options` so the underlying chat client receives them; **does not** map `store=false` to `session_mode="disabled"` by default — see [The Responses store parameter](#the-responses-store-parameter) for the full matrix and the developer-override path. The same default mapping is used for both HTTP and WebSocket transports — WS frames are decoded into the same Responses request model before invocation. | -| `InvocationsChannel` | Maps the request body into `input`, `options`, and session behavior for the hosted target. | -| `TelegramChannel` | Maps incoming messages or commands into `input`, `stream`, and session defaults appropriate for the chat. | - -### ASGI server portability - -The hosting architecture is coupled to **ASGI/Starlette**, not to **Uvicorn** specifically. - -- `host.app` is the canonical portability surface. -- `host.serve(...)` is only the default convenience path (lazy-imports `uvicorn`). -- Because `host.app` is a standard Starlette/ASGI app, it can run on Hypercorn, Daphne, Granian, or Gunicorn-with-Uvicorn-workers. -- ASGI **WebSocket** scope/frames are first-class: any channel may contribute `WebSocketRoute`s alongside HTTP routes, and the chosen ASGI server must support the WebSocket scope (Uvicorn, Hypercorn, Daphne, and Granian all do). - -The packaging question for `uvicorn` (required dependency vs optional extra) is therefore a **convenience choice**, not an architectural constraint. See Open Questions. - -### Error Responses - -| Status | Condition | Notes | -|---|---|---| -| `400 Bad Request` | Channel-specific protocol validation failure | Owned by the channel codec. | -| `401 Unauthorized` / `403 Forbidden` | Channel-specific auth/signature validation failure | Owned by channel middleware (e.g. Telegram secret token, Invocations auth). | -| `404 Not Found` | Route not contributed by any channel | Standard Starlette behavior. | -| `409 Conflict` | Session-resolution conflict with `session_mode="required"` and no resolvable session | Host-level. | -| `422 Unprocessable Entity` | `run_hook` raised a validation error | Channel surfaces the hook's error per protocol conventions. | - -## Terminology - -- **Host** (`AgentFrameworkHost`): The Python object that owns one Starlette app, one **hostable target** (an agent or a workflow), and a sequence of channels. Provides `host.app` (canonical ASGI surface) and `host.serve(...)` (uvicorn convenience). Named `AgentFrameworkHost` rather than `AgentHost` because the target is not restricted to agents. -- **Hostable target**: The executable object the host fronts — either a `SupportsAgentRun`-compatible agent or a `Workflow`. The host detects the kind and dispatches to the appropriate execution seam; channels remain unchanged. -- **Channel**: A pluggable component that contributes routes, middleware, commands, and lifecycle hooks to a host. One channel = one external protocol surface (Responses, Invocations, Telegram, …). Used interchangeably with "head" in earlier discussions; **Channel** is the canonical name. -- **`ChannelRequest`**: The host-neutral, normalized invocation envelope produced by a channel before the host calls the target's execution seam. Carries `input`, `options`, `session`, `session_mode`, and channel-specific `attributes`. -- **`ChannelSession`**: A small session hint with a stable `key`, an optional protocol-visible `conversation_id`, and an opaque `isolation_key`. The host resolves it into an `AgentSession`; storage specifics are deferred. -- **`isolation_key`**: An opaque partition boundary aligned with hosted-agent terminology — may represent a user, tenant, chat, or other scope without baking direct identity semantics into the generic host. -- **Channel-native identity** (`ChannelIdentity`): The **user/account** identifier the channel observes from its own platform (Telegram `from.id`, Teams `from.aadObjectId`, WhatsApp phone number, Slack user id). Always per-channel; never assumed to align across channels. Distinct from the **conversation locator** (`ChannelRequest.conversation_id` / `ChannelSession.conversation_id`) — in multi-user surfaces (Telegram groups, Teams group chats and channels) the two never coincide. See [Multi-user conversations](#multi-user-conversations-telegram-groups-teams-group-chats-and-channels). -- **`IdentityResolver`**: Host-level callable that maps a `ChannelIdentity` to an `isolation_key`. The default resolver **auto-issues** a fresh, stable `isolation_key` the first time a `(channel, native_id)` pair is seen and persists it in the host's identity store, so every end user automatically gets a per-user partition on first contact through any channel — without app code. Linking (see `IdentityLinker`) **merges** the second channel's auto-issued key onto the first channel's `isolation_key`, so cross-channel continuity is a one-shot operation, not a per-channel mapping hook. Apps that already own an identity namespace (corporate user id, tenant-scoped account id) can supply a custom resolver that returns those values directly. -- **`IdentityLinker`**: Host-level component that runs a connect ceremony — typically OAuth, MFA, or a signed one-time code — to associate a new `ChannelIdentity` with an existing `isolation_key`. Contributes its own routes (e.g. OAuth callback) and lifecycle to the host. A built-in `link`/`connect` `ChannelCommand` is exposed automatically when one is configured. On successful ceremony completion, also stores any verified IdP claim recovered from the proof (e.g. Entra ID `oid`) so subsequent channels that supply the same claim can be auto-merged onto the same `isolation_key` silently. Combined with `Channel(require_link=True)`, this enables an "authenticate before chatting" enforcement model where the first channel forces the OAuth ceremony and every other channel using the same IdP joins the same session without a second `/link`. -- **`LinkChallenge`**: The protocol-neutral artifact returned by `IdentityLinker.begin(...)` describing what the user must do to complete the ceremony — typically one of: a URL to visit (OAuth), a short code to enter on the other channel (one-time code), or an MFA prompt. -- **`ResponseTarget`**: Per-request directive on `ChannelRequest` controlling **where** the response is delivered: `originating` (default), `active`, a specific channel, a list of channels, `all_linked`, or `none`. Independent of `session_mode`. -- **`ChannelPush`**: Optional channel capability for proactive outbound delivery — Telegram proactive message, Activity Protocol proactive message via Azure Bot Service, webhook callback, SSE broadcast. Required to be the destination of a non-`originating` `ResponseTarget`. -- **Active channel**: The channel most recently observed for a given `isolation_key`. Tracked by the host on every successfully resolved request; consumed by `ResponseTarget.active`. -- **`ContinuationToken`**: First-class artifact for background/asynchronous runs, returned immediately from `host.run_in_background(request)`. Carries an opaque, URL-safe `token` plus `status`, `isolation_key`, `result`/`error`, and the configured `response_target`. Persisted via `HostStateStore` (file-based by default in v1) so background runs survive host restarts. Host pushes the result to the response target when ready and serves it via channel poll routes. -- **Background run**: A `ChannelRequest` submitted via `host.run_in_background(request)` (or any request with `background=True`). The originating call returns a `ContinuationToken` immediately; the response is delivered later via the configured `ResponseTarget` and/or polled by token. -- **`HostStateStore`**: Single persistence seam for host-execution metadata — continuation tokens, identity-link grants, last-seen records. V1 default `FileHostStateStore` (atomic JSON writes under `./.af-hosting/`); `InMemoryHostStateStore` for tests; pluggable for Cosmos / SQL / Redis (fast follow, req #24). Distinct from `ContextProvider` (per-conversation) and `CheckpointStorage` (workflow), but a deployment MAY back all three with the same physical store. -- **`session_mode`**: Per-request directive (`auto` | `required` | `disabled`) that controls whether the host resolves a session before invoking the target. Lets `run_hook`s express explicit policy — e.g. translating Responses `store=false` into `session_mode="disabled"` to honor the caller's "don't store" intent at the `HistoryProvider` layer (the channel does not do this automatically — see [The Responses store parameter](#the-responses-store-parameter)). -- **`confidentiality_tier`** (channel-level): Opaque label (`"corp"`, `"public"`, `"internal"`, …) declared on a `Channel` and consumed by the host's `LinkPolicy`. Two channels with different confidentiality tiers can share an agent target on one host while remaining session-isolated. -- **`LinkPolicy`**: Host-level decision over which channel pairs may share an `isolation_key` (link) and which channel pairs may be `ResponseTarget` source/destination for one another (deliver). Built-in variants: allow-all (default), same-tier-only, explicit allow-list, deny-all. See [LinkPolicy and confidentiality_tier](#linkpolicy-and-confidentiality_tier) for the full contract and built-ins table. -- **`ChannelContribution`**: What a channel returns from `contribute(...)` — routes, middleware, commands, and `on_startup`/`on_shutdown` hooks. The host aggregates contributions into one Starlette app. -- **`ChannelCommand`**: A transport-neutral command descriptor (`name`, `description`, `handle`). Message channels project these into native command surfaces — Telegram bot commands, future Activity Protocol slash commands / adaptive cards, WhatsApp menus. -- **`ChannelRunHook`**: Per-request callable on built-in channels. Runs after the channel's default `ChannelRequest` is produced, before session resolution. The escape hatch for forcing or forbidding session use, requiring extra options, adapting to targets like `A2AAgent`, **and** reshaping a channel's free-form input into the typed inputs a `Workflow` target expects. -- **Native command registration**: The startup-time projection of `ChannelCommand` metadata into a platform's native command catalog (e.g. Telegram `set_my_commands(...)`). -- **`SupportsAgentRun`**: The existing framework agent execution seam (`run(..., session=..., stream=...)`) — the contract the host uses when the hostable target is an agent. -- **`Workflow`**: The framework workflow execution seam — the contract the host uses when the hostable target is a workflow. The host wraps the workflow's outputs into the same `HostedRunResult` / `HostedStreamResult` shape so channels do not need to distinguish. - -## Runtime modes - -The host runs in one of two operational shapes, declared (or auto-detected) via a single `runtime_mode` parameter. The parameter is **advisory** — it sets defaults for the seams below; the developer can override any individual choice. - -```python -AgentFrameworkHost( - target=my_agent, - channels=[...], - runtime_mode=None, # None → auto-detect; "long_running" | "ephemeral" to force -) -``` - -| Value | Shape | When to use | -|---|---|---| -| `"long_running"` | Always-on container / process. Owns its own scheduler. Survives across many requests. | Local dev, OpenClaw-style hosted deployments, classic web-app rollouts on AKS / App Service / Container Apps. | -| `"ephemeral"` | Scale-to-zero / per-request lifecycle. Process may terminate between requests; cold-start cost on each one. | Foundry Hosted Agent, Azure Functions consumption plan, AWS Lambda, and similar serverless runtimes. | -| `None` (default) | Auto-detect. The host inspects environment markers at construction; falls back to `"long_running"` when nothing is detected. | The default. Recommended for portable code that works locally and ships to a serverless target. | - -**Auto-detection.** When `runtime_mode=None`, the host checks for known deployment markers in this order and picks `"ephemeral"` on the first hit: - -| Marker | Meaning | -|---|---| -| `FOUNDRY_HOSTING_ENVIRONMENT` (env var) | Running inside Foundry Hosted Agent. | -| `AZURE_FUNCTIONS_ENVIRONMENT` (env var) | Running inside the Azure Functions worker. | -| `AWS_LAMBDA_FUNCTION_NAME` (env var) | Running inside an AWS Lambda. | - -If none of the markers match, the host defaults to `"long_running"` (a sensible local-dev / container default). Additional markers may be added without bumping the API; the list is documented and overridable via the `runtime_mode` parameter itself. - -**Defaults selected by mode.** The mode drives the *default selection* for these seams. Each is independently overridable: - -| Concern | `"long_running"` default | `"ephemeral"` default | -|---|---|---| -| `HostStateStore` | `InMemoryHostStateStore` (process owns state) | `FileHostStateStore` (atomic JSON under `./.af-hosting/`; survives single-node restart) | -| `ContinuationToken` persistence | In-memory acceptable | Persistence required (file / Cosmos / Foundry) | -| `DurableTaskRunner` | `InProcessTaskRunner` (asyncio + bounded retry) | Adapter expected (`agent-framework-hosting-durabletask`, Foundry, …); falls back to `InProcessTaskRunner` with a startup warning when none configured | -| Background runs (req #14) | Owned by the long-running worker via `InProcessTaskRunner` | Hand off to the durable runner so the process can terminate between requests | -| Channel polling (e.g. Telegram `getUpdates`) | Natural fit — `on_startup` spawns the poller, `on_shutdown` cancels it | Requires an external scheduled trigger or webhook transport; polling channels emit a startup warning when paired with `"ephemeral"` | -| `IdentityLinker` short-lived grants | In-memory TTL fine | Must persist via `HostStateStore` | -| `IdentityAllowlist` lookup | In-memory cache fine | Persisted source or external IdP claim resolution | -| Health checks + readiness probes | First-class | Less relevant — runtime manages liveness | -| Per-channel polling-worker isolation | Important — leaks compound over days/weeks (see [`channels_vs_openclaw.md`](../../python/.user/channels_vs_openclaw.md)) | N/A — process recycles between requests | -| Process-recycle expectations | Days/weeks | Per-request | -| Memory/leak concerns | Important | Less relevant | - -**Detection failures.** Auto-detection is best-effort. If a deployment uses a custom runtime not in the marker list, callers SHOULD set `runtime_mode="ephemeral"` (or `"long_running"`) explicitly. The host logs the detected mode at startup so misdetection is visible in normal operation. - -**Why advisory and not enforced.** Most knobs make sense in both modes (e.g. a developer running a "long-running" container may still want `FileHostStateStore` for state durability across deploys); enforcing strict defaults per mode would force every override to fight a config error. The selected defaults are a starting point. - -## Hero Code Samples - -> **Common prerequisite:** Every sample below calls `host.serve(...)`, which lazy-imports `uvicorn`. Install `uvicorn` (e.g. `pip install uvicorn`) — or the corresponding `agent-framework-hosting[serve]` extra if the package ships one (see Open Question #2) — alongside the per-sample dependencies listed in each scenario's **Prerequisites** block. Samples that use `host.app` directly (handed to Hypercorn/Daphne/Granian/Gunicorn+uvicorn workers) do not require `uvicorn`. - -### Scenario 1: Expose one agent on the Responses API - -A developer has an agent and wants to expose it as the OpenAI-compatible Responses API on `localhost:8000` with no manual server bootstrap. - -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting` and `agent-framework-hosting-responses` are installed -> - An `OPENAI_API_KEY` is available in the environment - -```python -from agent_framework import Agent -from agent_framework.openai import OpenAIChatClient -from agent_framework.hosting import AgentFrameworkHost, ResponsesChannel - -agent = Agent( - name="WeatherAgent", - instructions="You are a helpful weather agent.", - client=OpenAIChatClient(model="gpt-4.1-mini"), -) - host = AgentFrameworkHost( target=agent, channels=[ResponsesChannel()], ) -if __name__ == "__main__": - host.serve(host="localhost", port=8000) +app = host.app ``` -This exposes the Responses route under `/responses`. No manual `uvicorn` import, no protocol handlers written by the user. - -### Scenario 2: Expose Responses + Invocations on one host with shared Starlette middleware - -Same agent, both protocols, with CORS applied at the host level. - -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting`, `-responses`, and `-invocations` are installed -> - A Foundry project with a `gpt-4.1` model deployment +### One agent on multiple channels ```python -from azure.identity import AzureCliCredential -from starlette.middleware import Middleware -from starlette.middleware.cors import CORSMiddleware - -from agent_framework import Agent -from agent_framework.foundry import FoundryChatClient -from agent_framework.hosting import AgentFrameworkHost, InvocationsChannel, ResponsesChannel - -agent = Agent( - name="TravelAgent", - instructions="Help users plan travel and keep answers concise.", - client=FoundryChatClient( - project_endpoint="https://my-project.services.ai.azure.com/api/projects/travel", - model="gpt-4.1", - credential=AzureCliCredential(), - ), -) - host = AgentFrameworkHost( target=agent, channels=[ - ResponsesChannel(), # -> /responses - InvocationsChannel(), # -> /invocations - ], - middleware=[ - Middleware( - CORSMiddleware, - allow_origins=["https://chat.contoso.com"], - allow_methods=["*"], - allow_headers=["*"], - ), + ResponsesChannel(), + InvocationsChannel(), + TelegramChannel(bot_token=os.environ["TELEGRAM_BOT_TOKEN"]), ], ) -# Hand the canonical ASGI app to any server, or use the convenience method. -app = host.app # for Hypercorn / Granian / Gunicorn+uvicorn workers -host.serve(host="localhost", port=8000) -``` - -### Scenario 3: Per-request run hook on the Responses channel - -The developer wants to enforce that every Responses call sets `temperature`, and to **harden** session handling so that `session_mode="required"` (fail if no session can be resolved) — explicitly ignoring caller `store=false` since the channel's default already keeps the agent's `HistoryProvider` active regardless of that wire flag (see [The Responses store parameter](#the-responses-store-parameter)). None of this is part of the official Responses spec, but all of it is valid app policy. - -> **Prerequisites:** This sample assumes: -> - The Responses channel is wired into an `AgentFrameworkHost` (see Scenario 1) - -```python -from dataclasses import replace - -from agent_framework.hosting import ( - AgentFrameworkHost, - ChannelRequest, - ResponsesChannel, -) - - -def responses_policy(request: ChannelRequest, **kwargs) -> ChannelRequest: - if request.options is None or request.options.temperature is None: - raise ValueError("This host requires temperature on every Responses call.") - - # Harden session handling: even when the caller sends store=false, keep host-managed - # sessions and fail closed instead of auto-issuing. The HistoryProvider would already - # run under the default "auto" mode; "required" upgrades that to a hard error if no - # session can be resolved (e.g. missing previous_response_id and no resolver match). - return replace(request, session_mode="required") - - -host = AgentFrameworkHost( - target=agent, - channels=[ResponsesChannel(run_hook=responses_policy)], -) host.serve(host="localhost", port=8000) ``` -The hook runs **after** the channel produces its default `ChannelRequest` and **before** the host resolves session behavior and calls `SupportsAgentRun.run(...)`. The same shape works to adapt to targets like `A2AAgent` — strip or remap channel-derived options that the target does not consume. - -### Scenario 4: Telegram channel with native command catalog (polling) - -A developer wants to expose the same agent as a Telegram bot, with first-class native commands (`/start`, `/new`, `/sessions`, …) registered into Telegram's command menu at startup. Modeled after PR #5393. - -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting-telegram` is installed -> - `TELEGRAM_BOT_TOKEN` is set in the environment - -```python -import os - -from agent_framework.hosting import ( - AgentFrameworkHost, - ChannelCommand, - ChannelCommandContext, - TelegramChannel, -) - - -async def handle_start(context: ChannelCommandContext) -> None: - await context.reply( - "Hi! Commands: /new, /sessions, /todo, /memories, /reminders, " - "/resume, /cancel, /reasoning, /tokens." - ) - - -async def handle_noop(context: ChannelCommandContext) -> None: - await context.reply("Command received.") - - -TELEGRAM_COMMANDS = [ - ChannelCommand("start", "Introduce the bot", handle_start), - ChannelCommand("new", "Start a new local session", handle_noop), - ChannelCommand("sessions", "List local sessions", handle_noop), - ChannelCommand("todo", "List todos for the active session", handle_noop), - ChannelCommand("memories", "List memory topics for the active session", handle_noop), - ChannelCommand("reminders", "List reminders for the active session", handle_noop), - ChannelCommand("resume", "Resume the latest pending or previous session", handle_noop), - ChannelCommand("cancel", "Cancel the active response", handle_noop), - ChannelCommand("reasoning", "Toggle the transient reasoning preview", handle_noop), - ChannelCommand("tokens", "Toggle token usage details", handle_noop), -] - -telegram = TelegramChannel( - bot_token=os.environ["TELEGRAM_BOT_TOKEN"], - transport="polling", - commands=TELEGRAM_COMMANDS, - register_native_commands=True, -) - -host = AgentFrameworkHost(target=agent, channels=[telegram]) -host.serve(host="localhost", port=8000) -``` - -This mirrors the important shape from PR #5393: command metadata is declared once, the channel registers it into Telegram's native menu at startup (`set_my_commands(...)`), and runtime command dispatch stays channel-local. - -### Scenario 5: Telegram webhook mode on the same host as Responses + Invocations - -Same agent, three channels, one Starlette app, one process. - -> **Prerequisites:** Same as Scenario 4, plus a public HTTPS URL for the webhook. - -```python -host = AgentFrameworkHost( - target=agent, - channels=[ - ResponsesChannel(), # -> /responses - InvocationsChannel(), # -> /invocations - TelegramChannel( - bot_token=os.environ["TELEGRAM_BOT_TOKEN"], - transport="webhook", # -> /telegram/webhook - commands=TELEGRAM_COMMANDS, - ), - ], -) - -host.serve(host="0.0.0.0", port=8000) -``` - -Webhook transport contributes `/telegram/webhook` by default; the command catalog remains identical to the polling sample. - -### Scenario 6: Linking a new channel to an existing identity via OAuth - -A developer wants every Telegram chat to be **authenticated up front** via OAuth (Microsoft Entra ID) before the agent will respond, and wants Teams chats from the same Entra ID user to be **auto-linked** to the existing session — no second `/link` ceremony, just sign in once on the first channel and the rest follow automatically. This delivers cross-channel chat continuity as a side-effect of identity linking; Scenario 7 covers the alternative pattern where a trusted server-side relay supplies identity directly without a link ceremony. - -```mermaid -sequenceDiagram - autonumber - actor User - participant Tg as TelegramChannel - participant Host - participant Linker as IdentityLinker
(EntraOAuth) - participant IdP as OAuth provider - participant Store as HostStateStore
(identity_links) - participant Act as ActivityChannel - - User->>Tg: /link - Tg->>Host: ChannelRequest(identity=tg:12345) - Host->>Linker: begin(identity=tg:12345) - Linker-->>Host: LinkChallenge(url, state) - Host->>Tg: response_hook → push challenge URL - Tg-->>User: "click here to sign in" - - User->>IdP: sign in (browser) - IdP-->>User: redirect with code - User->>Linker: /callback?code=…&state=… - Linker->>IdP: exchange code → tokens + claims - IdP-->>Linker: claims (oid, email, …) - Linker->>Store: persist link(tg:12345 ↔ linked_claims) - Linker-->>User: "linked ✅" - - Note over User: later, on Teams (same Entra OID) - - User->>Act: hello - Act->>Host: ChannelRequest(identity=teams-aad-oid) - Host->>Store: lookup linked_claims by oid - Store-->>Host: existing isolation_key (matches tg:12345) - Host->>Host: same session as Telegram -``` - -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting`, `agent-framework-hosting-telegram`, and the (future) `agent-framework-hosting-activity` channel are installed -> - An OAuth provider is configured (Microsoft Entra ID in this example) - -```python -import os - -from agent_framework.hosting import ( - AgentFrameworkHost, - OAuthIdentityLinker, - TelegramChannel, -) - - -# The OAuth linker contributes its own /identity/oauth/microsoft/{start,callback} -# routes to the host. On successful completion, the host's built-in identity -# store atomically records BOTH the originating channel-native identity AND the -# verified IdP claim (Entra ID object id) so future channels that authenticate -# the same IdP account can auto-link without a second ceremony. -linker = OAuthIdentityLinker( - provider="microsoft", - client_id=os.environ["AAD_CLIENT_ID"], - client_secret=os.environ["AAD_CLIENT_SECRET"], -) - -host = AgentFrameworkHost( - target=agent, - identity_linker=linker, - channels=[ - # require_link=True gates the channel: any inbound message from an - # un-linked ChannelIdentity is short-circuited to a LinkChallenge reply - # instead of being dispatched to the agent. - TelegramChannel( - bot_token=os.environ["TELEGRAM_BOT_TOKEN"], - transport="webhook", - require_link=True, - ), - # ActivityChannel(app_id=..., require_link=True), # future — same flag - ], -) -host.serve(host="0.0.0.0", port=8000) -``` - -The flow: - -1. `alice` sends her first message on Telegram. The `TelegramChannel` extracts `ChannelIdentity(channel="telegram", native_id="")` and asks the linker `is_linked(...)`. It is not. Because `require_link=True`, the channel does **not** invoke the agent; instead it asks `linker.begin(channel_identity)` for a `LinkChallenge`, renders the challenge URL into Telegram (clickable button), and returns. -2. `alice` clicks the button, signs in with Microsoft Entra ID, and the OAuth callback hits the linker's route. `linker.complete(...)` verifies the authorization code and records **two things atomically** in the identity store: - - `(channel="telegram", native_id="") → isolation_key="hk_018f…a3"` - - `verified_claim("microsoft.oid", "") → isolation_key="hk_018f…a3"` -3. `alice` replies on Telegram. The channel sees the link is now present, resolves the existing `isolation_key`, and forwards the message to the agent normally. From here on, Telegram chats are routed without further ceremony. -4. The next day, `alice` opens Teams. The `ActivityChannel` extracts both the channel-native identity (`activity`, ``) **and** the verified IdP claim from the inbound activity (Teams already authenticates with Entra ID via Bot Service, so the AAD object id is trusted). It asks the linker `is_linked(...)`. The `(activity, )` pair is **not** in the store — but the verified claim `("microsoft.oid", "")` **is**. The linker auto-merges `(activity, ) → isolation_key="hk_018f…a3"` without any user-visible `/link` ceremony. -5. From the next turn on, both Telegram and Teams resolve to the **same** `isolation_key` and the **same** `AgentSession`. The agent sees the conversation history from both channels as one continuous thread. - -The two enabling pieces: - -- **`require_link: bool` on the channel** — when `True`, the channel checks the linker before dispatching every inbound request. Un-linked identities are short-circuited to a rendered `LinkChallenge` instead of an agent invocation. Default is `False` (the opportunistic flow below). -- **Verified IdP claims in the linker's identity store** — when an OAuth ceremony completes, the linker records the verified identity claim (e.g. `(microsoft.oid, )`) alongside the channel-native identity. Channels that can supply the same kind of verified claim from their own auth context (Teams via the AAD bearer on the activity, future M365 channels via the same bearer, …) get **auto-linked silently** on first contact when their claim matches an existing entry. This is what makes "sign in once on Telegram, Teams just works" possible without any per-channel link ceremony. - -**Variant — opportunistic linking (`require_link=False`).** Leave the flag at its default and the channel will dispatch un-linked identities straight to the agent (the host's default resolver auto-issues a fresh `isolation_key` for them). The user can later run the `link` `ChannelCommand` manually to merge that auto-issued key onto an existing one. This is the lower-friction onboarding flow at the cost of allowing pre-link conversations to exist in their own isolated session until merged. - -**Variant — alternative ceremony.** Swapping the linker for `OneTimeCodeIdentityLinker(...)` changes the ceremony to "complete `/link` on channel A, get a 6-digit code, run `/link 482931` on channel B"; with `require_link=True` the channel just renders the code-entry instructions instead of an OAuth URL. Apps with their own corporate identity namespace can additionally pass a custom `identity_resolver` so the post-link `isolation_key` is the corporate user id instead of the host-issued opaque key. Channels themselves are unchanged across these variants — only the linker and (optionally) the resolver change. - -### Scenario 7: Trusted server-side caller relays a Responses request and pushes the answer back to the user's Telegram chat - -A developer runs an internal application server that already knows its end users (e.g. via an SSO session) and wants to expose **two surfaces against the same agent**: the OpenAI-compatible **Responses API** (so the application backend can drive the agent programmatically on behalf of the signed-in user) and **Telegram** (so the same end user can also chat with the agent directly). When the application backend submits a Responses call, it should be possible to (a) link that call to the same `isolation_key` as the user's existing Telegram chats — so the agent sees one continuous conversation history — and optionally (b) have the agent's response pushed back to the user's Telegram chat instead of (or in addition to) being returned synchronously on the Responses HTTP call. - -This works **without** an `IdentityLinker` because the application backend is a **trusted relay**: it already authenticated the user through its own SSO and knows both the user's app-internal id and (because the user has previously connected their Telegram account in the application's own settings page) the user's Telegram `chat_id`. The host just needs to be told. - -```mermaid -sequenceDiagram - autonumber - actor Backend as Server-side backend - participant Resp as ResponsesChannel - participant Host - participant Hook as run_hook
(responses_relay_hook) - participant Store as HostStateStore
(continuations) - participant Target as Agent - participant Runner as DurableTaskRunner - participant Tg as TelegramChannel - - Backend->>Resp: POST /v1/responses
extra_body.hosting.push_to_telegram_chat_id= - Resp->>Host: ChannelRequest(...) - Host->>Hook: run_hook(request, context) - Hook->>Hook: rewrite to
background=True,
response_target=identities([tg:]) - Host->>Store: write continuation(token, status=in_progress) - Host-->>Resp: ContinuationToken (token) - Resp-->>Backend: 200 with continuation token - - Note over Host,Target: background task - - Host->>Target: run (async) - Target-->>Host: AgentResponse - Host->>Store: continuation.complete(token, result) - Host->>Runner: schedule("hosting.push",
payload for tg:) - Runner->>Host: _handle_push_task(payload) - Host->>Tg: response_hook → push - Tg-->>User: answer arrives in Telegram chat -``` - -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting`, `agent-framework-hosting-responses`, and `agent-framework-hosting-telegram` are installed -> - The application backend can attach two extra fields to its Responses call: an `app_user_id` (the user's stable id in the application's own namespace) and, optionally, a `push_to_telegram_chat_id` (the user's known Telegram chat id from the application's own database) - -```python -import os -from dataclasses import replace - -from agent_framework.hosting import ( - AgentFrameworkHost, - ChannelIdentity, - ChannelRequest, - IdentityResolver, - ResponseTarget, - ResponsesChannel, - TelegramChannel, -) - - -# A custom identity resolver that promotes the app's own user id to the -# isolation_key whenever a channel can supply one. The Telegram channel exposes -# the chat_id (pre-registered in the application's settings page → so the -# application maps chat_id → app_user_id and tells the host); the Responses -# channel exposes the app_user_id directly via extra_body (see run_hook below). -async def app_identity_resolver(identity: ChannelIdentity, **_) -> str | None: - # Both channels populate ChannelIdentity.attributes["app_user_id"] — see - # the run hooks below. - return identity.attributes.get("app_user_id") - - -# Telegram channel maps Telegram chat_id → app_user_id from the application's -# pre-registered chat-id table. Cached locally; in real apps this is whatever -# lookup matches the application's own user-account schema. -KNOWN_TELEGRAM_USERS: dict[str, str] = { - "": "user_alice", - # ... -} - - -async def telegram_promote_app_user(request: ChannelRequest, **_) -> ChannelRequest: - chat_id = request.identity.native_id - app_user_id = KNOWN_TELEGRAM_USERS.get(chat_id) - if app_user_id is None: - return request # falls back to host's auto-issued isolation_key - return replace( - request, - identity=replace( - request.identity, - attributes={**request.identity.attributes, "app_user_id": app_user_id}, - ), - ) - - -# The application backend POSTs to /responses with -# -# { -# "model": "...", -# "input": "...", -# "extra_body": { -# "hosting": { -# "app_user_id": "user_alice", # who this request is for -# "push_to_telegram_chat_id": "", # optional -# } -# } -# } -# -# The Responses channel surfaces extra_body["hosting"] on -# ChannelRequest.attributes["hosting"]; this run_hook reads it and rewrites -# both the identity (so the request resolves to the same isolation_key as the -# user's Telegram chats) and the response_target (so the answer is pushed to -# Telegram in addition to / instead of the synchronous Responses reply). -async def responses_relay_hook(request: ChannelRequest, **_) -> ChannelRequest: - hosting = request.attributes.get("hosting", {}) - app_user_id = hosting.get("app_user_id") - push_chat_id = hosting.get("push_to_telegram_chat_id") - - if app_user_id is None: - return request # plain Responses call, no relay → keep defaults - - # Promote app_user_id onto the identity so the resolver returns it as - # isolation_key. - new_identity = replace( - request.identity, - attributes={**request.identity.attributes, "app_user_id": app_user_id}, - ) - - # If the caller also supplied a Telegram chat id, push the answer there - # via ResponseTarget.identities (explicit recipient — bypasses the link - # store, which is empty for this user since no link ceremony ran). The - # Responses HTTP call returns a ContinuationToken so the application - # backend can correlate. - if push_chat_id: - return replace( - request, - identity=new_identity, - response_target=ResponseTarget.identities([ - ChannelIdentity(channel="telegram", native_id=push_chat_id), - ]), - background=True, - ) - - return replace(request, identity=new_identity) - - -host = AgentFrameworkHost( - target=agent, - identity_resolver=IdentityResolver(app_identity_resolver), - channels=[ - ResponsesChannel(run_hook=responses_relay_hook), - TelegramChannel( - bot_token=os.environ["TELEGRAM_BOT_TOKEN"], - transport="webhook", - run_hook=telegram_promote_app_user, - ), - ], -) -host.serve(host="0.0.0.0", port=8000) -``` - -The flow: - -1. Alice has previously connected her Telegram account on the application's settings page; the application stored `chat_id_of_alice → user_alice` in `KNOWN_TELEGRAM_USERS` (a real deployment uses a database). -2. Alice opens the application's web UI and types a question. The application backend (signed in as `user_alice`) calls the Responses API mounted on this host with `extra_body={"hosting": {"app_user_id": "user_alice"}}` (and no `push_to_telegram_chat_id`). The `responses_relay_hook` promotes `app_user_id` onto the identity, the resolver returns `isolation_key="user_alice"`, the agent runs, and the answer is returned synchronously over HTTP. The agent's `HistoryProvider` appends both turns to the session keyed by `user_alice`. -3. Later, Alice messages the same agent on Telegram from her registered chat. The Telegram channel's `run_hook` promotes `app_user_id="user_alice"` onto the identity (because her chat_id is in the known-users table), the resolver returns the **same** `isolation_key="user_alice"`, the agent loads the **same** session — and sees the earlier turn from the web UI. **One continuous conversation across two channels, no link ceremony required, no `IdentityLinker` configured.** -4. Now Alice walks away from her desk. The application backend wants to fire a long-running task on her behalf and have the answer reach her on Telegram. It calls the Responses API with `extra_body={"hosting": {"app_user_id": "user_alice", "push_to_telegram_chat_id": ""}}`. The `responses_relay_hook` rewrites the request to `background=True` and `response_target=ResponseTarget.identities([ChannelIdentity("telegram", "")])`. The Responses HTTP call returns a `ContinuationToken` immediately (so the application backend can correlate); when the agent completes, the host calls `TelegramChannel.push(ChannelIdentity("telegram", ""), result)` and the answer arrives in Alice's Telegram chat. - -The two enabling pieces: - -- **`extra_body["hosting"]` as a developer-controlled relay envelope.** The Responses channel surfaces an opaque `hosting` block from `extra_body` onto `ChannelRequest.attributes["hosting"]`. The hosting core does **not** define what goes in there — the developer decides what their trusted backend may carry (here `app_user_id` and `push_to_telegram_chat_id`) and reads it in their `run_hook`. This is the same pattern the `store=` table calls out for richer per-call control. -- **`ResponseTarget.identities([...])` for explicit caller-known recipients.** This bypasses the link store and pushes to a channel-native identity the caller already knows. Use it when the originating caller is a trusted relay that authenticated the user through some other means (corporate SSO, an internal API key bound to a user) and just needs the host to dispatch. `LinkPolicy` is still consulted per delivery, so a corp-tier Responses call cannot smuggle a public-tier Telegram push if the policy disallows it. +The host owns one Starlette app. Each channel contributes its own routes and renders its own response. -**Variant — same scenario with an `IdentityLinker` configured.** If the host *does* have an `IdentityLinker` (Scenario 6), the application backend doesn't need to maintain its own `chat_id → app_user_id` table at all: when Alice runs `/link` once on Telegram, the linker records the channel-native identity against `isolation_key="user_alice"` (resolved from the Entra OAuth claim that matches the application's own SSO). After that, the run hook can simply use `ResponseTarget.channel("telegram")` (link-store recipient) instead of `ResponseTarget.identities([...])`. The explicit-identities variant remains useful when the application owns identity end-to-end and prefers not to delegate to a host-level linker. - -### Scenario 8: Background run with cross-channel response delivery - -A developer wants the user to start a long-running task on Telegram and pick up the response on Teams (whichever channel the user happens to be on when the result is ready). The originating Telegram message returns a `ContinuationToken` immediately; when the agent completes, the host pushes the result to the user's currently active channel via `ChannelPush`. A poll route is also exposed for callers that prefer polling. - -```mermaid -sequenceDiagram - autonumber - actor User - participant Tg as TelegramChannel - participant Host - participant Hook as run_hook - participant Store as HostStateStore
(continuations · last_seen) - participant Target as Agent - participant Runner as DurableTaskRunner - participant Act as ActivityChannel - - User->>Tg: long-running ask - Tg->>Host: ChannelRequest(identity=tg:12345) - Host->>Hook: run_hook - Hook->>Hook: background=True,
response_target=active - Host->>Store: write continuation(in_progress) - Host-->>Tg: ContinuationToken - Tg-->>User: "working on it…" - - Note over User: user opens Teams,
last_seen updates to "activity" - - User->>Act: hello on Teams - Act->>Host: ChannelRequest(identity=teams-aad-oid) - Host->>Store: record_last_seen(isolation_key, activity, now) - - Note over Host,Target: background completes - - Target-->>Host: AgentResponse - Host->>Store: get_last_seen(isolation_key) → activity - Host->>Runner: schedule("hosting.push",
payload for activity) - Runner->>Host: _handle_push_task - Host->>Act: push - Act-->>User: answer arrives on Teams -``` - -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting`, `agent-framework-hosting-telegram`, and the (future) `agent-framework-hosting-activity` channel are installed -> - The user is already linked across Telegram and Teams (Scenario 6) +### Adapting a request before execution ```python -import os from dataclasses import replace -from agent_framework.hosting import ( - AgentFrameworkHost, - ChannelRequest, - ResponseTarget, - TelegramChannel, -) - -# Override the Telegram channel default: any inbound message becomes a -# background run delivered to the user's currently active channel. -async def telegram_background(request: ChannelRequest, **kwargs) -> ChannelRequest: - return replace( - request, - background=True, - response_target=ResponseTarget.active, - ) +def enforce_options(request: ChannelRequest) -> ChannelRequest: + options = dict(request.options or {}) + options["temperature"] = 0 + return replace(request, options=options) host = AgentFrameworkHost( target=agent, - identity_linker=linker, # from Scenario 6 - channels=[ - TelegramChannel( - bot_token=os.environ["TELEGRAM_BOT_TOKEN"], - transport="webhook", - run_hook=telegram_background, - ), - # ActivityChannel(...), # future - ], + channels=[ResponsesChannel(run_hook=enforce_options)], ) -host.serve(host="0.0.0.0", port=8000) -``` - -The flow: - -1. `alice` sends a Telegram message that triggers a long-running tool. The Telegram channel produces a `ChannelRequest`; the hook flips `background=True` and sets `response_target=ResponseTarget.active`. -2. `host.run_in_background(request)` returns a `ContinuationToken(token="ct_018f…", status="queued")`. The Telegram channel acknowledges with a short "Working on it…" reply that includes the token (it could equally render a "Cancel" inline button bound to the token). -3. The host runs the target asynchronously. When complete, it resolves `ResponseTarget.active` against the host-tracked last-seen channel for `isolation_key="alice@contoso.com"`. If `alice` is currently on Teams, the host calls `ActivityChannel.push(channel_identity, hosted_run_result)`; if she is still on Telegram, it calls `TelegramChannel.push(...)` (so the same setup gracefully degrades to "reply on Telegram if she never switched"). -4. `ContinuationToken` is updated to `status="completed"` with the populated `result`. Any caller can poll `GET /telegram/runs/{continuation_token}` (or the equivalent route the channel exposes) to retrieve the run state by id. - -Variants without changing channel code: - -- `ResponseTarget.channel("activity")` — always deliver to Teams, regardless of where the user is. -- `ResponseTarget.all_linked` — broadcast to every channel `alice` has linked. -- `ResponseTarget.none` — fully detached: caller polls `host.get_continuation(token)` (or the channel's poll route); no proactive push. -- `background=False` with `response_target=ResponseTarget.active` — synchronous wait, but result still routed away from the originating channel (rare; mostly useful for pipelines where the originating call is a programmatic trigger and the human user lives elsewhere). - -If the chosen destination channel does not implement `ChannelPush` (e.g. Responses), the host falls back to the `originating` channel and records the fallback in telemetry. This makes the Responses + background-run combo work as "submit on Responses, poll on Responses" without surprising silent drops. - -### Scenario 9: Hosting a `Workflow` instead of an agent (with checkpoint storage) - -The host shape is unchanged when the target is a `Workflow`; the result wrapper narrows to `HostedRunResult[WorkflowRunResult]` and `response_hook` carries the projection that lets text-only channels render workflow output. - -```mermaid -sequenceDiagram - autonumber - actor User - participant Channel - participant Host - participant Workflow - participant Store as HostStateStore
(workflow_checkpoints) - participant Hook as response_hook
(per-channel) - - User->>Channel: message - Channel->>Host: ChannelRequest - Host->>Workflow: run(messages) - loop per executor / event - Workflow->>Store: write checkpoint - Workflow-->>Host: WorkflowEvent - end - Workflow-->>Host: WorkflowRunResult - Host->>Host: wrap → HostedRunResult[WorkflowRunResult]
(get_outputs, get_final_state, …) - - alt text-only channel - Host->>Hook: response_hook(result, context) - Hook->>Hook: result.replace(
result=AgentResponse(
text=workflow.get_outputs()[-1])) - Hook-->>Host: HostedRunResult[AgentResponse] - Host->>Channel: push (sync or via runner) - else card-capable channel - Host->>Hook: response_hook(result, context) - Hook->>Hook: render adaptive card
from workflow get_outputs - Hook-->>Host: HostedRunResult[Any] - Host->>Channel: push - end - Channel-->>User: reply (channel-native) ``` -> **Prerequisites:** This sample assumes: -> - `agent-framework-hosting` and `agent-framework-hosting-invocations` are installed -> - A `Workflow` definition with typed inputs (`OrderIntakeInputs`) -> - A directory writable by the host process for workflow checkpoints +### Workflow with explicit checkpoints ```python -from dataclasses import dataclass, replace -from pathlib import Path - -from agent_framework import FileCheckpointStorage, WorkflowBuilder -from agent_framework.hosting import ( - AgentFrameworkHost, - ChannelRequest, - InvocationsChannel, -) - - -@dataclass -class OrderIntakeInputs: - customer_id: str - sku: str - quantity: int - - -# Build the workflow with a CheckpointStorage so individual executor frames -# are persisted as the workflow runs. FileCheckpointStorage writes one file -# per checkpoint under the configured directory; survives host restarts. -checkpoint_storage = FileCheckpointStorage(directory=Path("./.af-hosting/checkpoints/")) - -workflow = ( - WorkflowBuilder(checkpoint_storage=checkpoint_storage) - .add_executor(...) # application-defined - .build() -) - - -def adapt_to_workflow_inputs(request: ChannelRequest, *, protocol_request=None, **kwargs) -> ChannelRequest: - # The channel produces a default ChannelRequest with text input. The workflow - # needs typed OrderIntakeInputs — the hook is the adapter point. The same - # hook is the place to surface a caller-supplied checkpoint id (to resume - # an interrupted run) by promoting it onto request.attributes; the host's - # workflow dispatch reads it on the way to Workflow.run(...). - payload = protocol_request # raw Invocations request body - inputs = OrderIntakeInputs( - customer_id=payload["customer_id"], - sku=payload["sku"], - quantity=int(payload["quantity"]), - ) - new_attrs = dict(request.attributes) - if checkpoint_id := payload.get("resume_from_checkpoint"): - new_attrs["workflow.checkpoint_id"] = checkpoint_id - return replace(request, input=inputs, attributes=new_attrs) - - host = AgentFrameworkHost( target=workflow, - channels=[ - InvocationsChannel(run_hook=adapt_to_workflow_inputs), - ], + channels=[InvocationsChannel(run_hook=adapt_to_workflow_input)], + checkpoint_location=Path("./.af-hosting/workflow_checkpoints"), ) -host.serve(host="localhost", port=8000) ``` -The host detects that `target` is a `Workflow` and dispatches the resulting `ChannelRequest.input` to `Workflow.run(...)` instead of `SupportsAgentRun.run(...)`. The channel does not need to know which kind of target it is fronting — `HostedRunResult` and `HostedStreamResult` are normalized across both seams. The same workflow target could equally be exposed on Telegram or a Responses channel by supplying the appropriate `run_hook` to translate inbound chat messages into typed workflow inputs. +The hook adapts channel-native input to the workflow's typed input. Checkpoints use the explicit workflow checkpoint location, not identity-link or delivery storage. -**Checkpoint storage** is wired onto the workflow itself (via `WorkflowBuilder(checkpoint_storage=...)` or per-run via `Workflow.run(..., checkpoint_storage=...)`), **not** on the host. The host treats it as workflow-runtime state — structurally distinct from the `HostStateStore` (which persists `ContinuationToken`s, identity-link grants, and last-seen records — host-execution metadata, not workflow internals) and from `ContextProvider` (per-conversation context). All three protocols stay separate, but a deployment MAY back them with the same physical store. When `request.attributes["workflow.checkpoint_id"]` is set (as the run hook does above when the caller supplies `resume_from_checkpoint`), the host's workflow dispatch path passes it through to `Workflow.run(checkpoint_id=...)` so the workflow resumes from that frame instead of running from scratch — useful for long-running intake flows that survive host restarts or retries. - -### Scenario 10: Authoring a new channel package - -The shape any new channel follows: parse external protocol → produce default `ChannelRequest` → optionally apply hook → `context.run(...)` / `context.stream(...)` → serialize back. +### Message channel reset command ```python -from starlette.requests import Request -from starlette.responses import JSONResponse -from starlette.routing import Route - -from agent_framework.hosting import ( - Channel, - ChannelContext, - ChannelContribution, - ChannelRequest, - ChannelSession, -) - - -class MyWebhookChannel: - name = "mywebhook" - - def __init__(self, *, path: str = "/mywebhook") -> None: - self._path = path - - def contribute(self, context: ChannelContext) -> ChannelContribution: - async def endpoint(request: Request) -> JSONResponse: - payload = await request.json() - channel_request = ChannelRequest( - channel=self.name, - operation="message.create", - input=payload["text"], - session=ChannelSession( - key=payload["thread_id"], - isolation_key=payload["account_id"], - ), - ) - result = await context.run(channel_request) - # See "Result is rich, not just text" below — `result.text` is the - # plain-text projection; this channel chooses to also surface - # citations and any tool-call traces it cares about. The exact - # serialization is the channel's call. - return JSONResponse(_render_for_mywebhook(result)) - - return ChannelContribution(routes=[Route(f"{self._path}/inbound", endpoint, methods=["POST"])]) +async def new_chat(context): + if context.request.session is not None: + await context.host.reset_session(context.request.session.isolation_key) + await context.reply("Started a new conversation.") ``` -**Result is rich, not just text.** `result` here is a `HostedRunResult[TResult]` — a thin generic envelope around the target's **full-fidelity output**. For agent targets `TResult` narrows to `AgentResponse`, so channels read everything the target produced directly off `result.result`: - -- the full `messages: list[ChatMessage]` thread the agent produced this turn — each message holds an ordered list of typed `Contents` (see [`Contents` in core](https://github.com/microsoft/agent-framework/blob/main/python/packages/core/agent_framework/_types.py)): `TextContent`, `DataContent` (inline base64 blobs), `UriContent` (URLs to images/audio/files), `FunctionCallContent` and `FunctionResultContent` (tool-call traces), `HostedFileContent` / `HostedVectorStoreContent` (provider-side file/vector references), `UsageContent` (token usage), `ErrorContent`, `TextReasoningContent` (reasoning traces), and channel-extensible custom content kinds. Each content also has `additional_properties` for provider-specific extensions (citations, image alt text, source spans, …), -- `value: T | None` — the typed structured output when the agent returned one (e.g. via response-format / structured-output features), -- `response_id`, `usage_details: UsageDetails | None`, `raw_representation`, and per-message `additional_properties` carrying provider-native extras. - -For workflow targets `TResult` is `WorkflowRunResult`, so `result.result.get_outputs()` iterates the per-executor output payloads and `result.result.get_final_state()` exposes terminal-state info. The host never collapses or pre-shapes workflow outputs — channels (and developer-supplied `response_hook`s) own the projection, since "what counts as a renderable output" is wire-format-specific. - -A channel author is free to project this into **whatever the channel's native shape supports**. Examples: - -- The built-in **Telegram channel** renders `text` segments with Telegram's `MarkdownV2` parse mode (escaping the special set), uploads `DataContent` images via `sendPhoto` and audio via `sendAudio` as separate Telegram messages in the same chat, and emits inline-button keyboards from `FunctionCallContent` traces when the channel is configured to surface tool calls as user-confirmable actions. Citations attached to a `TextContent.additional_properties["citations"]` slot are rendered as numbered footnote links the user can tap. -- The built-in **Responses channel** preserves the full content-list shape on the wire — every `ChatMessage` round-trips as a Responses-shaped output item so callers can inspect the typed mix of text, function-call traces, image/file outputs, reasoning, and structured-output `value`s exactly as the agent produced them. There is no lossy collapse to a single text field. -- A channel fronting a **chat UI** can render `TextContent` as full GitHub-Flavored Markdown / HTML (tables, code fences with syntax highlighting, math), `DataContent` and `UriContent` as inline images/audio/video players, `FunctionCallContent` / `FunctionResultContent` as collapsible "tool ran" cards, and `TextReasoningContent` as a collapsible reasoning panel — all from the same `result`. -- A **voice channel** can route `TextContent` through TTS, play `DataContent(audio/*)` directly, and surface `FunctionCallContent` only as audio earcons (or skip them entirely) — the same `result` object drives a completely different surface. -- A **richly-typed RPC channel** can return `result.result.value` (the structured output) directly when the workflow / agent produced one, and fall back to a text projection only when no typed output is available. - -The host imposes no projection — channels read `result.result.text` for a convenience plain-text rollup on agent targets, but are encouraged to lean on the full underlying payload when their protocol supports more. - -## Information Design - -### Canonical flow - -The default request/response shape — single channel, originating response, no fan-out. Authorization runs before `run_hook`; `response_hook` runs per-destination (here just one). - -```mermaid -sequenceDiagram - autonumber - actor User - participant Channel as Channel
(inbound) - participant Host - participant Auth as host.authorize - participant Target as Agent / Workflow - participant Annot as _annotate_intended_targets - - User->>Channel: native payload (webhook / poll / HTTP body) - Channel->>Channel: parse → ChannelRequest
(identity, conversation_id, content, response_target=originating) - Channel->>Host: dispatch(ChannelRequest) - Host->>Auth: authorize(identity, require_link, allowlist) - alt Denied / LinkRequired - Auth-->>Host: Denied(reason_code, user_message)
or LinkRequired(challenge) - Host-->>Channel: render denial / link challenge
(channel-appropriate UX) - Channel-->>User: short refusal in-room - else Allowed(isolation_key) - Host->>Host: resolve session via StateStore - Host->>Host: run_hook(request, context) - Host->>Target: target.run(messages, session=...) - Target-->>Host: AgentResponse / WorkflowRunResult - Host->>Host: wrap → HostedRunResult[TResult] - Host->>Annot: write hosting.intended_targets
onto assistant message - Host->>Channel: response_hook(result, context) - Channel->>Channel: shape to native payload - Channel-->>User: reply on originating wire - end -``` - -The textual trace of the same flow (showing more of the per-step bookkeeping): - -```text -external request/event - -> channel-specific parsing + validation - -> ChannelIdentity extraction (per-channel native id) - -> default channel invocation mapping - -> optional run_hook (dev-supplied; default no-op) - -> ChannelRequest (carries response_target, background, echo_input) - -> AgentFrameworkHost / ChannelContext - -> identity_resolver(ChannelIdentity) -> isolation_key - -> host records (isolation_key, channel, now) as last-seen (for ResponseTarget.active) - -> AgentSession resolution (per session_mode, scoped by isolation_key) - -> target execution seam (Agent.run / Workflow.run) - -> HostedRunResult[AgentResponse] | HostedRunResult[WorkflowRunResult] - (full-fidelity result carried unchanged; no pre-shaping by the host) - -> [foreground] fan-out: - for each destination resolved from ResponseTarget: - -> clone HostedRunResult envelope (per-destination isolation; shallow copy) - -> optional channel response_hook (dev-supplied; default = identity) - -> hook receives ChannelResponseContext(request, channel_name, destination_identity, originating, is_echo) - -> hook may rebind result via HostedRunResult.replace(result=...) - (e.g. project a WorkflowRunResult to an AgentResponse for a text-only wire) - -> channel-native serialization (channel chooses what content types / outputs it can render) - -> channel.push(identity, shaped_payload) | originating return value - if ResponseTarget.echo_input is True: - each non-originating destination receives the user's input first - (synthesised as a HostedRunResult[AgentResponse] with a role="user" message), - then the agent reply. Both pushes execute inside the same scheduled - push task; an echo-push failure is logged and swallowed so the - response push on the same destination is still attempted. - -> [background or response_target != originating] - -> ContinuationToken returned immediately to originating channel - -> target executes asynchronously - -> on completion, the same fan-out (clone + response_hook + push) applies - -> ContinuationToken updated; available via host.get_continuation(token) and channel poll routes -``` - -**Full-fidelity contract.** The host never collapses agent / workflow output. `HostedRunResult[TResult]` carries the target output unchanged: agent targets see the full `AgentResponse` (multi-modal `messages`, `value`, `usage_details`, `response_id`, …); workflow targets see the full `WorkflowRunResult` (per-executor outputs via `get_outputs()`, terminal state via `get_final_state()`). Each channel — through its `response_hook` and its own serializer — decides what subset its wire can carry. A text-only channel iterates `result.result.messages` (or projects the workflow's outputs into a single text turn via a response hook); a card-capable channel inspects the underlying contents directly. - -**Per-destination cloning.** Before invoking a channel's `response_hook`, the host clones the `HostedRunResult` envelope so one channel's `replace(result=...)` cannot leak into the payload another destination observes. The clone is shallow — channels that need to mutate `result` itself (rather than rebind it) own the deep copy. - -**`response_hook` is a channel-level convention, not part of the `Channel` Protocol.** Channels expose a `response_hook` attribute (callable accepting `(result, *, context: ChannelResponseContext) -> HostedRunResult[Any] | Awaitable[HostedRunResult[Any]]`). The host duck-types this attribute. Adding hook support to an existing channel package does not break the public `Channel` Protocol. - - -A parallel **link ceremony flow** runs out-of-band when a user invokes the host-provided `link`/`connect` command on a channel: - -```text -channel /link command - -> linker.begin(ChannelIdentity) -> LinkChallenge - -> channel-specific rendering (URL, code, MFA prompt) - -> user completes the ceremony out-of-band (browser, second channel, MFA app) - -> linker callback/verification route - -> linker.complete(challenge_id, proof) -> isolation_key - -> host atomically associates (channel, native_id) -> isolation_key - -> subsequent requests resolve to the linked AgentSession -``` - -### Inbound ownership - -| Concern | Owned by | Notes | -|---|---|---| -| HTTP / WebSocket route shape | Channel package | e.g. `/responses`, `/responses/ws`, `/invocations`, `/telegram/webhook` — channels may contribute either or both | -| Protocol request model | Channel package | e.g. Responses items (HTTP body or WS frames), Invocations body, Telegram webhook payload | -| Signature/auth validation | Channel package or host middleware | channel-specific unless generic Starlette middleware | -| Request-to-agent invocation mapping | Channel package + optional `run_hook` | forwards caller parameters into `ChannelRequest.options`, chooses `session_mode`, can enforce extra app policy | -| Native command catalog | Channel package using host-defined `ChannelCommand` | e.g. Telegram bot commands, future Activity Protocol slash-command / adaptive-card surfaces, WhatsApp menus | -| Command registration at startup | Channel package | e.g. Telegram `set_my_commands(...)` | -| Command dispatch | Channel package | commands may reply locally, manipulate channel-owned state, or invoke the agent | -| Normalized input to the agent | Host core | `ChannelRequest.input` reuses `AgentRunInputs` | -| Session resolution | Host core | based on `ChannelSession` + `ChannelRequest.session_mode`; storage specifics deferred | -| Channel-native identity extraction | Channel package | populates `ChannelIdentity(channel, native_id, attributes)` per request | -| Identity resolution (`native_id` → `isolation_key`) | Host core via `IdentityResolver` | default **auto-issues and persists** a per-user `isolation_key` on first contact per `(channel, native_id)`; user-supplied resolver can return app-owned identities directly | -| Identity store (`(channel, native_id) → isolation_key`) | Host core via `HostStateStore` | file-based by default in v1 (`FileHostStateStore`); pluggable for Cosmos / SQL / Redis in fast follow (req #24). Owns auto-issuance and atomic merge-on-link. | -| Identity link ceremony (OAuth / MFA / one-time code) | Host core via `IdentityLinker` | linker contributes its own routes + lifecycle; channels surface a built-in `link`/`connect` command | -| Authorization (allowlist + link enforcement) | Host core via `host.authorize(...)` + per-channel `IdentityAllowlist` | tri-state allowlist evaluated pre- and post-link; combines with `require_link` to produce one of three named profiles (open / forced-link / allowlist); see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) | -| Link & delivery policy across confidentiality tiers | Host core via `LinkPolicy` | consulted at link time (refuse incompatible link attempts) and at delivery time (drop incompatible `ResponseTarget` destinations); built-in policies cover all-allow, same-tier, explicit allow-list, deny-all | -| Active-channel tracking | Host core | updated on every successfully resolved request; consumed by `ResponseTarget.active` | -| Response-target resolution | Host core | translates `ResponseTarget` (originating, active, specific, list, all_linked, none) into an ordered set of `(channel, ChannelIdentity)` deliveries | -| Proactive outbound delivery | Channel package via optional `ChannelPush` capability | channels that can push (Telegram, Activity Protocol via Bot Service, webhook, SSE) implement `push(identity, result)`; channels that can't are only valid as `originating` targets | -| Per-delivery audit + replay state | Host core writes intent-only — the resolved destination set onto the assistant `Message.additional_properties["hosting"]["intended_targets"]` (immutable, single write). Operational state (attempts, retries, last error, success timestamp) lives in the `DurableTaskRunner` and is observed via the runner's own backend. | Replay across host restarts is a property of the configured runner (native for durable adapters; not supported for `InProcessTaskRunner`). See [Intended targets + durable delivery](#intended-targets--durable-delivery) and [Durable task runner](#durable-task-runner). | -| Background-run lifecycle | Host core | owns `ContinuationToken` issuance, async execution, completion notification; persists via `HostStateStore` (file-based default — survives restarts) | -| Run poll routes | Channel package | each channel exposes its own protocol-shaped poll route (`/responses/{continuation_token}`, `/invocations/{continuation_token}`) backed by `host.get_continuation(token)` | -| Conversation history (all channels — Responses, Invocations, Telegram, Activity Protocol, …) | Agent's core `HistoryProvider` (`agent_framework._sessions.HistoryProvider`) | Channels project their wire id (`previous_response_id`, `conversation_id`, request body `session_id`, host-tracked alias, …) into `ChannelSession.key`; the host resolves an `AgentSession` and the agent's `HistoryProvider` does the load / append. No channel-specific history seam. Multi-provider composition (with a single `load_messages=True`) is the standard AF convention; see [Conversation history for the Responses channel](#conversation-history-for-the-responses-channel) for the Foundry-backed variant. | -| Channel-owned non-message per-thread state (e.g. AG-UI `client_state`) | Channel-shipped `ContextProvider` subclass written into the same per-source state slot | Reuses the existing `ContextProvider` seam — *not* a new storage protocol. Channel reads `ChannelRequest.client_state` in `before_run`, lets the agent observe/mutate the slot, then reads the post-run value in `after_run` to emit channel-specific events (e.g. AG-UI `StateSnapshotEvent` / `StateDeltaEvent`). Composition rules unchanged (one `HistoryProvider` carries `load_messages=True`; additional `ContextProvider`s attach alongside). See [Channel-owned per-thread state](#channel-owned-per-thread-state). | -| Agent invocation | Host core | always through the target's execution seam — `SupportsAgentRun.run(...)` for agent targets, `Workflow.run(...)` for workflow targets | -| Protocol response/event model | Channel package | core returns agent results; channel serializes them | -| ASGI server bootstrap | Host core convenience | `host.serve(...)` for default uvicorn path; `host.app` for custom hosting | - -### Channel session-carriage models - -Channels split into two families based on **who owns the session identifier across requests**. This distinction is invisible to the agent target, but it changes which host-side mechanisms are load-bearing for that channel. - -| Model | Examples | `ChannelSession.key` source | How a caller starts a new thread | -|---|---|---|---| -| **Caller-supplied session** | Responses (`previous_response_id` / `conversation_id`), Invocations, A2A, MCP — generally any HTTP/RPC-shaped channel | The wire payload carries it; the channel parses it into `ChannelSession.key`. `None` means "ephemeral / fresh thread". | Omit the previous id (or send a fresh one). The caller is in control. | -| **Host-tracked session** | Telegram, Activity Protocol via Azure Bot Service (Teams/Web Chat/Slack/…), WhatsApp — generally any chat surface whose protocol carries identity (`chat_id`, AAD oid, `from.id`) but no per-conversation key | The channel leaves `ChannelSession.key = None` and lets the host's per-`isolation_key` alias decide which `AgentSession` to resolve (rule #8 below). | The channel surfaces a `/new`-style command (a `ChannelCommand`) that calls `host.reset_session(isolation_key)`; the host's session-id alias rotates. There is no in-band way for the user to address a specific past thread. | - -Identity is an **orthogonal axis** (anonymous vs. identified). The realized cells in v1 are: - -| | Anonymous | Identified | -|---|---|---| -| **Caller-supplied session** | ✓ — bare `curl /responses` + `previous_response_id`. The id effectively *is* the identity (the resolver may project `previous_response_id` into the `isolation_key` for that turn). | ✓ — Responses + `safety_identifier`, or any caller-supplied channel behind a JWT/OAuth bearer that the resolver maps to an `isolation_key`. | -| **Host-tracked session** | n/a in v1 | ✓ — Telegram / Activity Protocol (Bot Service) / WhatsApp. The channel always authenticates; the resolver maps `(channel, native_id)` to `isolation_key`. | - -**Channel-author guidance.** When implementing a new channel: - -- If your upstream protocol carries a per-conversation identifier on every request, populate `ChannelSession.key` from it. You are a **caller-supplied** channel. `host.reset_session(...)` is **not** the right primitive for your `/new`-equivalent (your callers control that by simply omitting the previous id). Cross-channel linking via `IdentityLinker` is opt-in and depends on whether you also extract a stable identity (header, JWT, etc.) into `ChannelIdentity`. -- If your upstream protocol carries identity but **no** per-conversation key, leave `ChannelSession.key = None`. You are a **host-tracked** channel. To support "start a fresh thread", expose a channel-native command (Telegram `/new`, Teams adaptive-card button, …) that invokes `host.reset_session(isolation_key)` — the host alias rotation does the rest, and prior history remains addressable under its previous session id. You are the canonical case for cross-channel linking; populate `ChannelIdentity` faithfully so `IdentityLinker` and `ResponseTarget.active`/`.all_linked` can find your users. - -**Mixing on one host.** A single `AgentFrameworkHost` can mount channels of both families. A user can chat on Telegram (host-tracked) and have it linked via `IdentityLinker` to a Responses-channel session keyed by `previous_response_id`; in that case the linker's identity merge collapses both sides onto the same `isolation_key` and the host-tracked channel's alias becomes a peer of the caller-supplied `previous_response_id` for the same `AgentSession`. This is the v1 mechanism for "agent built on Responses, exposed to humans on Telegram, with continuity across both". - -### Session resolution rules - -1. If `ChannelRequest.session_mode == "disabled"`, the host bypasses session resolution and calls the target with `session=None`. -2. If `session_mode == "auto"`, the host resolves `ChannelSession.key` to an `AgentSession`, scoped by `isolation_key` when supplied. -3. If `session_mode == "auto"` and no key is supplied, the host may create an ephemeral session. -4. If `session_mode == "required"`, the host must resolve or create a usable session before invoking the target. -5. **Cross-channel resolution rule:** when two channels mounted on the same `AgentFrameworkHost` produce the same `isolation_key` (and either both omit `key` or both produce equivalent keys derived from `isolation_key`), the host resolves them to the **same** `AgentSession`. This is the v1 mechanism for cross-channel chat continuity (e.g. Telegram → Teams against the same conversation history). The **canonical** path for translating a channel's native per-channel identifier (Telegram `chat_id`, Teams AAD object id, …) into the stable `isolation_key` is the host-level `IdentityResolver` (per-channel `run_hook` mapping is supported as a lower-level alternative). When the channel-native identity is not yet linked, the `IdentityLinker` runs a connect ceremony (OAuth, MFA, signed one-time code) to associate it with an existing `isolation_key`. -6. The first spec does **not** standardize a cross-package storage API; cross-host/cross-process continuity is deferred to the pluggable session store (req #24), which also persists identity-link grants beyond the host process lifetime. -7. Responses and other conversation-aware channels may still own protocol-specific conversation/item storage above this layer. -8. **Session rotation (`reset_session`).** The host exposes `reset_session(isolation_key)` so **host-tracked** channels (see [Channel session-carriage models](#channel-session-carriage-models)) can implement "start a fresh thread" commands (e.g. Telegram `/new`). The default behavior **rotates the active session id alias** (`` → `#`) rather than deleting on-disk history: prior history remains addressable by its original session id while subsequent runs for that `isolation_key` resolve to a brand-new `AgentSession`. Apps that want destructive reset can layer that on top by calling into their own `HistoryProvider`. **Caller-supplied** channels do not call `reset_session`; their callers branch threads by sending a fresh / no `previous_response_id` (or equivalent) on the next request. - -### Channel metadata persisted onto stored messages - -When the host invokes the target, it does **not** pass the raw `ChannelRequest.input` directly. It first wraps the input into a `Message(role="user", contents=[...])` whose `additional_properties["hosting"]` carries an envelope describing where the message came from and where its response should go. This makes the resulting conversation history self-describing for any `HistoryProvider` (`FileHistoryProvider`, future Cosmos/Foundry providers, …) without that provider having to know anything channel-specific. - -```jsonc -{ - "channel": "telegram", // ChannelRequest.channel - "identity": { // populated from ChannelRequest.identity - "channel": "telegram", - "native_id": "", - "attributes": { /* channel-specific */ } - }, - "response_target": { // populated from ChannelRequest.response_target - "kind": "originating", - "targets": [] // [(channel, native_id), ...] for explicit targets - } -} -``` - -Round-trip is guaranteed by `Message.to_dict()` / `Message.from_dict()`. Future providers that key on protocol shape (e.g. a Responses `previous_response_id`-keyed store) can read this envelope to reconstruct cross-channel context without needing a separate channel-metadata sidecar. - -`FoundryHostedAgentHistoryProvider` round-trips the entire `additional_properties["hosting"]` namespace (and any other AF-side namespace) through the Foundry response store via a single opaque `agent_framework` container key written onto each `OutputItem`. Because the schema is now **intent-only** (no per-destination mutation after the initial write — see [Intended targets + durable delivery](#intended-targets--durable-delivery)), no service-side additions to the Foundry storage SDK are required for it to round-trip. - -### Intended targets + durable delivery - -The inbound envelope above captures the caller's **intent**. The assistant `Message` produced by the host carries a parallel envelope that records the *resolved destination set* — what the host actually intended to deliver to, after `ResponseTarget` resolution and `LinkPolicy` filtering. **This is a single write, never mutated.** Operational state for each push attempt (status, attempts, retries, last error, channel-issued id) lives in the [`DurableTaskRunner`](#durable-task-runner) — not on the message — because the runner is the component that performs and (when durable) retries the push. - -The shape of the fan-out — synchronous on the originating wire, scheduled via the runner for every non-originating destination — is the same in every multi-target scenario (`all_linked`, `active`, `channels([...])`, `identities([...])`): - -```mermaid -sequenceDiagram - autonumber - actor User - participant Tg as TelegramChannel
(originating) - participant Host - participant Target as Agent - participant Runner as DurableTaskRunner - participant Annot as _annotate_intended_targets - participant Act as ActivityChannel
(linked) - participant Resp as ResponsesChannel
(linked) - - User->>Tg: message - Tg->>Host: ChannelRequest(
identity=tg:12345,
response_target=all_linked) - Host->>Host: resolve isolation_key - Host->>Target: run - Target-->>Host: AgentResponse - Host->>Annot: hosting.intended_targets =
[tg, activity, responses] - - par originating — synchronous - Host->>Tg: response_hook → push (sync) - Tg-->>User: reply on Telegram - and non-originating — durable - Host->>Runner: schedule("hosting.push",
payload for activity) - Runner->>Host: _handle_push_task(payload) - Host->>Act: response_hook → push - Act-->>User: reply in Teams (or wherever) - and - Host->>Runner: schedule("hosting.push",
payload for responses) - Runner->>Host: _handle_push_task(payload) - Host->>Resp: response_hook → push - end -``` - -Schema on `Message.additional_properties["hosting"]` for a host-produced assistant message: - -```jsonc -{ - "originating": { // mirror of the inbound envelope above - "channel": "telegram", - "identity": { "channel": "telegram", "native_id": "12345", "attributes": {} }, - "response_target": { "kind": "all_linked", "targets": [] } - }, - "intended_targets": [ - { "destination": { "channel": "activity", "native_id": "29:abc..." } }, - { "destination": { "channel": "telegram", "native_id": "12345" } } - ], - "skipped_targets": [ // optional — present only when LinkPolicy excluded something - { - "destination": { "channel": "corp-only", "native_id": "..." }, - "reason": "link_policy" // link_policy | no_push_capability - } - ] -} -``` - -Lifecycle the host follows: - -1. After `ResponseTarget` resolution and `LinkPolicy` filtering, the host writes the assistant `Message` **once**, with the resolved `intended_targets[]` (every destination it will attempt) and an optional `skipped_targets[]` for destinations dropped at resolution time (so audit can show *why* a resolved-by-`ResponseTarget` destination did not receive the message — `link_policy` or `no_push_capability`). This write is immutable. -2. For each non-originating destination, the host schedules a `"hosting.push"` task via the configured [`DurableTaskRunner`](#durable-task-runner). The runner is responsible for attempting, retrying per its `RetryPolicy`, and (for durable runners) surviving host restarts. The push handler resolves the channel, runs the channel's `response_hook`, and calls `ChannelPush.push(...)`. -3. Operational delivery state — attempt count, last error, success timestamp, channel-issued message id — lives in the runner's own log. Replay across host restarts is a property of the runner (native for durable runners; not supported for the in-process runner). Operators who want a queryable delivery dashboard can read it from their runner backend's observability surface (TaskHub, Foundry durable tasks, …) — the host does not project it back onto the message. - -The originating destination (when `ResponseTarget` includes it) is **not** routed through the runner. It is rendered synchronously on the originating channel's wire; the host-internal `_deliver_response` helper returns `bool` (`True` if any push was scheduled / delivered, `False` otherwise) for the channel's own bookkeeping. Per-destination delivery outcomes are not collated back to the caller — durable runners surface them in their own logs / dashboards, and the in-process runner logs failures with structured fields. See [Built-in routes](#built-in-routes) for the synchronous return contract. - -> **Why intent-only on the message, with operational state in the runner?** A single immutable write keeps the message store as the source of truth for "what the host intended", without requiring providers to implement in-place mutation (no `SupportsDeliveryTracking` capability, no Foundry `update_item` service ask). Per-destination retry, replay, and failure surfacing become responsibilities of the runner, which is the right component because it owns the work queue. Operators who already use a durable runner (TaskHub, Foundry durable tasks) get observability through the runner's existing tooling rather than through a parallel ETL on the message store. +Telegram, Activity Protocol, and Discord can expose equivalent native commands when their protocols support them. -### Durable task runner +## Follow-up Enhancements -The host delegates non-originating push fan-out — and, in v1 fast-follow, background runs — to a pluggable `DurableTaskRunner`. The runner is the component that owns "this work needs to happen; retry on failure; survive (or don't survive) restarts depending on which runner you chose". Channel packages never see it directly; they just implement `ChannelPush.push(...)`. +See [ADR-0028](../decisions/0028-hosting-linking-multicast-enhancements.md) for the deferred design covering: -```python -from typing import Protocol, Callable, Awaitable, Mapping, Any, Literal -from dataclasses import dataclass - -@dataclass(frozen=True) -class RetryPolicy: - max_attempts: int = 5 - initial_backoff_seconds: float = 1.0 - backoff_multiplier: float = 2.0 - max_backoff_seconds: float = 60.0 - -@dataclass(frozen=True) -class TaskHandle: - task_id: str # opaque, runner-issued - name: str # the registered handler name - -TaskStatus = Literal["scheduled", "running", "succeeded", "failed", "cancelled"] - -class DurableTaskRunner(Protocol): - def register( - self, - name: str, - handler: Callable[[Mapping[str, Any]], Awaitable[None]], - ) -> None: ... - - async def schedule( - self, - name: str, - payload: Mapping[str, Any], - *, - retry_policy: RetryPolicy | None = None, - ) -> TaskHandle: ... - - async def get(self, handle: TaskHandle) -> TaskStatus | None: ... -``` - -The host registers an internal handler `"hosting.push"` at startup. Each non-originating destination becomes a single `runner.schedule("hosting.push", payload)` call. The handler: - -1. Resolves the channel from `payload["channel_id"]`. -2. Clones the `HostedRunResult` and runs the channel's `response_hook` (if any). -3. Calls `ChannelPush.push(identity, shaped_result)`. -4. Returns normally on success. On exception, the runner records the failure and either schedules a retry per `RetryPolicy` or marks the task `failed` (terminal). - -Built-in runner shipped in core: - -| Runner | Persistence | Replay across restarts | Default for | -|---|---|---|---| -| `InProcessTaskRunner` | None — `asyncio.create_task` + in-process retry | No (in-flight tasks lost on process death) | `runtime_mode="long_running"` | - -Adapter packages (deferred to v1 Fast Follow; no runtime dep from core): - -| Package | Backend | Notes | -|---|---|---| -| `agent-framework-hosting-durabletask` | `agent-framework-durabletask` (gRPC TaskHub) | Suits `ephemeral` deployments that already run a Durable Task sidecar. | -| `agent-framework-hosting-foundry` (extension) | Foundry durable-task API | Deferred until the FHA durable-task surface is finalized. | -| (possibly) SQLite-outbox runner | SQLite under the existing `HostStateStore` root | Lowest-dep "survives single-node restart" option for ephemeral hosts without an external sidecar. | - -Default selection follows [Runtime modes](#runtime-modes). `long_running` defaults to `InProcessTaskRunner`. `ephemeral` is **strict**: if `durable_task_runner` is not configured and `allow_in_process_runner=True` is not opted in, the host raises `RuntimeError` at construction — falling back to the in-process runner in an ephemeral environment would silently drop in-flight pushes on the next scale-to-zero. The `allow_in_process_runner=True` escape hatch is intentionally noisy (warning) and meant for local dev / smoke tests. - -#### Codec contract for durable serialisation - -When a `DurableTaskRunner` is configured for a deployment that uses out-of-process scheduling (e.g. a sidecar / gRPC TaskHub), task payloads must be **JSON-serialisable** end to end. Two pieces of the contract enforce this: - -- **`DurableTaskRunner.payload_mode`** — a class-level attribute declared by each runner implementation: - - `OBJECT` — the in-process runner; payloads pass Python objects by reference. No serialisation required. - - `JSON` — out-of-process runners; payloads must round-trip through JSON. -- **`ChannelPushCodec`** — a Protocol exposed by push-capable channels whose payloads are not natively JSON-serialisable. The codec defines `encode(payload) -> Mapping[str, Any]` / `decode(envelope) -> Any` so the channel owns the over-the-wire shape of its push payloads. Channels without exotic payloads can leave the codec unset and rely on the host's default `dataclasses.asdict`-style encode. - -At construction the host runs `_validate_runner_codec_pairing`: if the configured runner declares `payload_mode == JSON` and any push-capable channel does not expose a codec, the host raises `ChannelConfigurationError` so the misconfiguration is caught before traffic. On the consumer side `_handle_push_task` accepts both `OBJECT`-mode (in-memory object) and `JSON`-mode (`{"type": "push", ...}` envelope) shapes so the same handler serves both runner backends. - -```mermaid -sequenceDiagram - autonumber - participant Host - participant Codec as ChannelPushCodec
(on the push channel) - participant Runner as Durable runner
(payload_mode=JSON) - participant Worker as Runner worker
(may run after host scaled to zero) - participant Channel as Push channel - participant External as External service
(Telegram / Bot Framework / …) - - Note over Host,Runner: construction-time:
_validate_runner_codec_pairing
(refuse JSON runner + codec-less channel) - - Host->>Codec: encode(payload) → JSON-safe Mapping - Host->>Runner: schedule("hosting.push",
{"type": "push", "channel": "tg",
"payload": }) - Runner->>Runner: persist task - Runner-->>Host: TaskHandle - Host-->>Host: synchronous return path
(originating already delivered) - - Note over Worker: ... host may scale to zero ... - - Worker->>Runner: dequeue task - Worker->>Host: invoke "hosting.push" handler
(JSON envelope) - Host->>Host: _handle_push_task
detect envelope shape (OBJECT or JSON) - Host->>Codec: decode(payload) → in-memory object - Host->>Channel: ChannelPush.push(identity, result) - Channel->>External: native API call - alt success - External-->>Channel: ok - Channel-->>Worker: handler returns - Worker->>Runner: mark task succeeded - else transient failure - External-->>Channel: 5xx - Channel-->>Worker: raise - Worker->>Runner: retry per RetryPolicy - else terminal failure - Worker->>Runner: max_attempts → mark failed
(log only) - end -``` - -#### In-process runner shutdown drain - -`InProcessTaskRunner` ships a two-phase shutdown driven by `shutdown_grace_seconds` (default `5.0`): +- cross-channel identity linking, +- authorization and allowlists, +- non-originating response delivery, +- active-channel routing, +- multicast and all-linked delivery, +- background runs and continuation tokens, +- durable delivery runners, +- retry/replay semantics, and +- payload serialization. -1. After lifespan shutdown signals, in-flight `"hosting.push"` tasks are given the grace period to finish — during which retries keep happening — so a clean Ctrl-C does not abandon work that is one network call away from completing. -2. When the grace expires, remaining tasks are cancelled and their `CancelledError` is swallowed (not logged as a failure — it is the expected shutdown shape). +Those enhancements must layer on top of this v1 contract without requiring v1 users to adopt them. -This is purely operational hygiene for the `long_running` default; durable adapters get this behaviour for free from their backends. +## Validation Gates -#### Echo idempotency on retry +The Python implementation should be considered complete when: -When `ResponseTarget.channel(name, echo_input=True)` is set, the host packages an echo (`role="user"`) push *and* the agent reply (`role="assistant"`) into the same `"hosting.push"` task per non-originating destination. The handler tracks an `echo_done` cursor on the task state and short-circuits the echo phase on retry: a retry that fires after the echo succeeded but before the response push completed will not double-echo the user's message. The cursor lives on the runner-owned task state, not the message — same principle as the broader "intent only on the message, operational state in the runner" rule. - -```mermaid -sequenceDiagram - autonumber - participant Runner - participant Host - participant Channel - participant External - - Runner->>Host: _handle_push_task(
echo, response,
state={echo_done: False}) - Host->>Host: read echo_done → False - Host->>Channel: response_hook(echo, is_echo=True) - Channel->>External: push user message - External-->>Channel: ok - Host->>Runner: persist state.echo_done = True - - Host->>Channel: response_hook(response, is_echo=False) - Channel->>External: push assistant reply - External-->>Channel: 5xx (transient) - Channel-->>Host: raise - - Runner->>Runner: retry per RetryPolicy
(backoff) - - Runner->>Host: _handle_push_task(
echo, response,
state={echo_done: True}) - Host->>Host: read echo_done → True (skip echo) - Host->>Channel: response_hook(response, is_echo=False) - Channel->>External: push assistant reply - External-->>Channel: ok - Host-->>Runner: handler returns → succeeded -``` - -## Reference and Parity Plan - -The new core sits **below** the conceptual boundary of today's top-level Responses/Invocations host wrappers but is implemented in Agent Framework-owned code. Existing top-level `agentserver` hosts inform behavior, naming, and parity targets — **without** becoming runtime dependencies of the hosting core. Individual channel packages MAY consume lower-level building blocks shipped in `azure.ai.agentserver` (e.g. `FoundryHostedAgentHistoryProvider` builds on the Foundry response-store SDK). - -| Existing code area | Proposed treatment | Why | -|---|---|---| -| `SupportsAgentRun.run(..., session=..., stream=...)` | Reuse directly in core for agent targets | Already the correct Python execution seam | -| `Workflow.run(...)` and workflow streaming events | Reuse directly in core for workflow targets; normalize outputs into `HostedRunResult`/`HostedStreamResult` | Lets channels stay target-agnostic | -| Session resolution logic in current hosting layers | Implement in core, using current behavior as reference | Host behavior, not protocol behavior | -| Starlette app assembly and route aggregation | Implement in core, referencing current servers | Needed by every channel | -| PR #5393 Telegram `BOT_COMMANDS`, `CommandHandler(...)`, `set_my_commands(...)` | Reference for the generic `ChannelCommand` capability | Clearest current prior art for native command catalogs + runtime dispatch | -| `agent_framework_foundry_hosting._to_chat_options` | Inspiration for Responses channel-owned mapping | Still protocol-specific | -| `agent_framework_foundry_hosting._items_to_messages` / `_output_item_to_message` | Inspiration / parity reference in Responses channel codec | Useful, not generic hosting | -| `agent_framework_foundry_hosting._to_outputs` and `ResponseEventStream` | Inspiration for Responses event mapping; the new Responses channel owns its own AF-native serialization rather than reusing top-level `agentserver` host wrappers | Responses-specific serialization | -| `azure.ai.agentserver.responses.ResponseContext.get_history()` + `Store` | Folded into the agent's normal core `HistoryProvider` flow. The Responses channel projects `previous_response_id` / `conversation_id` into `ChannelSession.key`; the agent's `HistoryProvider` does the load / append exactly as for any other session. No Responses-specific history Protocol. | One uniform history seam across channels — the developer chooses where to store, and may compose multiple providers under the standard "single `load_messages=True`" rule. | -| `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider` (HTTP-backed Foundry storage with `IsolationContext` user/chat headers) | Wrapped by a native `FoundryHostedAgentHistoryProvider` in `agent-framework-foundry-hosting` that **builds on top of** the SDK and exposes the standard core `HistoryProvider` Protocol. Agents attach it the same way they attach `FileHistoryProvider`. | Lets the Foundry response store back conversations driven through the new host, while keeping the channel agnostic to the storage backend. The provider owns a runtime dependency on `azure.ai.agentserver` (for the storage SDK) so it stays aligned with the SDK's wire contract, auth, and isolation headers without duplication. Same provider also works for non-Responses channels (Telegram, Invocations, …) so the choice is "where do I want history persisted" rather than "which channel am I exposing". | -| `agent_framework_foundry_hosting._invocations.InvocationsHostServer._sessions` (in-process `dict[str, AgentSession]`) | Replace with the host's normal `ChannelSession.key → AgentSession` resolution; agent history flows through its own (optional) core `HistoryProvider(load_messages=True)` | Invocations does **not** need a protocol-shaped history seam — confirmed by today's foundry hosting which keeps no `Store` on the Invocations side | -| `ResponsesAgentServerHost` / `InvocationAgentServerHost` top-level wrappers | Conceptual prior art only | Sit too high; encode protocol ownership | -| Workflow checkpoint behavior in current Responses hosting | Defer; reference only for future work | Needs separate design if it becomes shared | - -## Dependencies & Commitment Status - -| Dependency | Team | DRI | Status | -|---|---|---|---| -| `SupportsAgentRun` execution seam | Agent Framework Core (Python) | TBD | Committed (existing) | -| `Workflow` execution seam | Agent Framework Core (Python) | TBD | Committed (existing); host wraps workflow outputs into `HostedRunResult`/`HostedStreamResult` | -| `AgentSession` / conversation primitives | Agent Framework Core (Python) | TBD | Committed (existing); cross-package storage standardization deferred | -| Starlette | External (BSD-licensed) | n/a | Committed; required runtime dep of `agent-framework-hosting` | -| Uvicorn | External (BSD-licensed) | n/a | Open Question — required dep vs optional extra (see Open Questions) | -| `agent-framework-foundry-hosting` parity reference | Agent Framework Hosting | TBD | Reference-only, no runtime dependency | -| `FoundryHostedAgentHistoryProvider` (in `agent-framework-foundry-hosting`, built on `azure.ai.agentserver.responses.store._foundry_provider.FoundryStorageProvider`) | Agent Framework Foundry | TBD | Proposed v1 deliverable so Foundry-defined (and any other) agents can use Foundry's response store as a `HistoryProvider` through the new host. Implements the standard core `HistoryProvider` Protocol — usable from any channel, no Responses-specific Protocol. Owns a runtime dep on `azure.ai.agentserver` for the storage SDK. | -| PR #5393 Telegram sample (commands, polling/webhook patterns) | Agent Framework | PR author | Reference-only; informs `ChannelCommand` and `TelegramChannel` design | -| Telegram Bot API SDK | External | n/a | Committed (runtime dep of `agent-framework-hosting-telegram`) | -| `microsoft/teams.py` SDK (`microsoft-teams-apps`, `microsoft-teams-api`, `microsoft-teams-cards`) | External (MIT, Microsoft) | n/a | Proposed runtime dep of `agent-framework-hosting-teams` (req #29). The SDK already ships a "Build an agent using Microsoft Agent Framework" guide and a pluggable `HttpServerAdapter`, so the hosting package mounts the SDK's `App` into the host's Starlette app and reuses its Adaptive Cards / Streaming / Citations / Feedback / Suggested-prompts / Dialogs / Message-Extensions / SSO surface instead of re-implementing them. | -| `agent-framework-ag-ui`, `-a2a`, `-devui` | Agent Framework | various | Out of scope for first implementation; future convergence kept as a possibility | - -## Open Questions - -| # | Question | On Point | Notes | -|---|---|---|---| -| 5 | How much of the Responses Conversations API should the Responses channel own vs a future shared conversation utility? | Eng / PM | Tied to whether session storage gets standardized. | -| 6 | Should a later phase define a pluggable session store interface? | Eng | Needs to be designed **holistically across all storage axes** — sessions, messages, identity links, run-state / continuation tokens, workflow checkpoints — rather than per-axis. Tracked as v1 fast-follow / requirement #23. | -| 8 | Should command scopes / projection metadata become first-class — e.g. private-chat-only vs group-chat-visible commands, or per-locale descriptions? | Eng / PM | Telegram's `BotCommandScope` and `language_code` would need to be representable cross-channel. | -| 10 | Is "Channel" the GA name? "Head" was used interchangeably during design discussions. | PM | "Channel" chosen for the spec; confirm before public docs. | -| 12 | Should `ChannelRequest.session_mode` grow additional values (e.g. `"shared"` for multi-channel session sharing) or stay closed at three? | Eng | The taxonomy needs a **dedicated design exercise** covering all known channel session-shape patterns; revisit after that exercise. | -| 14 | Where do issued link grants live — short-lived in-memory state on the host, the same pluggable session store (#24), or a separate identity store? | Eng | Resolved as part of the **`HostStateStore`** seam (see [Host state storage](#host-state-storage)). Link grants live alongside continuation tokens and last-seen records in the v1 file-based default (`FileHostStateStore` → `link_grants/` namespace, 15min TTL). Pluggable Cosmos / SQL / Redis adapters tracked in req #24. **→ Move to Resolved Questions in next pass.** | -| 17 | Should `ResponseTarget.active` honor a configurable **time window** (last seen within N minutes) and what is the fallback when the window has expired before the response is ready — `originating`, `all_linked`, drop with `ContinuationToken` `status="failed"`? | PM / Eng | Likely yes with sensible default (e.g. 24h fall back to `originating`); per-request override via the run hook. | -| 22 | For the Responses WebSocket transport, what subprotocol identifier (if any) should be advertised on the `Upgrade` and how is auth conveyed — `Authorization` header on the upgrade, a `Sec-WebSocket-Protocol` token, or a query-string-bound short-lived token? | Eng / PM | Aligning with whatever OpenAI ships for Responses WS is preferable; keep the codec swappable so the channel can track upstream changes without breaking the host contract. | - -### Resolved Questions (decisions log) - -Original numbering preserved so external references (checkpoints, ADR cross-links) still resolve. Decisions captured here may imply spec-body changes elsewhere — see [Decisions-driven follow-ups](#decisions-driven-follow-ups) below. - -| # | Question | Decision | -|---|---|---| -| 1 | Final distribution package names? | `agent-framework-hosting` with suffixes (`-responses`, `-invocations`, `-telegram`, …). Public imports stay at `agent_framework.hosting`. | -| 2 | `uvicorn` required vs optional extra? | Use **hypercorn** instead of uvicorn; the `serve` extra remains optional. `host.app` is still the canonical server-agnostic ASGI surface. | -| 3 | Keep `HostedRunResult` wrapper or return `AgentResponse` directly? | **Keep `HostedRunResult`,** now shaped as a **generic typed envelope `HostedRunResult[TResult]`** (see Q31). It wraps both `AgentResponse` *and* `WorkflowRunResult`, and carries host-run metadata (resolved `session`) alongside the full-fidelity target output. | -| 4 | Where do generic auth helpers live? | Only the **mechanisms** live in core. Concrete implementations sit in their own packages when they pull dependencies; dep-free helpers may live in `hosting`. | -| 7 | `protocol_request` typed (`Any`) or typed kwargs? | **Keep `Any`.** | -| 9 | Allow nested routers / `path=""`? | **Yes.** The host developer is responsible for ensuring routes do not overlap. | -| 11 | Should the host support multiple targets? | **No** — final. Solve a layer above (an external router that owns multiple single-target hosts). | -| 13 | Which identity linkers ship in phase 1? | **Entra linker** (in the Entra package) + **one-time-code linker** (in core). Drop MFA for now; investigating additional linkers tracked as a follow-up. | -| 15 | Identity resolver invoked once on host vs per channel? | **Once on the host** with `ChannelIdentity(channel, native_id, ...)`. | -| 16 | Should `IdentityLinker` and `Channel` share a base `Contributor` protocol? | **A linker *is* a Channel — specialised.** Use the single Channel-shaped contract; collapse `IdentityLinker` into a Channel specialisation. | -| 18 | Contract for `ChannelPush` failures? | **The `DurableTaskRunner` owns retry and final-failure semantics**, per its `RetryPolicy`. Push handler exceptions are caught by the runner, which retries with backoff and ultimately marks the task `failed` when `max_attempts` is exhausted. Downstream push outcomes live in the runner's own log — there is no per-destination status surfaced on the message and no synchronous failure object returned to the caller. The host's internal `_deliver_response` helper returns `bool` (whether any work was scheduled) for the originating channel; observability for downstream pushes comes from the runner backend (TaskHub, Foundry durable tasks, log fields on `InProcessTaskRunner`). The earlier `DeliveryReport` value type has been removed. See [Intended targets + durable delivery](#intended-targets--durable-delivery) and [Durable task runner](#durable-task-runner). | -| 19 | `host.run_in_background(...)` `notify` callback? | Programmatic non-channel delivery will be expressed via the **`continuation_token`** mechanism (see Q20), not a separate `notify` callback. | -| 20 | Storage / TTL of `ContinuationToken`s? | **Done in this revision.** `ContinuationToken` is the type, with an opaque `token: str` field that channels surface to callers; equivalent continuation-token support is added to the **Invocations channel** alongside the existing Responses behaviour. Push-capable channels can still use it; default behaviour remains "push on completion", but the developer can choose other UX (poll-after-push, hybrid, …). Persistence is the **`HostStateStore`** seam — v1 default is **`FileHostStateStore`** (atomic JSON writes, 24h TTL on completed entries), so background runs survive host restarts. | -| 21 | Partial-failure surfacing for `all_linked`? | **Runner-only.** Originating-destination outcome is rendered synchronously on the originating channel's wire; the host's `_deliver_response` helper returns `bool` for the channel's own bookkeeping. Non-originating destinations are scheduled as `"hosting.push"` tasks on the `DurableTaskRunner`; per-task outcome (success / retried / terminal-failure) is observable via the runner's backend (TaskHub, Foundry durable tasks, structured log fields on `InProcessTaskRunner`). The host does not collate per-destination status back onto the message and no longer emits a `DeliveryReport`. | -| 23 | Share one backing store contract for host-level vs `ContextProvider`? | **Stay separate protocols** (current draft direction confirmed). A deployment may still bind both onto the same physical backend. | -| 24 | Where does the Foundry history provider live? | Tentative name **`FoundryHostedAgentHistoryProvider`**, in the **`foundry-hosting`** package (shares the dependency). Confirm with Foundry package owners before launch. | -| 25 | `Channel.confidentiality_tier` opaque vs enum? | Keep as `str?` for now; can revisit before Release. | -| 26 | Where does the delivery-replay mechanism live? | **In the `DurableTaskRunner`.** Durable adapters (TaskHub, Foundry durable tasks) provide retry-with-backoff and survive host restarts natively — replay is "the runner keeps retrying until `max_attempts` is exhausted or the push succeeds". The built-in `InProcessTaskRunner` retries within the process but does **not** survive restarts (in-flight tasks are lost). Operator-driven replay (`host.replay(task_handle)`) is out of scope for v1; the runner's own surface is sufficient for the common case. | -| 28 | Should the host collapse agent / workflow output to text? | **No.** `HostedRunResult[TResult]` carries the target output **unchanged** — full `AgentResponse` (with its multi-modal `messages`, `value`, `usage_details`) for agent targets, full `WorkflowRunResult` (with its `get_outputs()` / `get_final_state()`) for workflow targets. Channels decide what subset their wire renders; a `response_hook` may rebind `result` (e.g. project a workflow output into an `AgentResponse` for a text-only wire) via `HostedRunResult.replace(result=...)`. The host never loses fidelity it has, and never restricts modality. | -| 29 | How do channels do per-destination post-processing (text flattening, card rendering, citation attachment) without breaking the `Channel` Protocol? | **Channels expose a `response_hook` instance attribute** (callable accepting `(result, *, context: ChannelResponseContext) -> HostedRunResult[Any] \| Awaitable[HostedRunResult[Any]]`). The host duck-types this attribute and applies it on a per-destination clone of the `HostedRunResult` envelope before push. The `Channel` Protocol stays a small `name / path / contribute` contract — adding hook support to a new channel does not require Protocol changes. | -| 30 | Should non-originating destinations also see the user's input message, not just the agent reply? | **Opt-in via `ResponseTarget.channel(name, echo_input=True)`** (and the same kwarg on `.channels([...])` / `.identities([...])`). The host synthesises a `HostedRunResult[AgentResponse]` wrapping the user's input as a `role="user"` message and bundles it into the same scheduled push task as the agent reply per non-originating destination; the echo is dispatched first inside the task and an echo-push failure is logged and swallowed so the response push on the same destination is still attempted. Channels can transform or drop echoes via their `response_hook` (which receives `is_echo=True` for the echo phase). | -| 31 | Should `HostedRunResult` be flattened (text / messages) or carry the full target output? | **Carry the full target output, generically typed.** `HostedRunResult[TResult]` exposes a single `result: TResult` field — `AgentResponse` for agent targets, `WorkflowRunResult` for workflow targets — plus an optional `session: AgentSession \| None`. Earlier drafts carried a flattened `messages: list[Message]` projection alongside `raw_response`; this lost workflow-specific affordances (`get_outputs()`, `get_final_state()`, structured per-executor payloads) and forced the host to pre-shape data only some channels needed. The generic envelope keeps the host modality-agnostic, lets channels read the canonical accessor on the underlying type (`result.messages`, `result.value`, `result.get_outputs()`, …), and gives channel authors static typing where they want it. | -| 32 | Should authorization (per-channel allowlist) ship as a single `auth_mode` enum or as two orthogonal parameters? | **Two orthogonal parameters (`require_link: bool` + `allowlist: IdentityAllowlist \| Literal["inherit"] \| None = "inherit"`)** plus named `AuthPolicy` factories for the three common combinations. A single enum collapses `require_link` and `allowlist` into one axis and cannot express the Mixed profile (`AnyOfAllowlists(NativeIdAllowlist, LinkedClaimAllowlist)` with `require_link=False` — native ids bypass auth, everyone else is funneled into linking) without re-introducing per-value sub-parameters that would defeat the point. Composition is built on a **tri-state `AllowlistDecision` (`ALLOW` / `DENY` / `ABSTAIN`)** rather than a boolean, because boolean composition cannot distinguish "claim allowlist denies you" from "claim allowlist hasn't seen any claims yet" — a critical distinction for the Mixed profile. `LinkedClaimAllowlist` is rejected at host startup if no source of verified claims is available (config validator, fail-fast), preventing the silent-deny-everyone footgun. Group-chat denials apply the same DM-redirect pattern as `LinkChallenge` (short generic refusal in-room, fuller `user_message` in DM, structured `log_details` only in logs). Shipping in two waves: the Protocol + `NativeIdAllowlist` + config validator ship with the next core PR; full `host.authorize(...)` pipeline + `LinkedClaimAllowlist` enforcement land with the `IdentityLinker` core PR. See [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam). | -| 33 | How does the host decide whether it is running long-running vs ephemeral? | **Single `runtime_mode` parameter on `AgentFrameworkHost`**, defaulting to `None` for auto-detection. Auto-detect inspects known deployment markers (`FOUNDRY_HOSTING_ENVIRONMENT`, `AZURE_FUNCTIONS_ENVIRONMENT`, `AWS_LAMBDA_FUNCTION_NAME`) and picks `"ephemeral"` on the first hit; otherwise falls back to `"long_running"` (sensible local-dev / always-on default). The mode is **advisory** — it drives *defaults* for `HostStateStore`, `DurableTaskRunner`, identity-link state, and similar seams, but every individual choice remains overridable. Detected mode is logged at startup so misdetection is visible. See [Runtime modes](#runtime-modes). | -| 34 | How does delivery to non-originating destinations actually happen — synchronously in the originating request handler, or out-of-band? | **Out-of-band via a `DurableTaskRunner`.** The host registers an internal handler `"hosting.push"` at startup; each non-originating destination becomes a single `runner.schedule("hosting.push", payload)` call. The originating destination (when `ResponseTarget` includes it) is **still rendered synchronously** on the originating channel's wire — only fan-out goes through the runner. Default runner is `InProcessTaskRunner` (asyncio + bounded retry, no cross-restart persistence — suitable for `long_running`). Durable adapter packages (`agent-framework-hosting-durabletask`, future Foundry adapter) plug into the same Protocol for `ephemeral` deployments. See [Durable task runner](#durable-task-runner). | -| 35 | What is the audit shape on the assistant message — full per-destination state machine, or intent only? | **Intent only.** `Message.additional_properties["hosting"]["intended_targets"]` is a single immutable write that records the resolved destination set (after `ResponseTarget` + `LinkPolicy` filtering). Operational state — attempt count, last error, success timestamp, channel-issued id — lives in the `DurableTaskRunner` and is observed via the runner's backend. This eliminates the previous `deliveries[]` status state machine (`pending`/`delivered`/`failed`/`skipped`), the `SupportsDeliveryTracking` provider capability, and the Foundry `update_item` service ask. See [Intended targets + durable delivery](#intended-targets--durable-delivery). | -| 36 | What happens when `runtime_mode="ephemeral"` and no `durable_task_runner` is configured? | **Raise at construction.** Silently falling back to `InProcessTaskRunner` in an ephemeral environment would drop every in-flight push on the next scale-to-zero — a footgun. The host raises `RuntimeError` unless `allow_in_process_runner=True` is opted in (warning logged). The opt-in is intended for local-dev / smoke tests where the developer accepts the in-flight loss. See [Durable task runner](#durable-task-runner). | -| 37 | What is the wire contract for push payloads under a durable (out-of-process) runner? | **A two-piece contract.** Each `DurableTaskRunner` declares its `payload_mode` (`OBJECT` for in-process pass-by-reference; `JSON` for runners that round-trip through JSON). Channels that ship non-JSON-native payloads expose a `ChannelPushCodec` (`encode` / `decode`). At construction the host runs `_validate_runner_codec_pairing` and refuses a `JSON`-mode runner paired with codec-less push channels. The push handler accepts both `OBJECT` and `JSON` envelope shapes so the same handler serves both runner backends. See [Codec contract for durable serialisation](#codec-contract-for-durable-serialisation). | -| 38 | Should `DeliveryReport` remain as a per-destination return value? | **No — removed.** Operational state lives in the runner; observability comes from the runner's backend (TaskHub, Foundry durable tasks, structured log fields on `InProcessTaskRunner`). The host's internal `_deliver_response` helper now returns `bool` (whether any work was scheduled / delivered) for the originating channel's own bookkeeping. Removing the value type collapses the public surface and removes a coupling point that would have needed a "schedule-time failure" subtype to round-trip durable failures back to the caller — failures live where they originate (the runner), not on a parallel object passed back through the synchronous return. | -| 39 | How is double-echo avoided when a push task retries after the echo phase succeeded but the response phase failed? | **An `echo_done` cursor on the runner-owned task state.** When `echo_input=True`, the `"hosting.push"` handler packages both the echo (`role="user"`) and the assistant reply into the same task; on the first attempt the handler dispatches the echo, sets `echo_done=True` on the task state, and then dispatches the reply. A retry that fires after the echo succeeded but the reply failed reads the cursor and short-circuits the echo phase. The cursor lives in the runner — same principle as the broader "intent only on the message, operational state in the runner" rule. See [Echo idempotency on retry](#echo-idempotency-on-retry). | -| 40 | What happens to in-flight `"hosting.push"` tasks on a clean `InProcessTaskRunner` shutdown? | **Two-phase drain.** A `shutdown_grace_seconds` window (default `5.0`) lets in-flight retries finish; remaining tasks are then cancelled and `CancelledError` is swallowed (not logged as a failure — it is the expected shutdown shape). Operators with longer worst-case retry chains can extend the grace via the constructor. Durable adapters get equivalent behaviour from their backends. See [In-process runner shutdown drain](#in-process-runner-shutdown-drain). | - -### Decisions-driven follow-ups - -The following resolutions imply prose / API edits elsewhere in the spec body (not just the table above). Captured here so they aren't lost; the edits themselves are deferred to a separate pass. - -- **Q2** — Switch all install / `host.serve()` references from `uvicorn` to `hypercorn`. -- **Q3** — ✅ Done. `HostedRunResult[TResult]` is now generic over the target output type; see Q31 below for the rationale. -- **Q11** — Strip any remaining "multi-target hedge" language from the spec body. -- **Q13** — Update the linker catalogue: Entra (in Entra package) + one-time-code (in core); remove MFA references. -- **Q16** — Collapse `IdentityLinker` into a Channel specialisation in the spec body (architecture diagrams, contracts, examples). -- **Q20** — ✅ Done. `ContinuationToken` type carries an opaque `token: str`; routes use `/{continuation_token}`; Invocations channel gets equivalent continuation-token support; persistence via `HostStateStore` (v1 default file-based). -- **Q32** — Spec text added (see [Authorization profiles and the IdentityAllowlist seam](#authorization-profiles-and-the-identityallowlist-seam) and req #22). The core PR includes `IdentityAllowlist` Protocol, `AllowlistDecision` enum, `AuthorizationContext`, `AllowAll` / `NativeIdAllowlist` / `LinkedClaimAllowlist` / `AnyOfAllowlists` / `AllOfAllowlists` / `CallableAllowlist` built-ins, `IdentityLinker` Protocol, `LinkedIdentity`, `LinkChallenge`, `AuthPolicy` factories, `Allowed` / `LinkRequired` / `Denied` outcomes, `Host(default_allowlist=..., identity_linker=...)` + per-channel `allowlist` parameter, construction-time validator (rules #1 + #2 + #3 — `require_link=True` without `identity_linker` now raises), and `host.authorize(...)` for open, native-id, and linked-claim profiles. Provider-specific linkers (for example Entra OAuth helpers) are separate channel/helper packages. -- **Q36 / Q37 / Q38 / Q39 / Q40** — Spec text added: strict-ephemeral default + `allow_in_process_runner` opt-in in §[Durable task runner](#durable-task-runner); new sub-sections [Codec contract for durable serialisation](#codec-contract-for-durable-serialisation), [In-process runner shutdown drain](#in-process-runner-shutdown-drain), [Echo idempotency on retry](#echo-idempotency-on-retry); `DeliveryReport` references purged from §[Intended targets + durable delivery](#intended-targets--durable-delivery) and Qs 18 / 21. Code lands in this core PR: `DurableTaskPayloadMode` + `ChannelPushCodec` + `PushPayloadNotSerializable` exception in `_types.py`; `_validate_runner_codec_pairing` + dual-mode `_handle_push_task` + `_build_push_payload` + `echo_done` cursor + `_annotate_intended_targets` in `_host.py`; `shutdown_grace_seconds` + 2-phase drain in `_runner.py`. -- **Q33 / Q34 / Q35** — Spec text added: new top-level §[Runtime modes](#runtime-modes), rewritten §[Intended targets + durable delivery](#intended-targets--durable-delivery), new §[Durable task runner](#durable-task-runner). Code lands in this core PR: `DurableTaskRunner` Protocol + `InProcessTaskRunner` + `runtime_mode` constructor parameter + auto-detection. Durable runner adapters (`agent-framework-hosting-durabletask`, Foundry adapter) are separate follow-up packages tracked under §[Decisions-driven follow-ups](#decisions-driven-follow-ups). Bumping req #14 (background runs) to share the same runner is a non-goal of this PR — the `ContinuationToken` machinery and the runner can be wired together in a later pass without re-shaping either contract. +- a sample uses one `AgentFrameworkHost` with multiple channels and no manual Starlette route composition, +- each current channel has contract tests for route contribution, lifecycle, request parsing, hooks, and originating response rendering, +- session tests prove shared `isolation_key` values share an `AgentSession` and `reset_session` rotates it, +- workflow tests or samples use explicit `checkpoint_location`, +- Foundry isolation middleware is covered by integration or contract tests, +- no v1 package exposes the removed linking, multicast, durable-runner, or continuation APIs, and +- this spec and ADR-0027 remain aligned. diff --git a/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py index 4fbca0003a6..0c7e6237907 100644 --- a/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py +++ b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py @@ -90,14 +90,10 @@ ChannelContribution, ChannelIdentity, ChannelRequest, - ChannelResponseContext, ChannelResponseHook, ChannelRunHook, ChannelSession, - ChannelStreamTransformHook, - HostedRunResult, - apply_response_hook, - apply_run_hook, + ChannelStreamUpdateHook, logger, ) from azure.core.credentials_async import AsyncTokenCredential @@ -151,11 +147,6 @@ class _OutboundError(RuntimeError): """Marker for transient outbound failures that should produce 502/retry.""" -def _text_result(text: str) -> HostedRunResult[AgentResponse]: - """Wrap plain text in a ``HostedRunResult`` for streaming fan-out delivery.""" - return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) - - def _parse_activity(activity: Mapping[str, Any]) -> Message: """Translate one Bot Framework ``message`` Activity into an Agent Framework Message. @@ -231,7 +222,7 @@ class ActivityProtocolChannel: When ``stream=True`` (default), the channel sends an initial placeholder activity, then edits it in place as the agent emits ``AgentResponseUpdate`` chunks (``PUT /v3/conversations/{id}/activities/{id}``). When ``stream=False`` - it just sends the final reply. A ``stream_transform_hook`` can rewrite or + it just sends the final reply. A ``stream_update_hook`` can rewrite or drop individual updates before they hit the wire. """ @@ -253,7 +244,7 @@ def __init__( response_hook: ChannelResponseHook | None = None, send_typing_action: bool = True, stream: bool = True, - stream_transform_hook: ChannelStreamTransformHook | None = None, + stream_update_hook: ChannelStreamUpdateHook | None = None, stream_edit_min_interval: float = 0.7, inbound_auth_validator: InboundAuthValidator | None = None, service_url_allowed_hosts: tuple[str, ...] = _DEFAULT_SERVICE_URL_HOSTS, @@ -290,17 +281,15 @@ def __init__( Unknown ``/foo`` text falls through to the agent. Handlers reply via ``ChannelCommandContext.reply``; surface them to users with a Teams manifest ``commandLists`` entry. - run_hook: Optional rewrite of ``ChannelRequest`` before invocation. + run_hook: Optional rewrite of ``ChannelRequest`` before invocation; + the host owns invocation of this hook. response_hook: Optional rewrite of the :class:`HostedRunResult` before the originating Activity - reply is serialized. The host also invokes this hook when - delivering to this channel as a non-originating push - destination. + reply is serialized; the host owns invocation of this hook. send_typing_action: Whether to send ``typing`` activities while the agent runs. - stream: Whether to stream by default. ``run_hook`` can flip per - request. - stream_transform_hook: Optional rewrite of each + stream: Whether to stream by default. + stream_update_hook: Optional rewrite of each ``AgentResponseUpdate`` before it hits the wire. stream_edit_min_interval: Seconds between successive in-place edits. Teams is more rate-sensitive than Telegram, so default @@ -341,7 +330,7 @@ def __init__( self.response_hook = response_hook self._send_typing_action = send_typing_action self._stream_default = stream - self._stream_transform_hook = stream_transform_hook + self._stream_update_hook = stream_update_hook self._stream_edit_min_interval = stream_edit_min_interval self._inbound_auth_validator = inbound_auth_validator self._service_url_allowed_hosts = tuple(h.lower().lstrip(".") for h in service_url_allowed_hosts) @@ -557,11 +546,10 @@ async def _process_activity(self, activity: Mapping[str, Any]) -> None: return parsed = _parse_activity(activity) - # Store a Bot Framework conversation reference on the identity so the - # host can proactively ``push`` to this conversation later (fan-out - # from another channel). Recording the identity also registers this - # channel under the isolation key so ``ResponseTarget.all_linked`` / - # ``.active`` can resolve it. + # Store a Bot Framework conversation reference on the identity so + # channel hooks and command handlers can inspect it. Cross-channel + # proactive delivery is a follow-up enhancement outside the v1 host + # contract. identity = ChannelIdentity( channel=self.name, native_id=conversation_id, @@ -591,14 +579,6 @@ async def _process_activity(self, activity: Mapping[str, Any]) -> None: metadata={"reply_to_id": activity.get("id"), "recipient": activity.get("recipient")}, stream=self._stream_default, ) - if self._hook is not None: - channel_request = await apply_run_hook( - self._hook, - channel_request, - target=self._ctx.target, - protocol_request=activity, - ) - await self._dispatch(activity, channel_request) async def _invoke_command( @@ -647,13 +627,6 @@ async def _invoke_command( }, metadata={"reply_to_id": activity.get("id"), "recipient": activity.get("recipient")}, ) - if self._hook is not None: - request = await apply_run_hook( - self._hook, - request, - target=self._ctx.target, - protocol_request=activity, - ) async def _reply(body: str) -> None: await self._send_message(activity, body) @@ -679,33 +652,26 @@ async def _dispatch(self, inbound: Mapping[str, Any], request: ChannelRequest) - await self._send_typing(inbound) if not request.stream: - result = await self._ctx.run(request) - include_originating = await self._ctx.deliver_response(request, result) - if include_originating: - result = await self._apply_response_hook(result, request) - text = getattr(result.result, "text", None) or "(no response)" - await self._send_message(inbound, text) + result = await self._ctx.run( + request, + run_hook=self._hook, + protocol_request=inbound, + response_hook=self.response_hook, + channel_name=self.name, + ) + text = getattr(result.result, "text", None) or "(no response)" + await self._send_message(inbound, text) return - stream = self._ctx.run_stream(request) - await self._stream_to_conversation(inbound, request, stream) - - async def _apply_response_hook( - self, - result: HostedRunResult[Any], - request: ChannelRequest, - ) -> HostedRunResult[Any]: - """Apply the channel-level response hook for an originating reply.""" - if self.response_hook is None: - return result - context = ChannelResponseContext( - request=request, + stream = await self._ctx.run_stream( + request, + run_hook=self._hook, + protocol_request=inbound, + stream_update_hook=self._stream_update_hook, + response_hook=self.response_hook, channel_name=self.name, - destination_identity=None, - originating=True, - is_echo=False, ) - return await apply_response_hook(self.response_hook, result, context=context) + await self._stream_to_conversation(inbound, request, stream) async def _stream_to_conversation( self, @@ -799,13 +765,6 @@ async def edit_worker() -> None: try: async for update in stream: - if self._stream_transform_hook is not None: - transformed = self._stream_transform_hook(update) - if isinstance(transformed, Awaitable): - transformed = await transformed - if transformed is None: - continue - update = transformed chunk = getattr(update, "text", None) if chunk: accumulated += chunk @@ -821,39 +780,28 @@ async def edit_worker() -> None: logger.exception("Activity edit worker crashed") try: - await stream.get_final_response() + final = await stream.get_final_response() except Exception: # pragma: no cover logger.exception("Stream finalize failed") - - # Fan the final reply out to any non-originating linked destinations - # (e.g. ``ResponseTarget.all_linked``) and learn whether this channel - # should still render on its own wire. For the default - # ``ResponseTarget.originating`` this is a no-op that returns True. - # Always consult the host even when nothing streamed so that - # ``ResponseTarget.none`` is honoured and non-originating targets are - # still fanned out for empty replies. - include_originating = True - if self._ctx is not None: - include_originating = await self._ctx.deliver_response(request, _text_result(accumulated)) - if not include_originating: - return + final = None + final_text = getattr(final, "text", None) or accumulated # Final flush — make sure the user sees everything that arrived after # the worker's last edit. If the placeholder failed, or the channel # turned out not to support edits (405), POST a fresh activity here # with whatever accumulated rather than PUT-editing the placeholder. if not placeholder_ok or edit_unsupported: - text = accumulated or "(no response)" + text = final_text or "(no response)" try: await self._send_message(inbound, text) except Exception: # pragma: no cover logger.exception("Activity fallback final send failed") - elif activity_id is not None and accumulated and accumulated != last_sent: + elif activity_id is not None and final_text and final_text != last_sent: try: - await self._update_activity(inbound, activity_id, accumulated) + await self._update_activity(inbound, activity_id, final_text) except Exception: # pragma: no cover logger.exception("Activity final edit failed") - elif not accumulated and activity_id is not None: + elif not final_text and activity_id is not None: # No text streamed — replace the placeholder with a stub so the # user isn't left staring at "…". try: @@ -875,19 +823,12 @@ async def _buffer_and_send( ``PUT /v3/conversations/{id}/activities/{id}``, so the progressive in-place edit cannot be used; we buffer the stream and ``POST`` a single message at the end. Mirrors the non-streaming path's - fan-out + response-hook semantics so behaviour is consistent - regardless of whether the target streamed. + response-hook semantics so behaviour is consistent regardless of + whether the target streamed. """ accumulated = "" try: async for update in stream: - if self._stream_transform_hook is not None: - transformed = self._stream_transform_hook(update) - if isinstance(transformed, Awaitable): - transformed = await transformed - if transformed is None: - continue - update = transformed chunk = getattr(update, "text", None) if chunk: accumulated += chunk @@ -895,23 +836,11 @@ async def _buffer_and_send( logger.exception("Activity streaming consumption failed") try: - await stream.get_final_response() + final = await stream.get_final_response() except Exception: # pragma: no cover logger.exception("Stream finalize failed") - - # Fan the final reply out to any non-originating linked destinations - # and learn whether this channel should still render on its own wire. - # Always consult the host even when nothing streamed so that - # ``ResponseTarget.none`` is honoured and non-originating targets are - # still fanned out for empty replies. - include_originating = True - if self._ctx is not None: - include_originating = await self._ctx.deliver_response(request, _text_result(accumulated)) - if not include_originating: - return - - result = await self._apply_response_hook(_text_result(accumulated), request) - text = getattr(result.result, "text", None) or "(no response)" + final = None + text = getattr(final, "text", None) or accumulated or "(no response)" try: await self._send_message(inbound, text) except Exception: # pragma: no cover @@ -998,56 +927,5 @@ async def _send_typing(self, inbound: Mapping[str, Any]) -> None: except Exception: # pragma: no cover - non-critical UX logger.exception("Teams typing send failed") - # -- ChannelPush -------------------------------------------------------- # - - async def push(self, identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: - """Proactively deliver an out-of-band message into a Bot Framework conversation. - - Implements :class:`host.ChannelPush` so this channel can be a - non-originating destination for ``ChannelRequest.response_target`` - (e.g. ``ResponseTarget.all_linked`` fan-out from Telegram/Discord, or - ``echo_input`` replay). The conversation reference is reconstructed - from ``identity.attributes`` captured on the inbound activity: - ``service_url``, ``conversation``, ``bot`` (outbound ``from``), - ``user`` (outbound ``recipient``), and ``channel_id``. - - Echo payloads (the user's mirrored input) carry ``role="user"`` - messages; Bot Service channels can only send AS the bot, so the text - is delivered as a normal bot message. - """ - if self._http is None: - raise RuntimeError("ActivityProtocolChannel.push called before startup") - attrs = identity.attributes - service_url = str(attrs.get("service_url") or "").rstrip("/") - conversation = dict(attrs.get("conversation") or {"id": identity.native_id}) - conversation_id = conversation.get("id") or identity.native_id - if not service_url: - raise ValueError("ActivityProtocolChannel.push requires 'service_url' in identity attributes") - # Re-validate the persisted ``service_url`` against the allow-list. The - # identity may have been recorded hours earlier (push runs out-of-band), - # so the allow-list could have narrowed or the store been tampered with - # since; never send a bearer token to a now-disallowed host. - if not self._is_service_url_allowed(service_url): - raise ValueError(f"ActivityProtocolChannel.push: service_url {service_url!r} is not in the allowed hosts") - - text = getattr(payload.result, "text", None) or "(no response)" - activity = { - "type": "message", - "from": dict(attrs.get("bot") or {}), - "recipient": dict(attrs.get("user") or {}), - "conversation": conversation, - "channelId": attrs.get("channel_id"), - "serviceUrl": attrs.get("service_url"), - "text": text, - "textFormat": "markdown", - } - if attrs.get("locale"): - activity["locale"] = attrs["locale"] - - url = f"{service_url}/v3/conversations/{conversation_id}/activities" - token = await self._get_token() - response = await self._http.post(url, json=activity, headers=self._auth_headers(token)) - response.raise_for_status() - __all__ = ["ActivityProtocolChannel", "activity_protocol_isolation_key"] diff --git a/python/packages/hosting-activity-protocol/tests/test_channel.py b/python/packages/hosting-activity-protocol/tests/test_channel.py index 0f3d1bf0bd2..002003faf36 100644 --- a/python/packages/hosting-activity-protocol/tests/test_channel.py +++ b/python/packages/hosting-activity-protocol/tests/test_channel.py @@ -9,7 +9,7 @@ from __future__ import annotations -from dataclasses import dataclass, replace +from dataclasses import dataclass from typing import Any from unittest.mock import AsyncMock, MagicMock @@ -18,15 +18,13 @@ AgentFrameworkHost, ChannelCommand, ChannelCommandContext, - ChannelIdentity, ChannelRequest, - ChannelSession, HostedRunResult, ) from starlette.testclient import TestClient from agent_framework_hosting_activity_protocol import ActivityProtocolChannel, activity_protocol_isolation_key -from agent_framework_hosting_activity_protocol._channel import _command_text, _parse_activity, _text_result +from agent_framework_hosting_activity_protocol._channel import _command_text, _parse_activity def test_activity_protocol_isolation_key_format() -> None: @@ -139,6 +137,26 @@ class _FakeAgentResponse: text: str +@dataclass +class _FakeUpdate: + text: str + + +class _FakeStream: + def __init__(self, chunks: list[str]) -> None: + self._chunks = chunks + + def __aiter__(self) -> Any: + async def gen() -> Any: + for chunk in self._chunks: + yield _FakeUpdate(chunk) + + return gen() + + async def get_final_response(self) -> _FakeAgentResponse: + return _FakeAgentResponse(text="".join(self._chunks)) + + class _FakeAgent: def __init__(self, reply: str = "ok") -> None: self._reply = reply @@ -149,6 +167,8 @@ def create_session(self, *, session_id: str | None = None) -> Any: def run(self, messages: Any = None, *, stream: bool = False, **kwargs: Any) -> Any: self.runs.append({"messages": messages, "stream": stream, "kwargs": kwargs}) + if stream: + return _FakeStream([self._reply]) async def _coro() -> _FakeAgentResponse: return _FakeAgentResponse(text=self._reply) @@ -183,9 +203,7 @@ def _make_teams( "serviceUrl": "https://smba.trafficmanager.net/amer/", } -# Minimal request envelope for direct ``_stream_to_conversation`` calls. The -# channel only consults it for cross-channel fan-out, which is skipped when -# ``_ctx`` is unset (as in these unit tests). +# Minimal request envelope for direct ``_stream_to_conversation`` calls. _VALID_REQUEST = ChannelRequest(channel="activity", operation="message.create", input=[]) @@ -214,10 +232,10 @@ def test_empty_path_mounts_at_app_root(self) -> None: assert agent.runs, "expected the agent to be invoked" def test_response_hook_can_rewrite_originating_reply(self) -> None: - contexts: list[Any] = [] + seen_kwargs: list[dict[str, Any]] = [] def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: - contexts.append(kwargs["context"]) + seen_kwargs.append(dict(kwargs)) return HostedRunResult(_FakeAgentResponse(text=result.result.text.upper()), session=result.session) ch, agent = _make_teams() @@ -231,9 +249,8 @@ def hook(result: HostedRunResult, **kwargs: Any) -> HostedRunResult: assert ch._http is not None body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] assert body["text"] == "HI THERE" - assert contexts - assert contexts[0].channel_name == "activity" - assert contexts[0].originating is True + assert seen_kwargs + assert seen_kwargs[0]["channel_name"] == "activity" def test_non_message_activities_are_acked(self) -> None: ch, agent = _make_teams() @@ -346,10 +363,7 @@ async def handle(ctx: ChannelCommandContext) -> None: assert r.status_code == 200 assert not agent.runs - def test_run_hook_applied_to_command_request(self) -> None: - def hook(request: ChannelRequest, **_: Any) -> ChannelRequest: - return replace(request, session=ChannelSession(isolation_key="resolved-key")) - + def test_command_request_uses_activity_session(self) -> None: captured: list[str] = [] async def handle(ctx: ChannelCommandContext) -> None: @@ -358,7 +372,6 @@ async def handle(ctx: ChannelCommandContext) -> None: agent = _FakeAgent("hi") ch = ActivityProtocolChannel(send_typing_action=False, commands=[ChannelCommand("todos", "x", handle)]) - ch._hook = hook fake_http = MagicMock() response_mock = MagicMock() response_mock.raise_for_status = MagicMock() @@ -370,7 +383,7 @@ async def handle(ctx: ChannelCommandContext) -> None: with TestClient(host.app) as client: r = client.post("/activity/messages", json=dict(_VALID_ACTIVITY, text="/todos")) assert r.status_code == 200 - assert captured == ["resolved-key"] + assert captured == [activity_protocol_isolation_key("19:meeting_xyz@thread.v2")] class TestOutbound: @@ -385,79 +398,9 @@ async def test_send_message_posts_to_conversation_url(self) -> None: assert body["text"] == "hi" -class TestPush: - """The channel implements ``host.ChannelPush`` so it can be a - non-originating destination for cross-channel fan-out / echo replay.""" - - def test_is_channel_push_instance(self) -> None: - from agent_framework_hosting import ChannelPush - - ch, _agent = _make_teams() - assert isinstance(ch, ChannelPush) - - def _identity(self) -> ChannelIdentity: - return ChannelIdentity( - channel="activity", - native_id="19:meeting_xyz@thread.v2", - attributes={ - "service_url": "https://smba.trafficmanager.net/amer/", - "conversation": {"id": "19:meeting_xyz@thread.v2"}, - "bot": {"id": "bot-1"}, - "user": {"id": "user-1"}, - "channel_id": "msteams", - "locale": "en-US", - }, - ) - - async def test_push_posts_proactive_activity(self) -> None: - ch, _agent = _make_teams() - await ch.push(self._identity(), _text_result("broadcast hello")) - assert ch._http is not None - ch._http.post.assert_called() # type: ignore[attr-defined] - url = ch._http.post.call_args[0][0] # type: ignore[attr-defined] - assert url == ("https://smba.trafficmanager.net/amer/v3/conversations/19:meeting_xyz@thread.v2/activities") - body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] - assert body["text"] == "broadcast hello" - # Outbound activity speaks AS the bot: inbound recipient -> from, - # inbound from -> recipient. - assert body["from"] == {"id": "bot-1"} - assert body["recipient"] == {"id": "user-1"} - assert body["conversation"] == {"id": "19:meeting_xyz@thread.v2"} - - async def test_push_requires_service_url(self) -> None: - ch, _agent = _make_teams() - identity = ChannelIdentity( - channel="activity", - native_id="conv-x", - attributes={"conversation": {"id": "conv-x"}}, - ) - with pytest.raises(ValueError, match="service_url"): - await ch.push(identity, _text_result("hi")) - - async def test_push_rejects_disallowed_service_url(self) -> None: - # ``push`` runs out-of-band against a persisted identity, so it must - # re-validate the service_url against the allow-list rather than trust - # the value captured (possibly hours) earlier. - ch, _agent = _make_teams() - identity = ChannelIdentity( - channel="activity", - native_id="conv-x", - attributes={ - "service_url": "https://attacker.example.com/", - "conversation": {"id": "conv-x"}, - "bot": {"id": "bot-1"}, - "user": {"id": "user-1"}, - }, - ) - with pytest.raises(ValueError, match="not in the allowed hosts"): - await ch.push(identity, _text_result("hi")) - assert ch._http is not None - ch._http.post.assert_not_called() # type: ignore[attr-defined] - - class TestIdentityRecording: """``_process_activity`` must stamp the inbound conversation reference - onto ``ChannelRequest.identity`` so the host can record it for fan-out.""" + onto ``ChannelRequest.identity`` so hooks and commands can inspect it.""" async def test_inbound_sets_request_identity(self) -> None: ch, agent = _make_teams() @@ -793,56 +736,6 @@ async def get_final_response(self) -> Any: body = ch._http.post.call_args[1]["json"] # type: ignore[attr-defined] assert body["text"] == "(no response)" - async def test_buffer_empty_stream_consults_host_and_can_suppress(self) -> None: - # Empty streamed replies must still consult the host so that - # ``ResponseTarget.none`` (deliver_response -> False) suppresses the - # originating message instead of posting "(no response)". - ch, _agent = _make_teams(stream=True) - webchat_activity = {**_VALID_ACTIVITY, "channelId": "directline"} - ctx = MagicMock() - ctx.deliver_response = AsyncMock(return_value=False) - ch._ctx = ctx - - class _EmptyStream: - def __aiter__(self) -> Any: - async def gen() -> Any: - if False: - yield None # type: ignore[unreachable] - - return gen() - - async def get_final_response(self) -> Any: - return _FakeAgentResponse(text="") - - ch._stream_edit_min_interval = 0.0 - await ch._stream_to_conversation(webchat_activity, _VALID_REQUEST, _EmptyStream()) # type: ignore[arg-type] - assert ch._http is not None - ctx.deliver_response.assert_awaited_once() - ch._http.post.assert_not_called() # type: ignore[attr-defined] - ch._http.put.assert_not_called() # type: ignore[attr-defined] - - async def test_edit_empty_stream_consults_host_and_can_suppress(self) -> None: - # Same contract for the edit-capable (Teams) progressive path. - ch, _agent = _make_teams(stream=True) - ctx = MagicMock() - ctx.deliver_response = AsyncMock(return_value=False) - ch._ctx = ctx - - class _EmptyStream: - def __aiter__(self) -> Any: - async def gen() -> Any: - if False: - yield None # type: ignore[unreachable] - - return gen() - - async def get_final_response(self) -> Any: - return _FakeAgentResponse(text="") - - ch._stream_edit_min_interval = 0.0 - await ch._stream_to_conversation(_VALID_ACTIVITY, _VALID_REQUEST, _EmptyStream()) # type: ignore[arg-type] - ctx.deliver_response.assert_awaited_once() - async def test_edit_405_falls_back_to_single_post(self) -> None: # Defensive: a channel advertised as edit-capable that nonetheless # rejects the PUT with 405 must stop editing and POST the final diff --git a/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py b/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py index 18da136c0ca..f0297898b01 100644 --- a/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py +++ b/python/packages/hosting-discord/agent_framework_hosting_discord/_channel.py @@ -9,11 +9,11 @@ import logging import re import time -from collections.abc import Awaitable, Callable, Coroutine, Mapping, Sequence +from collections.abc import Callable, Coroutine, Mapping, Sequence from typing import Any, cast import httpx -from agent_framework import AgentResponse, AgentResponseUpdate, Content, Message, ResponseStream +from agent_framework import AgentResponse, AgentResponseUpdate, ResponseStream from agent_framework_hosting import ( ChannelCommand, ChannelCommandContext, @@ -24,10 +24,8 @@ ChannelResponseHook, ChannelRunHook, ChannelSession, - ChannelStreamTransformHook, + ChannelStreamUpdateHook, HostedRunResult, - apply_channel_response_hook, - apply_run_hook, ) from nacl.exceptions import BadSignatureError from nacl.signing import VerifyKey @@ -75,11 +73,6 @@ def _default_isolation_key(interaction: DiscordInteraction) -> str: return discord_isolation_key(guild_id, channel_id, user_id) -def _text_result(text: str) -> HostedRunResult[AgentResponse]: - """Build a host delivery payload from text accumulated by this channel.""" - return HostedRunResult(AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=text)])])) - - class DiscordChannel: """Discord channel backed by signed HTTP Interactions.""" @@ -100,7 +93,7 @@ def __init__( commands: Sequence[ChannelCommand] | None = None, run_hook: ChannelRunHook | None = None, response_hook: ChannelResponseHook | None = None, - stream_transform_hook: ChannelStreamTransformHook | None = None, + stream_update_hook: ChannelStreamUpdateHook | None = None, streaming: bool = False, isolation_key_factory: DiscordIsolationKeyFactory | None = None, skip_signature_verification: bool = False, @@ -133,7 +126,7 @@ def __init__( it reaches the host. response_hook: Optional hook that can rewrite the hosted result before the originating Discord response is serialized. - stream_transform_hook: Optional per-update transform hook applied + stream_update_hook: Optional per-update hook applied while streaming. streaming: Whether the agent command should call ``run_stream`` and edit the original interaction response as deltas arrive. @@ -163,7 +156,7 @@ def __init__( self._command_by_name = {command.name: command for command in self._commands} self._run_hook = run_hook self.response_hook = response_hook - self._stream_transform_hook = stream_transform_hook + self._stream_update_hook = stream_update_hook self._streaming = streaming self._isolation_key_factory = isolation_key_factory or _default_isolation_key self._skip_signature_verification = skip_signature_verification @@ -190,25 +183,6 @@ def contribute(self, context: ChannelContext) -> ChannelContribution: on_shutdown=[self._on_shutdown], ) - async def push(self, identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: - """Push a hosted result to a Discord channel. - - Args: - identity: Destination identity. ``identity.attributes`` must carry - ``channel_id``. - payload: Hosted run result to render as Discord message text. - - Raises: - RuntimeError: If the channel has no bot token for Discord REST. - ValueError: If ``channel_id`` is missing from the identity. - """ - channel_id = _string_or_none(identity.attributes.get("channel_id")) - if channel_id is None: - raise ValueError("Discord push requires identity.attributes['channel_id']") - if self.bot_token is None: - raise RuntimeError("DiscordChannel.push requires bot_token to send channel messages") - await self._send_channel_messages(channel_id, _payload_text(payload)) - async def _on_startup(self) -> None: """Open the Discord REST client and optionally register slash commands.""" self._ensure_http() @@ -297,23 +271,17 @@ async def _run_agent_command(self, interaction: DiscordInteraction, token: str) input_value=prompt, stream=self._streaming, ) - if self._run_hook is not None: - request = await apply_run_hook( - self._run_hook, - request, - target=self._ctx.target, - protocol_request=interaction, - ) if request.stream: - await self._run_streaming(request, token) + await self._run_streaming(request, token, protocol_request=interaction) return - result = await self._ctx.run(request) - include_originating = await self._ctx.deliver_response(request, result) - if include_originating: - result = await apply_channel_response_hook(self, result, request=request, originating=True) - await self._edit_original_with_result(token, result) - else: - await self._edit_original(token, "Sent.") + result = await self._ctx.run( + request=request, + run_hook=self._run_hook, + protocol_request=interaction, + response_hook=self.response_hook, + channel_name=self.name, + ) + await self._edit_original_with_result(token, result) async def _run_channel_command( self, @@ -333,23 +301,23 @@ async def _run_channel_command( if not reply.sent: await self._edit_original(token, "Done.") - async def _run_streaming(self, request: ChannelRequest, token: str) -> None: + async def _run_streaming( + self, request: ChannelRequest, token: str, *, protocol_request: DiscordInteraction | None = None + ) -> None: if self._ctx is None: raise RuntimeError("DiscordChannel was not contributed to a host.") - stream: ResponseStream[AgentResponseUpdate, AgentResponse] = self._ctx.run_stream(request) + stream: ResponseStream[AgentResponseUpdate, AgentResponse] = await self._ctx.run_stream( + request, + run_hook=self._run_hook, + protocol_request=protocol_request, + stream_update_hook=self._stream_update_hook, + response_hook=self.response_hook, + channel_name=self.name, + ) accumulated: list[str] = [] last_edit = 0.0 async for update in stream: - transformed: AgentResponseUpdate | None = update - if self._stream_transform_hook is not None: - maybe = self._stream_transform_hook(update) - if isinstance(maybe, Awaitable): - transformed = await cast("Awaitable[AgentResponseUpdate | None]", maybe) - else: - transformed = maybe - if transformed is None: - continue - chunk = _update_text(transformed) + chunk = _update_text(update) if not chunk: continue accumulated.append(chunk) @@ -358,13 +326,8 @@ async def _run_streaming(self, request: ChannelRequest, token: str) -> None: await self._edit_original(token, _stream_preview_content("".join(accumulated))) last_edit = now - final = _text_result("".join(accumulated)) - include_originating = await self._ctx.deliver_response(request, final) - if include_originating: - final = await apply_channel_response_hook(self, final, request=request, originating=True) - await self._edit_original_with_result(token, final) - else: - await self._edit_original(token, "Sent.") + final_response = await stream.get_final_response() + await self._edit_original_with_result(token, HostedRunResult(final_response)) def _build_request( self, @@ -478,16 +441,6 @@ async def _send_followup(self, token: str, content: str) -> None: ) _raise_for_discord_error(response, "send interaction follow-up") - async def _send_channel_messages(self, channel_id: str, content: str) -> None: - http = self._ensure_http() - for chunk in _split_content(content): - response = await http.post( - f"/channels/{channel_id}/messages", - headers=self._bot_headers(), - json={"content": chunk}, - ) - _raise_for_discord_error(response, "send channel message") - def _bot_headers(self) -> dict[str, str]: if self.bot_token is None: raise RuntimeError("Discord bot token is required for this operation") diff --git a/python/packages/hosting-discord/tests/discord/test_channel.py b/python/packages/hosting-discord/tests/discord/test_channel.py index 5addcd9a065..f402fde90fb 100644 --- a/python/packages/hosting-discord/tests/discord/test_channel.py +++ b/python/packages/hosting-discord/tests/discord/test_channel.py @@ -3,7 +3,7 @@ from __future__ import annotations import json -from collections.abc import AsyncIterator +from collections.abc import AsyncIterator, Awaitable from typing import Any import httpx @@ -13,7 +13,6 @@ ChannelCommand, ChannelCommandContext, ChannelRequest, - ChannelResponseContext, HostedRunResult, ) from nacl.signing import SigningKey @@ -60,39 +59,95 @@ def _headers(signing_key: SigningKey, body: bytes) -> dict[str, str]: class _FakeContext: - def __init__(self, *, text: str = "agent reply", include_originating: bool = True) -> None: + def __init__(self, *, text: str = "agent reply") -> None: self.target = object() self.text = text - self.include_originating = include_originating self.requests: list[ChannelRequest] = [] - self.delivered: list[tuple[ChannelRequest, HostedRunResult[Any]]] = [] - self.stream: _FakeStream | None = None - - async def run(self, request: ChannelRequest) -> HostedRunResult[AgentResponse]: + self.fake_stream: _FakeStream | None = None + + async def run( + self, + request: ChannelRequest, + *, + run_hook: Any | None = None, + protocol_request: Any | None = None, + response_hook: Any | None = None, + channel_name: str | None = None, + ) -> HostedRunResult[AgentResponse]: + if run_hook is not None: + maybe_request = run_hook(request, target=self.target, protocol_request=protocol_request) + if isinstance(maybe_request, Awaitable): + request = await maybe_request + else: + request = maybe_request self.requests.append(request) - return _run_result(self.text) - - def run_stream(self, request: ChannelRequest) -> _FakeStream: + result = _run_result(self.text) + if response_hook is not None: + maybe_result = response_hook(result, request=request, channel_name=channel_name or request.channel) + if isinstance(maybe_result, Awaitable): + return await maybe_result + return maybe_result + return result + + async def run_stream( + self, + request: ChannelRequest, + *, + run_hook: Any | None = None, + protocol_request: Any | None = None, + stream_update_hook: Any | None = None, + response_hook: Any | None = None, + channel_name: str | None = None, + ) -> _FakeStream: + if run_hook is not None: + maybe_request = run_hook(request, target=self.target, protocol_request=protocol_request) + if isinstance(maybe_request, Awaitable): + request = await maybe_request + else: + request = maybe_request self.requests.append(request) - if self.stream is None: - self.stream = _FakeStream(["a", "b"]) - return self.stream - - async def deliver_response(self, request: ChannelRequest, payload: HostedRunResult[Any]) -> bool: - self.delivered.append((request, payload)) - return self.include_originating + if self.fake_stream is None: + self.fake_stream = _FakeStream(["a", "b"]) + if stream_update_hook is not None: + self.fake_stream.transform = stream_update_hook + if response_hook is not None: + self.fake_stream.response_hook = response_hook + self.fake_stream.request = request + self.fake_stream.channel_name = channel_name or request.channel + return self.fake_stream class _FakeStream: def __init__(self, chunks: list[str]) -> None: self._chunks = chunks + self.transform: Any | None = None + self.response_hook: Any | None = None + self.request: ChannelRequest | None = None + self.channel_name: str | None = None def __aiter__(self) -> AsyncIterator[AgentResponseUpdate]: return self._iter() async def _iter(self) -> AsyncIterator[AgentResponseUpdate]: for chunk in self._chunks: - yield AgentResponseUpdate(contents=[Content.from_text(text=chunk)], role="assistant") + update = AgentResponseUpdate(contents=[Content.from_text(text=chunk)], role="assistant") + if self.transform is not None: + transformed = self.transform(update) + if isinstance(transformed, Awaitable): + transformed = await transformed + if transformed is None: + continue + update = transformed + yield update + + async def get_final_response(self) -> AgentResponse: + result = _run_result("".join(self._chunks)) + if self.response_hook is None: + return result.result + shaped = self.response_hook(result, request=self.request, channel_name=self.channel_name) + if isinstance(shaped, Awaitable): + shaped = await shaped + return shaped.result class _DiscordRecorder: @@ -212,7 +267,6 @@ async def test_agent_command_runs_host_and_edits_original_response() -> None: assert context.requests[0].identity is not None assert context.requests[0].identity.native_id == "user-1" assert context.requests[0].identity.attributes["channel_id"] == "channel-1" - assert len(context.delivered) == 1 assert recorder.requests[0].method == "PATCH" assert recorder.requests[0].url.path == "/webhooks/app-1/token/messages/@original" assert recorder.json_payloads[0] == {"content": "agent says hi"} @@ -232,7 +286,6 @@ async def hook(request: ChannelRequest, **_: Any) -> ChannelRequest: attributes=request.attributes, stream=request.stream, identity=request.identity, - response_target=request.response_target, ) channel = DiscordChannel( @@ -254,9 +307,9 @@ async def test_response_hook_rewrites_originating_reply() -> None: recorder = _DiscordRecorder() context = _FakeContext(text="original") - async def hook(result: HostedRunResult[Any], *, context: ChannelResponseContext) -> HostedRunResult[Any]: - assert context.originating is True + async def hook(result: HostedRunResult[Any], **kwargs: Any) -> HostedRunResult[Any]: assert result.result.text == "original" + assert kwargs["channel_name"] == "discord" return _run_result("rewritten") channel = DiscordChannel( @@ -274,23 +327,6 @@ async def hook(result: HostedRunResult[Any], *, context: ChannelResponseContext) assert recorder.json_payloads[-1] == {"content": "rewritten"} -async def test_deliver_response_false_acknowledges_without_originating_payload() -> None: - recorder = _DiscordRecorder() - context = _FakeContext(text="fanout only", include_originating=False) - channel = DiscordChannel( - application_id="app-1", - public_key=SigningKey.generate().verify_key.encode().hex(), - register_commands=False, - api_base_url="https://discord.test", - ) - channel.contribute(context) # type: ignore[arg-type] - channel._http = httpx.AsyncClient(base_url="https://discord.test", transport=recorder.transport()) - - await channel._run_agent_command(_interaction(), "token") - - assert recorder.json_payloads[-1] == {"content": "Sent."} - - async def test_missing_prompt_edits_original_without_calling_host() -> None: recorder = _DiscordRecorder() context = _FakeContext(text="should not run") @@ -513,75 +549,6 @@ async def test_originating_reply_sends_followup_chunks() -> None: assert [len(payload["content"]) for payload in recorder.json_payloads] == [2000, 1] -async def test_push_requires_channel_id_and_sends_chunked_messages() -> None: - recorder = _DiscordRecorder() - channel = DiscordChannel( - application_id="app-1", - public_key=SigningKey.generate().verify_key.encode().hex(), - bot_token="bot-token", - register_commands=False, - api_base_url="https://discord.test", - ) - channel._http = httpx.AsyncClient(base_url="https://discord.test", transport=recorder.transport()) - - await channel.push( - identity=channel._identity_from_interaction(_interaction()), # pyright: ignore[reportPrivateUsage] - payload=_run_result("a" * 2001), - ) - - assert [request.url.path for request in recorder.requests] == [ - "/channels/channel-1/messages", - "/channels/channel-1/messages", - ] - assert [len(payload["content"]) for payload in recorder.json_payloads] == [2000, 1] - - -async def test_push_renders_no_response_for_unknown_payload_shape() -> None: - recorder = _DiscordRecorder() - channel = DiscordChannel( - application_id="app-1", - public_key=SigningKey.generate().verify_key.encode().hex(), - bot_token="bot-token", - register_commands=False, - api_base_url="https://discord.test", - ) - channel._http = httpx.AsyncClient(base_url="https://discord.test", transport=recorder.transport()) - - await channel.push( - identity=channel._identity_from_interaction(_interaction()), # pyright: ignore[reportPrivateUsage] - payload=HostedRunResult(object()), - ) - - assert recorder.json_payloads == [{"content": "(no response)"}] - - -async def test_push_requires_bot_token_and_channel_id() -> None: - identity = DiscordChannel( - application_id="app-1", - public_key=SigningKey.generate().verify_key.encode().hex(), - register_commands=False, - )._identity_from_interaction(_interaction()) # pyright: ignore[reportPrivateUsage] - no_bot_token = DiscordChannel( - application_id="app-1", - public_key=SigningKey.generate().verify_key.encode().hex(), - register_commands=False, - ) - no_channel_id = DiscordChannel( - application_id="app-1", - public_key=SigningKey.generate().verify_key.encode().hex(), - bot_token="bot-token", - register_commands=False, - ) - - with pytest.raises(RuntimeError, match="bot_token"): - await no_bot_token.push(identity=identity, payload=_run_result("hello")) - with pytest.raises(ValueError, match="channel_id"): - await no_channel_id.push( - identity=type(identity)(channel=identity.channel, native_id=identity.native_id, attributes={}), - payload=_run_result("hello"), - ) - - async def test_streaming_edits_original_and_delivers_final_response() -> None: recorder = _DiscordRecorder() context = _FakeContext() @@ -599,14 +566,12 @@ async def test_streaming_edits_original_and_delivers_final_response() -> None: await channel._run_agent_command(_interaction(), "token") assert [payload["content"] for payload in recorder.json_payloads] == ["a", "ab", "ab"] - assert len(context.delivered) == 1 - assert context.delivered[0][1].result.text == "ab" async def test_streaming_preview_is_limited_and_final_reply_is_chunked() -> None: recorder = _DiscordRecorder() context = _FakeContext() - context.stream = _FakeStream(["a" * 2001]) + context.fake_stream = _FakeStream(["a" * 2001]) channel = DiscordChannel( application_id="app-1", public_key=SigningKey.generate().verify_key.encode().hex(), @@ -622,12 +587,11 @@ async def test_streaming_preview_is_limited_and_final_reply_is_chunked() -> None assert [request.method for request in recorder.requests] == ["PATCH", "PATCH", "POST"] assert [len(payload["content"]) for payload in recorder.json_payloads] == [2000, 2000, 1] - assert len(context.delivered[0][1].result.text) == 2001 -async def test_stream_transform_hook_can_drop_updates_and_disable_originating_reply() -> None: +async def test_stream_update_hook_can_drop_updates() -> None: recorder = _DiscordRecorder() - context = _FakeContext(include_originating=False) + context = _FakeContext() async def hook(update: AgentResponseUpdate) -> AgentResponseUpdate | None: if update.text == "a": @@ -639,7 +603,7 @@ async def hook(update: AgentResponseUpdate) -> AgentResponseUpdate | None: public_key=SigningKey.generate().verify_key.encode().hex(), register_commands=False, streaming=True, - stream_transform_hook=hook, + stream_update_hook=hook, edit_interval=0, api_base_url="https://discord.test", ) @@ -648,11 +612,10 @@ async def hook(update: AgentResponseUpdate) -> AgentResponseUpdate | None: await channel._run_agent_command(_interaction(), "token") - assert [payload["content"] for payload in recorder.json_payloads] == ["b", "Sent."] - assert context.delivered[0][1].result.text == "b" + assert [payload["content"] for payload in recorder.json_payloads] == ["b", "ab"] -async def test_stream_transform_hook_can_synchronously_rewrite_updates() -> None: +async def test_stream_update_hook_can_synchronously_rewrite_updates() -> None: recorder = _DiscordRecorder() context = _FakeContext() @@ -664,7 +627,7 @@ def hook(_update: AgentResponseUpdate) -> AgentResponseUpdate: public_key=SigningKey.generate().verify_key.encode().hex(), register_commands=False, streaming=True, - stream_transform_hook=hook, + stream_update_hook=hook, edit_interval=0, api_base_url="https://discord.test", ) @@ -673,7 +636,7 @@ def hook(_update: AgentResponseUpdate) -> AgentResponseUpdate: await channel._run_agent_command(_interaction(), "token") - assert [payload["content"] for payload in recorder.json_payloads] == ["x", "xx", "xx"] + assert [payload["content"] for payload in recorder.json_payloads] == ["x", "xx", "ab"] async def _noop() -> None: diff --git a/python/packages/hosting-entra/LICENSE b/python/packages/hosting-entra/LICENSE deleted file mode 100644 index 9e841e7a26e..00000000000 --- a/python/packages/hosting-entra/LICENSE +++ /dev/null @@ -1,21 +0,0 @@ - MIT License - - Copyright (c) Microsoft Corporation. - - Permission is hereby granted, free of charge, to any person obtaining a copy - of this software and associated documentation files (the "Software"), to deal - in the Software without restriction, including without limitation the rights - to use, copy, modify, merge, publish, distribute, sublicense, and/or sell - copies of the Software, and to permit persons to whom the Software is - furnished to do so, subject to the following conditions: - - The above copyright notice and this permission notice shall be included in all - copies or substantial portions of the Software. - - THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR - IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, - FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE - AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER - LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, - OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE - SOFTWARE diff --git a/python/packages/hosting-entra/README.md b/python/packages/hosting-entra/README.md deleted file mode 100644 index 6e0073812b1..00000000000 --- a/python/packages/hosting-entra/README.md +++ /dev/null @@ -1,39 +0,0 @@ -# agent-framework-hosting-entra - -Microsoft Entra (Azure AD) identity-linking sidecar channel for -[agent-framework-hosting](../hosting). Owns the OAuth 2.0 Authorization Code -flow that binds a per-channel id (e.g. a Telegram chat id) to the user's -Entra object id, so multiple non-Entra channels can share a single -`entra:` isolation key. - -## Usage - -```python -from pathlib import Path -from agent_framework_hosting import AgentFrameworkHost -from agent_framework_hosting_entra import ( - EntraIdentityLinkChannel, - EntraIdentityStore, -) - -store = EntraIdentityStore(Path("./identity_links.json")) - -host = AgentFrameworkHost( - target=my_agent, - channels=[ - EntraIdentityLinkChannel( - store=store, - tenant_id="", - client_id="", - client_secret="", - public_base_url="https://your.host", - ), - # ... other channels whose run hooks call store.lookup(...) - ], -) -host.serve() -``` - -For tenants that disallow client secrets, pass `certificate_path=` (and -optionally `certificate_password=`) instead of `client_secret`. The PEM -layout matches the one used by `agent-framework-hosting-teams`. diff --git a/python/packages/hosting-entra/agent_framework_hosting_entra/__init__.py b/python/packages/hosting-entra/agent_framework_hosting_entra/__init__.py deleted file mode 100644 index 6e1bba53b88..00000000000 --- a/python/packages/hosting-entra/agent_framework_hosting_entra/__init__.py +++ /dev/null @@ -1,15 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Microsoft Entra (Azure AD) identity channel for :mod:`agent_framework_hosting`.""" - -from ._channel import ( - EntraIdentityLinkChannel, - EntraIdentityStore, - entra_isolation_key, -) - -__all__ = [ - "EntraIdentityLinkChannel", - "EntraIdentityStore", - "entra_isolation_key", -] diff --git a/python/packages/hosting-entra/agent_framework_hosting_entra/_channel.py b/python/packages/hosting-entra/agent_framework_hosting_entra/_channel.py deleted file mode 100644 index 7e6075638c2..00000000000 --- a/python/packages/hosting-entra/agent_framework_hosting_entra/_channel.py +++ /dev/null @@ -1,505 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Microsoft Entra (Azure AD) identity-linking sidecar channel. - -Implements the OAuth 2.0 Authorization Code flow against Entra so users on -non-Entra channels (Telegram, Responses callers without a verified token, -etc.) can bind their per-channel id to a stable ``entra:`` isolation -key. Once the link is established, channel run-hooks can call -:meth:`EntraIdentityStore.lookup` and rewrite the request to use the Entra -key instead of the channel-native id. - -Two credential modes are supported: - -* ``client_secret`` — confidential-client secret. -* ``certificate_path`` — PEM bundle (private key + cert) for tenants that - disallow secrets. The Teams channel uses the same PEM layout; see - :mod:`agent_framework_hosting_teams` for the openssl recipe. -""" - -from __future__ import annotations - -import asyncio -import hashlib -import hmac -import html -import json -import secrets -import time -from dataclasses import dataclass -from pathlib import Path -from typing import Any -from urllib.parse import urlencode, urlparse - -import httpx -import msal -from agent_framework_hosting import ( - ChannelContext, - ChannelContribution, - logger, -) -from cryptography import x509 -from cryptography.hazmat.primitives import hashes, serialization -from starlette.requests import Request -from starlette.responses import HTMLResponse, RedirectResponse, Response -from starlette.routing import Route - - -def entra_isolation_key(oid: str) -> str: - """Canonical isolation key for a user identified by Entra object id.""" - return f"entra:{oid}" - - -class EntraIdentityStore: - """Tiny JSON-backed mapping ``: → entra:``. - - Production deployments should swap this for a real KV store. Single-file - JSON is fine for samples because writes are infrequent (only during the - OAuth callback) and we serialize them under an asyncio lock. - """ - - def __init__(self, path: Path) -> None: - """Open an identity store backed by ``path``. - - Loads any existing JSON document; an unreadable or corrupt file is - logged and replaced with an empty in-memory map so callers always - get a usable store. - """ - self._path = path - self._lock = asyncio.Lock() - self._data: dict[str, str] = {} - if path.exists(): - try: - self._data = json.loads(path.read_text()) - except Exception: - logger.exception("identity store load failed; starting empty") - - def lookup(self, channel_key: str) -> str | None: - """Return the linked ``entra:`` key for a per-channel id, or ``None``.""" - return self._data.get(channel_key) - - async def link(self, channel_key: str, oid: str) -> None: - """Bind ``channel_key`` (e.g. ``telegram:123``) to the Entra ``oid`` and persist. - - Overwrites any existing mapping for ``channel_key`` and rewrites the - backing JSON file under the lock so concurrent callers cannot race. - """ - async with self._lock: - self._data[channel_key] = entra_isolation_key(oid) - self._path.write_text(json.dumps(self._data, indent=2, sort_keys=True)) - - async def unlink(self, channel_key: str) -> None: - """Remove the mapping for ``channel_key``; no-op if absent. - - The file is only rewritten when an entry actually existed so we - don't churn disk on idempotent unlink calls. - """ - async with self._lock: - if self._data.pop(channel_key, None) is not None: - self._path.write_text(json.dumps(self._data, indent=2, sort_keys=True)) - - -@dataclass -class _PendingAuth: - """In-memory record of an authorize redirect waiting for its OAuth callback.""" - - channel: str - channel_id: str - expires_at: float - return_to: str | None = None - - -def _link_html(body: str, *, status: int = 200) -> HTMLResponse: - """Wrap ``body`` in a minimal HTML shell suitable for browser link UIs.""" - return HTMLResponse( - f"{body}", - status_code=status, - ) - - -def _load_certificate_credential(certificate_path: str | Path, certificate_password: bytes | None) -> dict[str, str]: - """Build the ``msal`` certificate credential dict from a PEM bundle. - - Expects ``certificate_path`` to point at a single PEM containing the - private key followed by the X.509 certificate (the layout produced by - ``cat key.pem cert.pem > combined.pem``). - """ - pem_bytes = Path(certificate_path).read_bytes() - private_key = serialization.load_pem_private_key(pem_bytes, password=certificate_password) - cert = x509.load_pem_x509_certificate(pem_bytes) - - private_key_pem = private_key.private_bytes( - encoding=serialization.Encoding.PEM, - format=serialization.PrivateFormat.PKCS8, - encryption_algorithm=serialization.NoEncryption(), - ).decode() - public_cert_pem = cert.public_bytes(serialization.Encoding.PEM).decode() - # SHA-1 thumbprint is required by the Entra ``client_assertion`` spec for cert auth — not a security choice. - thumbprint = cert.fingerprint(hashes.SHA1()).hex() # noqa: S303 - return { - "private_key": private_key_pem, - "thumbprint": thumbprint, - "public_certificate": public_cert_pem, - } - - -class EntraIdentityLinkChannel: - """Sidecar Channel exposing ``GET /auth/start`` and ``GET /auth/callback``. - - Demonstrates that ``Channel`` is a general extensibility point — not just - for chat surfaces. Owns the Entra OAuth Authorization Code flow used to - bind a per-channel id (e.g. Telegram chat id) to the user's Entra object - id. - - Two credential modes are supported (mutually exclusive): - - * ``client_secret`` — classic confidential-client secret. - * ``certificate_path`` — PEM bundle (private key + certificate) for - tenants that disallow secrets. See ``teams.py`` module docstring for - an ``openssl`` recipe; the same PEM works here. - - Flow (OAuth 2.0 Authorization Code, confidential client): - - 1. ``GET /auth/start?channel=&id=`` mints a one-shot - ``state`` token and 302s to the Entra ``authorize`` endpoint. - 2. User signs in; Entra calls ``GET /auth/callback?code=...&state=...``. - 3. We exchange the code for a token (via ``msal`` so secret + cert auth - look identical at the call site), call Microsoft Graph ``/me`` to - read ``id`` (oid), persist ``: → entra:``, and - respond with a friendly HTML page (or 302 to ``return_to``). - - Tokens never leave the host process; only the ``oid`` claim is stored. - """ - - name = "identity" - path = "/auth" - - _AUTHORITY_TEMPLATE = "https://login.microsoftonline.com/{tenant}" - _GRAPH_ME = "https://graph.microsoft.com/v1.0/me" - _PENDING_TTL_SECONDS = 600 # 10 minutes - - def __init__( - self, - *, - store: EntraIdentityStore, - tenant_id: str, - client_id: str, - public_base_url: str, - client_secret: str | None = None, - certificate_path: str | Path | None = None, - certificate_password: bytes | None = None, - scope: str = "openid profile User.Read", - link_token_secret: str | None = None, - link_token_ttl_seconds: int = 600, - ) -> None: - if bool(client_secret) == bool(certificate_path): - raise ValueError("IdentityLinkChannel: pass exactly one of client_secret or certificate_path.") - if certificate_path is not None: - credential: str | dict[str, str] = _load_certificate_credential(certificate_path, certificate_password) - self._auth_kind = "certificate" - else: - credential = client_secret # type: ignore[assignment] - self._auth_kind = "client_secret" - - self._store = store - self._tenant_id = tenant_id - self._client_id = client_id - self._public_base_url = public_base_url.rstrip("/") - self._scopes = [s for s in scope.split() if s and s.lower() not in {"openid", "profile", "offline_access"}] - # MSAL ConfidentialClientApplication is sync; we wrap blocking calls - # in ``asyncio.to_thread`` because token endpoint calls do real I/O. - self._msal_app = msal.ConfidentialClientApplication( - client_id=client_id, - authority=self._AUTHORITY_TEMPLATE.format(tenant=tenant_id), - client_credential=credential, - ) - self._pending: dict[str, _PendingAuth] = {} - self._http: httpx.AsyncClient | None = None - # ``link_token_secret`` is the HMAC key that gates ``/auth/start``. - # Without it any open-internet caller can mint a binding for an - # arbitrary ``(channel, channel_id)`` pair and IDOR the victim's - # isolation key (see PR review on 0026 for the threat model). - # Optional only so dev-mode samples without the integration in - # place don't have to scramble for a secret; unsigned mode logs - # a loud warning at startup and wire-time. - self._link_token_secret = link_token_secret.encode("utf-8") if link_token_secret else None - self._link_token_ttl = link_token_ttl_seconds - # Allowed redirect-back hosts: relative paths and same-origin only. - # ``return_to`` from the unauthenticated /start query string is - # otherwise an open redirect (auth-host phishing vector). - parsed = urlparse(self._public_base_url) - self._allowed_return_host = parsed.netloc.lower() if parsed.netloc else None - - @property - def redirect_uri(self) -> str: - """The fully-qualified OAuth redirect URI registered with Entra ID. - - Computed from ``public_base_url`` plus the channel's mount path so - operators can copy it straight into the app registration's reply URLs. - """ - return f"{self._public_base_url}{self.path}/callback" - - def contribute(self, context: "ChannelContext") -> "ChannelContribution": - """Mount the ``/start`` and ``/callback`` routes plus lifecycle hooks.""" - return ChannelContribution( - routes=[ - Route("/start", self._handle_start, methods=["GET"]), - Route("/callback", self._handle_callback, methods=["GET"]), - ], - on_startup=[self._on_startup], - on_shutdown=[self._on_shutdown], - ) - - async def _on_startup(self) -> None: - """Open the shared HTTP client used for Microsoft Graph calls.""" - self._http = httpx.AsyncClient(timeout=15.0) - if self._link_token_secret is None: - logger.warning( - "EntraIdentityLinkChannel running WITHOUT link_token_secret. " - "GET /auth/start accepts unauthenticated (channel, id) pairs, " - "which means any open-internet caller can bind their Entra " - "account to a victim's per-channel id (IDOR on the identity " - "store). Pass link_token_secret=, mint URLs via " - "mint_start_url(...), and gate /start in front of the " - "channel that issues those URLs." - ) - logger.info( - "IdentityLinkChannel ready (auth=%s, signed_start=%s); redirect_uri=%s", - self._auth_kind, - self._link_token_secret is not None, - self.redirect_uri, - ) - - async def _on_shutdown(self) -> None: - """Close the Graph HTTP client; safe to call when never started.""" - if self._http is not None: - await self._http.aclose() - - # -- link-token helpers ----------------------------------------------- # - - def _sign_link_token(self, channel: str, channel_id: str, expires_at: int) -> str: - """Sign ``(channel, channel_id, expires_at)`` with HMAC-SHA256.""" - if self._link_token_secret is None: # pragma: no cover - guarded by callers - raise RuntimeError("link_token_secret is required to mint link tokens") - msg = f"{channel}|{channel_id}|{expires_at}".encode() - return hmac.new(self._link_token_secret, msg, hashlib.sha256).hexdigest() - - def _verify_link_token(self, channel: str, channel_id: str, expires_at: int, signature: str) -> bool: - """Constant-time verify the link-token signature and TTL.""" - if self._link_token_secret is None: # pragma: no cover - guarded by callers - return False - if expires_at < int(time.time()): - return False - expected = self._sign_link_token(channel, channel_id, expires_at) - return hmac.compare_digest(expected, signature) - - def mint_start_url(self, channel: str, channel_id: str, return_to: str | None = None) -> str: - """Return a one-shot signed URL for ``GET /auth/start``. - - Required when ``link_token_secret`` is set. Channels that issue - these URLs (e.g. a Telegram ``/link`` command after verifying the - inbound webhook signature) call this helper so the resulting URL - proves the caller authorised the ``(channel, channel_id)`` binding. - - Without this layer ``GET /auth/start`` is an IDOR vector: any - anonymous caller can bind a victim's per-channel id to their own - Entra ``oid``. - """ - if self._link_token_secret is None: - raise RuntimeError("mint_start_url requires link_token_secret in the constructor") - if return_to is not None: - self._validate_return_to(return_to) # fail fast at mint time - expires_at = int(time.time()) + self._link_token_ttl - sig = self._sign_link_token(channel, str(channel_id), expires_at) - params = { - "channel": channel, - "id": str(channel_id), - "exp": str(expires_at), - "sig": sig, - } - if return_to: - params["return_to"] = return_to - return f"{self._public_base_url}{self.path}/start?{urlencode(params)}" - - def _validate_return_to(self, return_to: str) -> None: - """Reject open-redirect targets. - - Allows: relative paths starting with ``/``, or absolute URLs whose - host equals the configured ``public_base_url``'s host. Rejects - everything else with ``ValueError``. - """ - if return_to.startswith("/") and not return_to.startswith("//"): - return # relative path, safe. - parsed = urlparse(return_to) - if not parsed.netloc: - return - if self._allowed_return_host and parsed.netloc.lower() == self._allowed_return_host: - return - raise ValueError( - f"return_to must be a relative path or same-origin URL " - f"(public_base_url host={self._allowed_return_host!r}); got {return_to!r}" - ) - - def authorize_url_for(self, channel: str, channel_id: str, return_to: str | None = None) -> str: - """Mint a one-shot authorize URL the user can visit to bind their account.""" - state = secrets.token_urlsafe(24) - self._gc_pending() - self._pending[state] = _PendingAuth( - channel=channel, - channel_id=str(channel_id), - expires_at=time.monotonic() + self._PENDING_TTL_SECONDS, - return_to=return_to, - ) - return str( - self._msal_app.get_authorization_request_url( - scopes=self._scopes, - redirect_uri=self.redirect_uri, - state=state, - prompt="select_account", - ) - ) - - def _gc_pending(self) -> None: - """Drop expired pending-auth entries so the in-memory map cannot grow unbounded.""" - now = time.monotonic() - for key, entry in list(self._pending.items()): - if entry.expires_at < now: - self._pending.pop(key, None) - - async def _handle_start(self, request: Request) -> Response: - """``GET /start?channel=&id=&return_to=&exp=&sig=`` — redirect to Entra to sign in. - - **Security model.** When ``link_token_secret`` is set the - request must include ``exp`` + ``sig`` — an HMAC over - ``(channel, channel_id, expires_at)`` minted by - :meth:`mint_start_url`. Without that gate, any open-internet - caller can bind a victim's per-channel id (e.g. - ``telegram:``) to their own Entra ``oid``: the - callback would persist - ``"telegram:" -> "entra:"`` and any - future inbound message from the victim would resolve to the - attacker's isolation key. We make the unsigned mode opt-in - with a loud startup warning so the dev-mode default doesn't - ship to production. - - ``return_to`` is validated against the configured - ``public_base_url`` host (or restricted to relative paths) to - prevent open-redirect phishing on a successful sign-in. - """ - channel = request.query_params.get("channel") - channel_id = request.query_params.get("id") - return_to = request.query_params.get("return_to") - if not channel or not channel_id: - return _link_html("Missing 'channel' or 'id' query parameter.", status=400) - - if self._link_token_secret is not None: - sig = request.query_params.get("sig") - exp_raw = request.query_params.get("exp") - try: - exp = int(exp_raw) if exp_raw else 0 - except ValueError: - exp = 0 - if not sig or not exp or not self._verify_link_token(channel, channel_id, exp, sig): - logger.warning( - "EntraIdentityLinkChannel /start rejected: missing/invalid signed link-token (channel=%s, id=%s)", - channel, - channel_id, - ) - return _link_html("Invalid or expired sign-in link.", status=403) - else: - # See _on_startup warning. Logged on every wire access so - # operators can't miss the IDOR exposure in their access logs. - logger.warning( - "EntraIdentityLinkChannel /start accepted UNSIGNED request " - "for (channel=%s, id=%s) — set link_token_secret to require " - "HMAC-signed link tokens minted via mint_start_url().", - channel, - channel_id, - ) - if return_to is not None: - try: - self._validate_return_to(return_to) - except ValueError as exc: - logger.warning("EntraIdentityLinkChannel /start invalid return_to: %s", exc) - return _link_html("Invalid return_to URL.", status=400) - url = self.authorize_url_for(channel, channel_id, return_to=return_to) - return RedirectResponse(url, status_code=302) - - async def _handle_callback(self, request: Request) -> Response: - """``GET /callback`` — finish the OAuth flow and persist the link. - - Exchanges the authorization code for a token, reads the user's - ``id``/``userPrincipalName`` from Microsoft Graph, then stores the - ``channel:channel_id -> entra:`` mapping in the identity store. - Renders a small HTML page so a browser-based flow has something to - show; if ``return_to`` was supplied (and validated at /start time - against the same-origin allowlist) it appears as a deep link. - - All values that flow into HTML output (``error``, ``error_description``, - ``channel_key``, ``upn``) are passed through :func:`html.escape` to - avoid reflected XSS — both the OAuth-error path and the - sign-in-success body would otherwise execute attacker-controlled - markup on the auth host's origin. - """ - if self._http is None: # pragma: no cover - guarded by lifecycle - raise RuntimeError("entra identity channel not started") - if error := request.query_params.get("error"): - description = request.query_params.get("error_description", "") - return _link_html( - f"Sign-in failed: {html.escape(error)}
{html.escape(description)}", - status=400, - ) - - code = request.query_params.get("code") - state = request.query_params.get("state") - pending = self._pending.pop(state or "", None) - if not code or pending is None or pending.expires_at < time.monotonic(): - return _link_html("Invalid or expired sign-in state. Please retry.", status=400) - - # MSAL handles client_secret vs client_assertion (cert) under the hood. - result: dict[str, Any] = await asyncio.to_thread( - self._msal_app.acquire_token_by_authorization_code, - code, - scopes=self._scopes, - redirect_uri=self.redirect_uri, - ) - if "access_token" not in result: - logger.warning("Entra token exchange failed: %s", result) - err_text = result.get("error_description") or result.get("error") or "unknown error" - return _link_html( - f"Token exchange failed: {html.escape(str(err_text))}", - status=502, - ) - access_token = result["access_token"] - - me = await self._http.get(self._GRAPH_ME, headers={"Authorization": f"Bearer {access_token}"}) - if me.status_code != 200: - return _link_html("Could not read user profile from Microsoft Graph.", status=502) - profile = me.json() - oid = profile.get("id") - upn = profile.get("userPrincipalName") or profile.get("displayName") or oid - if not oid: - return _link_html("Profile response missing 'id'.", status=502) - - channel_key = f"{pending.channel}:{pending.channel_id}" - await self._store.link(channel_key, oid) - logger.info("Linked %s → entra:%s (%s)", channel_key, oid, upn) - - if pending.return_to: - # ``return_to`` was already validated at /start time against - # the allowlist (relative path or same-origin only). Re-check - # defensively to harden against any future code path that - # bypasses the /start gate. - try: - self._validate_return_to(pending.return_to) - return RedirectResponse(pending.return_to, status_code=302) - except ValueError: - logger.warning( - "EntraIdentityLinkChannel /callback dropping invalid return_to: %s", - pending.return_to, - ) - return _link_html( - f"

Linked

{html.escape(channel_key)} is now bound to " - f"{html.escape(str(upn))}.

" - "

You can close this window and return to your chat.

" - ) diff --git a/python/packages/hosting-entra/pyproject.toml b/python/packages/hosting-entra/pyproject.toml deleted file mode 100644 index 45264741d54..00000000000 --- a/python/packages/hosting-entra/pyproject.toml +++ /dev/null @@ -1,108 +0,0 @@ -[project] -name = "agent-framework-hosting-entra" -description = "Microsoft Entra (Azure AD) OAuth-based identity-linking channel for agent-framework-hosting." -authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] -readme = "README.md" -requires-python = ">=3.10" -version = "1.0.0a260424" -license-files = ["LICENSE"] -urls.homepage = "https://aka.ms/agent-framework" -urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" -urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" -urls.issues = "https://github.com/microsoft/agent-framework/issues" -classifiers = [ - "License :: OSI Approved :: MIT License", - "Development Status :: 3 - Alpha", - "Intended Audience :: Developers", - "Programming Language :: Python :: 3", - "Programming Language :: Python :: 3.10", - "Programming Language :: Python :: 3.11", - "Programming Language :: Python :: 3.12", - "Programming Language :: Python :: 3.13", - "Programming Language :: Python :: 3.14", - "Typing :: Typed", -] -dependencies = [ - "agent-framework-core>=1.2.0,<2", - "agent-framework-hosting==1.0.0a260424", - "httpx>=0.27,<1", - "msal>=1.28,<2", - "cryptography>=42", -] - -[tool.uv] -prerelease = "if-necessary-or-explicit" -environments = [ - "sys_platform == 'darwin'", - "sys_platform == 'linux'", - "sys_platform == 'win32'" -] - -[tool.uv-dynamic-versioning] -fallback-version = "0.0.0" - -[tool.pytest.ini_options] -testpaths = 'tests' -addopts = "-ra -q -r fEX" -asyncio_mode = "auto" -asyncio_default_fixture_loop_scope = "function" -filterwarnings = [] -timeout = 120 -markers = [ - "integration: marks tests as integration tests that require external services", -] - -[tool.ruff] -extend = "../../pyproject.toml" - -[tool.coverage.run] -omit = [ - "**/__init__.py" -] - -[tool.pyright] -extends = "../../pyproject.toml" -include = ["agent_framework_hosting_entra"] -exclude = ['tests'] -# Bot Framework activities arrive as loosely-typed JSON-ish maps. Strict -# ``Unknown`` reporting on every ``.get(...)`` adds noise without catching -# real bugs — narrowing happens via runtime isinstance checks instead. -reportUnknownArgumentType = "none" -reportUnknownMemberType = "none" -reportUnknownVariableType = "none" -reportUnknownLambdaType = "none" -reportOptionalMemberAccess = "none" - -[tool.mypy] -plugins = ['pydantic.mypy'] -strict = true -python_version = "3.10" -ignore_missing_imports = true -disallow_untyped_defs = true -no_implicit_optional = true -check_untyped_defs = true -warn_return_any = true -show_error_codes = true -warn_unused_ignores = false -disallow_incomplete_defs = true -disallow_untyped_decorators = true - -[tool.bandit] -targets = ["agent_framework_hosting_entra"] -exclude_dirs = ["tests"] - -[tool.poe] -executor.type = "uv" -include = "../../shared_tasks.toml" - -[tool.poe.tasks.mypy] -help = "Run MyPy for this package." -cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_entra" - -[tool.poe.tasks.test] -help = "Run the default unit test suite for this package." -cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_entra --cov-report=term-missing:skip-covered tests' - -[build-system] -requires = ["flit-core >= 3.11,<4.0"] -build-backend = "flit_core.buildapi" diff --git a/python/packages/hosting-entra/tests/__init__.py b/python/packages/hosting-entra/tests/__init__.py deleted file mode 100644 index e69de29bb2d..00000000000 diff --git a/python/packages/hosting-entra/tests/test_channel.py b/python/packages/hosting-entra/tests/test_channel.py deleted file mode 100644 index 19aa43d9b0d..00000000000 --- a/python/packages/hosting-entra/tests/test_channel.py +++ /dev/null @@ -1,464 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Unit tests for :mod:`agent_framework_hosting_entra`. - -The MSAL ``ConfidentialClientApplication`` and Microsoft Graph calls are -mocked out so no network access is required. Live OAuth, certificate auth, -and full webhook flow are out of scope here. -""" - -from __future__ import annotations - -import json -from pathlib import Path -from typing import Any -from unittest.mock import AsyncMock, MagicMock, patch - -import pytest -from starlette.applications import Starlette -from starlette.testclient import TestClient - -from agent_framework_hosting_entra import ( - EntraIdentityLinkChannel, - EntraIdentityStore, - entra_isolation_key, -) - - -def test_entra_isolation_key_format() -> None: - assert entra_isolation_key("abc123") == "entra:abc123" - - -class TestEntraIdentityStore: - async def test_link_writes_entra_namespaced_value(self, tmp_path: Path) -> None: - store = EntraIdentityStore(tmp_path / "links.json") - await store.link("telegram:42", "oid-xyz") - assert store.lookup("telegram:42") == "entra:oid-xyz" - # Persisted to disk. - saved = json.loads((tmp_path / "links.json").read_text()) - assert saved == {"telegram:42": "entra:oid-xyz"} - - async def test_unlink_removes_entry(self, tmp_path: Path) -> None: - store = EntraIdentityStore(tmp_path / "links.json") - await store.link("telegram:42", "oid") - await store.unlink("telegram:42") - assert store.lookup("telegram:42") is None - assert json.loads((tmp_path / "links.json").read_text()) == {} - - async def test_unlink_unknown_is_noop(self, tmp_path: Path) -> None: - store = EntraIdentityStore(tmp_path / "links.json") - await store.unlink("telegram:never") # must not raise - assert not (tmp_path / "links.json").exists() - - def test_loads_existing_file(self, tmp_path: Path) -> None: - path = tmp_path / "links.json" - path.write_text(json.dumps({"telegram:1": "entra:abc"})) - store = EntraIdentityStore(path) - assert store.lookup("telegram:1") == "entra:abc" - - def test_corrupt_file_starts_empty(self, tmp_path: Path) -> None: - path = tmp_path / "links.json" - path.write_text("not-json") - store = EntraIdentityStore(path) - assert store.lookup("anything") is None - - -class TestEntraIdentityLinkChannelConfig: - def test_rejects_neither_credential(self, tmp_path: Path) -> None: - with pytest.raises(ValueError, match="exactly one"): - EntraIdentityLinkChannel( - store=EntraIdentityStore(tmp_path / "x.json"), - tenant_id="t", - client_id="c", - public_base_url="https://example.com", - ) - - def test_rejects_both_credentials(self, tmp_path: Path) -> None: - with pytest.raises(ValueError, match="exactly one"): - EntraIdentityLinkChannel( - store=EntraIdentityStore(tmp_path / "x.json"), - tenant_id="t", - client_id="c", - public_base_url="https://example.com", - client_secret="s", - certificate_path="/tmp/does-not-exist.pem", - ) - - def test_redirect_uri_strips_trailing_slash(self, tmp_path: Path) -> None: - with patch( - "agent_framework_hosting_entra._channel.msal.ConfidentialClientApplication", - MagicMock(), - ): - ch = EntraIdentityLinkChannel( - store=EntraIdentityStore(tmp_path / "x.json"), - tenant_id="t", - client_id="c", - public_base_url="https://example.com/", - client_secret="s", - ) - assert ch.redirect_uri == "https://example.com/auth/callback" - - -class TestEntraIdentityLinkChannelRoutes: - def _make_channel(self, tmp_path: Path, msal_app: MagicMock) -> tuple[EntraIdentityLinkChannel, EntraIdentityStore]: - store = EntraIdentityStore(tmp_path / "links.json") - with patch( - "agent_framework_hosting_entra._channel.msal.ConfidentialClientApplication", - return_value=msal_app, - ): - ch = EntraIdentityLinkChannel( - store=store, - tenant_id="tenant-1", - client_id="client-1", - public_base_url="https://example.com", - client_secret="s", - ) - return ch, store - - def _mount_app(self, ch: EntraIdentityLinkChannel) -> Starlette: - # We don't depend on AgentFrameworkHost here — wire the routes - # directly so we can exercise the channel in isolation. - from starlette.routing import Mount - - contribution = ch.contribute(MagicMock()) - return Starlette(routes=[Mount(ch.path, routes=contribution.routes)]) - - def test_start_missing_params_returns_400(self, tmp_path: Path) -> None: - msal_app = MagicMock() - ch, _ = self._make_channel(tmp_path, msal_app) - with TestClient(self._mount_app(ch)) as client: - r = client.get("/auth/start", follow_redirects=False) - assert r.status_code == 400 - - def test_start_redirects_to_authorize_url(self, tmp_path: Path) -> None: - msal_app = MagicMock() - msal_app.get_authorization_request_url.return_value = ( - "https://login.microsoftonline.com/tenant-1/oauth2/v2.0/authorize?state=X" - ) - ch, _ = self._make_channel(tmp_path, msal_app) - with TestClient(self._mount_app(ch)) as client: - r = client.get( - "/auth/start", - params={"channel": "telegram", "id": "42"}, - follow_redirects=False, - ) - assert r.status_code == 302 - assert "login.microsoftonline.com" in r.headers["location"] - - def test_callback_invalid_state_returns_400(self, tmp_path: Path) -> None: - msal_app = MagicMock() - ch, _ = self._make_channel(tmp_path, msal_app) - ch._http = MagicMock(aclose=AsyncMock()) - with TestClient(self._mount_app(ch)) as client: - r = client.get("/auth/callback", params={"code": "c", "state": "unknown"}) - assert r.status_code == 400 - - def test_callback_links_oid_on_success(self, tmp_path: Path) -> None: - msal_app = MagicMock() - msal_app.get_authorization_request_url.return_value = ( - "https://login.microsoftonline.com/tenant-1/authorize?state=X" - ) - msal_app.acquire_token_by_authorization_code.return_value = {"access_token": "t"} - ch, store = self._make_channel(tmp_path, msal_app) - - # Fake the Graph /me call. - graph_response = MagicMock() - graph_response.status_code = 200 - graph_response.json = MagicMock(return_value={"id": "oid-xyz", "userPrincipalName": "user@x"}) - ch._http = MagicMock() - ch._http.get = AsyncMock(return_value=graph_response) - ch._http.aclose = AsyncMock() - - # Mint a real state via the public API so the pending dict is populated. - ch.authorize_url_for("telegram", "42") - state = next(iter(ch._pending.keys())) - - with TestClient(self._mount_app(ch)) as client: - r = client.get("/auth/callback", params={"code": "abc", "state": state}) - assert r.status_code == 200 - assert store.lookup("telegram:42") == "entra:oid-xyz" - - def test_callback_token_failure_returns_502(self, tmp_path: Path) -> None: - msal_app = MagicMock() - msal_app.get_authorization_request_url.return_value = "https://x" - msal_app.acquire_token_by_authorization_code.return_value = { - "error": "invalid_grant", - "error_description": "expired", - } - ch, store = self._make_channel(tmp_path, msal_app) - ch._http = MagicMock(aclose=AsyncMock()) - ch.authorize_url_for("telegram", "42") - state = next(iter(ch._pending.keys())) - with TestClient(self._mount_app(ch)) as client: - r = client.get("/auth/callback", params={"code": "c", "state": state}) - assert r.status_code == 502 - assert store.lookup("telegram:42") is None - - -# --------------------------------------------------------------------------- # -# Round-2 security hardening # -# --------------------------------------------------------------------------- # - - -class TestSignedLinkToken: - """`/auth/start` must reject unsigned/forged requests when secret is set.""" - - def _make_signed_channel( - self, tmp_path: Path, msal_app: MagicMock, *, secret: str = "test-secret" - ) -> EntraIdentityLinkChannel: - store = EntraIdentityStore(tmp_path / "links.json") - with patch( - "agent_framework_hosting_entra._channel.msal.ConfidentialClientApplication", - return_value=msal_app, - ): - return EntraIdentityLinkChannel( - store=store, - tenant_id="tenant-1", - client_id="client-1", - public_base_url="https://example.com", - client_secret="s", - link_token_secret=secret, - ) - - def _mount(self, ch: EntraIdentityLinkChannel) -> Starlette: - from starlette.routing import Mount - - contribution = ch.contribute(MagicMock()) - return Starlette(routes=[Mount(ch.path, routes=contribution.routes)]) - - def test_start_rejects_unsigned_request_when_secret_set(self, tmp_path: Path) -> None: - msal_app = MagicMock() - ch = self._make_signed_channel(tmp_path, msal_app) - with TestClient(self._mount(ch)) as client: - r = client.get( - "/auth/start", - params={"channel": "telegram", "id": "42"}, - follow_redirects=False, - ) - assert r.status_code == 403 - - def test_start_rejects_forged_signature(self, tmp_path: Path) -> None: - msal_app = MagicMock() - ch = self._make_signed_channel(tmp_path, msal_app) - with TestClient(self._mount(ch)) as client: - r = client.get( - "/auth/start", - params={ - "channel": "telegram", - "id": "42", - "exp": "9999999999", - "sig": "deadbeef", - }, - follow_redirects=False, - ) - assert r.status_code == 403 - - def test_start_accepts_valid_signed_url(self, tmp_path: Path) -> None: - msal_app = MagicMock() - msal_app.get_authorization_request_url.return_value = ( - "https://login.microsoftonline.com/tenant-1/authorize?state=X" - ) - ch = self._make_signed_channel(tmp_path, msal_app) - url = ch.mint_start_url("telegram", "42") - # Strip the host prefix to call via the in-process client. - path_and_query = url.split("https://example.com", 1)[1] - with TestClient(self._mount(ch)) as client: - r = client.get(path_and_query, follow_redirects=False) - assert r.status_code == 302 - - def test_start_rejects_expired_signed_url(self, tmp_path: Path) -> None: - import time as time_module - from urllib.parse import urlencode - - msal_app = MagicMock() - ch = self._make_signed_channel(tmp_path, msal_app) - # Hand-craft an expired-but-otherwise-valid token. - expired = int(time_module.time()) - 60 - sig = ch._sign_link_token("telegram", "42", expired) # type: ignore[attr-defined] # pyright: ignore[reportPrivateUsage] - params = {"channel": "telegram", "id": "42", "exp": str(expired), "sig": sig} - with TestClient(self._mount(ch)) as client: - r = client.get(f"/auth/start?{urlencode(params)}", follow_redirects=False) - assert r.status_code == 403 - - def test_mint_start_url_requires_secret(self, tmp_path: Path) -> None: - import pytest - - msal_app = MagicMock() - store = EntraIdentityStore(tmp_path / "links.json") - with patch( - "agent_framework_hosting_entra._channel.msal.ConfidentialClientApplication", - return_value=msal_app, - ): - ch = EntraIdentityLinkChannel( - store=store, - tenant_id="tenant-1", - client_id="client-1", - public_base_url="https://example.com", - client_secret="s", - ) - with pytest.raises(RuntimeError, match="link_token_secret"): - ch.mint_start_url("telegram", "42") - - def test_unsigned_mode_logs_warning_at_startup(self, tmp_path: Path, caplog: Any) -> None: - import asyncio as asyncio_mod - import logging - - msal_app = MagicMock() - store = EntraIdentityStore(tmp_path / "links.json") - with patch( - "agent_framework_hosting_entra._channel.msal.ConfidentialClientApplication", - return_value=msal_app, - ): - ch = EntraIdentityLinkChannel( - store=store, - tenant_id="tenant-1", - client_id="client-1", - public_base_url="https://example.com", - client_secret="s", - ) - with caplog.at_level(logging.WARNING, logger="agent_framework.hosting"): - asyncio_mod.run(ch._on_startup()) # pyright: ignore[reportPrivateUsage] - asyncio_mod.run(ch._on_shutdown()) # pyright: ignore[reportPrivateUsage] - assert any("WITHOUT link_token_secret" in r.message for r in caplog.records) - - -class TestXssEscaping: - """All inbound query/profile values must be HTML-escaped before output.""" - - def _setup(self, tmp_path: Path) -> tuple[EntraIdentityLinkChannel, EntraIdentityStore, MagicMock]: - store = EntraIdentityStore(tmp_path / "links.json") - msal_app = MagicMock() - msal_app.get_authorization_request_url.return_value = "https://x" - with patch( - "agent_framework_hosting_entra._channel.msal.ConfidentialClientApplication", - return_value=msal_app, - ): - ch = EntraIdentityLinkChannel( - store=store, - tenant_id="tenant-1", - client_id="client-1", - public_base_url="https://example.com", - client_secret="s", - ) - return ch, store, msal_app - - def _mount(self, ch: EntraIdentityLinkChannel) -> Starlette: - from starlette.routing import Mount - - contribution = ch.contribute(MagicMock()) - return Starlette(routes=[Mount(ch.path, routes=contribution.routes)]) - - def test_callback_error_param_is_escaped(self, tmp_path: Path) -> None: - ch, _, _ = self._setup(tmp_path) - ch._http = MagicMock(aclose=AsyncMock()) - with TestClient(self._mount(ch)) as client: - r = client.get( - "/auth/callback", - params={ - "error": "", - "error_description": "", - }, - ) - assert r.status_code == 400 - assert "@x"} - ) - ch._http = MagicMock(aclose=AsyncMock()) - ch._http.get = AsyncMock(return_value=graph_response) - # Mint a binding via authorize_url_for (channel-side trusted call). - ch.authorize_url_for("", "42") - state = next(iter(ch._pending.keys())) - with TestClient(self._mount(ch)) as client: - r = client.get("/auth/callback", params={"code": "abc", "state": state}) - assert r.status_code == 200 - assert "
messaging app] - - subgraph Host[AgentFrameworkHost] - direction TB - ASGI[Starlette app] - Router[Channel router] - Parse{parse →
command or
message?} - Auth[host.authorize] - Resolver[IdentityResolver] - Delivery[_deliver_response] - Push[_handle_push_task] - end - - Channels[Channels
Responses · Invocations ·
Telegram · Activity ·
IdentityLinker] - CmdHandler[CommandHandler
via ChannelCommandContext] - Target[(Agent or Workflow)] - Runner[DurableTaskRunner] - StateStore[(HostStateStore)] - - Caller --> ASGI - ASGI --> Router - Router --> Parse - Parse -- /command --> CmdHandler - Parse -- message --> Auth - CmdHandler -- ctx.run --> Auth - CmdHandler -- local reply --> Channels - Auth --> Resolver - Resolver --> StateStore - Auth --> Target - Target --> Delivery - Delivery -- originating sync --> Channels - Delivery -- non-originating --> Runner - Runner --> Push - Push --> Channels - Channels --> ASGI -``` - -For a richer set of flow diagrams — identity linking, multi-channel -fan-out, server-side relays, background runs, durable-runner codec -envelopes, echo idempotency, workflow targets — see the -[Python hosting spec](https://github.com/microsoft/agent-framework/blob/main/docs/specs/002-python-hosting-channels.md). +| `agent-framework-hosting-activity-protocol` | Bot Framework Activity Protocol | +| `agent-framework-hosting-discord` | Discord HTTP Interactions | ## Install ```bash pip install agent-framework-hosting agent-framework-hosting-responses -# or with uvicorn pre-installed for the demo `host.serve(...)` helper +# or with Hypercorn pre-installed for the demo `host.serve(...)` helper pip install "agent-framework-hosting[serve]" agent-framework-hosting-responses -# add the [disk] extra to opt in to on-disk persistence (see below) +# add the [disk] extra to persist reset-session aliases pip install "agent-framework-hosting[disk]" ``` @@ -109,84 +58,46 @@ host = AgentFrameworkHost(target=agent, channels=channels) host.serve(port=8000) ``` -See the [hosting samples](https://github.com/microsoft/agent-framework/tree/main/python/samples/04-hosting/af-hosting) -for richer multi-channel apps (Telegram + Teams + Responses fan-out, -identity linking, `ResponseTarget` routing, etc.). - -## Optional disk persistence (`state_dir`) - -By default the host keeps everything in memory: the durable-task runner's -pending push queue, the per-isolation-key session aliases, the active-channel -map, and the per-channel `ChannelIdentity` map. That is the right shape for -**ephemeral** runtimes (Foundry Hosted Agents et al.) where the host is -restarted per request and persistence lives behind a service like the Foundry -response store, and for short-lived local dev. - -For **long-running** deployments (an always-on container, a local dev server -you restart often, a single-VM bot) opt in to disk persistence by passing -`state_dir` to `AgentFrameworkHost`. The runner queue and the session -bookkeeping use [`diskcache`](https://grantjenks.com/docs/diskcache/) -(installed via the `[disk]` extra) protected by an OS-level advisory file -lock so two hosts pointed at the same directory can't double-execute -scheduled pushes. Workflow checkpoints (when the target is a `Workflow`) -use the framework's `FileCheckpointStorage` — no extra dependency. The -identity-link store path is offered to linkers that implement -`SupportsLinkStorePath`; linkers that manage persistence themselves should -be configured directly. +## Session state and workflow checkpoints -```python -from agent_framework_hosting import AgentFrameworkHost +By default the host keeps live `AgentSession` objects and reset-session aliases +in memory. Channels opt into continuity by setting +`ChannelRequest.session = ChannelSession(isolation_key=...)`; requests with the +same isolation key reuse the same host-created session. -# Single path → host auto-derives `runner/`, `sessions/`, `links/`, and -# (for workflow targets) `checkpoints/` subpaths. +For long-running deployments that need `reset_session(...)` aliases to survive +restart, pass `state_dir`: + +```python host = AgentFrameworkHost( target=agent, channels=channels, state_dir="./.host-state", ) +``` + +This creates `./.host-state/sessions/` and stores only lightweight alias +bookkeeping. Live `AgentSession` objects are still rehydrated lazily by the +configured history provider on the next turn. -# Or route components to different roots — use the HostStatePaths TypedDict -# (or a plain dict with the same keys) for editor autocomplete on the keys. -# Omit a key to opt that component out of persistence. +For workflow targets, `checkpoint_location=...` is the clearest way to enable +checkpoint persistence. As a convenience, `state_dir="./.host-state"` also +derives `./.host-state/checkpoints/` for workflow targets. Use the mapping form +when you want only one component: + +```python from agent_framework_hosting import HostStatePaths host = AgentFrameworkHost( target=workflow, channels=channels, state_dir=HostStatePaths( - runner="/var/lib/myapp/tasks", - sessions="/var/lib/myapp/state", + sessions="/var/lib/myapp/sessions", checkpoints="/var/lib/myapp/checkpoints", - links="/var/lib/myapp/links", ), ) ``` -What survives a restart: - -- **Pending durable-task records** — scheduled but not-yet-completed push - deliveries replay on the next host startup via `runner.resume()`. Records - that crashed mid-attempt resume with their already-consumed retry budget. -- **`_session_aliases`** — per-isolation-key session-id rewrites (via the - reset-session command). -- **`_active`** — the most recently active channel for each isolation key - (consumed by `ResponseTarget.active`). -- **`_identities`** — channel-native `ChannelIdentity` rows used by - `ResponseTarget.channels([...])` / `.all_linked` fan-out. -- **Workflow checkpoints** — when the target is a `Workflow`, the host wraps - the `checkpoints` path in a per-isolation-key `FileCheckpointStorage` - (equivalent to passing `checkpoint_location=...` directly; the explicit - parameter takes precedence and emits a warning when both are set). -- **Identity-link store** — when the configured linker implements - `SupportsLinkStorePath`, the host passes the `links` path to it so pending - challenges, linked identities, and verified claims can survive restarts. - -What doesn't: - -- Live `AgentSession` objects (rehydrated lazily by the history provider on the - next turn). -- The `ContinuationToken` store (separate concern, plug in your own). - -Unpicklable push payloads raise `PushPayloadNotPicklable` *eagerly* from -`schedule()` so issues surface at the call site, not on the next restart. - +Cross-channel identity linking, multicast delivery, background runs, +continuation tokens, and durable delivery runners are follow-up enhancements, +not part of this v1 host contract. diff --git a/python/packages/hosting/agent_framework_hosting/__init__.py b/python/packages/hosting/agent_framework_hosting/__init__.py index aa91f654b04..ff03530dd45 100644 --- a/python/packages/hosting/agent_framework_hosting/__init__.py +++ b/python/packages/hosting/agent_framework_hosting/__init__.py @@ -13,30 +13,7 @@ import importlib.metadata -from ._authorization import ( - AllOfAllowlists, - AllowAll, - Allowed, - AllowlistDecision, - AnyOfAllowlists, - AuthorizationContext, - AuthorizationOutcome, - AuthPolicy, - CallableAllowlist, - ChannelConfigurationError, - ClaimValue, - Denied, - IdentityAllowlist, - IdentityLinker, - LinkChallenge, - LinkedClaimAllowlist, - LinkedIdentity, - LinkRequired, - LinkResolution, - NativeIdAllowlist, - SupportsLinkStorePath, -) -from ._host import AgentFrameworkHost, ChannelContext, RuntimeMode, logger +from ._host import AgentFrameworkHost, ChannelContext, logger from ._isolation import ( ISOLATION_HEADER_CHAT, ISOLATION_HEADER_USER, @@ -45,35 +22,19 @@ reset_current_isolation_keys, set_current_isolation_keys, ) -from ._runner import InProcessTaskRunner from ._types import ( Channel, ChannelCommand, ChannelCommandContext, ChannelContribution, ChannelIdentity, - ChannelPush, - ChannelPushCodec, ChannelRequest, - ChannelResponseContext, ChannelResponseHook, ChannelRunHook, ChannelSession, - ChannelStreamTransformHook, - DurableTaskPayloadMode, - DurableTaskRunner, + ChannelStreamUpdateHook, HostedRunResult, HostStatePaths, - PushPayloadNotPicklable, - PushPayloadNotSerializable, - ResponseTarget, - ResponseTargetKind, - RetryPolicy, - TaskHandle, - TaskStatus, - apply_channel_response_hook, - apply_response_hook, - apply_run_hook, ) try: @@ -85,59 +46,21 @@ "ISOLATION_HEADER_CHAT", "ISOLATION_HEADER_USER", "AgentFrameworkHost", - "AllOfAllowlists", - "AllowAll", - "Allowed", - "AllowlistDecision", - "AnyOfAllowlists", - "AuthPolicy", - "AuthorizationContext", - "AuthorizationOutcome", - "CallableAllowlist", "Channel", "ChannelCommand", "ChannelCommandContext", - "ChannelConfigurationError", "ChannelContext", "ChannelContribution", "ChannelIdentity", - "ChannelPush", - "ChannelPushCodec", "ChannelRequest", - "ChannelResponseContext", "ChannelResponseHook", "ChannelRunHook", "ChannelSession", - "ChannelStreamTransformHook", - "ClaimValue", - "Denied", - "DurableTaskPayloadMode", - "DurableTaskRunner", + "ChannelStreamUpdateHook", "HostStatePaths", "HostedRunResult", - "IdentityAllowlist", - "IdentityLinker", - "InProcessTaskRunner", "IsolationKeys", - "LinkChallenge", - "LinkRequired", - "LinkResolution", - "LinkedClaimAllowlist", - "LinkedIdentity", - "NativeIdAllowlist", - "PushPayloadNotPicklable", - "PushPayloadNotSerializable", - "ResponseTarget", - "ResponseTargetKind", - "RetryPolicy", - "RuntimeMode", - "SupportsLinkStorePath", - "TaskHandle", - "TaskStatus", "__version__", - "apply_channel_response_hook", - "apply_response_hook", - "apply_run_hook", "get_current_isolation_keys", "logger", "reset_current_isolation_keys", diff --git a/python/packages/hosting/agent_framework_hosting/_authorization.py b/python/packages/hosting/agent_framework_hosting/_authorization.py deleted file mode 100644 index 882dad18cc0..00000000000 --- a/python/packages/hosting/agent_framework_hosting/_authorization.py +++ /dev/null @@ -1,485 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Authorization seam — :class:`IdentityAllowlist`, :class:`IdentityLinker`, and outcomes. - -Channels that emit a :class:`ChannelIdentity` compose authorization from -two **orthogonal** parameters set per channel: - -- ``require_link: bool`` — "identity must be linked to an IdP claim". The - host delegates this to the configured :class:`IdentityLinker`; pairing - ``require_link=True`` with no linker is rejected at construction - (silent-deny-everyone is the worst possible default). -- ``allowlist: IdentityAllowlist | Literal["inherit"] | None`` — "identity - is on the accept list". The host evaluates the allowlist on every - inbound message via :func:`AgentFrameworkHost.authorize`. - -The two axes compose into the three named profiles **open** (no gate), -**forced-link** (any authenticated identity), and **allowlist** (only -listed identities, keyed either on the channel-native id pre-link or on -a verified IdP claim post-link). See -``docs/specs/002-python-hosting-channels.md`` § -"Authorization profiles and the IdentityAllowlist seam". - -This module ships the channel-neutral core pieces. Provider-specific -linking channels (for example Entra OAuth helpers) can implement -:class:`IdentityLinker` without the core package taking a dependency on -their transport or identity-provider SDKs. -""" - -from __future__ import annotations - -import os -from collections.abc import Awaitable, Callable, Collection, Mapping, Sequence -from dataclasses import dataclass, field -from datetime import datetime -from enum import Enum -from typing import Any, Literal, Protocol, TypeAlias, runtime_checkable - -from ._types import ChannelIdentity - - -class AllowlistDecision(str, Enum): - """Tri-state allowlist evaluation outcome. - - ``ABSTAIN`` is **not** a denial — it means "this allowlist has no - information yet" (typically a claim-based allowlist evaluated at - ``pre_link``). The host's :meth:`AgentFrameworkHost.authorize` - pipeline is what turns an all-``ABSTAIN`` outcome into the next - step (allow when open, escalate to a link ceremony when the config - calls for one). Boolean composition cannot distinguish "claim - allowlist denies you" from "claim allowlist hasn't seen any claims - yet" — a critical distinction for the **Mixed** profile. - """ - - ALLOW = "allow" - DENY = "deny" - ABSTAIN = "abstain" - - -ClaimValue: TypeAlias = str | Sequence[str] -"""Verified claim value shape understood by :class:`LinkedClaimAllowlist`.""" - - -def _empty_claim_mapping() -> Mapping[str, ClaimValue]: - return {} - - -def _empty_any_mapping() -> Mapping[str, Any]: - return {} - - -@dataclass(frozen=True) -class AuthorizationContext: - """Inputs to a single :meth:`IdentityAllowlist.evaluate` call.""" - - identity: ChannelIdentity - phase: Literal["pre_link", "post_link"] - isolation_key: str | None = None - verified_claims: Mapping[str, ClaimValue] = field(default_factory=_empty_claim_mapping) - claim_source: Literal["linker", "channel", "none"] = "none" - - -@runtime_checkable -class IdentityAllowlist(Protocol): - """Per-channel accept/deny gate evaluated by the host. - - ``requires_linked_claims`` declares that this allowlist's - :meth:`evaluate` cannot ``ALLOW`` until verified claims are - available — the host's construction-time validator rejects - configurations that would silently deny everyone (e.g. a - :class:`LinkedClaimAllowlist` on a channel that neither has - ``require_link=True`` nor natively emits verified claims). - """ - - requires_linked_claims: bool - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: ... - - -class AllowAll: - """Explicit "open" sentinel. - - Useful for tests, sample code, and for **overriding** a host-level - ``default_allowlist`` on a specific channel that should be public - inside an otherwise locked-down host. - """ - - requires_linked_claims: bool = False - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: - return AllowlistDecision.ALLOW - - -class NativeIdAllowlist: - """Accept only listed channel-native ids. - - Telegram ``chat_id``, WhatsApp number, Slack user id, etc. The - list can be a plain collection or an async loader so allowlist - sources can be config files, secret stores, or feature flags. - Pre-link and post-link behaviour is identical — native-id - allowlists do not depend on link state. - - When ``channel`` is set, the allowlist participates in - :class:`AnyOfAllowlists` composition by returning ``ABSTAIN`` for - requests from other channels — this lets per-channel native lists - coexist under a single combinator without one channel's ``DENY`` - masking another channel's ``ALLOW``. - - Keyword Args: - native_ids: A static collection of ids, or an async loader. - channel: When set, only requests whose - ``ChannelIdentity.channel`` matches participate; others - ``ABSTAIN``. - """ - - requires_linked_claims: bool = False - - def __init__( - self, - native_ids: Collection[str] | Callable[[], Awaitable[Collection[str]]], - *, - channel: str | None = None, - ) -> None: - self._native_ids: Collection[str] | None - self._loader: Callable[[], Awaitable[Collection[str]]] | None - if callable(native_ids): - self._native_ids = None - self._loader = native_ids - else: - self._native_ids = frozenset(native_ids) - self._loader = None - self.channel = channel - - async def _resolve(self) -> Collection[str]: - if self._native_ids is not None: - return self._native_ids - loader = self._loader - if loader is None: # pragma: no cover - defensive - raise RuntimeError("NativeIdAllowlist: loader missing after cache miss") - loaded = await loader() - # Cache the resolved set so subsequent calls avoid re-loading. - self._native_ids = frozenset(loaded) - self._loader = None - return self._native_ids - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: - if self.channel is not None and context.identity.channel != self.channel: - return AllowlistDecision.ABSTAIN - ids = await self._resolve() - if context.identity.native_id in ids: - return AllowlistDecision.ALLOW - return AllowlistDecision.DENY - - -class LinkedClaimAllowlist: - """Accept only identities whose verified IdP claim is on the list. - - ``evaluate`` returns ``ABSTAIN`` at ``pre_link`` (no claims yet) - and ``ALLOW``/``DENY`` at ``post_link``. Claim values may be plain - strings or a sequence of strings (for multi-valued claims such as - group ids); any intersection with ``values`` allows the identity. - - Keyword Args: - claim: The verified-claim key to inspect (e.g. ``"oid"``, - ``"tid"``, ``"groups"``). - values: Accepted values. - """ - - requires_linked_claims: bool = True - - def __init__(self, claim: str, values: Collection[str]) -> None: - self.claim = claim - self.values = frozenset(values) - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: - if context.phase == "pre_link": - return AllowlistDecision.ABSTAIN - value = context.verified_claims.get(self.claim) - if value is None: - return AllowlistDecision.DENY - if isinstance(value, str): - return AllowlistDecision.ALLOW if value in self.values else AllowlistDecision.DENY - return AllowlistDecision.ALLOW if any(item in self.values for item in value) else AllowlistDecision.DENY - - -class AnyOfAllowlists: - """Combinator: any child ``ALLOW`` wins; ``DENY`` only if all children ``DENY``. - - Use this for the **Mixed** profile (native id OR linked claim). - Returns ``ABSTAIN`` when no child decides. - """ - - def __init__(self, *allowlists: IdentityAllowlist) -> None: - self._children = allowlists - self.requires_linked_claims = any(getattr(a, "requires_linked_claims", False) for a in allowlists) - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: - any_abstain = False - all_deny = True - for child in self._children: - decision = await child.evaluate(context) - if decision is AllowlistDecision.ALLOW: - return AllowlistDecision.ALLOW - if decision is AllowlistDecision.ABSTAIN: - any_abstain = True - all_deny = False - # DENY contributes to all_deny without short-circuit. - if all_deny and self._children: - return AllowlistDecision.DENY - if any_abstain: - return AllowlistDecision.ABSTAIN - # No children — treat as ABSTAIN to avoid surprise DENY. - return AllowlistDecision.ABSTAIN - - -class AllOfAllowlists: - """Combinator: any child ``DENY`` wins; ``ALLOW`` only if all children ``ALLOW``. - - Use this to require multiple conditions (e.g. tenancy - **and** group membership). Returns ``ABSTAIN`` when no child - denies but at least one ``ABSTAIN``s. - """ - - def __init__(self, *allowlists: IdentityAllowlist) -> None: - self._children = allowlists - self.requires_linked_claims = any(getattr(a, "requires_linked_claims", False) for a in allowlists) - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: - any_abstain = False - for child in self._children: - decision = await child.evaluate(context) - if decision is AllowlistDecision.DENY: - return AllowlistDecision.DENY - if decision is AllowlistDecision.ABSTAIN: - any_abstain = True - if not self._children: - return AllowlistDecision.ABSTAIN - if any_abstain: - return AllowlistDecision.ABSTAIN - return AllowlistDecision.ALLOW - - -class CallableAllowlist: - """Escape hatch: wrap an arbitrary async function as an allowlist. - - Recommended only after exhausting the structured variants — - composition is harder to reason about with opaque callables. - """ - - def __init__( - self, - fn: Callable[[AuthorizationContext], Awaitable[AllowlistDecision]], - *, - requires_linked_claims: bool = False, - ) -> None: - self._fn = fn - self.requires_linked_claims = requires_linked_claims - - async def evaluate(self, context: AuthorizationContext) -> AllowlistDecision: - return await self._fn(context) - - -# --------------------------------------------------------------------------- # -# Outcome types # -# --------------------------------------------------------------------------- # - - -@dataclass(frozen=True) -class LinkChallenge: - """Challenge a channel can render to complete an identity link. - - Attributes: - challenge_id: Opaque linker-owned id for correlating the challenge - with the later completion callback. - url: Optional URL (OAuth authorization URL, device-flow URL, etc.) - the user should open. - expires_at: Optional challenge expiry time. - message: Optional safe text a channel may render with the challenge. - attributes: Linker-specific structured metadata. Channels should - only use keys documented by the concrete linker they integrate. - """ - - challenge_id: str - url: str | None = None - expires_at: datetime | None = None - message: str | None = None - attributes: Mapping[str, Any] = field(default_factory=_empty_any_mapping) - - -@dataclass(frozen=True) -class LinkedIdentity: - """Resolved IdP-backed identity returned by :class:`IdentityLinker`. - - Attributes: - isolation_key: Stable key the host should use for the linked user. - verified_claims: Claims verified by the linker or by a channel that - natively authenticates the user. - claim_source: Where the claims came from. - """ - - isolation_key: str - verified_claims: Mapping[str, ClaimValue] = field(default_factory=_empty_claim_mapping) - claim_source: Literal["linker", "channel"] = "linker" - - -LinkResolution: TypeAlias = LinkedIdentity | LinkChallenge -"""Result returned by :meth:`IdentityLinker.resolve`.""" - - -class IdentityLinker(Protocol): - """Resolve a channel-native identity or return a challenge to link it. - - Concrete linker packages own the storage, OAuth/device-code routes, and - provider-specific claim mapping. The core host only consumes the single - resolution call so authorization can be a one-round-trip decision. - """ - - async def resolve(self, identity: ChannelIdentity) -> LinkResolution: - """Return a linked identity or the challenge needed to create one.""" - ... - - -@runtime_checkable -class SupportsLinkStorePath(Protocol): - """Optional protocol for linkers that accept host-provided persistence. - - When ``AgentFrameworkHost(state_dir=...)`` derives a ``links`` path, the - host calls this hook on identity linkers that implement it. Linkers that - manage their own persistence can ignore this protocol and should be - configured directly by the application. - """ - - def configure_link_store_path(self, path: str | os.PathLike[str]) -> None: - """Configure where the linker should persist its link store.""" - ... - - -@dataclass(frozen=True) -class Allowed: - """The identity is authorized; ``isolation_key`` is its stable key.""" - - isolation_key: str - verified_claims: Mapping[str, ClaimValue] = field(default_factory=_empty_claim_mapping) - claim_source: Literal["linker", "channel", "none"] = "none" - - -@dataclass(frozen=True) -class LinkRequired: - """The identity must complete the link ceremony before proceeding. - - Channels render ``challenge`` through their native UX (the same - path the ``link`` command uses). - """ - - challenge: LinkChallenge - - -@dataclass(frozen=True) -class Denied: - """The identity is rejected. - - Attributes: - reason_code: Stable, machine-readable token (e.g. - ``"allowlist_denied_pre_link"``). Never echoed to end - users. - user_message: Safe to render publicly (group-chat-safe); - ``None`` falls back to a bland default ("You don't have - access to this bot."). - log_details: Structured payload for audit/observability; - **never** shown to users. - """ - - reason_code: str - user_message: str | None = None - log_details: Mapping[str, Any] = field(default_factory=_empty_any_mapping) - - -AuthorizationOutcome = Allowed | LinkRequired | Denied -"""Result of :func:`AgentFrameworkHost.authorize`. Channels render -each variant through their native UX.""" - - -class AuthPolicy: - """Factory helpers for common authorization policies. - - These helpers are thin wrappers over the concrete allowlist types; they - exist so application code can describe authorization intent without - importing each building block separately. - """ - - @staticmethod - def open() -> AllowAll: - """Allow every identity.""" - return AllowAll() - - @staticmethod - def native_ids( - native_ids: Collection[str] | Callable[[], Awaitable[Collection[str]]], - *, - channel: str | None = None, - ) -> NativeIdAllowlist: - """Allow listed channel-native ids.""" - return NativeIdAllowlist(native_ids, channel=channel) - - @staticmethod - def linked_claim(claim: str, values: Collection[str]) -> LinkedClaimAllowlist: - """Allow identities whose verified claim matches one of ``values``.""" - return LinkedClaimAllowlist(claim, values) - - @staticmethod - def any_of(*allowlists: IdentityAllowlist) -> AnyOfAllowlists: - """Allow when any child allowlist allows.""" - return AnyOfAllowlists(*allowlists) - - @staticmethod - def all_of(*allowlists: IdentityAllowlist) -> AllOfAllowlists: - """Allow only when every child allowlist allows.""" - return AllOfAllowlists(*allowlists) - - @staticmethod - def custom( - fn: Callable[[AuthorizationContext], Awaitable[AllowlistDecision]], - *, - requires_linked_claims: bool = False, - ) -> CallableAllowlist: - """Wrap a custom async allowlist function.""" - return CallableAllowlist(fn, requires_linked_claims=requires_linked_claims) - - -# --------------------------------------------------------------------------- # -# Configuration error # -# --------------------------------------------------------------------------- # - - -class ChannelConfigurationError(ValueError): - """Raised at host construction for authorization config that would deny all users. - - The host validator runs three rules (see spec §"Configuration - validation"); any failure is reported here rather than letting - the misconfigured host start up and reject every request. - """ - - -__all__ = [ - "AllOfAllowlists", - "AllowAll", - "Allowed", - "AllowlistDecision", - "AnyOfAllowlists", - "AuthPolicy", - "AuthorizationContext", - "AuthorizationOutcome", - "CallableAllowlist", - "ChannelConfigurationError", - "ClaimValue", - "Denied", - "IdentityAllowlist", - "IdentityLinker", - "LinkChallenge", - "LinkRequired", - "LinkResolution", - "LinkedClaimAllowlist", - "LinkedIdentity", - "NativeIdAllowlist", - "SupportsLinkStorePath", -] diff --git a/python/packages/hosting/agent_framework_hosting/_host.py b/python/packages/hosting/agent_framework_hosting/_host.py index d90a9df29c3..c0a6a8e461b 100644 --- a/python/packages/hosting/agent_framework_hosting/_host.py +++ b/python/packages/hosting/agent_framework_hosting/_host.py @@ -2,23 +2,21 @@ """The :class:`AgentFrameworkHost` and its :class:`ChannelContext` bridge. -The host is a tiny Starlette wrapper: +The host is a small Starlette wrapper: - ``__init__`` accepts a hostable target (``SupportsAgentRun`` agent or ``Workflow``) and a sequence of channels. - :meth:`AgentFrameworkHost.app` lazily builds a Starlette app by calling every channel's ``contribute`` and mounting the returned routes under the channel's ``path`` (empty path → mount at the app root). -- :class:`ChannelContext` exposes ``run`` / ``run_stream`` / - ``deliver_response`` for channels to invoke; the host handles - per-``isolation_key`` session caching, identity tracking, and - :class:`ResponseTarget` fan-out. +- :class:`ChannelContext` exposes ``run`` / ``run_stream`` for channels to + invoke; the host handles hook invocation and per-``isolation_key`` session + caching. Per SPEC-002 (and ADR-0026), the host is intentionally thin so the bulk of channel-specific behaviour stays in the channel package. Identity -linking, link policies, response targets, background runs, and the like -are pluggable extensions that the future identity/foundry packages will -contribute on top of this surface. +linking, multicast delivery, background runs, and durable delivery are +follow-up enhancements layered outside this v1 host contract. """ from __future__ import annotations @@ -30,7 +28,7 @@ from collections.abc import AsyncIterator, Awaitable, Callable, Mapping, Sequence from contextlib import AbstractContextManager, ExitStack, asynccontextmanager from pathlib import Path -from typing import TYPE_CHECKING, Any, Literal, cast +from typing import TYPE_CHECKING, Any, cast from agent_framework import ( AgentResponse, @@ -51,20 +49,6 @@ from starlette.routing import BaseRoute, Mount, Route, WebSocketRoute from starlette.types import ASGIApp, Receive, Scope, Send -from ._authorization import ( - Allowed, - AllowlistDecision, - AuthorizationContext, - AuthorizationOutcome, - ChannelConfigurationError, - ClaimValue, - Denied, - IdentityAllowlist, - IdentityLinker, - LinkChallenge, - LinkRequired, - SupportsLinkStorePath, -) from ._isolation import ( ISOLATION_HEADER_CHAT, ISOLATION_HEADER_USER, @@ -73,21 +57,15 @@ set_current_isolation_keys, ) from ._persistence import normalize_state_dir -from ._runner import InProcessTaskRunner -from ._state_store import SessionsStateStore, build_session_dicts +from ._state_store import SessionsStateStore, build_session_aliases from ._types import ( Channel, - ChannelIdentity, - ChannelPush, - ChannelPushCodec, ChannelRequest, - DurableTaskPayloadMode, - DurableTaskRunner, + ChannelResponseHook, + ChannelRunHook, + ChannelStreamUpdateHook, HostedRunResult, HostStatePaths, - PushPayloadNotSerializable, - ResponseTargetKind, - apply_channel_response_hook, ) if TYPE_CHECKING: @@ -96,20 +74,6 @@ logger = logging.getLogger("agent_framework.hosting") -# Environment markers that auto-detect ``runtime_mode="ephemeral"``. Order -# matters only for telemetry — the first match wins and is logged at -# startup. Adding a new marker is a non-breaking change; consumers can -# always override via the ``runtime_mode`` constructor parameter. -_EPHEMERAL_RUNTIME_MARKERS: tuple[str, ...] = ( - "FOUNDRY_HOSTING_ENVIRONMENT", - "AZURE_FUNCTIONS_ENVIRONMENT", - "AWS_LAMBDA_FUNCTION_NAME", -) - - -RuntimeMode = Literal["long_running", "ephemeral"] - - def _exact_path_route(path: str, route: BaseRoute) -> BaseRoute | None: """Clone a root route so ``Mount('/x', Route('/'))`` also handles ``/x`` without a redirect.""" if isinstance(route, Route) and route.path == "/": @@ -125,46 +89,6 @@ def _exact_path_route(path: str, route: BaseRoute) -> BaseRoute | None: return None -def _detect_runtime_mode(env: Mapping[str, str] | None = None) -> tuple[RuntimeMode, str | None]: - """Inspect deployment markers and return ``(mode, matched_marker_or_None)``. - - Pure / side-effect-free so the host can call it once at construction - and tests can pass a synthetic env. ``env`` defaults to - :data:`os.environ`. Returns ``"long_running"`` when nothing matches — - that's the sensible default for local dev and always-on container - deployments. - """ - source = env if env is not None else os.environ - for marker in _EPHEMERAL_RUNTIME_MARKERS: - if source.get(marker): - return ("ephemeral", marker) - return ("long_running", None) - - -# Internal name the host uses when registering the push handler on the -# durable task runner. Exposed as a module constant so adapter packages -# (and the future background-run wiring under req #14) can use the same -# name for cross-runner observability. -HOSTING_PUSH_TASK_NAME = "hosting.push" - - -def _flatten_allowlists(allowlist: IdentityAllowlist) -> tuple[IdentityAllowlist, ...]: - """Walk an allowlist tree to expose nested :class:`IdentityAllowlist` instances. - - Used by :meth:`AgentFrameworkHost._validate_channel_authorization` - to inspect every leaf so type-checks like - ``NativeIdAllowlist(channel=)`` can be detected even - when buried inside :class:`AnyOfAllowlists` / :class:`AllOfAllowlists`. - """ - children = getattr(allowlist, "_children", None) - if children: - flat: list[IdentityAllowlist] = [allowlist] - for child in children: - flat.extend(_flatten_allowlists(child)) - return tuple(flat) - return (allowlist,) - - def _checkpoint_path_for_isolation_key(root: Path, isolation_key: str) -> Path: r"""Return ``root / isolation_key`` after rejecting path-traversal patterns. @@ -236,6 +160,34 @@ def _workflow_output_to_text(value: Any) -> str: return str(value) +async def _apply_run_hook( + hook: ChannelRunHook, + request: ChannelRequest, + *, + target: SupportsAgentRun | Workflow, + protocol_request: Any | None, +) -> ChannelRequest: + """Invoke a run hook with the host-owned calling convention.""" + result = hook(request, target=target, protocol_request=protocol_request) + if isinstance(result, Awaitable): + return await result + return result + + +async def _apply_response_hook( + hook: ChannelResponseHook, + result: HostedRunResult[Any], + *, + request: ChannelRequest, + channel_name: str | None, +) -> HostedRunResult[Any]: + """Invoke a response hook with the host-owned calling convention.""" + out = hook(result, request=request, channel_name=channel_name or request.channel) + if isinstance(out, Awaitable): + return await out + return out + + def _workflow_event_to_update(event: WorkflowEvent[Any]) -> AgentResponseUpdate | None: """Map a :class:`WorkflowEvent` to a channel-friendly :class:`AgentResponseUpdate`. @@ -272,7 +224,7 @@ def _workflow_event_to_update(event: WorkflowEvent[Any]) -> AgentResponseUpdate @asynccontextmanager -async def _suppress_already_consumed() -> AsyncIterator[None]: +async def _suppress_already_consumed() -> AsyncIterator[None]: # noqa: RUF029 """Yield, swallowing finalizer failures so consumer cleanup never crashes the host. The bridge stream calls ``get_final_response()`` after iterating the @@ -386,6 +338,63 @@ def __getattr__(self, name: str) -> Any: return getattr(self._inner, name) +class _HostResponseStream: + """Adapter that applies host-owned stream and final-response hooks.""" + + def __init__( + self, + inner: Any, + *, + request: ChannelRequest, + stream_update_hook: ChannelStreamUpdateHook | None = None, + response_hook: ChannelResponseHook | None = None, + channel_name: str | None = None, + ) -> None: + self._inner = inner + self._request = request + self._stream_update_hook = stream_update_hook + self._response_hook = response_hook + self._channel_name = channel_name + + def __await__(self) -> Any: + return self.get_final_response().__await__() + + def __aiter__(self) -> AsyncIterator[Any]: + return self._wrap() + + async def _wrap(self) -> AsyncIterator[Any]: + async for update in self._inner: + if self._stream_update_hook is None: + yield update + continue + transformed = self._stream_update_hook(update) + if isinstance(transformed, Awaitable): + transformed = await transformed + if transformed is None: + continue + yield transformed + + async def get_final_response(self) -> Any: + result = await self._inner.get_final_response() + if self._response_hook is None: + return result + shaped = await _apply_response_hook( + self._response_hook, + HostedRunResult(result), + request=self._request, + channel_name=self._channel_name, + ) + return shaped.result + + async def aclose(self) -> None: + close = getattr(self._inner, "aclose", None) + if close is not None: + await close() + + def __getattr__(self, name: str) -> Any: + return getattr(self._inner, name) + + class ChannelContext: """Host-owned bridge that channels call to invoke the target.""" @@ -393,8 +402,8 @@ def __init__(self, host: AgentFrameworkHost) -> None: """Bind the context to its owning :class:`AgentFrameworkHost`. The host instance is the source of truth for the target, registered - channels, identity stores, sessions, and lifecycle state. Channels - only ever receive a context; they never see the host directly. + channels, sessions, and lifecycle state. Channels only ever receive a + context; they never see the host directly. """ self._host = host @@ -403,7 +412,15 @@ def target(self) -> SupportsAgentRun | Workflow: """The hostable target the channel should invoke.""" return self._host.target - async def run(self, request: ChannelRequest) -> HostedRunResult[Any]: + async def run( + self, + request: ChannelRequest, + *, + run_hook: ChannelRunHook | None = None, + protocol_request: Any | None = None, + response_hook: ChannelResponseHook | None = None, + channel_name: str | None = None, + ) -> HostedRunResult[Any]: """Invoke the target for ``request`` and return a channel-neutral result. For agent targets the return type narrows to @@ -412,45 +429,78 @@ async def run(self, request: ChannelRequest) -> HostedRunResult[Any]: as ``HostedRunResult[Any]`` because :class:`ChannelContext` is agnostic to which target shape the host was constructed with; channels narrow at the call site if they need it. - """ - return await self._host._invoke(request) # pyright: ignore[reportPrivateUsage] - def run_stream(self, request: ChannelRequest) -> ResponseStream[AgentResponseUpdate, AgentResponse]: - """Invoke the target with ``stream=True`` and return the agent's ResponseStream. + Args: + request: The channel-built request envelope. - Channels iterate the stream directly (it acts like an AsyncGenerator) - and are responsible for delivering updates to their wire protocol. - Apply per-channel ``transform_hook`` callables during iteration to - rewrite or drop individual updates before they hit the wire. + Keyword Args: + run_hook: Optional channel-supplied hook the host applies before + invoking the target. + protocol_request: Raw channel-native payload passed to + ``run_hook``. + response_hook: Optional channel-supplied hook the host applies to + the completed result before returning it. + channel_name: Channel name passed to ``response_hook``. Defaults + to ``request.channel``. """ - return self._host._invoke_stream(request) # pyright: ignore[reportPrivateUsage] + prepared = await self._host._apply_run_hook( # pyright: ignore[reportPrivateUsage] + request, + hook=run_hook, + protocol_request=protocol_request, + ) + result = await self._host._invoke(prepared) # pyright: ignore[reportPrivateUsage] + return await self._host._apply_response_hook( # pyright: ignore[reportPrivateUsage] + result, + request=prepared, + hook=response_hook, + channel_name=channel_name, + ) - async def deliver_response( + async def run_stream( self, request: ChannelRequest, - payload: HostedRunResult[Any], - ) -> bool: - """Resolve ``request.response_target`` and push ``payload`` to each destination. - - Returns ``True`` when the originating channel should render the - agent reply on its own wire (i.e. the resolved target included - the originating channel — explicitly via - ``ResponseTarget.originating``, implicitly via - ``ResponseTarget.channels(["originating", ...])``, or as the - host's "every destination dropped, fall back to originating" - recovery path). Returns ``False`` when the reply is fanned out - purely to non-originating destinations (or - :data:`ResponseTarget.none` suppresses the reply entirely) — in - which case the originating channel typically responds with a - bare ack. - - Per-destination push outcomes (scheduled, retried, terminally - failed) live in the durable task runner's own log; this method - emits structured log entries for every resolution-time skip and - every schedule-time outage so operators have a single grep - anchor for "where did my reply go?". + *, + run_hook: ChannelRunHook | None = None, + protocol_request: Any | None = None, + stream_update_hook: ChannelStreamUpdateHook | None = None, + response_hook: ChannelResponseHook | None = None, + channel_name: str | None = None, + ) -> ResponseStream[AgentResponseUpdate, AgentResponse]: + """Apply host-owned hooks and invoke the target with ``stream=True``. + + Channels iterate the stream directly (it acts like an AsyncGenerator) + and are responsible for delivering updates to their wire protocol. + When ``stream_update_hook`` is supplied, the host applies it during + iteration to rewrite or drop individual updates before they hit the wire. + + Args: + request: The channel-built request envelope. + + Keyword Args: + run_hook: Optional channel-supplied hook the host applies before + opening the target stream. + protocol_request: Raw channel-native payload passed to + ``run_hook``. + stream_update_hook: Optional host-applied update transform. + response_hook: Optional host-applied final-response transform. + channel_name: Channel name passed to ``response_hook``. Defaults + to ``request.channel``. """ - return await self._host._deliver_response(request, payload) # pyright: ignore[reportPrivateUsage] + prepared = await self._host._apply_run_hook( # pyright: ignore[reportPrivateUsage] + request, + hook=run_hook, + protocol_request=protocol_request, + ) + stream = self._host._invoke_stream(prepared) # pyright: ignore[reportPrivateUsage] + if stream_update_hook is None and response_hook is None: + return stream + return _HostResponseStream( + stream, + request=prepared, + stream_update_hook=stream_update_hook, + response_hook=response_hook, + channel_name=channel_name, + ) # type: ignore[return-value] class _FoundryIsolationASGIMiddleware: @@ -502,11 +552,6 @@ def __init__( channels: Sequence[Channel], debug: bool = False, checkpoint_location: str | os.PathLike[str] | CheckpointStorage | None = None, - runtime_mode: RuntimeMode | None = None, - durable_task_runner: DurableTaskRunner | None = None, - allow_in_process_runner: bool = False, - default_allowlist: IdentityAllowlist | None = None, - identity_linker: IdentityLinker | None = None, state_dir: str | os.PathLike[str] | HostStatePaths | Mapping[str, str | os.PathLike[str]] | None = None, ) -> None: """Create a host for ``target`` and its channels. @@ -544,73 +589,21 @@ def __init__( ``state_dir['checkpoints']`` (or the auto-derived ``state_dir/checkpoints/`` subfolder); a warning surfaces the double-configuration. - runtime_mode: Hint that drives the *defaults* for runtime-shape - dependent components (currently the durable task runner, - and — by extension — anything that wants to know whether - the process is expected to outlive a single request). - ``"long_running"`` (containers, OpenClaw-style always-on - deployments, local dev) → in-process / in-memory defaults. - ``"ephemeral"`` (Foundry Hosted Agents, Azure Functions, - AWS Lambda) → the host expects a durable runner to be - supplied via ``durable_task_runner`` and logs a warning - otherwise. ``None`` (the default) auto-detects from - deployment environment markers (currently - ``FOUNDRY_HOSTING_ENVIRONMENT``, ``AZURE_FUNCTIONS_ENVIRONMENT``, - ``AWS_LAMBDA_FUNCTION_NAME``); falls back to - ``"long_running"``. - durable_task_runner: The runner used to dispatch - non-originating push fan-out. Defaults to a process-local - :class:`InProcessTaskRunner` (asyncio + bounded retry, no - persistence) — appropriate for ``runtime_mode="long_running"`` - deployments. Ephemeral deployments should pass a durable - adapter (e.g. ``agent-framework-hosting-durabletask``, - or a Foundry-native adapter once available) so scheduled - pushes survive process restarts. - allow_in_process_runner: Opt-in escape hatch that allows - ``runtime_mode="ephemeral"`` to be paired with the - default in-process runner. Without this flag, the host - refuses to start in ephemeral mode without an explicit - ``durable_task_runner`` because the failure mode — - non-originating pushes silently lost on process recycle — - is the worst class of production bug (works in light - testing, drops work under load / lifecycle events). - Useful for local dev that wants to exercise ephemeral - code paths without standing up a durable backend; **not** - appropriate for production. - default_allowlist: Host-level fallback applied to every - channel that leaves ``allowlist="inherit"``. ``None`` - (the default) means the channel is open unless it sets - its own ``allowlist``. Channels can opt out of the host - default by setting ``allowlist=None`` explicitly. - identity_linker: Optional :class:`IdentityLinker` used to - resolve channel-native identities into verified IdP-backed - identities, or to return a :class:`LinkChallenge` the - channel can render when the user still needs to sign in. - Channels with ``require_link=True`` require this to be - configured unless they provide their own native verified - claims. state_dir: Opt-in disk persistence for host-managed state. - When set, the host writes the in-process task runner's - pending queue and the session-related dicts - (``_session_aliases``, ``_active``, ``_identities``) to - a :mod:`diskcache`-backed store under ``state_dir`` and - replays the runner queue on next startup. When the - target is a :class:`Workflow`, the auto-derived + When set, the host writes session aliases created by + :meth:`reset_session` to a :mod:`diskcache`-backed store + under ``state_dir``. When the target is a + :class:`Workflow`, the auto-derived ``state_dir/checkpoints/`` subfolder (or the ``checkpoints`` key of the mapping form) is also used as the workflow checkpoint location (equivalent to - passing ``checkpoint_location`` directly). The - auto-derived ``state_dir/links/`` subfolder (or the - ``links`` key of the mapping form) is offered to - identity linkers that implement - :class:`SupportsLinkStorePath`. Accepts: + passing ``checkpoint_location`` directly). Accepts: * ``None`` (default) — everything stays in memory; the process owns its state and loses it on exit. Matches today's behaviour exactly. * ``str`` / :class:`os.PathLike` — the host derives - default subpaths ``state_dir/runner/``, - ``state_dir/sessions/``, ``state_dir/links/``, and + default subpaths ``state_dir/sessions/`` and (for workflow targets) ``state_dir/checkpoints/``. Recommended for most long-running-host deployments — one path, no extra @@ -621,50 +614,32 @@ def __init__( * :class:`HostStatePaths` typed dict / plain ``Mapping`` — per-component overrides for callers that want each component on a different volume (fast local - SSD for the runner, network-attached volume for - sessions, …). Components missing from the mapping - fall back to in-memory (or, for ``checkpoints``, to - no checkpoint persistence). Unknown keys raise + SSD for checkpoints, network-attached volume for + sessions, …). Components missing from the mapping fall + back to in-memory (or, for ``checkpoints``, to no + checkpoint persistence). Unknown keys raise ``ValueError`` to surface typos early. - The ``runner`` and ``sessions`` components require the + The ``sessions`` component requires the optional ``diskcache`` dependency (install with ``pip install 'agent-framework-hosting[disk]'``); ``checkpoints`` uses the core :class:`~agent_framework.FileCheckpointStorage` and has - no extra dependency. Each disk-cache-backed component - acquires an OS-level advisory lock on its directory; a - second host pointed at the same paths raises + no extra dependency. The disk-cache-backed sessions + component acquires an OS-level advisory lock on its + directory; a second host pointed at the same path raises :class:`RuntimeError` at construction so two processes - do not double-execute queued tasks. When - ``durable_task_runner`` is supplied explicitly, the - ``runner`` sub-path is ignored — the caller owns the - runner's persistence story. When ``checkpoint_location`` - is supplied explicitly, the ``checkpoints`` sub-path is - ignored. When an ``identity_linker`` does not implement - :class:`SupportsLinkStorePath`, the ``links`` sub-path is - ignored and the linker must be configured directly. + do not race session-alias writes. When + ``checkpoint_location`` is supplied explicitly, the + ``checkpoints`` sub-path is ignored. """ self.target: SupportsAgentRun | Workflow = target self._is_workflow = isinstance(target, Workflow) self.channels = list(channels) self._debug = debug self._app: Starlette | None = None - # Disk persistence — normalise the per-component map up front so - # the runner, session-store, and checkpoint paths are resolved - # before any consumer (including ``checkpoint_location``) is - # built. ``None`` (default) means everything stays in memory. self._state_paths: dict[str, Path | None] = normalize_state_dir(state_dir) - # Track whether the user passed the mapping form so we can - # distinguish "auto-derived from single path" (silent ignore for - # non-workflow targets) from "explicit mapping key" (warn for - # non-workflow targets, since that's almost certainly dead config). checkpoints_explicit_in_mapping = isinstance(state_dir, Mapping) and "checkpoints" in state_dir - links_explicit_in_mapping = isinstance(state_dir, Mapping) and "links" in state_dir - # Resolve the effective workflow checkpoint location: the - # explicit ``checkpoint_location`` argument wins; otherwise we - # fall back to ``state_dir['checkpoints']`` (single-path form - # auto-derives ``state_dir/checkpoints/``). derived_checkpoint_path = self._state_paths.get("checkpoints") self._checkpoint_location: Path | CheckpointStorage | None = None effective_checkpoint_source: str | os.PathLike[str] | CheckpointStorage | None = checkpoint_location @@ -685,7 +660,7 @@ def __init__( "(state_dir['checkpoints']=%s); the explicit checkpoint_location " "takes precedence and the state_dir sub-path is ignored. " "Use the HostStatePaths mapping form and omit 'checkpoints' to " - "configure runner/sessions persistence without also enabling " + "configure session-alias persistence without also enabling " "host-managed workflow checkpointing.", derived_checkpoint_path, ) @@ -713,114 +688,19 @@ def __init__( # ``CheckpointStorage`` is a non-runtime-checkable Protocol, # so we cannot ``isinstance``-check it directly. self._checkpoint_location = effective_checkpoint_source - # Runtime mode + durable task runner. We resolve mode first - # because the warning-on-ephemeral-without-runner only fires - # when both are at their defaults. - if runtime_mode is None: - resolved_mode, matched_marker = _detect_runtime_mode() - self._runtime_mode: RuntimeMode = resolved_mode - self._runtime_mode_source: str = ( - f"auto-detected from {matched_marker}" if matched_marker is not None else "auto-detected default" - ) - else: - self._runtime_mode = runtime_mode - self._runtime_mode_source = "explicit" - if durable_task_runner is None: - if self._runtime_mode == "ephemeral" and not allow_in_process_runner: - raise RuntimeError( - "AgentFrameworkHost is running in ephemeral runtime mode " - f"({self._runtime_mode_source}) without a durable_task_runner. " - "Non-originating push deliveries would be lost on process " - "recycle. Pass `durable_task_runner=...` (e.g. an " - "agent-framework-hosting-durabletask runner) for production, " - "or set `allow_in_process_runner=True` to opt out of this " - "check (e.g. for local dev exercising ephemeral code paths)." - ) - # When state_dir["runner"] is set, the default in-process - # runner persists its queue to disk so a long-running host - # can replay in-flight pushes after a crash / restart. - self._durable_task_runner: DurableTaskRunner = InProcessTaskRunner( - state_dir=self._state_paths.get("runner"), - ) - self._owns_runner = True - if self._runtime_mode == "ephemeral": - logger.warning( - "AgentFrameworkHost is running in ephemeral runtime mode " - "with the default InProcessTaskRunner (allow_in_process_runner=True). " - "Non-originating push deliveries will be lost if the process is " - "recycled mid-flight — this configuration is intended for local dev only." - ) - else: - self._durable_task_runner = durable_task_runner - self._owns_runner = False - if self._state_paths.get("runner") is not None: - # The caller supplied both a runner and a runner state - # path. The path would only have applied to the default - # in-process runner; surface the misconfig so it doesn't - # silently become a no-op. - logger.warning( - "state_dir['runner'] is set but a durable_task_runner was " - "supplied explicitly; the runner sub-path is ignored — " - "configure persistence on the runner instance directly." - ) - # Validate the runner / push-codec pairing eagerly: a JSON-mode - # durable runner cannot persist payloads for a push-capable - # channel that has no codec. Failing here makes the misconfig - # visible at process start rather than on first push. - self._validate_runner_codec_pairing() - # Register the internal push handler eagerly so it is available - # whether callers invoke ``_deliver_response`` directly (e.g. - # tests) or through the lifespan-managed ASGI app. Doing this - # in ``__init__`` is safe because runner handler registration - # has no I/O — it only associates a name with a callable. - self._durable_task_runner.register(HOSTING_PUSH_TASK_NAME, self._handle_push_task) - # Per-isolation_key session cache. The real spec backs this with a - # pluggable session store; this base host keeps it in-process. - # NOTE: live ``AgentSession`` objects are NOT persisted to disk - # — the history provider rehydrates them from its own store on - # the next turn. ``state_dir`` only persists the lightweight - # pickle-friendly bookkeeping below. self._sessions: dict[str, Any] = {} - # Open the disk-backed sessions store first when persistence is - # on; the three persisted dicts share the same cache + lock to - # minimise file handles and acquisition cost. sessions_path = self._state_paths.get("sessions") self._sessions_store: SessionsStateStore | None if sessions_path is not None: self._sessions_store = SessionsStateStore(sessions_path) - # ``isolation_key -> active session_id``. Normally identical to the - # isolation_key, but ``reset_session`` rotates this to a fresh id so - # the next turn starts a new ``AgentSession`` while the old history - # remains on disk under its original session_id. Persisted so a - # rotation survives a restart. - aliases_dict, active_dict, identities_dict = build_session_dicts(self._sessions_store) - self._session_aliases: dict[str, str] = aliases_dict - # (isolation_key -> last-seen channel name) for ResponseTarget.active. - self._active: dict[str, str] = active_dict - # Per-isolation_key identity registry: which channels we've seen this - # user on, and which native_id they used on each. Powers - # ResponseTarget.active / .channel(name) / .channels([...]) / - # .all_linked. - # Shape: { isolation_key: { channel_name: ChannelIdentity } }. - self._identities: dict[str, dict[str, ChannelIdentity]] = identities_dict + self._session_aliases: dict[str, str] = build_session_aliases(self._sessions_store) else: self._sessions_store = None self._session_aliases = {} - self._active = {} - self._identities = {} # Set by ``serve()`` so the lifespan startup handler doesn't # double-log the banner; remains ``False`` when callers mount # ``host.app`` under their own ASGI server. self._startup_logged: bool = False - # Authorization seam: allowlists, optional identity linker, and - # construction-time validation for fail-fast misconfigurations. - self._default_allowlist: IdentityAllowlist | None = default_allowlist - self._identity_linker: IdentityLinker | None = identity_linker - self._configure_identity_linker_state( - self._state_paths.get("links"), - explicit=links_explicit_in_mapping, - ) - self._validate_channel_authorization() @property def app(self) -> Starlette: @@ -829,340 +709,6 @@ def app(self) -> Starlette: self._app = self._build_app() return self._app - def _configure_identity_linker_state(self, links_path: Path | None, *, explicit: bool) -> None: - """Offer the derived ``state_dir['links']`` path to compatible linkers.""" - if links_path is None: - return - linker = self._identity_linker - if linker is None: - if explicit: - logger.warning("state_dir['links'] is set but no identity_linker is configured; ignoring.") - return - if isinstance(linker, SupportsLinkStorePath): - linker.configure_link_store_path(links_path) - return - logger.warning( - "state_dir['links'] is set but the configured identity_linker does not implement " - "SupportsLinkStorePath; configure link-store persistence on the linker directly." - ) - - def _validate_runner_codec_pairing(self) -> None: - """Refuse to start when a JSON-mode runner is paired with codec-less push channels. - - A JSON-mode durable runner (``payload_mode=JSON``) persists every - scheduled task's payload so it survives process restarts. The - host's ``hosting.push`` payload includes a - :class:`HostedRunResult` containing the full agent / workflow - output, which cannot be JSON-serialised without help from the - destination channel. Push-capable channels therefore must - declare a :class:`ChannelPushCodec` (a duck-typed - ``push_codec`` attribute on the channel) when paired with a - JSON-mode runner. - - Object-mode runners (the default in-process runner) accept live - Python references and skip this check. - """ - mode = getattr(self._durable_task_runner, "payload_mode", DurableTaskPayloadMode.OBJECT) - if mode != DurableTaskPayloadMode.JSON: - return - missing: list[str] = [] - for channel in self.channels: - if not isinstance(channel, ChannelPush): - # Channels that don't implement push are never scheduled, - # so a missing codec is fine. - continue - codec = getattr(channel, "push_codec", None) - if codec is None: - missing.append(channel.name) - if missing: - raise RuntimeError( - "Durable task runner declares payload_mode=JSON, but the following " - "push-capable channels have no `push_codec` attribute and cannot " - "be serialised for persistence: " - f"{', '.join(missing)}. Add a ChannelPushCodec to each channel " - "or switch to an object-mode runner (e.g. InProcessTaskRunner)." - ) - - def _resolve_channel_allowlist(self, channel: Channel) -> IdentityAllowlist | None: - """Apply the ``"inherit"`` / ``None`` / explicit semantics. - - - ``"inherit"`` (default) → host's ``default_allowlist``. - - ``None`` → explicitly open (carve-out inside a locked host). - - any other value → use as-is. - """ - raw: Any = getattr(channel, "allowlist", "inherit") - if raw == "inherit": - return self._default_allowlist - # ``None`` and concrete allowlists both pass through unchanged; - # the caller (``authorize``) treats ``None`` as "open". - return cast("IdentityAllowlist | None", raw) - - def _validate_channel_authorization(self) -> None: - """Reject configurations that would silently deny every user. - - Runs three rules (see spec § "Configuration validation"): - - 1. If a channel's resolved allowlist declares - ``requires_linked_claims=True``, the channel must either set - ``require_link=True`` or declare - ``emits_verified_claims=True`` — otherwise no verified - claims will ever reach :meth:`evaluate` and the allowlist - would always ``ABSTAIN`` / ``DENY``. - 2. If any channel has ``require_link=True``, an - ``identity_linker`` must be configured. Silent - deny-everyone is the worst possible default. - 3. ``NativeIdAllowlist(channel=)`` must reference a - channel name that exists on this host — typo-detection. - """ - known_channels = {c.name for c in self.channels} - for channel in self.channels: - allowlist = self._resolve_channel_allowlist(channel) - require_link = bool(getattr(channel, "require_link", False)) - emits_claims = bool(getattr(channel, "emits_verified_claims", False)) - # Rule #2: require_link without a linker. - if require_link and self._identity_linker is None: - raise ChannelConfigurationError( - f"Channel '{channel.name}' has require_link=True but no " - "identity_linker is configured on the host. Configure one or " - "remove require_link=True (silent deny-everyone is rejected)." - ) - if allowlist is None: - continue - # Rule #1: claim-dependent allowlist needs a claim source. - if getattr(allowlist, "requires_linked_claims", False) and not (require_link or emits_claims): - raise ChannelConfigurationError( - f"Channel '{channel.name}' has an allowlist that requires " - "verified IdP claims (requires_linked_claims=True) but the " - "channel neither sets require_link=True nor emits verified " - "claims natively. Configure a source of verified claims for " - "the allowlist (silent deny-everyone is rejected)." - ) - # Rule #3: native-id allowlists pointing at unknown channels. - for nested in _flatten_allowlists(allowlist): - target = getattr(nested, "channel", None) - if target is not None and target not in known_channels: - raise ChannelConfigurationError( - f"NativeIdAllowlist on channel '{channel.name}' references " - f"unknown channel '{target}'. Known channels: " - f"{sorted(known_channels)}." - ) - - async def authorize( - self, - identity: ChannelIdentity, - *, - require_link: bool = False, - allowlist: IdentityAllowlist | None = None, - verified_claims: Mapping[str, ClaimValue] | None = None, - ) -> AuthorizationOutcome: - """Evaluate authorization for ``identity`` against ``allowlist``. - - Channels should call this **before** producing a - :class:`ChannelRequest` so a denied identity never reaches the - agent. The host's run path also re-checks authorization for - defense-in-depth, but channels that surface :class:`Denied` or - :class:`LinkRequired` themselves can render the outcome - through their native UX (refusal message, link challenge) - rather than a generic error. - - Supports open, native-id allowlist, and verified-claim allowlist - profiles. ``require_link=True`` or claim-based allowlists use - the configured :class:`IdentityLinker`; channels that natively - authenticate users may pass ``verified_claims`` directly. - - Returns: - One of :class:`Allowed`, :class:`LinkRequired`, or - :class:`Denied`. - """ - claims: Mapping[str, ClaimValue] = verified_claims or {} - claim_source: Literal["linker", "channel", "none"] = "channel" if claims else "none" - auto_isolation_key = self._auto_issue_isolation_key(identity) - if allowlist is None: - # Open profile (or explicitly carved-out channel). - if require_link: - return await self._resolve_required_link(identity) - return Allowed(isolation_key=auto_isolation_key, verified_claims=claims, claim_source=claim_source) - pre_context = AuthorizationContext( - identity=identity, - phase="pre_link", - isolation_key=None, - verified_claims=claims, - claim_source=claim_source, - ) - decision = await allowlist.evaluate(pre_context) - if decision is AllowlistDecision.ALLOW: - if require_link: - return await self._resolve_required_link(identity) - return Allowed(isolation_key=auto_isolation_key, verified_claims=claims, claim_source=claim_source) - if decision is AllowlistDecision.DENY: - return Denied( - reason_code="allowlist_denied_pre_link", - user_message="You don't have access to this bot.", - log_details={ - "channel": identity.channel, - "phase": "pre_link", - }, - ) - # ABSTAIN: claim-dependent allowlists need a post-link / - # verified-claim evaluation. Non-claim allowlists can fall - # through to the open path, while still honoring require_link. - if getattr(allowlist, "requires_linked_claims", False): - if claims: - post_context = AuthorizationContext( - identity=identity, - phase="post_link", - isolation_key=auto_isolation_key, - verified_claims=claims, - claim_source="channel", - ) - post_decision = await allowlist.evaluate(post_context) - return self._authorization_outcome_from_post_link( - identity=identity, - isolation_key=auto_isolation_key, - claims=claims, - claim_source="channel", - decision=post_decision, - ) - return await self._resolve_and_evaluate_claim_allowlist(identity, allowlist) - if require_link: - return await self._resolve_required_link(identity) - return Allowed(isolation_key=auto_isolation_key, verified_claims=claims, claim_source=claim_source) - - async def _resolve_required_link(self, identity: ChannelIdentity) -> AuthorizationOutcome: - """Resolve ``identity`` through the configured linker or request linking.""" - linker = self._identity_linker - if linker is None: - # Defensive: the construction-time validator should catch this. - return Denied( - reason_code="link_required_without_linker", - user_message="Sign-in is not configured for this bot.", - log_details={"channel": identity.channel}, - ) - resolution = await linker.resolve(identity) - if isinstance(resolution, LinkChallenge): - return LinkRequired(challenge=resolution) - return Allowed( - isolation_key=resolution.isolation_key, - verified_claims=resolution.verified_claims, - claim_source=resolution.claim_source, - ) - - async def _resolve_and_evaluate_claim_allowlist( - self, - identity: ChannelIdentity, - allowlist: IdentityAllowlist, - ) -> AuthorizationOutcome: - """Resolve identity, then run a claim-dependent allowlist post-link.""" - linker = self._identity_linker - if linker is None: - return Denied( - reason_code="allowlist_requires_link", - user_message="Please link your account to continue.", - log_details={"channel": identity.channel, "phase": "pre_link"}, - ) - resolution = await linker.resolve(identity) - if isinstance(resolution, LinkChallenge): - return LinkRequired(challenge=resolution) - post_context = AuthorizationContext( - identity=identity, - phase="post_link", - isolation_key=resolution.isolation_key, - verified_claims=resolution.verified_claims, - claim_source=resolution.claim_source, - ) - post_decision = await allowlist.evaluate(post_context) - return self._authorization_outcome_from_post_link( - identity=identity, - isolation_key=resolution.isolation_key, - claims=resolution.verified_claims, - claim_source=resolution.claim_source, - decision=post_decision, - ) - - def _authorization_outcome_from_post_link( - self, - *, - identity: ChannelIdentity, - isolation_key: str, - claims: Mapping[str, ClaimValue], - claim_source: Literal["linker", "channel"], - decision: AllowlistDecision, - ) -> AuthorizationOutcome: - """Convert a post-link allowlist decision to a host outcome.""" - if decision is AllowlistDecision.ALLOW: - return Allowed(isolation_key=isolation_key, verified_claims=claims, claim_source=claim_source) - if decision is AllowlistDecision.DENY: - return Denied( - reason_code="allowlist_denied_post_link", - user_message="You don't have access to this bot.", - log_details={ - "channel": identity.channel, - "phase": "post_link", - "claim_source": claim_source, - }, - ) - return Denied( - reason_code="allowlist_abstained_post_link", - user_message="You don't have access to this bot.", - log_details={ - "channel": identity.channel, - "phase": "post_link", - "claim_source": claim_source, - }, - ) - - def _auto_issue_isolation_key(self, identity: ChannelIdentity) -> str: - """Auto-issue a stable isolation key for ``identity``. - - Returns the existing key when ``(channel, native_id)`` has - already been seen, or coins ``":"`` on - first contact. Configured :class:`IdentityLinker` instances can - return provider-backed isolation keys for flows that require - verified identity. - """ - # Look for an existing isolation_key that has already linked - # this (channel, native_id). Linear scan is fine for the - # in-process registry. Linker implementations can use their own - # indexed stores for provider-backed identities. - for isolation_key, by_channel in self._identities.items(): - existing = by_channel.get(identity.channel) - if existing is not None and existing.native_id == identity.native_id: - return isolation_key - # First contact — coin a deterministic key. - return f"{identity.channel}:{identity.native_id}" - - @property - def default_allowlist(self) -> IdentityAllowlist | None: - """Host-level fallback allowlist applied to channels with ``allowlist="inherit"``.""" - return self._default_allowlist - - @property - def runtime_mode(self) -> RuntimeMode: - """The resolved runtime mode for this host. - - Either ``"long_running"`` or ``"ephemeral"``. Resolved at - construction from the ``runtime_mode`` constructor argument or - — when unset — auto-detected from deployment environment - markers; see :func:`_detect_runtime_mode`. Advisory: the value - drives the *defaults* selected for runtime-shape-dependent - components (today, the durable task runner) and is logged at - startup for operator visibility. - """ - return self._runtime_mode - - @property - def durable_task_runner(self) -> DurableTaskRunner: - """The durable task runner used to dispatch non-originating pushes. - - Defaults to a process-local :class:`InProcessTaskRunner` when no - runner was supplied at construction. Adapter packages may - replace this with a durable backend (e.g. Foundry-native - scheduling, ``agent-framework-hosting-durabletask``); the host - itself only relies on the :class:`DurableTaskRunner` Protocol - surface so any conforming implementation is usable. - """ - return self._durable_task_runner - def serve( self, *, @@ -1257,9 +803,7 @@ def _log_startup( Called from both :meth:`serve` (which knows the bind triple) and the ASGI lifespan ``startup`` phase (which does not — the host may be embedded under any caller-managed ASGI server). - Bind fields are omitted from the log line when unknown so - operators can still spot the runtime-mode banner under - externally-managed servers. + Bind fields are omitted from the log line when unknown. """ target_kind = "Workflow" if isinstance(self.target, Workflow) else type(self.target).__name__ target_name = getattr(self.target, "name", None) or target_kind @@ -1270,16 +814,12 @@ def _log_startup( is_hosted = bool(os.environ.get("FOUNDRY_HOSTING_ENVIRONMENT")) bind = f"{host}:{port}" if host is not None and port is not None else "" logger.info( - "AgentFrameworkHost starting: target=%s (%s) bind=%s workers=%s hosted=%s " - "runtime_mode=%s (%s) runner=%s channels=[%s]", + "AgentFrameworkHost starting: target=%s (%s) bind=%s workers=%s hosted=%s channels=[%s]", target_name, target_kind, bind, workers if workers is not None else "", is_hosted, - self._runtime_mode, - self._runtime_mode_source, - type(self._durable_task_runner).__name__, channels_repr or "", ) @@ -1329,8 +869,7 @@ async def lifespan(_app: Starlette) -> AsyncIterator[None]: # logged it (it logs eagerly so the banner appears before # control passes to hypercorn); the lifespan still logs it # for callers that mount ``host.app`` directly under their - # own ASGI server — that path otherwise wouldn't get a - # runtime-mode banner at all. + # own ASGI server. if not self._startup_logged: self._log_startup() self._startup_logged = True @@ -1341,27 +880,7 @@ async def lifespan(_app: Starlette) -> AsyncIterator[None]: # Starlette / the ASGI server still aborts boot — and log # every other failure so operators can see them all in one # log scrape rather than discovering them turn-by-turn. - # (The hosting.push handler is registered eagerly in - # ``__init__`` rather than here, so ``_deliver_response`` - # can be called without first entering the lifespan — e.g. - # in tests, or by callers driving the host without an ASGI - # server.) startup_errors: list[tuple[str, BaseException]] = [] - # Replay any persisted pending tasks first so re-scheduled - # work runs alongside fresh traffic from the moment the - # host accepts requests. Only meaningful for the host-owned - # in-process runner with disk persistence on; caller-owned - # runners manage their own replay lifecycle. - if ( - self._owns_runner - and isinstance(self._durable_task_runner, InProcessTaskRunner) - and self._state_paths.get("runner") is not None - ): - try: - await self._durable_task_runner.resume() - except Exception as exc: - logger.exception("lifespan startup: durable task runner resume failed") - startup_errors.append(("InProcessTaskRunner.resume", exc)) for cb in on_startup: try: await cb() @@ -1394,21 +913,6 @@ async def lifespan(_app: Starlette) -> AsyncIterator[None]: name = getattr(cb, "__qualname__", repr(cb)) logger.exception("lifespan shutdown: callback %s failed", name) shutdown_errors.append((name, exc)) - # Drain the host-owned runner after channel shutdowns — - # channels may legitimately schedule a final push while - # tearing down (e.g. a goodbye message), and we want - # those tasks to get a chance to complete before we - # cancel pending work. For caller-supplied runners we - # leave lifecycle to the caller. - if self._owns_runner and isinstance(self._durable_task_runner, InProcessTaskRunner): - try: - await self._durable_task_runner.shutdown(timeout=5.0) - except Exception as exc: # pragma: no cover - defensive - logger.exception("lifespan shutdown: durable task runner shutdown failed") - shutdown_errors.append(("InProcessTaskRunner.shutdown", exc)) - # Close the persisted sessions store after the runner so - # any in-flight task that touches session state during - # shutdown can still write through. if self._sessions_store is not None: try: self._sessions_store.close() @@ -1426,17 +930,18 @@ async def lifespan(_app: Starlette) -> AsyncIterator[None]: ) raise first_exc + middleware = ( + [Middleware(_FoundryIsolationASGIMiddleware)] if os.environ.get("FOUNDRY_HOSTING_ENVIRONMENT") else [] + ) return Starlette( debug=self._debug, routes=routes, lifespan=lifespan, - middleware=[Middleware(_FoundryIsolationASGIMiddleware)], + middleware=middleware, ) def _build_run_kwargs(self, request: ChannelRequest) -> dict[str, Any]: - # The full spec resolves a ChannelSession into an AgentSession here, - # honors session_mode, and consults LinkPolicy / ResponseTarget. This - # base host keys a per-isolation_key AgentSession off the channel's + # The host keys a per-isolation_key AgentSession off the channel's # session hint so context providers (FileHistoryProvider, …) on the # target see one session per end user. session = None @@ -1471,6 +976,36 @@ def _build_run_kwargs(self, request: ChannelRequest) -> dict[str, Any]: run_kwargs["options"] = request.options return run_kwargs + async def _apply_run_hook( + self, + request: ChannelRequest, + *, + hook: ChannelRunHook | None, + protocol_request: Any | None, + ) -> ChannelRequest: + """Apply a channel-supplied run hook under host ownership.""" + if hook is None: + return request + return await _apply_run_hook( + hook, + request, + target=self.target, + protocol_request=protocol_request, + ) + + async def _apply_response_hook( + self, + result: HostedRunResult[Any], + *, + request: ChannelRequest, + hook: ChannelResponseHook | None, + channel_name: str | None, + ) -> HostedRunResult[Any]: + """Apply a channel-supplied response hook under host ownership.""" + if hook is None: + return result + return await _apply_response_hook(hook, result, request=request, channel_name=channel_name) + def _log_incoming(self, request: ChannelRequest, *, stream: bool) -> None: """Emit a structured INFO summary for every incoming target invocation. @@ -1554,7 +1089,6 @@ def _bind_request_context(self, request: ChannelRequest) -> ExitStack: async def _invoke(self, request: ChannelRequest) -> HostedRunResult[AgentResponse]: self._log_incoming(request, stream=False) - self._record_identity(request) if self._is_workflow: # Workflow targets follow a separate path; the dedicated dispatch # is parameterised on ``WorkflowRunResult`` so the static return @@ -1579,7 +1113,6 @@ async def _invoke(self, request: ChannelRequest) -> HostedRunResult[AgentRespons def _invoke_stream(self, request: ChannelRequest) -> ResponseStream[AgentResponseUpdate, AgentResponse]: self._log_incoming(request, stream=True) - self._record_identity(request) if self._is_workflow: return self._invoke_workflow_stream(request) run_kwargs = self._build_run_kwargs(request) @@ -1711,7 +1244,7 @@ def _invoke_workflow_stream(self, request: ChannelRequest) -> ResponseStream[Age Wraps the workflow's ``ResponseStream[WorkflowEvent, WorkflowRunResult]`` in a new ``ResponseStream[AgentResponseUpdate, AgentResponse]`` so channels can iterate it identically to an agent stream and apply - their ``stream_transform_hook`` callables. + their ``stream_update_hook`` callables. Mapping rules: @@ -1799,10 +1332,10 @@ def _wrap_input(self, request: ChannelRequest) -> Message | list[Message]: Channels deliver inputs as plain text, a single ``Message``, or a list of ``Message`` (e.g. a Responses-API request that includes a ``system`` - instruction plus the user turn). To preserve channel provenance + - identity + ``response_target`` on the persisted history record (and - make it visible to context providers, evals, audits), we attach a - ``hosting`` block under ``additional_properties``. AF's + instruction plus the user turn). To preserve channel provenance and + optional identity metadata on the persisted history record (and make it + visible to context providers, evals, audits), we attach a ``hosting`` + block under ``additional_properties``. AF's ``Message.to_dict`` round-trips ``additional_properties`` through any ``HistoryProvider`` that serializes via ``to_dict`` (e.g. ``FileHistoryProvider``) and the framework explicitly does *not* @@ -1819,12 +1352,6 @@ def _wrap_input(self, request: ChannelRequest) -> Message | list[Message]: "native_id": request.identity.native_id, "attributes": dict(request.identity.attributes) if request.identity.attributes else {}, } - target = request.response_target - hosting_meta["response_target"] = { - "kind": target.kind.value, - "targets": list(target.targets), - } - raw = request.input if isinstance(raw, Message): raw.additional_properties = {**(raw.additional_properties or {}), "hosting": hosting_meta} @@ -1842,531 +1369,5 @@ def _wrap_input(self, request: ChannelRequest) -> Message | list[Message]: additional_properties={"hosting": hosting_meta}, ) - def _record_identity(self, request: ChannelRequest) -> None: - """Update the per-``isolation_key`` identity registry + active-channel hint. - - Called on every successful resolve. ``ResponseTarget.active`` - consumes ``self._active``; ``ResponseTarget.channel(name)`` / - ``.channels([...])`` / ``.all_linked`` consume ``self._identities``. - """ - if request.identity is None or request.session is None: - return - key = request.session.isolation_key - if not key: - return - self._identities.setdefault(key, {})[request.identity.channel] = request.identity - self._active[key] = request.identity.channel - - def _build_echo_payload(self, request: ChannelRequest) -> HostedRunResult[AgentResponse]: - """Build a ``HostedRunResult`` representing the originating user message. - - Used when ``ResponseTarget.echo_input`` is set so non-originating - destinations can mirror the user's turn before the agent reply - arrives. The user-facing payload is synthesised as a one-message - :class:`AgentResponse` (``role="user"``) so it flows through the - same delivery machinery as the agent's reply — channels handle - both via a single ``HostedRunResult[AgentResponse]`` shape. The - hosting metadata that ``_wrap_input`` attaches for agent - invocation is intentionally stripped: the echo is end-user-facing - and we don't leak host-internal bookkeeping onto another - channel's wire. - """ - raw = request.input - if isinstance(raw, Message): - user_messages: list[Message] = [ - Message(role="user", contents=list(raw.contents), author_name=raw.author_name), - ] - elif isinstance(raw, list) and raw and all(isinstance(m, Message) for m in raw): - user_messages = [ - Message(role="user", contents=list(m.contents), author_name=m.author_name) - for m in raw - if isinstance(m, Message) - ] - elif isinstance(raw, str): - user_messages = [Message(role="user", contents=[Content.from_text(text=raw)])] - elif isinstance(raw, Content): - user_messages = [Message(role="user", contents=[raw])] - else: - # AgentRunInputs allows other shapes (mapping, sequence of mixed - # str/Content); stringify as a defensive fallback. - user_messages = [Message(role="user", contents=[Content.from_text(text=str(raw))])] - return HostedRunResult(AgentResponse(messages=user_messages)) - - async def _deliver_payload_to_channel( - self, - channel: ChannelPush, - identity: ChannelIdentity, - payload: HostedRunResult[Any], - *, - request: ChannelRequest, - is_echo: bool, - ) -> HostedRunResult[Any]: - """Clone, run the channel's ``response_hook`` (if any), and push. - - The clone keeps fan-out free from cross-destination mutation: a - hook that rebinds ``result`` on one destination cannot leak into - the next push. Note that the clone is shallow — channels that - need to mutate ``result`` itself (rather than rebind it via - :meth:`HostedRunResult.replace`) are responsible for their own - deep copy. Returns the (possibly hook-shaped) payload so callers - can log post-hook diagnostics rather than the pre-hook ones. - - ``response_hook`` is duck-typed on the channel: any attribute - named ``response_hook`` that is callable participates. The - :class:`Channel` Protocol stays a small "name / path / contribute" - contract; richer surfaces stay attribute-level so adding hook - support to a new channel does not require updating the Protocol. - """ - shaped = await apply_channel_response_hook( - channel, - payload, - request=request, - destination_identity=identity, - originating=False, - is_echo=is_echo, - clone=True, - ) - await channel.push(identity, shaped) - return shaped - - async def _handle_push_task(self, payload: Mapping[str, Any]) -> None: - """Runner-side handler for ``hosting.push`` tasks. - - Unpacks a single per-destination push payload (one channel, one - identity) and runs the echo (when present) followed by the - response push. Echo failures are logged and swallowed — the - user-visible failure mode is "response delivered without - echo", *not* "no response at all". Response-push failures - re-raise so the runner can retry per the configured - :class:`RetryPolicy`. - - **Retry idempotency for the echo phase.** The payload includes a - mutable ``"echo_done"`` cursor (initialised to ``False`` at - schedule time). If a previous attempt already delivered the - echo but the response push then failed, the runner retries the - whole task; we observe ``echo_done == True`` and skip the - re-echo so end users on channels without server-side - deduplication don't see the same user-message echoed multiple - times. This is a best-effort guarantee for the in-process - runner — payload mutations don't survive process restarts. - Durable adapter packages SHOULD persist the cursor as part of - their task state (their replay machinery typically gives them - that primitive for free). - - Payload shape depends on the configured - :data:`DurableTaskRunner.payload_mode`: - - * Object mode (default) — live Python references: - ``channel_name``, ``identity``, ``result``, ``echo_result``, - ``echo_done``, ``request``. - * JSON mode — a single ``envelope`` produced by the - destination channel's :class:`ChannelPushCodec` plus - ``channel_name`` and ``echo_done``. The handler invokes - ``codec.decode(envelope)`` to recover the live references - before pushing. - """ - channel_name = cast(str, payload["channel_name"]) - echo_done = bool(payload.get("echo_done", False)) - - by_name = {ch.name: ch for ch in self.channels} - channel = by_name.get(channel_name) - if channel is None or not isinstance(channel, ChannelPush): - # Channel was validated at schedule time; if we ever land - # here it means the host's channel list mutated mid-flight, - # which we don't support. Log loudly and drop — re-raising - # would just cause the runner to retry forever. - logger.error( - "hosting.push: channel %r is no longer a ChannelPush; dropping task", - channel_name, - ) - return - push_channel = cast(ChannelPush, channel) - - # Recover the live references. Object-mode runners pass them - # through verbatim; JSON-mode runners persisted an envelope the - # channel's codec produced and we now ask the codec to decode - # it back. - envelope = payload.get("envelope") - if envelope is not None: - codec = cast("ChannelPushCodec | None", getattr(channel, "push_codec", None)) - if codec is None: - logger.error( - "hosting.push: channel %r received a JSON envelope but has no push_codec; dropping task", - channel_name, - ) - return - result, request, identity, echo_result = await codec.decode(envelope) - else: - identity = cast(ChannelIdentity, payload["identity"]) - result = cast(HostedRunResult[Any], payload["result"]) - echo_result = cast("HostedRunResult[Any] | None", payload.get("echo_result")) - request = cast(ChannelRequest, payload["request"]) - - if echo_result is not None and not echo_done: - try: - await self._deliver_payload_to_channel( - push_channel, - identity, - echo_result, - request=request, - is_echo=True, - ) - except Exception: - logger.exception( - "hosting.push: echo push failed for channel=%s native_id=%s", - channel_name, - identity.native_id, - ) - else: - # Mutate the payload mapping so a subsequent retry of - # this task (triggered by a failure in the response - # phase below) skips the echo. The in-process runner - # reuses the same mapping object across retries — see - # ``_run_with_retry``; durable adapters persist the - # cursor as part of their task state. - if isinstance(payload, dict): - payload["echo_done"] = True - logger.info( - "hosting.push: echoed user message", - extra={"channel": channel_name, "native_id": identity.native_id}, - ) - elif echo_result is not None and echo_done: - logger.debug( - "hosting.push: skipping echo on retry (already delivered)", - extra={"channel": channel_name, "native_id": identity.native_id}, - ) - - # Response phase — raise on failure so the runner retries per - # the configured retry policy. The runner is responsible for - # terminal-failure bookkeeping. - await self._deliver_payload_to_channel( - push_channel, - identity, - result, - request=request, - is_echo=False, - ) - logger.info( - "hosting.push: pushed agent response", - extra={"channel": channel_name, "native_id": identity.native_id}, - ) - - async def _deliver_response(self, request: ChannelRequest, payload: HostedRunResult[Any]) -> bool: - """Resolve ``request.response_target``, annotate audit metadata, and schedule pushes. - - Returns ``True`` when the originating channel should render the - agent reply on its own wire (the resolved target included the - originating channel either explicitly or via the host's "every - destination dropped, fall back to originating" recovery path). - Returns ``False`` when the reply is fanned out purely to - non-originating destinations (or :data:`ResponseTarget.none` - suppresses the reply entirely). - - Per SPEC-002 §"Intended targets + durable delivery": for any - non-``originating`` target, the originating channel returns an - acknowledgement and the actual agent reply is dispatched - **asynchronously** via the host's :class:`DurableTaskRunner` — - one scheduled task per destination, with the runner owning - retry / terminal-failure / replay semantics. - - **Immutable audit annotation.** Before scheduling, the host - annotates each resolved assistant ``Message`` in the payload - with the ``hosting.intended_targets`` list (and optionally - ``hosting.skipped_targets`` for destinations dropped at - resolution time). Persistence providers therefore observe the - host's *intent* from a single immutable write — mutable - per-destination delivery state is owned by the runner backend. - - When a destination cannot be resolved (no known native id), or - the destination channel doesn't implement :class:`ChannelPush`, - or no channel by that name is registered, it is dropped - synchronously and logged at WARNING. When the only resolved - destinations all drop at resolution time we fall back to - delivering on the originating channel so the user is never left - without a reply. - - When ``request.response_target.echo_input`` is True the echo - payload (the originating user message) is bundled into the - same per-destination task as the agent response — see - :meth:`_handle_push_task`. The echo is dispatched *before* the - response within that task; an echo failure does not abort the - response push, and a retried task skips an already-delivered - echo via the ``echo_done`` cursor. - - For JSON-mode runners the destination channel's - :class:`ChannelPushCodec` is called to project the in-memory - :class:`HostedRunResult` into a JSON-safe envelope before - scheduling. Codec failures - (:class:`PushPayloadNotSerializable`) abort the schedule for - that destination (logged and treated as skipped); other - destinations still get their chance. - - Each per-destination push (echo and response) goes through - :meth:`_deliver_payload_to_channel`, which clones the payload - and applies the channel's optional ``response_hook`` so - per-channel transforms (e.g. flatten multi-modal to text for a - text-only wire) can't leak across destinations. - """ - target = request.response_target - kind = target.kind - - # Fast paths for the trivial variants. - if kind == ResponseTargetKind.ORIGINATING: - return True - if kind == ResponseTargetKind.NONE: - # Background-only — drop the reply on the floor for now (no - # ContinuationToken in the prototype). - return False - - # Build the destination set. - include_originating = False - # Each entry is (channel_name, identity_override_or_None_to_lookup). - destinations: list[tuple[str, ChannelIdentity | None]] = [] - isolation_key = request.session.isolation_key if request.session is not None else None - known = self._identities.get(isolation_key or "", {}) - - if kind == ResponseTargetKind.ACTIVE: - active = self._active.get(isolation_key or "") - if active is None or active == request.channel: - # Fall back to originating when there's no other active - # channel known (matches the "first message" case). - self._annotate_intended_targets(payload, intended=(), skipped=()) - return True - destinations.append((active, known.get(active))) - - elif kind == ResponseTargetKind.ALL_LINKED: - for channel_name, identity in known.items(): - if channel_name == request.channel: - include_originating = True - continue - destinations.append((channel_name, identity)) - if not destinations and not include_originating: - # No links recorded yet — fall back. - self._annotate_intended_targets(payload, intended=(), skipped=()) - return True - - elif kind == ResponseTargetKind.IDENTITIES: - for ident in target.target_identities: - if ident.channel == request.channel: - # Pointing the originating channel at itself — fold - # into ``include_originating`` so the originating - # channel renders on its own wire rather than - # double-delivering via push. - include_originating = True - continue - destinations.append((ident.channel, ident)) - - elif kind == ResponseTargetKind.CHANNELS: - for entry in target.targets: - if entry == "originating": - include_originating = True - continue - if ":" in entry: - channel_name, _, native_id = entry.partition(":") - if channel_name == request.channel: - # Pointing the originating channel at itself with a - # specific native id — treat as "include - # originating" since the channel will reply on its - # own wire to that user anyway. - include_originating = True - continue - destinations.append((channel_name, ChannelIdentity(channel=channel_name, native_id=native_id))) - else: - if entry == request.channel: - include_originating = True - continue - destinations.append((entry, known.get(entry))) - - # Schedule per-destination push tasks via the durable runner. - by_name = {ch.name: ch for ch in self.channels} - runner_mode = getattr(self._durable_task_runner, "payload_mode", DurableTaskPayloadMode.OBJECT) - intended_tokens: list[str] = [] - skipped_tokens: list[str] = [] - echo_payload = self._build_echo_payload(request) if target.echo_input else None - for channel_name, dest_identity in destinations: - channel = by_name.get(channel_name) - token = f"{channel_name}:{dest_identity.native_id}" if dest_identity is not None else channel_name - if channel is None: - logger.warning("deliver_response: no channel named %r (target=%s)", channel_name, token) - skipped_tokens.append(token) - continue - if not isinstance(channel, ChannelPush): - logger.warning( - "deliver_response: channel %r does not implement ChannelPush (target=%s)", - channel_name, - token, - ) - skipped_tokens.append(token) - continue - if dest_identity is None: - logger.warning( - "deliver_response: no known identity for isolation_key=%s on channel=%s", - isolation_key, - channel_name, - ) - skipped_tokens.append(token) - continue - - # Build the runner payload. Object-mode runners get live - # references for speed; JSON-mode runners get a fully - # encoded envelope from the channel's push codec. - try: - task_payload = await self._build_push_payload( - channel=channel, - channel_name=channel_name, - identity=dest_identity, - request=request, - result=payload, - echo_payload=echo_payload, - runner_mode=runner_mode, - ) - except PushPayloadNotSerializable: - logger.exception( - "deliver_response: channel %r push codec refused payload (target=%s); skipping", - channel_name, - token, - ) - skipped_tokens.append(token) - continue - try: - await self._durable_task_runner.schedule(HOSTING_PUSH_TASK_NAME, task_payload) - except Exception: - # Schedule-time failures are a host-side outage (runner - # backend unreachable, configuration error). Log and - # treat the destination as skipped — the originating - # channel's fall-back-to-originating rule (below) keeps - # the user from being left without a reply when every - # destination dropped. - logger.exception("deliver_response: failed to schedule push for target=%s", token) - skipped_tokens.append(token) - continue - intended_tokens.append(token) - logger.info( - "deliver_response: scheduled push", - extra={"target": token, "channel": channel_name}, - ) - - if not intended_tokens and not include_originating: - # Spec policy: if every destination drops at resolution time - # (or scheduling fails universally) deliver to originating - # so the user gets a response. The runner backend still - # owns observability for any partial-failure case where at - # least one destination did get scheduled. - logger.warning("deliver_response: every destination dropped — falling back to originating") - include_originating = True - - self._annotate_intended_targets( - payload, - intended=tuple(intended_tokens), - skipped=tuple(skipped_tokens), - include_originating=include_originating, - originating_channel=request.channel, - ) - - return include_originating - - async def _build_push_payload( - self, - *, - channel: ChannelPush, - channel_name: str, - identity: ChannelIdentity, - request: ChannelRequest, - result: HostedRunResult[Any], - echo_payload: HostedRunResult[Any] | None, - runner_mode: DurableTaskPayloadMode, - ) -> dict[str, Any]: - """Assemble the runner payload for a single push destination. - - For object-mode runners (the default in-process runner) we - forward live references — no serialisation cost on the hot - path. For JSON-mode runners we invoke the channel's - :class:`ChannelPushCodec` once to produce a JSON-safe envelope - for the whole push triple; the codec is the only entity that - knows how to project a :class:`HostedRunResult` plus the - channel-side request/identity context for a specific channel's - wire format. - """ - if runner_mode == DurableTaskPayloadMode.OBJECT: - return { - "channel_name": channel_name, - "identity": identity, - "result": result, - "echo_result": echo_payload, - "echo_done": False, - "request": request, - } - # JSON mode — the startup validator guarantees every push-capable - # channel has a ``push_codec``. Use ``getattr`` for the same - # duck-typed lookup pattern the validator and decoder use. - codec = cast("ChannelPushCodec", getattr(channel, "push_codec")) # noqa: B009 - envelope = await codec.encode( - result=result, - request=request, - identity=identity, - echo_result=echo_payload, - ) - return { - "channel_name": channel_name, - "envelope": dict(envelope), - "echo_done": False, - } - - def _annotate_intended_targets( - self, - payload: HostedRunResult[Any], - *, - intended: tuple[str, ...], - skipped: tuple[str, ...], - include_originating: bool = False, - originating_channel: str | None = None, - ) -> None: - """Stamp ``additional_properties["hosting"]`` on every assistant message in the payload. - - The audit annotation is the spec's immutable record of the - host's delivery *intent* — persistence providers see what the - host meant to deliver from a single write, without ever - observing mutable per-destination state (the runner owns - that). Annotated fields: - - - ``intended_targets``: ``[[:], …]`` for - every non-originating destination whose push task was - scheduled successfully. - - ``skipped_targets``: destinations dropped at resolution time - (unknown channel, no ``ChannelPush``, no known identity, or - schedule-time outage). Useful for ops triage. - - ``includes_originating``: ``True`` when the originating - channel rendered (or will render) the reply on its own wire. - - Workflow targets producing arbitrary result objects with no - ``messages`` field are left untouched — the annotation is a - best-effort augmentation of conventional agent responses. - """ - result_obj = payload.result - messages_raw: Any = getattr(result_obj, "messages", None) - if not isinstance(messages_raw, list): - return - hosting_meta: dict[str, Any] = { - "intended_targets": list(intended), - "includes_originating": include_originating, - } - if skipped: - hosting_meta["skipped_targets"] = list(skipped) - if include_originating and originating_channel is not None: - hosting_meta["originating_channel"] = originating_channel - for entry in cast("list[Any]", messages_raw): # type: ignore[redundant-cast] - if not isinstance(entry, Message): - continue - message: Message = entry - if getattr(message, "role", None) != "assistant": - continue - existing = message.additional_properties or {} - existing_hosting = existing.get("hosting") if isinstance(existing, Mapping) else None - if isinstance(existing_hosting, Mapping): - merged_hosting: Mapping[str, Any] = {**existing_hosting, **hosting_meta} - else: - merged_hosting = hosting_meta - message.additional_properties = {**existing, "hosting": merged_hosting} - __all__ = ["AgentFrameworkHost", "ChannelContext", "logger"] diff --git a/python/packages/hosting/agent_framework_hosting/_persistence.py b/python/packages/hosting/agent_framework_hosting/_persistence.py index 0e2ca854386..9ecc4dd23e0 100644 --- a/python/packages/hosting/agent_framework_hosting/_persistence.py +++ b/python/packages/hosting/agent_framework_hosting/_persistence.py @@ -2,25 +2,10 @@ """Shared persistence primitives for the hosting package. -The hosting core ships with an opt-in disk-persistence layer for the -in-process task runner and the host's session-related state. The -on-disk format is provided by the ``diskcache`` package (a small, -pure-Python, sqlite-backed dependency installed via the ``[disk]`` -optional extra). - -This module centralises: - -- :func:`load_diskcache` — lazy import that raises a helpful error when - the optional extra is missing. -- :func:`acquire_state_dir_lock` — single-owner file lock that fails - fast when a second process points at the same directory. -- :func:`normalize_state_dir` — turn the host-level ``state_dir`` - parameter (``str`` / ``PathLike`` / :class:`HostStatePaths` / - ``Mapping``) into a normalised ``dict[component_name -> Path | None]``. - -Everything in this module is internal — public callers should go -through :class:`AgentFrameworkHost` or -:class:`InProcessTaskRunner` directly. +The simplified hosting core keeps disk persistence only for session aliases +created by :meth:`AgentFrameworkHost.reset_session` and for workflow +checkpoint path derivation. The on-disk session-alias store uses the optional +``diskcache`` package installed via the ``[disk]`` extra. """ from __future__ import annotations @@ -35,29 +20,19 @@ if TYPE_CHECKING: from ._types import HostStatePaths -# Known component keys recognised by the host's ``state_dir`` normaliser. -# Adding a new component is a non-breaking change: extend this tuple and -# add the matching key to :class:`HostStatePaths` in ``_types.py``. -_KNOWN_COMPONENTS: tuple[str, ...] = ("runner", "sessions", "checkpoints", "links") +_KNOWN_COMPONENTS: tuple[str, ...] = ("sessions", "checkpoints") def load_diskcache() -> Any: - """Lazy-import :mod:`diskcache` with a helpful error when missing. - - The ``diskcache`` package is an optional dependency installed via - the ``agent-framework-hosting[disk]`` extra. Users that never set - ``state_dir`` never trigger the import. This wrapper produces a - single, consistent error message when the import is needed but the - extra was not installed. - """ + """Lazy-import :mod:`diskcache` with a helpful error when missing.""" try: import diskcache # type: ignore[import-untyped] except ImportError as exc: # pragma: no cover - exercised via tests by monkeypatching raise ImportError( - "agent-framework-hosting was asked to persist state to disk " - "(state_dir is set) but the optional `diskcache` dependency " + "agent-framework-hosting was asked to persist session aliases to disk " + "(state_dir['sessions'] is set) but the optional `diskcache` dependency " "is not installed. Install the disk extra: " - "`pip install 'agent-framework-hosting[disk]'`." + "`pip install 'agent-framework-hosting[disk]`." ) from exc return diskcache @@ -65,24 +40,11 @@ def load_diskcache() -> Any: def acquire_state_dir_lock(component_dir: Path) -> Any: """Acquire an exclusive single-owner lock on a component's state dir. - Two processes pointing at the same state directory would both scan - pending records on startup and could execute the same task twice; - we therefore enforce single-owner semantics with an OS-level - advisory lock. The lock file lives at ``/.lock`` and - is held for the lifetime of the returned file handle. Closing the - handle (or process exit) releases it. - - On Unix this uses :func:`fcntl.flock`. On Windows it uses - :func:`msvcrt.locking`. The lock is *advisory* — the OS will not - enforce it against processes that ignore it, but no - well-behaved component of this package will. - - Raises ``RuntimeError`` if another process already holds the lock. + Raises: + RuntimeError: If another process already holds the lock. """ component_dir.mkdir(parents=True, exist_ok=True) lock_path = component_dir / ".lock" - # Open in append mode so we don't truncate an existing lock file - # (some monitoring tools may inspect it). fh = open(lock_path, "a+", encoding="utf-8") # noqa: SIM115 - kept open for lifetime try: if sys.platform == "win32": @@ -94,8 +56,7 @@ def acquire_state_dir_lock(component_dir: Path) -> Any: fh.close() raise RuntimeError( f"Another process already holds the hosting state lock at {lock_path}. " - "Two hosts (or two runners) pointing at the same state directory would " - "double-execute scheduled tasks; point each host at its own state_dir." + "Point each host at its own state_dir." ) from exc else: import fcntl @@ -106,8 +67,7 @@ def acquire_state_dir_lock(component_dir: Path) -> Any: fh.close() raise RuntimeError( f"Another process already holds the hosting state lock at {lock_path}. " - "Two hosts (or two runners) pointing at the same state directory would " - "double-execute scheduled tasks; point each host at its own state_dir." + "Point each host at its own state_dir." ) from exc except RuntimeError: raise @@ -118,15 +78,10 @@ def acquire_state_dir_lock(component_dir: Path) -> Any: def release_state_dir_lock(handle: Any) -> None: - """Release a lock previously acquired by :func:`acquire_state_dir_lock`. - - Closing the file handle is sufficient to drop the lock on both - platforms, but we make the intent explicit so the caller doesn't - have to know which mechanism (``fcntl`` vs ``msvcrt``) is in use. - """ + """Release a lock previously acquired by :func:`acquire_state_dir_lock`.""" if handle is None: return - with contextlib.suppress(Exception): # close errors are not actionable + with contextlib.suppress(Exception): handle.close() @@ -135,40 +90,27 @@ def normalize_state_dir( ) -> dict[str, Path | None]: """Resolve the host-level ``state_dir`` parameter into a per-component map. - Accepts any of: - - - ``None`` → all components return ``None`` (fully in-memory; today's behavior). - - ``str`` / :class:`os.PathLike` → all components share a parent - directory and get an auto-allocated subfolder (``runner/``, - ``sessions/``, ``checkpoints/``, ``links/``). - - :class:`HostStatePaths` typed dict / plain ``Mapping`` → per-key - override. Components missing from the mapping fall back to ``None`` - (in-memory only). Unknown keys raise ``ValueError`` to surface - typos early. - - Returns a ``dict[component_name -> Path | None]`` covering every - component in :data:`_KNOWN_COMPONENTS`. + Accepts ``None``, a single root path, or a mapping with ``sessions`` and + ``checkpoints`` keys. Unknown keys raise ``ValueError`` so obsolete + ``runner`` / ``links`` configuration is rejected instead of silently + doing nothing. """ result: dict[str, Path | None] = {name: None for name in _KNOWN_COMPONENTS} if state_dir is None: return result - # Strings and PathLikes use the default subfolder layout. if isinstance(state_dir, (str, os.PathLike)): root = Path(os.fspath(state_dir)) for name in _KNOWN_COMPONENTS: result[name] = root / name return result - # Mappings (incl. TypedDict at runtime) get per-component overrides. if isinstance(state_dir, Mapping): unknown = [k for k in state_dir if k not in _KNOWN_COMPONENTS] if unknown: raise ValueError( f"state_dir mapping contains unknown component key(s): {unknown!r}. " - f"Known components are: {list(_KNOWN_COMPONENTS)!r}. " - "If you are trying to use a future component, upgrade " - "agent-framework-hosting to a version that supports it." + f"Known components are: {list(_KNOWN_COMPONENTS)!r}." ) for name in _KNOWN_COMPONENTS: raw_value: Any = state_dir.get(name) @@ -184,12 +126,3 @@ def normalize_state_dir( raise TypeError( f"state_dir must be a str, PathLike, HostStatePaths mapping, or None — got {type(state_dir).__name__}" ) - - -__all__ = [ - "_KNOWN_COMPONENTS", - "acquire_state_dir_lock", - "load_diskcache", - "normalize_state_dir", - "release_state_dir_lock", -] diff --git a/python/packages/hosting/agent_framework_hosting/_runner.py b/python/packages/hosting/agent_framework_hosting/_runner.py deleted file mode 100644 index d37fa19bc0e..00000000000 --- a/python/packages/hosting/agent_framework_hosting/_runner.py +++ /dev/null @@ -1,751 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""In-process implementation of :class:`DurableTaskRunner`. - -This is the default runner the host wires in when the operator does not -supply one. It runs tasks via :func:`asyncio.create_task` with a bounded -retry loop following the supplied :class:`RetryPolicy`. - -Two modes: - -* **In-memory** (``state_dir=None``, default) — pending tasks live as - ``asyncio.Task`` references in process memory. In-flight tasks are - lost on process death. Cheap, zero dependencies, suitable for unit - tests and for long-running deployments where "the process dies, - queued pushes are lost" is an acceptable failure mode. -* **Disk-persistent** (``state_dir=``) — pending tasks are - pickled into a :mod:`diskcache`-backed sqlite store before the - ``asyncio.Task`` is created. On the next startup the host calls - :meth:`InProcessTaskRunner.resume` which re-schedules every - surviving ``"pending"`` record with its persisted attempt count. - Graceful shutdown cancellations leave records in ``"pending"`` so - they replay on the next boot. Suitable for ``runtime_mode="long_running"`` - deployments that survive container moves / OOMs. - -For ``runtime_mode="ephemeral"`` deployments (Foundry Hosted Agent, -Azure Functions, Lambda) plug in a durable adapter package -(``agent-framework-hosting-durabletask`` for the gRPC TaskHub backend, -a future Foundry adapter, …) — they all implement the same -:class:`DurableTaskRunner` Protocol. - -See ``docs/specs/002-python-hosting-channels.md`` § "Durable task runner". -""" - -from __future__ import annotations - -import asyncio -import contextlib -import logging -import os -import pickle # noqa: S403 # nosec B403 - used only to validate user payloads round-trip -import time -import uuid -from collections.abc import Awaitable, Callable, Mapping -from pathlib import Path -from typing import Any, cast - -from ._persistence import ( - acquire_state_dir_lock, - load_diskcache, - release_state_dir_lock, -) -from ._types import ( - DurableTaskPayloadMode, - DurableTaskRunner, - PushPayloadNotPicklable, - RetryPolicy, - TaskHandle, - TaskStatus, -) - -logger = logging.getLogger(__name__) - - -# Keys used inside the per-task on-disk record. Kept as module constants -# so the schema is documented once and refactors are mechanical. -_REC_HANDLER_NAME = "handler_name" -_REC_PAYLOAD = "payload" -_REC_RETRY_POLICY = "retry_policy" -_REC_ATTEMPTS = "attempts_completed" -_REC_STATUS = "status" -_REC_CREATED_AT = "created_at" -_REC_TERMINAL_AT = "terminal_at" -_REC_NAME = "name" - -# Deque key inside the cache holding terminal task ids in insertion order. -# Used for FIFO eviction of terminal records once the bounded cap is hit. -_TERMINAL_ORDER_KEY = "__terminal_order__" - - -class _PersistedPayloadDict(dict[str, Any]): - """Drop-in :class:`dict` that mirrors mutations back to disk. - - Used by :class:`InProcessTaskRunner` when ``state_dir`` is set so - handler-side cursors (``echo_done``) survive process restarts. The - handler interacts with this object exactly as it would with a plain - dict; the override on :meth:`__setitem__` is the only difference. - - Held weakly by the runner so handlers that capture the dict in - long-lived closures don't keep the runner alive past its natural - lifetime. - """ - - # Type annotation for the persist callback; the actual attribute is - # assigned via the __slots__-aware ``object.__setattr__`` dance - # below so PyPy doesn't reject the assignment on a ``dict`` subclass. - _persist_cb: Callable[[Mapping[str, Any]], None] - - __slots__ = ("_persist_cb",) - - def __init__( - self, - data: Mapping[str, Any], - persist_cb: Callable[[Mapping[str, Any]], None], - ) -> None: - super().__init__(data) - # Use object.__setattr__ to bypass the __slots__ checker on - # dict subclasses (CPython is liberal here but PyPy is strict). - object.__setattr__(self, "_persist_cb", persist_cb) - - def __setitem__(self, key: str, value: Any) -> None: - super().__setitem__(key, value) - # Re-serialise after each mutation. The cache stores opaque - # pickled values, so partial-field updates aren't possible — - # we send the whole payload mapping every time. Mutations on - # the runner's hot path are rare (just the ``echo_done`` - # cursor today) so this is fine. - self._persist_cb(dict(self)) - - -class InProcessTaskRunner(DurableTaskRunner): - """In-memory or disk-persistent :class:`DurableTaskRunner`. - - Schedules each task as an :func:`asyncio.create_task` coroutine and - retries on exception up to ``RetryPolicy.max_attempts`` times with - exponential backoff. Terminal status (``succeeded`` / ``failed`` / - ``cancelled``) is reported via :meth:`get`. - - Re-registration of the same handler name after :meth:`schedule` has - been called is rejected to avoid silent re-orderings of in-flight - work; the host registers all handlers at startup, before serving - traffic. - - Keyword Args: - default_retry_policy: Per-runner default :class:`RetryPolicy`; - overridable per-task at :meth:`schedule` call sites. - terminal_cache_size: Maximum number of terminal task records to - retain. Older entries are FIFO-evicted so a long-running - host can't accumulate unbounded status entries. - shutdown_grace_seconds: Window :meth:`shutdown` waits for - in-flight tasks to drain before cancelling stragglers. - state_dir: When set, the runner persists pending and terminal - task records under this directory (a :mod:`diskcache` - sqlite store at ``/cache.db`` and a single-owner - lock at ``/.lock``). Persisted pending records - survive process restarts and are replayed by :meth:`resume`. - When ``None`` (default) the runner is purely in-memory and - in-flight tasks are lost on process death. Requires the - optional ``diskcache`` dependency — install with - ``pip install 'agent-framework-hosting[disk]'``. - """ - - # Declared at class level so the ``DurableTaskRunner`` Protocol's - # ``payload_mode`` attribute resolves on instances without needing - # to assign it in ``__init__``. - payload_mode: DurableTaskPayloadMode = DurableTaskPayloadMode.OBJECT - - def __init__( - self, - *, - default_retry_policy: RetryPolicy | None = None, - terminal_cache_size: int = 1024, - shutdown_grace_seconds: float = 5.0, - state_dir: str | os.PathLike[str] | None = None, - ) -> None: - self._handlers: dict[str, Callable[[Mapping[str, Any]], Awaitable[None]]] = {} - self._default_retry_policy = default_retry_policy or RetryPolicy() - self._terminal_cache_size = terminal_cache_size - # How long ``shutdown()`` waits for in-flight tasks to finish on - # their own before cancelling them. Channels may legitimately - # schedule a final push during their own shutdown callback - # (goodbye message, telemetry flush), so the runner gives them - # this window to complete before cancellation kicks in. - self._shutdown_grace_seconds = shutdown_grace_seconds - - # Operational state. ``_pending`` holds asyncio tasks that are - # scheduled or running. ``_terminal`` is an in-memory mirror of - # the most recent terminal statuses (kept in-memory regardless of - # ``state_dir`` so ``get`` is fast and works before/without the - # cache being opened). - self._pending: dict[str, asyncio.Task[None]] = {} - self._terminal: dict[str, TaskStatus] = {} - self._terminal_order: list[str] = [] - - # Set to True on the first ``schedule``/``resume`` call so subsequent - # ``register`` calls fail loudly rather than silently swapping a - # handler out from under in-flight work. - self._started = False - - # Set to True when ``shutdown()`` starts so the retry loop's - # ``CancelledError`` handler distinguishes "the runner is going - # down, leave my record in 'pending' for resume()" from "this - # task was explicitly cancelled, mark it 'cancelled'". - self._shutting_down = False - - # Disk persistence — opt-in via ``state_dir``. ``None`` keeps - # the runner pure-memory (the default behaviour). - self._state_dir: Path | None = Path(os.fspath(state_dir)) if state_dir is not None else None - self._cache: Any = None - self._terminal_deque: Any = None - self._lock_handle: Any = None - if self._state_dir is not None: - self._open_cache() - - # ------------------------------------------------------------------ # - # Cache lifecycle - # ------------------------------------------------------------------ # - - def _open_cache(self) -> None: - """Open the disk cache and acquire the single-owner lock. - - Called from ``__init__`` when ``state_dir`` is set. Splitting it - out keeps the constructor body readable and gives tests a clean - seam for monkeypatching. - """ - if self._state_dir is None: # pragma: no cover - guarded by caller - raise RuntimeError("_open_cache called without state_dir") - diskcache = load_diskcache() - # Acquire the directory lock *before* opening the cache so two - # runners pointed at the same dir don't both try to initialise - # sqlite. The lock handle stays open for the runner's lifetime. - self._lock_handle = acquire_state_dir_lock(self._state_dir) - try: - self._cache = diskcache.Cache(str(self._state_dir)) - # Re-hydrate the in-memory terminal mirror so ``get`` works - # for task ids that completed in a prior process. Doing this - # here (rather than lazily) means the mirror is consistent - # the moment construction returns. - order: Any = self._cache.get(_TERMINAL_ORDER_KEY, default=[]) - if not isinstance(order, list): - # Defensive: a corrupted ordering list shouldn't take - # the host down. Reset and continue — at worst we lose - # ordering for FIFO eviction, not correctness. - logger.warning( - "InProcessTaskRunner: terminal-order entry in %s is not a list; resetting", self._state_dir - ) - order = [] - self._cache.set(_TERMINAL_ORDER_KEY, order) - self._terminal_order = [str(x) for x in cast(list[Any], order)] - for task_id in self._terminal_order: - rec_obj: Any - try: - rec_obj = self._cache.get(task_id) - except Exception: # pragma: no cover - exercised via corrupt-entry test - rec_obj = None - if not isinstance(rec_obj, dict): - continue - rec = cast(dict[str, Any], rec_obj) - status = rec.get(_REC_STATUS) - if status in {"succeeded", "failed", "cancelled"}: - self._terminal[task_id] = status - except Exception: - release_state_dir_lock(self._lock_handle) - self._lock_handle = None - raise - - # ------------------------------------------------------------------ # - # DurableTaskRunner Protocol - # ------------------------------------------------------------------ # - - def register( - self, - name: str, - handler: Callable[[Mapping[str, Any]], Awaitable[None]], - ) -> None: - if self._started: - raise RuntimeError( - f"InProcessTaskRunner.register({name!r}) called after the " - "runner started scheduling tasks — register all handlers at " - "host startup, before serving traffic, to avoid silently " - "reordering in-flight work." - ) - if name in self._handlers: - logger.warning("InProcessTaskRunner: replacing handler registered under %r", name) - self._handlers[name] = handler - - async def schedule( - self, - name: str, - payload: Mapping[str, Any], - *, - retry_policy: RetryPolicy | None = None, - ) -> TaskHandle: - if name not in self._handlers: - raise KeyError( - f"InProcessTaskRunner.schedule({name!r}): no handler " - "registered under this name. Call register(name, handler) " - "at host startup before scheduling." - ) - - self._started = True - policy = retry_policy or self._default_retry_policy - task_id = uuid.uuid4().hex - handle = TaskHandle(task_id=task_id, name=name) - - # Persist the record (when state_dir is set) BEFORE we spawn the - # asyncio task — if the persistence write fails we surface it as - # a synchronous error from ``schedule`` rather than silently - # downgrading to in-memory. - if self._cache is not None: - record = self._build_record(name, dict(payload), policy) - self._validate_picklable(record) - self._cache.set(task_id, record) - - # When persisted, wrap the payload so handler-side mutations - # (e.g. ``payload["echo_done"] = True``) flow back to disk. - runtime_payload: Mapping[str, Any] - if self._cache is not None: - captured_task_id = task_id - - def _persist_cb(new_payload: Mapping[str, Any]) -> None: - self._update_record_payload(captured_task_id, new_payload) - - runtime_payload = _PersistedPayloadDict(payload, _persist_cb) - else: - runtime_payload = payload - - handler = self._handlers[name] - task = asyncio.create_task( - self._run_with_retry(handle, handler, runtime_payload, policy), - name=f"hosting.task[{name}]:{task_id}", - ) - self._pending[task_id] = task - - def _on_done(_t: asyncio.Task[None], tid: str = task_id) -> None: - self._pending.pop(tid, None) - - task.add_done_callback(_on_done) - return handle - - async def get(self, handle: TaskHandle) -> TaskStatus | None: - if handle.task_id in self._pending: - task = self._pending[handle.task_id] - if task.cancelled(): - return "cancelled" - return "running" - # In-memory terminal mirror covers both pure-memory and - # disk-persistent runs (we re-hydrate on cache open). - if handle.task_id in self._terminal: - return self._terminal[handle.task_id] - # Disk fallback for very-aged task ids that left the in-memory - # mirror but still have a record on disk (extremely unlikely - # given that we re-hydrate all terminals at open, but defensive). - if self._cache is not None: - rec_obj: Any = self._cache.get(handle.task_id) - if isinstance(rec_obj, dict): - rec = cast(dict[str, Any], rec_obj) - status = rec.get(_REC_STATUS) - # Records on disk only live in one of four states: - # ``pending`` (queued or in-flight — resume picks these - # up) or one of the terminals. There is no transient - # ``running`` status; the in-flight asyncio task is - # observable via ``_pending`` only inside its own - # process. - if status in {"succeeded", "failed", "cancelled", "pending"}: - return cast(TaskStatus, status) - return None - - # ------------------------------------------------------------------ # - # Resume — replay persisted pending records on startup - # ------------------------------------------------------------------ # - - async def resume(self) -> int: - """Re-schedule pending tasks persisted by a previous process. - - Walks the cache for records in ``"pending"`` status, looks up - their handler in :attr:`_handlers`, and re-creates an - :class:`asyncio.Task` for each — preserving the persisted - attempt count so retry budgets resume mid-way through their - backoff schedule. - - Records whose handler is no longer registered are marked - ``"failed"`` with a clear reason in the log; they will not be - retried again. Records that fail to deserialise (corrupted - sqlite row, schema drift, …) are quarantined: their entry is - removed from the cache and the task id is logged. Both classes - of error are non-fatal — the host should boot even when a - small number of legacy records can't be replayed. - - Returns the number of records successfully re-scheduled. - - Called automatically from :class:`AgentFrameworkHost`'s lifespan - startup hook when the runner is host-owned. Callers driving the - runner directly (tests, bespoke ASGI setups) MUST call this - once after registering handlers and before serving traffic. - """ - if self._cache is None: - return 0 - - # Mark started so subsequent register() calls fail loudly — we - # don't want handler swaps after replay begins. - self._started = True - - replayed = 0 - # iterkeys returns a live view; we copy to a list because we may - # delete entries inside the loop (quarantine / drop-on-missing-handler). - task_ids: list[str] = [str(k) for k in self._cache.iterkeys() if k != _TERMINAL_ORDER_KEY] - for task_id in task_ids: - rec_obj: Any - try: - rec_obj = self._cache.get(task_id) - except Exception: - logger.exception("InProcessTaskRunner.resume: failed to read record %s; quarantining", task_id) - with contextlib.suppress(KeyError): - del self._cache[task_id] - continue - if not isinstance(rec_obj, dict) or _REC_STATUS not in rec_obj: - logger.warning("InProcessTaskRunner.resume: record %s is not a task dict; quarantining", task_id) - with contextlib.suppress(KeyError): - del self._cache[task_id] - continue - rec = cast(dict[str, Any], rec_obj) - status = rec[_REC_STATUS] - if status != "pending": - continue - - handler_name = rec.get(_REC_HANDLER_NAME) - if not isinstance(handler_name, str) or handler_name not in self._handlers: - logger.warning( - "InProcessTaskRunner.resume: no handler registered for record %s (handler=%r); marking failed", - task_id, - handler_name, - ) - self._mark_terminal(task_id, "failed") - continue - handler = self._handlers[handler_name] - - policy_value = rec.get(_REC_RETRY_POLICY) or self._default_retry_policy - if not isinstance(policy_value, RetryPolicy): - # Legacy / corrupt entry — fall back to the default rather - # than failing the whole resume. - policy_value = self._default_retry_policy - policy: RetryPolicy = policy_value - payload_value: Any = rec.get(_REC_PAYLOAD) or {} - payload: dict[str, Any] - if isinstance(payload_value, dict): - payload = cast(dict[str, Any], payload_value) - elif hasattr(payload_value, "keys"): - payload = dict(cast(Mapping[str, Any], payload_value)) - else: - payload = {} - - name_value = rec.get(_REC_NAME, handler_name) - handle = TaskHandle(task_id=task_id, name=str(name_value)) - attempts_value = rec.get(_REC_ATTEMPTS, 0) - attempts_completed = int(attempts_value or 0) - - def _make_resume_persist_cb(tid: str) -> Callable[[Mapping[str, Any]], None]: - def _cb(new_payload: Mapping[str, Any]) -> None: - self._update_record_payload(tid, new_payload) - - return _cb - - runtime_payload = _PersistedPayloadDict(payload, _make_resume_persist_cb(task_id)) - - task = asyncio.create_task( - self._run_with_retry(handle, handler, runtime_payload, policy, _resume_from_attempt=attempts_completed), - name=f"hosting.task[{handle.name}]:{task_id}(resumed)", - ) - self._pending[task_id] = task - - def _on_done(_t: asyncio.Task[None], tid: str = task_id) -> None: - self._pending.pop(tid, None) - - task.add_done_callback(_on_done) - replayed += 1 - - if replayed: - logger.info( - "InProcessTaskRunner.resume: re-scheduled %d pending task(s) from %s", replayed, self._state_dir - ) - return replayed - - # ------------------------------------------------------------------ # - # Lifecycle helper (the host calls this from ``on_shutdown``) - # ------------------------------------------------------------------ # - - async def shutdown(self, *, timeout: float | None = None) -> None: - """Wait briefly for pending tasks to drain, then cancel anything still running. - - Called by the host on ``on_shutdown`` so a graceful shutdown does - not orphan in-flight push retries. Channels may legitimately - schedule a final push from their own shutdown callback (e.g. a - goodbye message); the runner therefore *waits* up to - ``timeout`` seconds (default: the runner's - ``shutdown_grace_seconds`` configured at construction) for the - in-flight set to finish on its own before cancelling stragglers. - Tasks that don't honour cancellation within the same window are - abandoned — the runner makes no synchronous durability claim, - so cleanup is best-effort. - - When ``state_dir`` is set, tasks that didn't drain are left in - ``"pending"`` status on disk so the next process replays them - via :meth:`resume`. The disk cache is closed and the - single-owner lock is released regardless of drain outcome. - """ - self._shutting_down = True - try: - if self._pending: - grace = timeout if timeout is not None else self._shutdown_grace_seconds - tasks = list(self._pending.values()) - # Phase 1 — wait for natural completion within the grace window. - if grace > 0: - await asyncio.wait(tasks, timeout=grace) - # Phase 2 — cancel anything still pending, then wait briefly for - # cancellation to propagate. - still_pending = [t for t in tasks if not t.done()] - if still_pending: - logger.info( - "InProcessTaskRunner.shutdown: %d task(s) still running after %.2fs grace; cancelling", - len(still_pending), - grace, - ) - for task in still_pending: - task.cancel() - cancellation_window = max(grace, 1.0) - try: - await asyncio.wait_for( - asyncio.gather(*still_pending, return_exceptions=True), - timeout=cancellation_window, - ) - except (TimeoutError, asyncio.TimeoutError): - logger.warning( - "InProcessTaskRunner.shutdown: %d task(s) did not exit within %.2fs " - "of cancellation; abandoning", - sum(not t.done() for t in still_pending), - cancellation_window, - ) - finally: - # Release disk resources after the in-flight set has been - # given a chance to drain — tasks that mutate the payload - # mid-shutdown will fail to persist after this point, which - # is the correct behaviour (the next process will replay - # from whatever the last fully-committed state was). - if self._cache is not None: - try: - self._cache.close() - except Exception: # pragma: no cover - close errors aren't actionable - logger.exception("InProcessTaskRunner.shutdown: failed to close cache cleanly") - self._cache = None - if self._lock_handle is not None: - release_state_dir_lock(self._lock_handle) - self._lock_handle = None - - # ------------------------------------------------------------------ # - # Internals — retry loop - # ------------------------------------------------------------------ # - - async def _run_with_retry( - self, - handle: TaskHandle, - handler: Callable[[Mapping[str, Any]], Awaitable[None]], - payload: Mapping[str, Any], - policy: RetryPolicy, - *, - _resume_from_attempt: int = 0, - ) -> None: - delay = policy.initial_backoff_seconds - attempt = _resume_from_attempt - try: - while True: - attempt += 1 - # Persist the attempt counter BEFORE we invoke the - # handler so a crash mid-handler doesn't lose the fact - # that we tried — replay sees the bumped counter and - # respects the original retry budget. Trade-off: a - # crash before the external call is made still consumes - # one attempt (at-most-once semantics around the bump); - # we document this as best-effort across crashes. - self._update_record_attempts(handle.task_id, attempt) - - try: - await handler(payload) - except asyncio.CancelledError: - # On a graceful shutdown of a disk-persistent runner - # we deliberately *don't* mark the record terminal — - # ``resume()`` will pick it up on the next boot and - # replay it with the persisted attempt counter. For - # in-memory runners (no cache) there's nothing to - # resume from, so we still mark ``cancelled`` so - # callers holding the handle can observe the - # outcome. - if not (self._shutting_down and self._cache is not None): - self._mark_terminal(handle.task_id, "cancelled") - raise - except Exception as exc: - if attempt >= policy.max_attempts: - logger.exception( - "InProcessTaskRunner: task %s (%s) failed after %d attempts", - handle.name, - handle.task_id, - attempt, - ) - self._mark_terminal(handle.task_id, "failed") - return - logger.warning( - "InProcessTaskRunner: task %s (%s) attempt %d/%d failed (%s); retrying in %.2fs", - handle.name, - handle.task_id, - attempt, - policy.max_attempts, - exc, - delay, - ) - try: - await asyncio.sleep(delay) - except asyncio.CancelledError: - if not (self._shutting_down and self._cache is not None): - self._mark_terminal(handle.task_id, "cancelled") - raise - delay = min(delay * policy.backoff_multiplier, policy.max_backoff_seconds) - else: - self._mark_terminal(handle.task_id, "succeeded") - return - except asyncio.CancelledError: - # Propagate so the outer ``asyncio.Task`` records cancellation - # in its own state for any observer that holds the raw task. - return - - # ------------------------------------------------------------------ # - # Internals — record / disk helpers - # ------------------------------------------------------------------ # - - def _build_record( - self, - name: str, - payload: Mapping[str, Any], - policy: RetryPolicy, - ) -> dict[str, Any]: - """Construct the on-disk record dict for a freshly-scheduled task.""" - return { - _REC_HANDLER_NAME: name, - _REC_NAME: name, - _REC_PAYLOAD: dict(payload), - _REC_RETRY_POLICY: policy, - _REC_ATTEMPTS: 0, - _REC_STATUS: "pending", - _REC_CREATED_AT: time.time(), - } - - def _validate_picklable(self, record: Mapping[str, Any]) -> None: - """Pickle-probe a record at schedule time so misconfig is loud. - - We only do this when the cache is open (i.e. persistence is on). - The probe runs ``pickle.dumps`` on the record and raises a - framework-typed :class:`PushPayloadNotPicklable` if it fails. - Loud failure here is better than silent data loss after the - next restart. - """ - try: - pickle.dumps(record) # nosec B301 - dumps only, no untrusted load - except Exception as exc: - raise PushPayloadNotPicklable( - "InProcessTaskRunner: scheduled task payload is not picklable; " - "disk persistence (state_dir) requires payloads to round-trip " - "through pickle. Common causes: a user-supplied response that " - "embeds a live network client, asyncio.Lock, or generator. " - f"Underlying pickle error: {exc!r}" - ) from exc - - def _update_record_attempts(self, task_id: str, attempt: int) -> None: - """Bump the attempt counter on the persisted record (if any). - - Status stays ``"pending"`` while the task is in-flight — there - is no transient ``"running"`` status. This keeps the resume - contract simple: anything ``"pending"`` on disk is a candidate - for replay, whether it was never picked up or crashed mid-attempt. - """ - if self._cache is None: - return - rec = self._cache.get(task_id) - if not isinstance(rec, dict): - # Record was evicted / quarantined since schedule; nothing - # to persist. The asyncio task continues — it just won't - # be resumable on next boot. - return - rec[_REC_ATTEMPTS] = attempt - try: - self._cache.set(task_id, rec) - except Exception: # pragma: no cover - cache write failures aren't actionable - logger.exception("InProcessTaskRunner: failed to persist attempt counter for %s", task_id) - - def _update_record_payload(self, task_id: str, new_payload: Mapping[str, Any]) -> None: - """Persist a handler-side payload mutation back to disk. - - Called from :class:`_PersistedPayloadDict.__setitem__`. The whole - payload mapping is re-written (the cache stores opaque pickled - values, so partial-field updates aren't possible). Handler-side - mutations on the runner's hot path are rare (today: only the - ``echo_done`` cursor) so the extra write is acceptable. - """ - if self._cache is None: - return - rec = self._cache.get(task_id) - if not isinstance(rec, dict): - return - rec[_REC_PAYLOAD] = dict(new_payload) - try: - self._cache.set(task_id, rec) - except Exception: # pragma: no cover - cache write failures aren't actionable - logger.exception("InProcessTaskRunner: failed to persist payload mutation for %s", task_id) - - def _mark_terminal(self, task_id: str, status: TaskStatus) -> None: - """Move a task to a terminal status, updating both memory and disk. - - Records are first updated on disk (so a crash between the disk - write and the in-memory write doesn't lose the terminal status), - then mirrored to the in-memory cache, then FIFO-bounded. - """ - # Disk side first. - if self._cache is not None: - rec = self._cache.get(task_id) - if isinstance(rec, dict): - rec[_REC_STATUS] = status - rec[_REC_TERMINAL_AT] = time.time() - # Truncate heavy fields (payload, retry_policy) — once - # the task is terminal we never need them again, and - # keeping them around bloats disk on long-lived hosts. - rec[_REC_PAYLOAD] = None - rec[_REC_RETRY_POLICY] = None - try: - self._cache.set(task_id, rec) - except Exception: # pragma: no cover - logger.exception("InProcessTaskRunner: failed to persist terminal status for %s", task_id) - - # In-memory side. - if task_id not in self._terminal: - self._terminal_order.append(task_id) - self._terminal[task_id] = status - - # FIFO-evict from BOTH layers once we exceed the cap. - while len(self._terminal_order) > self._terminal_cache_size: - evicted = self._terminal_order.pop(0) - self._terminal.pop(evicted, None) - if self._cache is not None: - try: - del self._cache[evicted] - except KeyError: - pass - except Exception: # pragma: no cover - logger.exception("InProcessTaskRunner: failed to evict %s from disk cache", evicted) - - # Persist the new ordering list so a restart sees the same FIFO - # ordering for further eviction decisions. - if self._cache is not None: - try: - self._cache.set(_TERMINAL_ORDER_KEY, list(self._terminal_order)) - except Exception: # pragma: no cover - logger.exception("InProcessTaskRunner: failed to persist terminal-order list") - - -__all__ = ["InProcessTaskRunner"] diff --git a/python/packages/hosting/agent_framework_hosting/_state_store.py b/python/packages/hosting/agent_framework_hosting/_state_store.py index 1ca62a0a7d5..817f9bd52e1 100644 --- a/python/packages/hosting/agent_framework_hosting/_state_store.py +++ b/python/packages/hosting/agent_framework_hosting/_state_store.py @@ -1,49 +1,11 @@ # Copyright (c) Microsoft. All rights reserved. -"""Disk-backed wrappers for the host's in-memory state dicts. +"""Disk-backed wrapper for the host's session-alias map. -The host keeps three in-process dictionaries that need to survive a -process restart when the operator opts in to disk persistence: - -- ``_session_aliases`` (``isolation_key -> active session_id``): rotated - by :meth:`AgentFrameworkHost.reset_session`; without persistence a - restart silently re-uses the pre-rotation session_id and the user sees - history they were supposed to have walked away from. -- ``_active`` (``isolation_key -> last-seen channel name``): drives - :class:`ResponseTarget` ``.active`` fan-out; losing it on restart makes - :class:`ResponseTarget.active` raise ``"no active channel"`` for every - user the host has previously talked to. -- ``_identities`` - (``isolation_key -> {channel_name -> ChannelIdentity}``): the per-user - channel registry that powers :class:`ResponseTarget` ``.channel(name)``, - ``.channels([...])`` and ``.all_linked``; losing it on restart turns - every linked-identity push target into a not-found. - -Both wrappers are :class:`dict` subclasses so the rest of the host code -doesn't need to know whether persistence is on or off; the only -difference is that mutations are mirrored back to a -:mod:`diskcache`-backed sqlite store. Reads stay fast because the -in-memory copy is the source of truth — disk is purely a backing -store for write-through and re-hydration. - -Layout under ``/sessions/`` (the ``sessions`` component -chosen because all three dicts share the same per-user-life cycle): - - /sessions/ - .lock # single-owner lock (advisory) - cache.db, … # diskcache sqlite files - keyed by: - "aliases:" -> str (session_id) - "active:" -> str (channel name) - "identities:" -> dict[channel_name, ChannelIdentity] - -Pickle is what diskcache uses by default; the wrappers do not impose -their own serialisation. :class:`ChannelIdentity` is a frozen dataclass -of plain scalars and so round-trips cleanly. - -Everything in this module is internal. Public consumers should use -:class:`AgentFrameworkHost(state_dir=...)` and let the host wire the -wrappers up. +``AgentFrameworkHost.reset_session(isolation_key)`` rotates future requests for +that isolation key onto a new session id. Persisting the alias map lets that +rotation survive a host restart without introducing cross-channel identity or +delivery state into the core host. """ from __future__ import annotations @@ -52,7 +14,7 @@ import os from collections.abc import Mapping from pathlib import Path -from typing import Any, TypeVar, cast +from typing import Any, TypeVar from ._persistence import ( acquire_state_dir_lock, @@ -62,27 +24,12 @@ logger = logging.getLogger(__name__) - _V = TypeVar("_V") - - -# Key prefixes inside the shared sessions cache. Three logical maps live -# in one diskcache so they share a single sqlite handle and a single -# directory lock — opening multiple diskcaches against the same -# directory is supported but doubles file-handle pressure and the -# per-open lock acquisition cost. _ALIASES_PREFIX = "aliases:" -_ACTIVE_PREFIX = "active:" -_IDENTITIES_PREFIX = "identities:" class SessionsStateStore: - """One disk cache + lock shared by every host-side persisted dict. - - The host constructs one of these per ``state_dir["sessions"]`` value - and threads it into each :class:`_PersistedDict` it creates. Closing - the store releases the lock and the cache handle. - """ + """One disk cache + lock for host-side session aliases.""" def __init__(self, sessions_dir: str | os.PathLike[str]) -> None: self._sessions_dir: Path = Path(os.fspath(sessions_dir)) @@ -97,22 +44,11 @@ def __init__(self, sessions_dir: str | os.PathLike[str]) -> None: @property def cache(self) -> Any: - """Return the underlying :mod:`diskcache` Cache. - - Intended for the wrapper classes in this module only. Callers - outside the module should go through the typed wrappers — direct - cache access bypasses the key-prefix discipline that keeps the - three maps from colliding. - """ + """Return the underlying :mod:`diskcache` Cache.""" return self._cache def close(self) -> None: - """Close the cache and release the directory lock. - - Safe to call multiple times. The host invokes this from its - lifespan shutdown hook so a second host can re-open the same - ``state_dir`` cleanly after the first exits. - """ + """Close the cache and release the directory lock.""" if self._cache is not None: try: self._cache.close() @@ -125,15 +61,7 @@ def close(self) -> None: class _PersistedDict(dict[str, _V]): - """Drop-in :class:`dict` whose mutations mirror to a diskcache prefix. - - Used for the host's flat ``str -> V`` dicts (``_session_aliases`` - and ``_active``). The in-memory copy is the source of truth for - reads; writes update memory first and then mirror to disk so a - crash between the two leaves the in-memory state correct (which is - what subsequent reads will see anyway) and only loses the last - not-yet-flushed value on next restart. - """ + """Drop-in :class:`dict` whose mutations mirror to a diskcache prefix.""" def __init__( self, @@ -144,26 +72,20 @@ def __init__( super().__init__() self._store = store self._prefix = key_prefix - # Rehydrate from disk into memory exactly once at construction. - # Doing this here (rather than lazily) keeps the in-memory dict - # behaviour consistent with the non-persisted code path — - # ``len(host._session_aliases)`` reflects all known users from - # the moment the host is constructed. cache: Any = store.cache for raw_key in cache.iterkeys(): if not isinstance(raw_key, str) or not raw_key.startswith(key_prefix): continue - value: Any try: - value = cache.get(raw_key) + value: Any = cache.get(raw_key) except Exception: logger.exception("SessionsStateStore: failed to rehydrate %s; skipping", raw_key) continue logical_key = raw_key[len(key_prefix) :] super().__setitem__(logical_key, value) if initial: - for k, v in initial.items(): - self[k] = v + for key, value in initial.items(): + self[key] = value def __setitem__(self, key: str, value: _V) -> None: super().__setitem__(key, value) @@ -182,9 +104,7 @@ def __delitem__(self, key: str) -> None: logger.exception("SessionsStateStore: failed to evict %s%s", self._prefix, key) def pop(self, key: str, *args: Any) -> _V: - # ``dict.pop`` doesn't go through ``__delitem__``, so we mirror - # the disk side here explicitly. Forward the default sentinel - # only when present so we match ``dict.pop`` semantics exactly. + """Mirror ``dict.pop`` to disk.""" value: _V = super().pop(key, *args) try: del self._store.cache[self._prefix + key] @@ -195,16 +115,17 @@ def pop(self, key: str, *args: Any) -> _V: return value def clear(self) -> None: + """Mirror ``dict.clear`` to disk.""" keys = list(self.keys()) super().clear() cache = self._store.cache - for k in keys: + for key in keys: try: - del cache[self._prefix + k] + del cache[self._prefix + key] except KeyError: pass except Exception: # pragma: no cover - logger.exception("SessionsStateStore: failed to evict %s%s during clear", self._prefix, k) + logger.exception("SessionsStateStore: failed to evict %s%s during clear", self._prefix, key) def update( # type: ignore[override] self, @@ -212,191 +133,14 @@ def update( # type: ignore[override] /, **kwargs: _V, ) -> None: - # Defer to __setitem__ so every entry is mirrored to disk; the - # default ``dict.update`` writes into the underlying storage - # directly and would skip our persistence hook. + """Mirror ``dict.update`` to disk one item at a time.""" if other is not None: - for k in other: - self[k] = other[k] - for k, v in kwargs.items(): - self[k] = v - - -class _PersistedNestedDict(dict[str, dict[str, _V]]): - """Disk-backed wrapper for the per-isolation-key identity map. - - The host's ``_identities`` is a nested dict - ``isolation_key -> {channel_name -> ChannelIdentity}``. The whole - inner dict for a given isolation_key is small (one entry per channel - the user has appeared on), so we persist the inner dict as a single - cache value rather than per-channel — fewer cache hits, simpler - schema, no need for a separate sub-prefix. - - To make mutations of the inner dict mirror to disk, ``__getitem__`` - returns a ``_NestedInnerProxy`` that mutates the parent's cache slot - on each ``__setitem__`` / ``__delitem__``. The wrapper is purely - additive — callers that pass a plain dict in via ``__setitem__`` get - the same write-through behaviour for free. - """ - - def __init__( - self, - store: SessionsStateStore, - key_prefix: str = _IDENTITIES_PREFIX, - ) -> None: - super().__init__() - self._store = store - self._prefix = key_prefix - cache: Any = store.cache - for raw_key in cache.iterkeys(): - if not isinstance(raw_key, str) or not raw_key.startswith(key_prefix): - continue - value: Any - try: - value = cache.get(raw_key) - except Exception: - logger.exception("SessionsStateStore: failed to rehydrate %s; skipping", raw_key) - continue - if not isinstance(value, dict): - continue - inner_value = cast(dict[str, _V], value) - logical_key = raw_key[len(key_prefix) :] - # Wrap so caller-side mutations on the inner dict mirror back. - inner: _NestedInnerProxy[_V] = _NestedInnerProxy(self, logical_key, inner_value) - super().__setitem__(logical_key, inner) - - def __setitem__(self, key: str, value: dict[str, _V]) -> None: - # Wrap whatever the caller passes in so subsequent ``inner[ch] = ...`` - # mutations are mirrored to disk. We always wrap (even - # ``_NestedInnerProxy`` inputs) so the proxy's ``_outer`` link - # points at us rather than at any previous outer dict. - wrapped = _NestedInnerProxy(self, key, dict(value)) - super().__setitem__(key, wrapped) - self.persist_inner(key, dict(value)) - - def __delitem__(self, key: str) -> None: - super().__delitem__(key) - try: - del self._store.cache[self._prefix + key] - except KeyError: - pass - except Exception: # pragma: no cover - logger.exception("SessionsStateStore: failed to evict %s%s", self._prefix, key) - - def setdefault(self, key: str, default: dict[str, _V] | None = None) -> dict[str, _V]: # type: ignore[override] - if key in self: - return self[key] - if default is None: - default = {} - self[key] = default - return self[key] - - def persist_inner(self, isolation_key: str, snapshot: Mapping[str, _V]) -> None: - """Write the full inner dict for ``isolation_key`` back to disk. - - Called from :class:`_NestedInnerProxy` on every mutation and by - :meth:`__setitem__` when a new outer key is added. A single - write per change keeps the schema simple — there is no - partial-row update — and is fine for the access pattern - (mutations on the host's hot path are rare: identity registry - writes are once-per-channel-per-user). - """ - try: - self._store.cache.set(self._prefix + isolation_key, snapshot) - except Exception: # pragma: no cover - cache write failures aren't actionable - logger.exception( - "SessionsStateStore: failed to persist identities for %s%s", - self._prefix, - isolation_key, - ) - - -class _NestedInnerProxy(dict[str, _V]): - """Inner-dict proxy that mirrors mutations back to its outer. - - Returned by :class:`_PersistedNestedDict.__getitem__` (via the - rehydration / ``__setitem__`` wrap). When the channel-registry code - does ``self._identities[ik][channel_name] = identity``, the - ``__setitem__`` on this proxy fires and re-writes the whole inner - dict to disk via the parent's ``persist_inner``. Behavioural - identity with ``dict`` is preserved otherwise (``len``, iteration, - ``__contains__``, …). - """ - - _outer: _PersistedNestedDict[_V] - _key: str - - __slots__ = ("_key", "_outer") - - def __init__( - self, - outer: _PersistedNestedDict[_V], - key: str, - data: Mapping[str, _V], - ) -> None: - super().__init__(data) - # ``__slots__`` on a ``dict`` subclass requires the back-door — - # CPython is lenient, PyPy is strict. - object.__setattr__(self, "_outer", outer) - object.__setattr__(self, "_key", key) - - def __setitem__(self, key: str, value: _V) -> None: - super().__setitem__(key, value) - self._outer.persist_inner(self._key, dict(self)) - - def __delitem__(self, key: str) -> None: - super().__delitem__(key) - self._outer.persist_inner(self._key, dict(self)) - - def pop(self, key: str, *args: Any) -> _V: - value: _V = super().pop(key, *args) - self._outer.persist_inner(self._key, dict(self)) - return value - - def clear(self) -> None: - super().clear() - self._outer.persist_inner(self._key, dict(self)) - - def update( # type: ignore[override] - self, - other: Mapping[str, _V] | None = None, - /, - **kwargs: _V, - ) -> None: - if other is not None: - for k in other: - super().__setitem__(k, other[k]) - for k, v in kwargs.items(): - super().__setitem__(k, v) - self._outer.persist_inner(self._key, dict(self)) - - -def build_session_dicts( - store: SessionsStateStore, -) -> tuple[ - _PersistedDict[str], - _PersistedDict[str], - _PersistedNestedDict[Any], -]: - """Construct the three host-side persisted dicts against a single store. - - Returns ``(session_aliases, active, identities)`` in the order the - host assigns them, so the call site reads - ``self._session_aliases, self._active, self._identities = build_session_dicts(store)``. - """ - aliases: _PersistedDict[str] = _PersistedDict(store, _ALIASES_PREFIX) - active: _PersistedDict[str] = _PersistedDict(store, _ACTIVE_PREFIX) - identities: _PersistedNestedDict[Any] = _PersistedNestedDict(store) - return aliases, active, identities + for key in other: + self[key] = other[key] + for key, value in kwargs.items(): + self[key] = value -# Re-export keys for tests / power users that want to inspect the cache. -__all__ = [ - "_ACTIVE_PREFIX", - "_ALIASES_PREFIX", - "_IDENTITIES_PREFIX", - "SessionsStateStore", - "_PersistedDict", - "_PersistedNestedDict", - "build_session_dicts", -] +def build_session_aliases(store: SessionsStateStore) -> dict[str, str]: + """Return the disk-backed session-alias map for ``store``.""" + return _PersistedDict[str](store, _ALIASES_PREFIX) diff --git a/python/packages/hosting/agent_framework_hosting/_types.py b/python/packages/hosting/agent_framework_hosting/_types.py index 93dcc437a0c..23c172d9a70 100644 --- a/python/packages/hosting/agent_framework_hosting/_types.py +++ b/python/packages/hosting/agent_framework_hosting/_types.py @@ -11,12 +11,7 @@ These types form the boundary between the host and individual channels. A channel parses its native payload, builds a :class:`ChannelRequest`, and hands it to :class:`ChannelContext.run` (or ``run_stream``) on the host. -The host normalizes the request into a single agent invocation and either -returns the result to the originating channel or fans out via -:class:`ResponseTarget` to other channels that implement -:class:`ChannelPush`. - -See ``docs/specs/002-python-hosting-channels.md`` for the full design. +The channel owns rendering the result back onto its originating protocol. """ from __future__ import annotations @@ -24,16 +19,11 @@ import os from collections.abc import Awaitable, Callable, Mapping, Sequence from dataclasses import dataclass, field -from enum import Enum -from typing import TYPE_CHECKING, Any, Generic, Literal, Protocol, TypedDict, TypeVar, cast, runtime_checkable +from typing import TYPE_CHECKING, Any, Generic, Protocol, TypedDict, TypeVar, runtime_checkable from agent_framework import ( - AgentResponse, AgentResponseUpdate, AgentRunInputs, - ResponseStream, - SupportsAgentRun, - Workflow, ) from starlette.routing import BaseRoute @@ -41,11 +31,6 @@ from ._host import ChannelContext -# --------------------------------------------------------------------------- # -# Channel-neutral request envelope -# --------------------------------------------------------------------------- # - - class ChannelSession: """Channel-supplied session hint. @@ -59,15 +44,12 @@ def __init__(self, isolation_key: str | None = None) -> None: class ChannelIdentity: - """Channel-native identity the host sees on each request. + """Channel-native identity metadata observed on a request. - Consumed by the host's identity registry. The host uses it for two things: - - 1. Recording the active channel for an ``isolation_key`` so - ``ResponseTarget.active`` resolves correctly. - 2. Telling :class:`ChannelPush` ``push`` recipients **where** in their - native namespace to deliver — Telegram uses ``native_id`` as the - chat id, Teams as the conversation/AAD id, etc. + The simplified hosting core records this only on the persisted input + message's ``additional_properties["hosting"]`` block and forwards it + through run/response hooks. Cross-channel linking and recipient lookup are + follow-up concerns, not part of the v1 host contract. """ def __init__( @@ -81,176 +63,6 @@ def __init__( self.attributes: Mapping[str, Any] = attributes if attributes is not None else dict() -class ResponseTargetKind(str, Enum): - """Discriminator for :class:`ResponseTarget` variants.""" - - ORIGINATING = "originating" - ACTIVE = "active" - CHANNELS = "channels" - ALL_LINKED = "all_linked" - IDENTITIES = "identities" - NONE = "none" - - -class ResponseTarget: - """Per-request directive controlling **where** the host delivers the agent reply. - - Independent of ``session_mode``. Construct via the classmethod helpers or - use the module-level singletons rather than touching ``kind`` directly. - Variants: - - - ``ResponseTarget.originating`` (default) — synchronous response on the - originating channel only. - - ``ResponseTarget.active`` — push to the channel most recently observed - for the resolved ``isolation_key``. - - ``ResponseTarget.channel("teams")`` / ``.channels([...])`` — push to - one or more named destinations. Each entry is either a bare channel - name (host resolves the native id from its identity registry) or a - ``"channel:native_id"`` token (used verbatim). The pseudo-name - ``"originating"`` includes the originating channel in the fan-out. - - ``ResponseTarget.identity(ChannelIdentity)`` / - ``.identities([ChannelIdentity, ...])`` — push to one or more - **fully-specified identities**. Preferred over the ``"channel:native_id"`` - string variant when the destination needs ``identity.attributes`` - preserved (Teams conversation/thread metadata, Slack channel+thread, - Bot Framework service-url, etc.). - - ``ResponseTarget.all_linked`` — push to every channel where the - resolved ``isolation_key`` has been observed. - - ``ResponseTarget.none`` — background-only; in the prototype this just - suppresses the originating reply (no ``ContinuationToken`` yet). - - Instances are intended to be treated as immutable; the singletons are - shared across the process. - """ - - def __init__( - self, - kind: ResponseTargetKind = ResponseTargetKind.ORIGINATING, - targets: tuple[str, ...] = (), - identities: tuple[ChannelIdentity, ...] = (), - *, - echo_input: bool = False, - ) -> None: - self.kind = kind - self.targets = targets - # Stored under a non-clashing name so the ``identities`` - # *classmethod* (the public builder) can coexist with the - # value accessor (the ``identities`` property below). At - # runtime instance attributes shadow class attributes anyway, - # but type checkers see the classmethod and reject reassignment. - self._target_identities: tuple[ChannelIdentity, ...] = tuple(identities) - # When True, the host first pushes the originating user message - # to every non-originating destination (so end-user apps observing - # those channels can keep their UI in sync) before pushing the - # agent response. Defaults to False — opt-in only, because not - # every channel knows how to render ``role="user"`` content - # gracefully on its own surface. - self.echo_input = echo_input - - @property - def target_identities(self) -> tuple[ChannelIdentity, ...]: - """Destination identities for ``kind == IDENTITIES`` targets. - - Public name distinct from the :meth:`identities` classmethod - builder. Empty for non-``IDENTITIES`` kinds. - """ - return self._target_identities - - # -- builders ---------------------------------------------------------- # - - @classmethod - def channel(cls, name: str, *, echo_input: bool = False) -> ResponseTarget: - """Target a single named destination channel.""" - return cls(kind=ResponseTargetKind.CHANNELS, targets=(name,), echo_input=echo_input) - - @classmethod - def channels(cls, names: Sequence[str], *, echo_input: bool = False) -> ResponseTarget: - """Target an explicit list of destination channels.""" - return cls(kind=ResponseTargetKind.CHANNELS, targets=tuple(names), echo_input=echo_input) - - @classmethod - def identity(cls, identity: ChannelIdentity, *, echo_input: bool = False) -> ResponseTarget: - """Target a single fully-specified :class:`ChannelIdentity`. - - Preferred over the ``"channel:native_id"`` string token in - :meth:`channels` when ``identity.attributes`` carries metadata the - destination channel needs (Teams conversation/thread ids and - service-url, Slack channel + thread, Bot Framework activity-locator - fields, etc.). The host pushes to the named identity verbatim - without consulting its own identity registry. - """ - return cls(kind=ResponseTargetKind.IDENTITIES, identities=(identity,), echo_input=echo_input) - - @classmethod - def identities(cls, identities: Sequence[ChannelIdentity], *, echo_input: bool = False) -> ResponseTarget: - """Target an explicit list of fully-specified :class:`ChannelIdentity` objects. - - See :meth:`identity` for the single-destination variant. - """ - return cls(kind=ResponseTargetKind.IDENTITIES, identities=tuple(identities), echo_input=echo_input) - - # -- value semantics --------------------------------------------------- # - # ``ResponseTarget`` is treated as immutable, so two instances with the - # same ``kind`` + ``targets`` + ``identities`` + ``echo_input`` are - # interchangeable. Tests and channel parsers compare instances with - # ``==`` and use them as dict keys. - - def __eq__(self, other: object) -> bool: - if not isinstance(other, ResponseTarget): - return NotImplemented - return ( - self.kind is other.kind - and self.targets == other.targets - and _identities_equal(self._target_identities, other._target_identities) - and self.echo_input == other.echo_input - ) - - def __hash__(self) -> int: - # ``ChannelIdentity`` is not itself hashable (mutable attributes - # mapping); fold the identifying triple so two ``identities`` - # tuples with the same channel/native_id/attributes content hash - # the same. - identities_key = tuple( - (i.channel, i.native_id, tuple(sorted(i.attributes.items()))) for i in self._target_identities - ) - return hash((self.kind, self.targets, identities_key, self.echo_input)) - - def __repr__(self) -> str: - suffix = ", echo_input=True" if self.echo_input else "" - if self.kind is ResponseTargetKind.CHANNELS: - return f"ResponseTarget.channels({list(self.targets)!r}{suffix})" - if self.kind is ResponseTargetKind.IDENTITIES: - return f"ResponseTarget.identities({list(self._target_identities)!r}{suffix})" - return f"ResponseTarget.{self.kind.value}{suffix}" - - -def _identities_equal(left: tuple[ChannelIdentity, ...], right: tuple[ChannelIdentity, ...]) -> bool: - """Structural-equality helper for ``ResponseTarget.identities`` comparisons. - - ``ChannelIdentity`` is a plain class without ``__eq__``, so ``tuple`` / - ``list`` comparisons fall back to identity equality which is too strict - for value-typed ``ResponseTarget`` callers (two equivalent identity - tuples produced independently would otherwise compare unequal). - """ - if len(left) != len(right): - return False - for a, b in zip(left, right, strict=True): - if a.channel != b.channel or a.native_id != b.native_id: - return False - if dict(a.attributes) != dict(b.attributes): - return False - return True - - -# Module-level singletons so callers can write ``ResponseTarget.originating`` -# (matching the spec's classmethod-style notation) without juggling Python's -# no-zero-arg-classmethod-property limitation. -ResponseTarget.originating = ResponseTarget(kind=ResponseTargetKind.ORIGINATING) # type: ignore[attr-defined] -ResponseTarget.active = ResponseTarget(kind=ResponseTargetKind.ACTIVE) # type: ignore[attr-defined] -ResponseTarget.all_linked = ResponseTarget(kind=ResponseTargetKind.ALL_LINKED) # type: ignore[attr-defined] -ResponseTarget.none = ResponseTarget(kind=ResponseTargetKind.NONE) # type: ignore[attr-defined] - - @dataclass class ChannelRequest: """Uniform invocation envelope every channel produces from its native payload. @@ -260,16 +72,15 @@ class ChannelRequest: """ channel: str - operation: str # e.g. "message.create", "command.invoke" + operation: str input: AgentRunInputs session: ChannelSession | None = None options: Mapping[str, Any] | None = None - session_mode: str = "auto" # "auto" | "required" | "disabled" + session_mode: str = "auto" metadata: Mapping[str, Any] = field(default_factory=lambda: {}) attributes: Mapping[str, Any] = field(default_factory=lambda: {}) stream: bool = False identity: ChannelIdentity | None = None - response_target: ResponseTarget = field(default_factory=lambda: ResponseTarget.originating) # type: ignore[attr-defined] class ChannelCommand: @@ -335,42 +146,11 @@ class _Unset: class HostedRunResult(Generic[TResult]): - r"""Channel-neutral envelope around the target's full-fidelity result. - - Carries the underlying execution payload **unchanged** so channels - (and developer-supplied ``response_hook``\\s) can read everything the - target produced — full multi-modal contents, structured ``value``, - ``usage_details``, ``response_id``, workflow per-executor outputs, - final ``WorkflowRunState``, etc. - - ``result`` is generic in ``TResult`` so callers retain static typing: - - * Agent targets always produce - ``HostedRunResult[AgentResponse]`` — channels read - ``result.messages``, ``result.value``, ``result.usage_details``, … - directly. - * Workflow targets produce ``HostedRunResult[WorkflowRunResult]`` - today (``Workflow`` is not itself generic, so the static narrowing - is only as tight as ``Workflow.run``'s return). Channels iterate - ``result.get_outputs()`` and inspect ``result.get_final_state()`` - to render workflow-specific UX. When a host author drives the - workflow themselves and knows the final-output type, they may - narrow to ``HostedRunResult[MyOutput]`` in their own - ``response_hook`` signatures. - * The echo-input phase synthesises an ``HostedRunResult[AgentResponse]`` - wrapping the originating user turn so the same per-destination - delivery machinery applies. - - The optional ``session`` slot carries the resolved - :class:`~agent_framework.AgentSession` the host bound to this - invocation (``None`` for workflow targets, which do not own session - state in the agent sense). Channels that want to surface session - metadata (e.g. echo the resolved isolation key into a response - header) read it here. - - Treat instances as immutable: the host clones per-destination before - invoking a per-channel ``response_hook`` so one channel's transform - cannot perturb the payload another destination observes. + """Channel-neutral envelope around the target's full-fidelity result. + + The host does not flatten or pre-shape the target output. Channels and + response hooks read the underlying result type directly and serialize the + subset their wire format can carry. """ def __init__( @@ -388,575 +168,45 @@ def replace( result: TResult | _Unset = _UNSET, session: Any | _Unset | None = _UNSET, ) -> HostedRunResult[TResult]: - """Return a shallow copy with the supplied fields overridden. - - Used by the host's delivery layer to clone the envelope before - applying a per-destination ``response_hook``, so one channel's - transform cannot mutate the payload another destination sees. - The clone is shallow — channels that need to mutate - ``result.messages`` (or any other nested mutable container) are - responsible for deep-cloning that container themselves. - """ + """Return a shallow copy with the supplied fields overridden.""" new: HostedRunResult[TResult] = HostedRunResult.__new__(HostedRunResult) # pyright: ignore[reportUnknownVariableType] new.result = self.result if isinstance(result, _Unset) else result new.session = self.session if isinstance(session, _Unset) else session return new -class DurableTaskPayloadMode(str, Enum): - """How a :class:`DurableTaskRunner` consumes scheduled-task payloads. - - Used by the host's startup validator to pair a runner's persistence - expectations with the channels' push-codec capabilities. Adapter packages - pick the right value for their backing store. - - * ``OBJECT`` — the runner accepts live Python objects in the payload. - No serialization is required; the host's - :class:`InProcessTaskRunner` is the canonical example. Suitable for - ``runtime_mode="long_running"`` deployments where the runner shares - address space with the producer. - * ``JSON`` — the runner persists the payload (database, durable queue, - Foundry scheduled-task store, …) and replays it after a process - restart. Payloads MUST be JSON-serializable, which constrains what - the host can put on the wire. The host validates at construction - that every push-capable channel exposes a - :class:`ChannelPushCodec` (so :class:`HostedRunResult` payloads can - be reduced to a JSON envelope before scheduling). - """ - - OBJECT = "object" - JSON = "json" - - -# A push-codec implementation reduces the ``(result, request, identity)`` -# triple a destination channel will receive into a JSON-safe envelope that -# a durable :class:`DurableTaskRunner` can persist, and reconstructs the -# rendering inputs on the consumer side. The host *invokes* the codec -# during scheduling; the destination channel implements it (the channel -# knows what shape of payload it can render). -# -# Channels with no push codec are usable only with object-mode runners -# (the default :class:`InProcessTaskRunner`) — the host validates this at -# construction so the mismatch surfaces eagerly rather than on first push. -class ChannelPushCodec(Protocol): - """Optional capability: serialise the push envelope for a durable task runner. - - Implementations live on the destination channel (alongside ``push``) - as a duck-typed ``push_codec`` attribute. The host's - :meth:`_deliver_response` invokes :meth:`encode` once per scheduled - push (in JSON-mode runner deployments) to produce a JSON-safe - envelope for the runner; the handler calls :meth:`decode` - immediately before invoking :meth:`ChannelPush.push`. Object-mode - runners (the default in-process runner) bypass the codec entirely - and pass live references through verbatim. - - Encoded envelopes MUST be JSON-serialisable - (``dict``/``list``/``str``/``int``/``float``/``bool``/``None``). - Channels that cannot satisfy this for some inputs (e.g. arbitrary - workflow result objects without a stable schema) SHOULD raise a - typed :class:`PushPayloadNotSerializable` from :meth:`encode` - rather than return a best-effort representation; the host surfaces - that as a schedule-time error and the destination is treated as - skipped (other destinations still get their chance). - """ - - async def encode( - self, - *, - result: HostedRunResult[Any], - request: ChannelRequest, - identity: ChannelIdentity, - echo_result: HostedRunResult[Any] | None, - ) -> Mapping[str, Any]: - """Project the in-memory push triple into a JSON-safe envelope.""" - ... - - async def decode( - self, - envelope: Mapping[str, Any], - ) -> tuple[HostedRunResult[Any], ChannelRequest, ChannelIdentity, HostedRunResult[Any] | None]: - """Reconstruct ``(result, request, identity, echo_result)`` from an envelope.""" - ... - - -class PushPayloadNotSerializable(RuntimeError): - """Raised by a :class:`ChannelPushCodec` when the payload cannot be serialised. - - Channels raise this from :meth:`ChannelPushCodec.encode` when the - inbound :class:`HostedRunResult` carries content the codec has no - JSON projection for (e.g. an arbitrary workflow result with no - declared schema). The host surfaces the error eagerly at schedule - time rather than letting the runner discover it after persisting - a half-formed envelope. - """ - - -class PushPayloadNotPicklable(RuntimeError): - """Raised when a disk-persistent runner cannot pickle a scheduled task payload. - - The in-process runner falls back to pickle when ``state_dir`` is set - so a long-running host can resume in-flight pushes across restarts. - Most :class:`HostedRunResult` payloads (frozen dataclasses wrapping - :class:`AgentResponse` or workflow output) pickle without issue, but - a user-supplied workflow result or response hook may embed an - unpickleable object (live network client, ``asyncio.Lock``, generator). - The runner raises this at schedule time so the misconfig is loud - rather than silently downgrading to no-persistence. - """ - - class HostStatePaths(TypedDict, total=False): """Per-component disk paths for host-managed state. - Pass an instance of this typed dict to - :class:`~agent_framework_hosting._host.AgentFrameworkHost`'s - ``state_dir`` parameter when you want to place individual components - on different volumes — for example, a fast local SSD for the runner - task queue and a network-attached durable volume for session state - that needs to survive container moves. - - All keys are optional (``total=False``): unset components fall back - to in-memory storage (or, for ``checkpoints``, to no checkpoint - persistence). Pass a single ``str``/``PathLike`` to ``state_dir`` - instead to get the default subfolder layout - (``state_dir/runner/``, ``state_dir/sessions/``, - ``state_dir/checkpoints/``, ``state_dir/links/``). - - Future components (continuations, ledger) will be added as additional - keys in subsequent releases. + Only session aliases and workflow checkpoints remain in the simplified + host. Linking stores, active-channel maps, identity registries, and runner + queues are follow-up concerns. """ - runner: str | os.PathLike[str] - """Where :class:`~agent_framework_hosting._runner.InProcessTaskRunner` - persists its pending-task queue and bounded terminal-status cache. - Required for in-flight push retries to survive process restarts.""" - sessions: str | os.PathLike[str] - """Where the host persists session aliases (from - :meth:`AgentFrameworkHost.reset_session`), the per-isolation-key - identity registry, and the last-active-channel map. Required for - ``ResponseTarget.active``/``.channel``/``.all_linked`` to find - destinations after a restart, and for ``reset_session`` rotations - to survive a restart.""" + """Where the host persists session aliases created by ``reset_session``.""" checkpoints: str | os.PathLike[str] - """Where the host persists workflow checkpoints for ``Workflow`` - targets. Equivalent to passing ``checkpoint_location=`` - directly: the host wraps it in a per-isolation-key - :class:`~agent_framework.FileCheckpointStorage`. Ignored when the - target is a ``SupportsAgentRun`` agent (a warning is emitted if you - set it explicitly via the mapping form). Pass the legacy - ``checkpoint_location`` parameter instead when you need to supply a - :class:`~agent_framework.CheckpointStorage` instance — it takes - precedence over this key.""" - - links: str | os.PathLike[str] - """Where identity-linker implementations persist their link store: - pending link challenges/grants, channel-native identity to linked - isolation-key mappings, and verified-claim metadata. The core host - does not impose a storage format; concrete :class:`IdentityLinker` - implementations that support host-provided persistence receive this - path via ``configure_link_store_path``. If a linker manages its own - persistence, omit this key or configure that linker directly.""" - - -# A transform hook runs over each AgentResponseUpdate as the channel consumes -# the stream. It can return a replacement update, ``None`` to drop the update, -# or be async. Channels apply it during iteration so that channel-specific -# concerns (e.g. masking, redaction, formatting for the wire) live close to -# the channel rather than on the agent. -ChannelStreamTransformHook = Callable[ + """Where the host persists workflow checkpoints for ``Workflow`` targets.""" + + +ChannelStreamUpdateHook = Callable[ [AgentResponseUpdate], "AgentResponseUpdate | Awaitable[AgentResponseUpdate | None] | None", ] -# --------------------------------------------------------------------------- # -# Channel run hook -# --------------------------------------------------------------------------- # - - -# Run hooks accept the channel-built ``ChannelRequest`` and return a -# (possibly modified) replacement. Channels invoke the hook with both the -# request and the channel-side context as keyword arguments — the call -# convention is ``await hook(request, target=..., protocol_request=...)``. -# -# The ergonomic minimum for a hook implementation is therefore a function -# accepting ``request`` positionally plus ``**kwargs`` and returning a -# (possibly mutated) :class:`ChannelRequest`. Hooks that need the agent -# target or the raw channel-native payload pull them off the keyword -# arguments by name (``target`` / ``protocol_request``). -# -# ``protocol_request`` is the raw, channel-native payload the channel -# parsed (the JSON body for Responses, the Telegram ``Update`` dict, the -# Bot Framework ``Activity`` for Teams). Use it when the hook needs a -# field the channel did not lift onto ``ChannelRequest`` (e.g. OpenAI's -# ``safety_identifier``, Teams' ``from.aadObjectId``, …). ChannelRunHook = Callable[..., "Awaitable[ChannelRequest] | ChannelRequest"] -async def apply_run_hook( - hook: ChannelRunHook, - request: ChannelRequest, - *, - target: SupportsAgentRun | Workflow, - protocol_request: Any | None, -) -> ChannelRequest: - """Channel-side helper to invoke a :data:`ChannelRunHook` with the standard kwargs. - - Channels call this rather than calling the hook directly so the - invocation convention (``request`` positional, ``target`` / - ``protocol_request`` keyword) is enforced in one place. - """ - result = hook(request, target=target, protocol_request=protocol_request) - if isinstance(result, Awaitable): - return await result - return result - - -# --------------------------------------------------------------------------- # -# Channel response hook -# --------------------------------------------------------------------------- # - - -class ChannelResponseContext: - """Per-destination context handed to a :data:`ChannelResponseHook`. - - Response hooks run on the *output* side of the host pipeline, after - the agent / workflow has produced a :class:`HostedRunResult` but - before the destination channel serialises it to its wire format. - Hooks may need to make decisions based on *where* the payload is - headed — e.g. flatten multi-modal output to text for a text-only - destination, or pick which content variant to deliver to a card- - capable channel. The context captures that information without - forcing hooks to parse stringly destination tokens. - """ - - def __init__( - self, - request: ChannelRequest, - channel_name: str, - destination_identity: ChannelIdentity | None, - originating: bool, - is_echo: bool = False, - ) -> None: - self.request = request - self.channel_name = channel_name - # ``None`` when the originating channel is rendering its own reply - # (no push identity needed for "respond on the wire you came in - # on") or when the destination is named without a known native id. - self.destination_identity = destination_identity - # True when this hook invocation is for the originating channel's - # synchronous reply. False for non-originating push targets. - self.originating = originating - # True when the payload being shaped is the user-message echo - # rather than the agent response (only happens when - # ``ResponseTarget.echo_input`` is set). - self.is_echo = is_echo - - -# Response hooks accept the :class:`HostedRunResult` the host has assembled -# and return a (possibly modified) replacement. Channels invoke the hook -# with both the payload and the per-destination -# :class:`ChannelResponseContext` as keyword arguments — the call -# convention is ``await hook(result, context=...)``. -# -# The ergonomic minimum for a hook implementation is a function accepting -# ``result`` positionally plus ``**kwargs`` and returning a (possibly -# rewritten) :class:`HostedRunResult`. Hooks that need to branch on the -# destination read it off the ``context`` keyword argument. -# -# ``HostedRunResult`` is generic in the underlying ``result`` type; the -# hook callable signature stays ``Any``-typed so a single -# ``response_hook`` attribute on a channel can serve both agent -# (``HostedRunResult[AgentResponse]``) and workflow -# (``HostedRunResult[WorkflowRunResult]``) payloads — channels narrow -# at hook entry if they need static checking. ChannelResponseHook = Callable[..., "Awaitable[HostedRunResult[Any]] | HostedRunResult[Any]"] -async def apply_response_hook( - hook: ChannelResponseHook, - result: HostedRunResult[Any], - *, - context: ChannelResponseContext, -) -> HostedRunResult[Any]: - """Channel-side helper to invoke a :data:`ChannelResponseHook` with the standard kwargs. - - Channels (and the host's delivery layer) call this rather than calling - the hook directly so the invocation convention (``result`` positional, - ``context`` keyword) is enforced in one place. - """ - out = hook(result, context=context) - if isinstance(out, Awaitable): - return await out - return out - - -# --------------------------------------------------------------------------- # -# Channel protocols -# --------------------------------------------------------------------------- # - - @runtime_checkable class Channel(Protocol): - """A pluggable adapter that exposes one transport on the host. - - Channels publish their routes, commands, and lifecycle callbacks via - :meth:`contribute`. The host mounts them under the channel's ``path`` - (or at the app root when ``path == ""``) and gives the channel a - :class:`ChannelContext` so it can call back into the host to invoke - the agent target and deliver responses. - """ + """A pluggable adapter that exposes one transport on the host.""" name: str - path: str # default endpoint path (e.g. "/responses"); use "" to mount contributed routes at the app root + path: str def contribute(self, context: ChannelContext) -> ChannelContribution: ... - - -@runtime_checkable -class ChannelPush(Protocol): - r"""Optional capability: a channel that can deliver outbound messages without a prior request. - - Per SPEC-002 (req #13), channels that can do proactive delivery - (Telegram bot proactive message, Teams proactive bot message, - webhook callbacks, SSE broadcasts) implement ``push`` on top of the - base :class:`Channel` protocol. Channels without push can only be - addressed as the ``originating`` :class:`ResponseTarget`. - - Distinguishing user echoes from agent replies - --------------------------------------------- - When the originating :class:`ResponseTarget` opts in to - ``echo_input=True``, the host pushes the user's input message to - each non-originating destination **before** the agent reply. Both - pushes go through the same ``push(identity, payload)`` entry point; - the channel distinguishes them by inspecting the role on the - payload's underlying :class:`~agent_framework.Message`\\(s): - - * ``payload.result.messages[i].role == "user"`` → the echo phase - (originating user's turn mirrored onto this destination so the - channel's UX can stay coherent with the user's actual prompt). - Channels that cannot impersonate the user (most chat bots can - only send AS the bot) typically render echoes as a quoted / - prefixed block, drop them, or skip them via a - ``response_hook`` — see below. - * ``payload.result.messages[i].role == "assistant"`` → the agent's - reply. - - Channels that want to branch on phase WITHOUT inspecting roles can - instead expose a ``response_hook`` attribute on the channel - instance: the host calls the hook with a - :class:`ChannelResponseContext` whose ``is_echo`` flag carries the - same phase information explicitly, and the hook returns a - (possibly rewritten) :class:`HostedRunResult` that the host then - hands to ``push``. The hook seam is duck-typed and intentionally - NOT part of this Protocol so adding hook support to an existing - channel never breaks its public contract. - """ - - name: str - - async def push(self, identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: ... - - -async def apply_channel_response_hook( - channel: Channel | ChannelPush, - result: HostedRunResult[Any], - *, - request: ChannelRequest, - originating: bool, - destination_identity: ChannelIdentity | None = None, - is_echo: bool = False, - clone: bool = False, -) -> HostedRunResult[Any]: - """Apply a channel's optional response hook with the standard context. - - Channels and the host call this helper when they need to shape a - :class:`HostedRunResult` for one destination. The helper centralizes the - response-hook convention: hooks are discovered from a duck-typed - ``response_hook`` attribute, called through :func:`apply_response_hook`, - and receive a :class:`ChannelResponseContext` that identifies the channel, - destination identity, originating-vs-push phase, and echo phase. - - Args: - channel: Channel whose ``response_hook`` attribute may shape the payload. - result: Hosted run result to pass to the hook. - request: Originating channel request. - originating: Whether this is the originating channel's synchronous reply. - destination_identity: Destination identity for non-originating pushes, or - ``None`` for originating replies. - is_echo: Whether the payload is an echo of the user input. - clone: Whether to shallow-clone ``result`` before applying the hook. - - Returns: - The original, cloned, or hook-shaped hosted run result. - """ - shaped = result.replace() if clone else result - hook = cast(ChannelResponseHook | None, getattr(channel, "response_hook", None)) - if not callable(hook): - return shaped - context = ChannelResponseContext( - request=request, - channel_name=channel.name, - destination_identity=destination_identity, - originating=originating, - is_echo=is_echo, - ) - return await apply_response_hook(hook, shaped, context=context) - - -# --------------------------------------------------------------------------- # -# Durable task runner — pluggable seam for non-originating push fan-out and -# (in v1 fast-follow) background runs. See spec §"Durable task runner". -# --------------------------------------------------------------------------- # - - -@dataclass(frozen=True) -class RetryPolicy: - """Retry contract a :class:`DurableTaskRunner` honours per scheduled task. - - Defaults are deliberately conservative — five attempts on a 1s/2x/60s - exponential backoff — so a transient channel outage (Telegram returning - 502, Activity Protocol token refresh) is rerouted to retry without the - operator wiring anything. Adapter backends (TaskHub, Foundry durable - tasks) MAY translate this into their native retry primitive; the - in-process runner implements it directly via ``asyncio.sleep``. - """ - - max_attempts: int = 5 - initial_backoff_seconds: float = 1.0 - backoff_multiplier: float = 2.0 - max_backoff_seconds: float = 60.0 - - -@dataclass(frozen=True) -class TaskHandle: - """Opaque, runner-issued handle for a scheduled task. - - Callers receive one of these from :meth:`DurableTaskRunner.schedule` and - pass it back to :meth:`DurableTaskRunner.get` to poll status. ``task_id`` - is opaque — its shape is implementation-defined (UUID for the in-process - runner, instance id for TaskHub, scheduled-task arn for Foundry). The - ``name`` mirrors the handler name supplied to :meth:`schedule` so the - caller does not have to track it separately. - """ - - task_id: str - name: str - - -TaskStatus = Literal["scheduled", "running", "succeeded", "failed", "cancelled"] - - -@runtime_checkable -class DurableTaskRunner(Protocol): - """Pluggable seam the host uses to schedule out-of-band work. - - The host registers a single internal handler — ``"hosting.push"`` — at - startup; each non-originating push destination becomes a - ``runner.schedule("hosting.push", payload)`` call. The handler resolves - the destination channel, runs its ``response_hook`` (if any), and calls - :meth:`ChannelPush.push`. Failures inside the handler are caught by the - runner, retried per the supplied :class:`RetryPolicy`, and ultimately - marked terminal-failed when ``max_attempts`` is exhausted. - - Two implementations ship in the framework: an in-process default - (``InProcessTaskRunner``, asyncio + bounded retry, no cross-restart - persistence) suitable for ``runtime_mode="long_running"`` deployments, - plus adapter packages (``agent-framework-hosting-durabletask``, a future - Foundry adapter) for ``runtime_mode="ephemeral"`` deployments that need - cross-restart durability. - - Adapters MUST publish their ``payload_mode`` so the host's startup - validator can pair runner persistence expectations with channel - push-codec capabilities. Object-mode runners accept live Python - references in the payload (the in-process default does this for - speed); JSON-mode runners persist payloads across process restarts - and therefore require every push-capable channel to expose a - :class:`ChannelPushCodec`. - """ - - # Adapter classes set this explicitly; the host inspects it at - # construction time. Default is conservative ("object") so a runner - # that omits the attribute is treated as in-process-only and does - # not silently impose a JSON requirement on channels. - payload_mode: DurableTaskPayloadMode - - def register( - self, - name: str, - handler: Callable[[Mapping[str, Any]], Awaitable[None]], - ) -> None: - """Register a named handler the runner will invoke when a task fires. - - Re-registering under the same name replaces the previous handler. - Implementations SHOULD raise :class:`RuntimeError` if called after - the runner has been started, to avoid silent reorderings of in-flight - work; the in-process runner enforces this. - """ - ... - - async def schedule( - self, - name: str, - payload: Mapping[str, Any], - *, - retry_policy: RetryPolicy | None = None, - ) -> TaskHandle: - """Schedule a previously-registered handler invocation. - - ``name`` MUST match a name previously passed to :meth:`register`. The - ``payload`` is forwarded verbatim to the handler; implementations - MUST treat it as opaque (no introspection, no normalization). - ``retry_policy`` overrides the runner's default for this task only; - ``None`` means "use the runner-wide default". - - Returns a :class:`TaskHandle` the caller may use with :meth:`get` to - poll status. Returning the handle MUST NOT wait for the task to run - — scheduling is fire-and-forget from the caller's perspective. - """ - ... - - async def get(self, handle: TaskHandle) -> TaskStatus | None: - """Return the current status of a scheduled task. - - Returns ``None`` if the runner no longer has any record of the task - (e.g. it was scheduled in a prior process and the runner has no - persistent backing). Otherwise one of the :data:`TaskStatus` values. - """ - ... - - -__all__ = [ - "AgentResponse", - "AgentResponseUpdate", - "Channel", - "ChannelCommand", - "ChannelCommandContext", - "ChannelContribution", - "ChannelIdentity", - "ChannelPush", - "ChannelPushCodec", - "ChannelRequest", - "ChannelResponseContext", - "ChannelResponseHook", - "ChannelRunHook", - "ChannelSession", - "ChannelStreamTransformHook", - "DurableTaskPayloadMode", - "DurableTaskRunner", - "HostStatePaths", - "HostedRunResult", - "PushPayloadNotPicklable", - "PushPayloadNotSerializable", - "ResponseStream", - "ResponseTarget", - "ResponseTargetKind", - "RetryPolicy", - "TaskHandle", - "TaskStatus", - "apply_channel_response_hook", - "apply_response_hook", - "apply_run_hook", -] diff --git a/python/packages/hosting/tests/test_authorization.py b/python/packages/hosting/tests/test_authorization.py deleted file mode 100644 index 9de3f6b5cad..00000000000 --- a/python/packages/hosting/tests/test_authorization.py +++ /dev/null @@ -1,580 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Tests for the authorization and identity-linking seam.""" - -from __future__ import annotations - -from collections.abc import Collection -from typing import Any - -import pytest - -from agent_framework_hosting import ( - AgentFrameworkHost, - AllOfAllowlists, - AllowAll, - Allowed, - AllowlistDecision, - AnyOfAllowlists, - AuthorizationContext, - AuthPolicy, - CallableAllowlist, - ChannelConfigurationError, - ChannelContext, - ChannelContribution, - ChannelIdentity, - Denied, - LinkChallenge, - LinkedClaimAllowlist, - LinkedIdentity, - LinkRequired, - NativeIdAllowlist, -) - -# --------------------------------------------------------------------------- # -# Fakes # -# --------------------------------------------------------------------------- # - - -class _ChannelStub: - name: str = "stub" - path: str = "/stub" - require_link: bool = False - allowlist: Any = "inherit" - emits_verified_claims: bool = False - - def __init__( - self, - *, - name: str = "stub", - require_link: bool = False, - allowlist: Any = "inherit", - emits_verified_claims: bool = False, - ) -> None: - self.name = name - self.path = f"/{name}" - self.require_link = require_link - self.allowlist = allowlist - self.emits_verified_claims = emits_verified_claims - - def contribute(self, context: ChannelContext) -> ChannelContribution: - return ChannelContribution(routes=[]) - - -class _AgentStub: - """Bare minimum target — the validators run during ``__init__``, - not on first request, so the target is never actually invoked.""" - - async def run(self, *args: Any, **kwargs: Any) -> Any: # pragma: no cover - raise NotImplementedError - - -class _StaticLinker: - """Test linker returning either a linked identity or a challenge.""" - - def __init__(self, result: LinkedIdentity | LinkChallenge) -> None: - self.result = result - self.calls: list[ChannelIdentity] = [] - - async def resolve(self, identity: ChannelIdentity) -> LinkedIdentity | LinkChallenge: - self.calls.append(identity) - return self.result - - -def _ctx_pre_link(channel: str = "telegram", native_id: str = "42") -> AuthorizationContext: - return AuthorizationContext( - identity=ChannelIdentity(channel=channel, native_id=native_id), - phase="pre_link", - ) - - -def _ctx_post_link(claims: dict[str, str] | None = None) -> AuthorizationContext: - return AuthorizationContext( - identity=ChannelIdentity(channel="telegram", native_id="42"), - phase="post_link", - isolation_key="alice", - verified_claims=claims or {}, - claim_source="linker", - ) - - -# --------------------------------------------------------------------------- # -# Built-in allowlists # -# --------------------------------------------------------------------------- # - - -class TestAllowAll: - async def test_allows_both_phases(self) -> None: - a = AllowAll() - assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW - assert await a.evaluate(_ctx_post_link()) is AllowlistDecision.ALLOW - - def test_does_not_require_linked_claims(self) -> None: - assert AllowAll().requires_linked_claims is False - - -class TestNativeIdAllowlist: - async def test_allows_listed_id(self) -> None: - a = NativeIdAllowlist({"42", "99"}) - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW - - async def test_denies_unlisted_id(self) -> None: - a = NativeIdAllowlist({"42"}) - assert await a.evaluate(_ctx_pre_link(native_id="99")) is AllowlistDecision.DENY - - async def test_channel_filter_abstains_for_other_channels(self) -> None: - # The native-id list is scoped to "telegram" — a request from - # another channel should ABSTAIN so a combinator can give a - # parallel allowlist a chance to ALLOW. - a = NativeIdAllowlist({"42"}, channel="telegram") - assert await a.evaluate(_ctx_pre_link(channel="slack", native_id="42")) is AllowlistDecision.ABSTAIN - - async def test_channel_filter_evaluates_matching_channel(self) -> None: - a = NativeIdAllowlist({"42"}, channel="telegram") - assert await a.evaluate(_ctx_pre_link(channel="telegram", native_id="42")) is AllowlistDecision.ALLOW - assert await a.evaluate(_ctx_pre_link(channel="telegram", native_id="99")) is AllowlistDecision.DENY - - async def test_async_loader_caches_after_first_call(self) -> None: - # The loader should run once; subsequent ``evaluate`` calls hit - # the cache so a slow / costly source isn't re-queried per - # message. - calls = {"n": 0} - - async def loader() -> Collection[str]: - calls["n"] += 1 - return {"42"} - - a = NativeIdAllowlist(loader) - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW - assert calls["n"] == 1 - - -class TestLinkedClaimAllowlist: - """Claim allowlists abstain pre-link and decide once claims are available.""" - - def test_declares_requires_linked_claims(self) -> None: - a = LinkedClaimAllowlist("oid", ["abc"]) - assert a.requires_linked_claims is True - - async def test_pre_link_abstains(self) -> None: - a = LinkedClaimAllowlist("oid", ["abc"]) - assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ABSTAIN - - async def test_post_link_allows_matching_claim(self) -> None: - a = LinkedClaimAllowlist("oid", ["abc"]) - assert await a.evaluate(_ctx_post_link({"oid": "abc"})) is AllowlistDecision.ALLOW - - async def test_post_link_allows_matching_multi_value_claim(self) -> None: - a = LinkedClaimAllowlist("groups", ["admins"]) - ctx = AuthorizationContext( - identity=ChannelIdentity(channel="telegram", native_id="42"), - phase="post_link", - isolation_key="alice", - verified_claims={"groups": ("users", "admins")}, - claim_source="linker", - ) - assert await a.evaluate(ctx) is AllowlistDecision.ALLOW - - async def test_post_link_denies_missing_or_nonmatching_claim(self) -> None: - a = LinkedClaimAllowlist("oid", ["abc"]) - assert await a.evaluate(_ctx_post_link({"oid": "def"})) is AllowlistDecision.DENY - assert await a.evaluate(_ctx_post_link({"tid": "abc"})) is AllowlistDecision.DENY - - -class TestAnyOfAllowlists: - async def test_any_allow_wins(self) -> None: - a = AnyOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"99"})) - # native_id=42 → first ALLOWs, short-circuit. - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW - - async def test_all_deny_yields_deny(self) -> None: - # Both lists deny native_id=7. - a = AnyOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"99"})) - assert await a.evaluate(_ctx_pre_link(native_id="7")) is AllowlistDecision.DENY - - async def test_abstain_when_no_decision(self) -> None: - # Channel-scoped lists both ABSTAIN on a "slack" request. - a = AnyOfAllowlists( - NativeIdAllowlist({"42"}, channel="telegram"), - NativeIdAllowlist({"99"}, channel="teams"), - ) - assert await a.evaluate(_ctx_pre_link(channel="slack", native_id="42")) is AllowlistDecision.ABSTAIN - - async def test_empty_is_abstain(self) -> None: - # No children → ABSTAIN (not DENY) to avoid silent deny-all. - a = AnyOfAllowlists() - assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ABSTAIN - - def test_propagates_requires_linked_claims(self) -> None: - a = AnyOfAllowlists(NativeIdAllowlist({"42"}), LinkedClaimAllowlist("oid", [])) - assert a.requires_linked_claims is True - - -class TestAllOfAllowlists: - async def test_any_deny_short_circuits(self) -> None: - a = AllOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"99"})) - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.DENY - - async def test_all_allow_yields_allow(self) -> None: - a = AllOfAllowlists(NativeIdAllowlist({"42"}), NativeIdAllowlist({"42", "99"})) - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW - - async def test_abstain_when_no_deny_but_no_unanimous_allow(self) -> None: - a = AllOfAllowlists( - NativeIdAllowlist({"42"}, channel="telegram"), - NativeIdAllowlist({"42"}, channel="teams"), - ) - # ABSTAIN from teams (different channel), ALLOW from telegram → ABSTAIN. - assert await a.evaluate(_ctx_pre_link(channel="telegram", native_id="42")) is AllowlistDecision.ABSTAIN - - async def test_empty_is_abstain(self) -> None: - a = AllOfAllowlists() - assert await a.evaluate(_ctx_pre_link()) is AllowlistDecision.ABSTAIN - - -class TestCallableAllowlist: - async def test_wraps_async_fn(self) -> None: - async def fn(ctx: AuthorizationContext) -> AllowlistDecision: - if ctx.identity.native_id == "42": - return AllowlistDecision.ALLOW - return AllowlistDecision.DENY - - a = CallableAllowlist(fn) - assert await a.evaluate(_ctx_pre_link(native_id="42")) is AllowlistDecision.ALLOW - assert await a.evaluate(_ctx_pre_link(native_id="99")) is AllowlistDecision.DENY - - def test_requires_linked_claims_passthrough(self) -> None: - async def fn(_: AuthorizationContext) -> AllowlistDecision: # pragma: no cover - return AllowlistDecision.ALLOW - - a = CallableAllowlist(fn, requires_linked_claims=True) - assert a.requires_linked_claims is True - - -class TestAuthPolicy: - async def test_factory_helpers_return_working_allowlists(self) -> None: - assert await AuthPolicy.open().evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW - assert await AuthPolicy.native_ids({"42"}).evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW - assert await AuthPolicy.linked_claim("oid", {"abc"}).evaluate(_ctx_post_link({"oid": "abc"})) is ( - AllowlistDecision.ALLOW - ) - - async def test_custom_factory(self) -> None: - async def fn(_: AuthorizationContext) -> AllowlistDecision: - return AllowlistDecision.ALLOW - - policy = AuthPolicy.custom(fn, requires_linked_claims=True) - assert policy.requires_linked_claims is True - assert await policy.evaluate(_ctx_pre_link()) is AllowlistDecision.ALLOW - - -# --------------------------------------------------------------------------- # -# Host configuration validator # -# --------------------------------------------------------------------------- # - - -class TestChannelAuthorizationValidator: - """The host's startup validator catches three classes of misconfig - so they fail at construction rather than silently denying every - user at runtime.""" - - def test_require_link_without_linker_raises(self) -> None: - # ``require_link=True`` with no linker would silently reject - # every request — caught at construction. - with pytest.raises(ChannelConfigurationError, match="identity_linker"): - AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True)], - ) - - def test_require_link_with_linker_passes(self) -> None: - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True)], - identity_linker=_StaticLinker(LinkedIdentity("alice", {"oid": "abc"})), - ) - assert host.runtime_mode == "long_running" - - def test_linked_claim_allowlist_without_claim_source_raises(self) -> None: - # The channel has no ``require_link=True`` AND doesn't emit - # claims natively → the allowlist would always DENY / ABSTAIN. - with pytest.raises(ChannelConfigurationError, match="verified IdP claims"): - AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(allowlist=LinkedClaimAllowlist("oid", []))], - ) - - def test_linked_claim_allowlist_with_native_claim_source_passes(self) -> None: - # When the channel declares ``emits_verified_claims=True`` - # (e.g. Activity Protocol with AAD bearer) the validator - # accepts the LinkedClaimAllowlist without needing a linker. - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[ - _ChannelStub( - allowlist=LinkedClaimAllowlist("oid", ["abc"]), - emits_verified_claims=True, - ) - ], - ) - assert host.default_allowlist is None - - def test_linked_claim_allowlist_with_require_link_and_linker_passes(self) -> None: - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], - identity_linker=_StaticLinker(LinkedIdentity("alice", {"oid": "abc"})), - ) - assert host.runtime_mode == "long_running" - - def test_native_id_allowlist_unknown_channel_raises(self) -> None: - with pytest.raises(ChannelConfigurationError, match="unknown channel 'mystery'"): - AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(allowlist=NativeIdAllowlist({"42"}, channel="mystery"))], - ) - - def test_native_id_allowlist_known_channel_passes(self) -> None: - # A channel-scoped native list pointing at a peer channel is - # the supported way to compose per-channel allowlists. - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[ - _ChannelStub(name="telegram", allowlist=NativeIdAllowlist({"42"}, channel="telegram")), - _ChannelStub(name="slack"), - ], - ) - assert host.runtime_mode == "long_running" - - def test_default_allowlist_applies_to_inheriting_channel(self) -> None: - # ``allowlist="inherit"`` (the default) picks up the host-level - # ``default_allowlist``. This is the "lock down a whole bot in - # one place" ergonomic. - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(name="telegram")], - default_allowlist=NativeIdAllowlist({"42"}), - ) - # The default flowed through; channel sees the host's allowlist. - assert host.default_allowlist is not None - - def test_explicit_none_carve_out_overrides_default(self) -> None: - # ``allowlist=None`` on a channel explicitly opts out of the - # host default — useful for a public endpoint inside an - # otherwise locked-down host. - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(name="public", allowlist=None)], - default_allowlist=NativeIdAllowlist({"42"}), - ) - # Construction succeeded; the validator did not raise. - assert host.default_allowlist is not None - - def test_combinator_with_unknown_nested_channel_raises(self) -> None: - # The validator walks ``AnyOfAllowlists`` / ``AllOfAllowlists`` - # so a typo'd channel name nested under a combinator is still - # caught at construction. - with pytest.raises(ChannelConfigurationError, match="unknown channel 'typo'"): - AgentFrameworkHost( - target=_AgentStub(), - channels=[ - _ChannelStub( - allowlist=AnyOfAllowlists( - NativeIdAllowlist({"42"}, channel="stub"), - NativeIdAllowlist({"99"}, channel="typo"), - ) - ) - ], - ) - - -# --------------------------------------------------------------------------- # -# host.authorize pipeline # -# --------------------------------------------------------------------------- # - - -class TestHostAuthorize: - """Host authorization pipeline across open, native-id, and linked-claim profiles.""" - - def _host(self) -> AgentFrameworkHost: - return AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) - - async def test_open_profile_returns_allowed_with_auto_isolation_key(self) -> None: - host = self._host() - outcome = await host.authorize(ChannelIdentity(channel="telegram", native_id="42")) - assert isinstance(outcome, Allowed) - assert outcome.isolation_key == "telegram:42" - - async def test_native_allowlist_allows_listed_id(self) -> None: - host = self._host() - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - allowlist=NativeIdAllowlist({"42"}), - ) - assert isinstance(outcome, Allowed) - assert outcome.isolation_key == "telegram:42" - - async def test_native_allowlist_denies_unlisted_id(self) -> None: - host = self._host() - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="99"), - allowlist=NativeIdAllowlist({"42"}), - ) - assert isinstance(outcome, Denied) - assert outcome.reason_code == "allowlist_denied_pre_link" - assert outcome.user_message is not None - # The bland default leaks neither tenant nor list size. - assert "telegram" not in (outcome.user_message or "") - - async def test_abstain_with_claim_requirement_yields_link_required_message(self) -> None: - # Without a linker and without channel-emitted claims, a claim-required - # allowlist cannot make progress and the host returns a safe denial. - async def abstain(_: AuthorizationContext) -> AllowlistDecision: - return AllowlistDecision.ABSTAIN - - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(emits_verified_claims=True)], - ) - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - allowlist=CallableAllowlist(abstain, requires_linked_claims=True), - ) - assert isinstance(outcome, Denied) - assert outcome.reason_code == "allowlist_requires_link" - - async def test_abstain_without_claim_requirement_falls_through_to_allowed(self) -> None: - async def abstain(_: AuthorizationContext) -> AllowlistDecision: - return AllowlistDecision.ABSTAIN - - host = self._host() - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - allowlist=CallableAllowlist(abstain), - ) - assert isinstance(outcome, Allowed) - - async def test_auto_issue_returns_existing_key_when_known(self) -> None: - # When an identity has already been observed, the auto-issued - # key matches the existing one rather than coining a fresh - # token. This is the linker-free equivalent of identity resolution. - host = self._host() - host._identities["alice"] = {"telegram": ChannelIdentity(channel="telegram", native_id="42")} - outcome = await host.authorize(ChannelIdentity(channel="telegram", native_id="42")) - assert isinstance(outcome, Allowed) - assert outcome.isolation_key == "alice" - - async def test_verified_claims_propagate_to_context(self) -> None: - # Channels that natively carry verified claims (e.g. Activity - # Protocol bearer with AAD oid) pass them through to - # ``authorize`` — the allowlist sees them on the - # ``AuthorizationContext``. - seen: list[AuthorizationContext] = [] - - async def capture(ctx: AuthorizationContext) -> AllowlistDecision: - seen.append(ctx) - return AllowlistDecision.ALLOW - - host = self._host() - await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - allowlist=CallableAllowlist(capture), - verified_claims={"oid": "abc"}, - ) - assert len(seen) == 1 - assert seen[0].claim_source == "channel" - assert dict(seen[0].verified_claims) == {"oid": "abc"} - - async def test_require_link_returns_challenge_when_unlinked(self) -> None: - challenge = LinkChallenge("c1", url="https://login.example/c1") - linker = _StaticLinker(challenge) - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True)], - identity_linker=linker, - ) - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - require_link=True, - ) - assert isinstance(outcome, LinkRequired) - assert outcome.challenge is challenge - assert [call.native_id for call in linker.calls] == ["42"] - - async def test_require_link_returns_linked_identity_when_resolved(self) -> None: - linked = LinkedIdentity("entra:abc", {"oid": "abc"}) - linker = _StaticLinker(linked) - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True)], - identity_linker=linker, - ) - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - require_link=True, - ) - assert isinstance(outcome, Allowed) - assert outcome.isolation_key == "entra:abc" - assert dict(outcome.verified_claims) == {"oid": "abc"} - assert outcome.claim_source == "linker" - # authorize() is decision-only; identity registry writes remain on - # the request execution path. - assert host._identities == {} - - async def test_linked_claim_allowlist_with_linker_allows_matching_claim(self) -> None: - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], - identity_linker=_StaticLinker(LinkedIdentity("entra:abc", {"oid": "abc"})), - ) - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - require_link=True, - allowlist=LinkedClaimAllowlist("oid", ["abc"]), - ) - assert isinstance(outcome, Allowed) - assert outcome.isolation_key == "entra:abc" - - async def test_linked_claim_allowlist_with_linker_denies_nonmatching_claim(self) -> None: - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], - identity_linker=_StaticLinker(LinkedIdentity("entra:def", {"oid": "def"})), - ) - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - require_link=True, - allowlist=LinkedClaimAllowlist("oid", ["abc"]), - ) - assert isinstance(outcome, Denied) - assert outcome.reason_code == "allowlist_denied_post_link" - - async def test_linked_claim_allowlist_with_linker_returns_challenge_when_unlinked(self) -> None: - challenge = LinkChallenge("c1") - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(require_link=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], - identity_linker=_StaticLinker(challenge), - ) - outcome = await host.authorize( - ChannelIdentity(channel="telegram", native_id="42"), - require_link=True, - allowlist=LinkedClaimAllowlist("oid", ["abc"]), - ) - assert isinstance(outcome, LinkRequired) - assert outcome.challenge is challenge - - async def test_linked_claim_allowlist_uses_channel_verified_claims_without_linker(self) -> None: - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub(emits_verified_claims=True, allowlist=LinkedClaimAllowlist("oid", ["abc"]))], - ) - outcome = await host.authorize( - ChannelIdentity(channel="activity", native_id="aad-user"), - allowlist=LinkedClaimAllowlist("oid", ["abc"]), - verified_claims={"oid": "abc"}, - ) - assert isinstance(outcome, Allowed) - assert outcome.isolation_key == "activity:aad-user" - assert outcome.claim_source == "channel" diff --git a/python/packages/hosting/tests/test_host.py b/python/packages/hosting/tests/test_host.py index cefdf5c8481..f6b2d9ebfa7 100644 --- a/python/packages/hosting/tests/test_host.py +++ b/python/packages/hosting/tests/test_host.py @@ -4,7 +4,7 @@ from __future__ import annotations -from collections.abc import AsyncIterator, Awaitable, Callable, Mapping, Sequence +from collections.abc import AsyncIterator, Sequence from dataclasses import dataclass, field from typing import Any @@ -21,16 +21,9 @@ ChannelContext, ChannelContribution, ChannelIdentity, - ChannelPush, ChannelRequest, ChannelSession, - DurableTaskPayloadMode, - DurableTaskRunner, HostedRunResult, - ResponseTarget, - RetryPolicy, - TaskHandle, - TaskStatus, ) @@ -75,27 +68,33 @@ def create_session(self, *, session_id: str | None = None) -> _FakeAgentSession: self.created_sessions.append(s) return s - async def run(self, messages: Any = None, *, stream: bool = False, session: Any = None, **kwargs: Any) -> Any: + def run(self, messages: Any = None, *, stream: bool = False, session: Any = None, **kwargs: Any) -> Any: self.calls.append({"messages": messages, "stream": stream, "session": session, "kwargs": kwargs}) - if stream: # pragma: no cover - not used by these tests + if stream: + updates = [AgentResponseUpdate(contents=[Content.from_text(text=self._reply)], role="assistant")] - async def _gen() -> AsyncIterator[Any]: - yield self._reply + async def _gen() -> AsyncIterator[AgentResponseUpdate]: + for update in updates: + yield update - return _gen() - return _FakeAgentResponse(text=self._reply) + async def _finalize(items: Sequence[AgentResponseUpdate]) -> AgentResponse: # noqa: RUF029 + return AgentResponse.from_updates(items) + + return ResponseStream[AgentResponseUpdate, AgentResponse](_gen(), finalizer=_finalize) + + async def _coro() -> _FakeAgentResponse: + return _FakeAgentResponse(text=self._reply) + + return _coro() class _RecordingChannel: - """Minimal :class:`Channel` + :class:`ChannelPush` for routing tests.""" + """Minimal :class:`Channel` for host tests.""" - def __init__(self, name: str = "fake", path: str = "/fake", supports_push: bool = True) -> None: + def __init__(self, name: str = "fake", path: str = "/fake") -> None: self.name = name self.path = path self.context: ChannelContext | None = None - self.pushes: list[tuple[ChannelIdentity, HostedRunResult[Any]]] = [] - self._push_raises: Exception | None = None - self._supports_push = supports_push # Provide a single trivial route so contribute() exercises the endpoint path. self._routes: Sequence[BaseRoute] = (Route("/ping", _ping),) @@ -103,88 +102,6 @@ def contribute(self, context: ChannelContext) -> ChannelContribution: self.context = context return ChannelContribution(routes=self._routes) - async def push(self, identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: - if self._push_raises is not None: - raise self._push_raises - self.pushes.append((identity, payload)) - - -class _NoPushChannel: - """A channel that does NOT implement :class:`ChannelPush`.""" - - def __init__(self, name: str = "nopush", path: str = "/nopush") -> None: - self.name = name - self.path = path - - def contribute(self, context: ChannelContext) -> ChannelContribution: - return ChannelContribution() - - -class _SyncTaskRunner(DurableTaskRunner): - """A :class:`DurableTaskRunner` that runs handlers inline. - - Tests of the delivery routing want deterministic, synchronous - behaviour. The real :class:`InProcessTaskRunner` schedules via - ``asyncio.create_task`` so push side effects only land *after* - the test has yielded control — awkward for assertions that read - a channel's recorded pushes immediately after - :meth:`ChannelContext.deliver_response` returns. - - Two knobs control failure handling: - - - ``schedule_raises``: when set, every call to :meth:`schedule` - raises this exception. Mimics a host-side outage (the durable - backend is unreachable). - - ``swallow_handler_errors`` (default ``True``): when the - handler raises, the error is recorded in - :attr:`handler_errors` but :meth:`schedule` still returns - successfully — matching the real durable contract that - "scheduled" is a separate signal from "delivered". Set to - ``False`` to surface handler exceptions through - :meth:`schedule` for the few tests that want to assert on - handler-raised failures inline. - """ - - def __init__(self, *, swallow_handler_errors: bool = True) -> None: - self._handlers: dict[str, Callable[[Mapping[str, Any]], Awaitable[None]]] = {} - self.scheduled: list[tuple[str, Mapping[str, Any]]] = [] - self.handler_errors: list[BaseException] = [] - self.schedule_raises: BaseException | None = None - self.swallow_handler_errors = swallow_handler_errors - - # Default object-mode matches the real ``InProcessTaskRunner`` — - # tests that want to exercise the JSON-mode path override this on - # the instance. - payload_mode = DurableTaskPayloadMode.OBJECT - - def register( - self, - name: str, - handler: Callable[[Mapping[str, Any]], Awaitable[None]], - ) -> None: - self._handlers[name] = handler - - async def schedule( - self, - name: str, - payload: Mapping[str, Any], - *, - retry_policy: RetryPolicy | None = None, - ) -> TaskHandle: - if self.schedule_raises is not None: - raise self.schedule_raises - self.scheduled.append((name, payload)) - try: - await self._handlers[name](payload) - except Exception as exc: - self.handler_errors.append(exc) - if not self.swallow_handler_errors: - raise - return TaskHandle(task_id=f"sync-{len(self.scheduled)}", name=name) - - async def get(self, handle: TaskHandle) -> TaskStatus | None: # pragma: no cover - unused - return "succeeded" - def _assistant_response(text: str) -> AgentResponse: """Build a one-message ``AgentResponse`` to use as a ``HostedRunResult.result``.""" @@ -227,7 +144,6 @@ class TestHostWiring: def test_channel_is_recognized(self) -> None: ch = _RecordingChannel() assert isinstance(ch, Channel) - assert isinstance(ch, ChannelPush) def test_app_mounts_channel_routes_under_path(self) -> None: agent = _FakeAgent() @@ -313,10 +229,6 @@ async def test_invoke_wraps_input_with_hosting_metadata(self) -> None: "native_id": "user:1", "attributes": {}, } - assert msg.additional_properties["hosting"]["response_target"] == { - "kind": "originating", - "targets": [], - } async def test_invoke_caches_session_per_isolation_key(self) -> None: agent = _FakeAgent() @@ -398,6 +310,56 @@ async def test_options_propagates_to_target_run(self) -> None: assert agent.calls[0]["kwargs"]["options"] == {"temperature": 0.4} +class TestHostOwnedHooks: + async def test_context_run_applies_run_hook_before_invocation(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + captured: dict[str, Any] = {} + + async def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: + captured["target"] = kwargs["target"] + captured["protocol_request"] = kwargs["protocol_request"] + return ChannelRequest( + channel=request.channel, + operation=request.operation, + input="rewritten", + session=request.session, + ) + + req = ChannelRequest(channel=ch.name, operation="op", input="original", session=ChannelSession("alice")) + await ch.context.run(req, run_hook=hook, protocol_request={"raw": True}) + + assert captured["target"] is agent + assert captured["protocol_request"] == {"raw": True} + assert agent.calls[0]["messages"].text == "rewritten" + + async def test_context_run_stream_applies_run_hook_before_opening_stream(self) -> None: + agent = _FakeAgent() + ch = _RecordingChannel() + host = AgentFrameworkHost(target=agent, channels=[ch]) + _ = host.app + assert ch.context is not None + + def hook(request: ChannelRequest, **_: Any) -> ChannelRequest: + return ChannelRequest(channel=request.channel, operation=request.operation, input="streamed") + + stream = await ch.context.run_stream( + ChannelRequest(channel=ch.name, operation="op", input="original"), + run_hook=hook, + stream_update_hook=lambda update: AgentResponseUpdate( + contents=[Content.from_text(text=update.text.upper())], + role="assistant", + ), + ) + + chunks = [update.text async for update in stream] + assert chunks == ["OK"] + assert agent.calls[0]["messages"].text == "streamed" + + # --------------------------------------------------------------------------- # # Workflow target # # --------------------------------------------------------------------------- # @@ -436,7 +398,7 @@ async def test_stream_workflow_yields_updates_and_finalizes(self) -> None: assert ch.context is not None req = ChannelRequest(channel="fake", operation="message.create", input="hi") - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) updates: list[AgentResponseUpdate] = [] async for update in stream: @@ -464,7 +426,7 @@ async def test_stream_workflow_yields_one_update_per_output_event(self) -> None: assert ch.context is not None req = ChannelRequest(channel="fake", operation="message.create", input="x") - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) chunks: list[str] = [] async for update in stream: @@ -561,7 +523,7 @@ async def test_stream_writes_checkpoint_under_isolation_key(self, tmp_path: Any) input="hi", session=ChannelSession(isolation_key="bob"), ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) async for _ in stream: pass await stream.get_final_response() @@ -702,520 +664,6 @@ async def test_separator_in_key_skips_checkpointing(self, tmp_path: Any) -> None assert list(tmp_path.iterdir()) == [] -# --------------------------------------------------------------------------- # -# Delivery routing # -# --------------------------------------------------------------------------- # - - -def _make_host_with_two_channels( - *, - runner: DurableTaskRunner | None = None, -) -> tuple[AgentFrameworkHost, _RecordingChannel, _RecordingChannel, ChannelContext, _SyncTaskRunner]: - agent = _FakeAgent() - a = _RecordingChannel(name="responses", path="/r") - b = _RecordingChannel(name="telegram", path="/t") - sync_runner = runner if isinstance(runner, _SyncTaskRunner) else _SyncTaskRunner() - host = AgentFrameworkHost( - target=agent, - channels=[a, b], - durable_task_runner=runner or sync_runner, - ) - _ = host.app - assert a.context is not None - return host, a, b, a.context, sync_runner - - -def _record_identity_on(host: AgentFrameworkHost, isolation_key: str, channel: str, native_id: str) -> None: - """Pre-seed the host's identity registry by running a request.""" - host._identities.setdefault(isolation_key, {})[channel] = ChannelIdentity(channel=channel, native_id=native_id) - host._active[isolation_key] = channel - - -class TestDeliverResponse: - """Delivery routing — the originating channel learns whether to render - on its own wire from the ``bool`` return; everything else - (scheduled tasks, schedule-time failures, skip reasons) lives in - the runner's own log. Tests assert the bool plus observable - state on the sync runner fake (``scheduled``, ``handler_errors``) - and on the destination channels (``pushes``).""" - - async def test_originating_returns_true(self) -> None: - _, _, _, ctx, runner = _make_host_with_two_channels() - req = ChannelRequest(channel="responses", operation="op", input="x") - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True - assert runner.scheduled == [] - - async def test_none_suppresses_everything(self) -> None: - _, _, _, ctx, runner = _make_host_with_two_channels() - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - response_target=ResponseTarget.none, # type: ignore[attr-defined] - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is False - assert runner.scheduled == [] - - async def test_active_pushes_to_other_channel(self) -> None: - host, _a, b, ctx, runner = _make_host_with_two_channels() - # Alice was last seen on telegram. - _record_identity_on(host, "alice", "telegram", "42") - # Now she sends a message via responses; ResponseTarget.active should - # push to telegram, not back to responses. - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.active, # type: ignore[attr-defined] - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is False - assert len(runner.scheduled) == 1 - assert b.pushes and b.pushes[0][0].native_id == "42" - - async def test_active_falls_back_to_originating_when_self(self) -> None: - host, _a, _b, ctx, runner = _make_host_with_two_channels() - _record_identity_on(host, "alice", "responses", "user:1") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.active, # type: ignore[attr-defined] - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True - assert runner.scheduled == [] - - async def test_channels_with_unknown_identity_falls_back_to_originating(self) -> None: - _, _, _, ctx, runner = _make_host_with_two_channels() - # No prior identity seeded for telegram on alice. - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram"), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - # Skipped at resolution → fallback to originating so the user - # still gets a reply. - assert include_originating is True - assert runner.scheduled == [] - - async def test_channels_with_explicit_native_id_token(self) -> None: - _, _, b, ctx, runner = _make_host_with_two_channels() - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - response_target=ResponseTarget.channel("telegram:99"), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is False - assert len(runner.scheduled) == 1 - assert b.pushes[0][0].native_id == "99" - - async def test_channels_originating_pseudo_includes_origin(self) -> None: - host, _a, _b, ctx, runner = _make_host_with_two_channels() - _record_identity_on(host, "alice", "telegram", "42") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channels(["originating", "telegram"]), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True - assert len(runner.scheduled) == 1 - - async def test_channels_unknown_channel_name_falls_back(self) -> None: - _, _, _, ctx, runner = _make_host_with_two_channels() - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - response_target=ResponseTarget.channel("nope"), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True # fallback - assert runner.scheduled == [] - - async def test_no_push_capability_falls_back(self) -> None: - agent = _FakeAgent() - a = _RecordingChannel(name="responses", path="/r") - b = _NoPushChannel(name="nopush", path="/n") - host = AgentFrameworkHost(target=agent, channels=[a, b]) - _ = host.app - assert a.context is not None - # Pre-seed identity on the no-push channel so we get past the - # identity check and hit the ChannelPush check. - host._identities.setdefault("alice", {})["nopush"] = ChannelIdentity(channel="nopush", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("nopush"), - ) - include_originating = await a.context.deliver_response(req, _make_reply("reply")) - assert include_originating is True # fallback - - async def test_all_linked_pushes_to_every_other_channel(self) -> None: - host, _a, b, ctx, runner = _make_host_with_two_channels() - # Alice on responses (originating) and telegram. - host._identities.setdefault("alice", {}) - host._identities["alice"]["responses"] = ChannelIdentity(channel="responses", native_id="user:1") - host._identities["alice"]["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.all_linked, # type: ignore[attr-defined] - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True - assert len(runner.scheduled) == 1 - assert b.pushes and b.pushes[0][1].result.text == "reply" - - async def test_all_linked_no_other_channels_falls_back(self) -> None: - _host, _a, _b, ctx, runner = _make_host_with_two_channels() - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.all_linked, # type: ignore[attr-defined] - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True - assert runner.scheduled == [] - - async def test_identities_variant_preserves_attributes(self) -> None: - """``ResponseTarget.identities([...])`` plumbs full - :class:`ChannelIdentity` objects through resolution, preserving - ``attributes`` for destination channels that need conversation/ - thread metadata (Teams, Slack, Bot Framework).""" - _, _, b, ctx, runner = _make_host_with_two_channels() - ident = ChannelIdentity( - channel="telegram", - native_id="42", - attributes={"thread_id": "t1", "service_url": "https://x"}, - ) - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - response_target=ResponseTarget.identity(ident), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is False - assert len(runner.scheduled) == 1 - # The destination identity arrived at push with attributes intact. - pushed_identity = b.pushes[0][0] - assert pushed_identity.native_id == "42" - assert dict(pushed_identity.attributes) == {"thread_id": "t1", "service_url": "https://x"} - - async def test_identities_pointing_to_originating_includes_origin(self) -> None: - """An identity whose channel matches the originating channel - folds into ``include_originating`` rather than double-delivering - via push.""" - _, _, _, ctx, runner = _make_host_with_two_channels() - ident = ChannelIdentity(channel="responses", native_id="user:1") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - response_target=ResponseTarget.identities([ident]), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is True - assert runner.scheduled == [] - - async def test_handler_exception_does_not_change_return_value(self) -> None: - """When ``ChannelPush.push`` raises *inside the runner handler* - the originating channel still sees the same return value — - ``DurableTaskRunner.schedule`` accepted the work, and downstream - delivery outcome is owned by the runner (it logs and retries - per the configured ``RetryPolicy``).""" - host, _a, b, ctx, runner = _make_host_with_two_channels() - b._push_raises = RuntimeError("boom") # type: ignore[attr-defined] - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram"), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - # Schedule succeeded → the return value is unaffected by a - # downstream handler failure. - assert include_originating is False - assert len(runner.scheduled) == 1 - # Handler raised — runner captured the error (the real runner - # would retry it; the sync fake records it). - assert runner.handler_errors and isinstance(runner.handler_errors[0], RuntimeError) - assert str(runner.handler_errors[0]) == "boom" - - async def test_schedule_exception_falls_back_to_originating(self) -> None: - """When :meth:`DurableTaskRunner.schedule` itself raises (the - runner backend is unreachable) the destination is treated as - skipped — same outcome as any other resolution-time drop. The - host's fall-back-to-originating rule then ensures the user - still gets a reply rather than being left without one.""" - host, _a, _b, ctx, runner = _make_host_with_two_channels() - runner.schedule_raises = RuntimeError("runner backend unreachable") - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="x", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram"), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - # Schedule raised → no scheduled tasks, fall back to originating. - assert runner.scheduled == [] - assert include_originating is True - - async def test_echo_input_pushes_user_message_then_response(self) -> None: - """``echo_input=True`` triggers two pushes per destination, - bundled into the same scheduled task: the originating user - message first, then the agent reply. Channels downstream of a - workflow that emits to multiple channels need this to keep - their UI state coherent with the user's actual prompt.""" - host, _a, b, ctx, runner = _make_host_with_two_channels() - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="hello there", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram", echo_input=True), - ) - include_originating = await ctx.deliver_response(req, _make_reply("reply")) - assert include_originating is False - # One scheduled task per destination; the handler does echo then response inline. - assert len(runner.scheduled) == 1 - _, payload = runner.scheduled[0] - assert payload["echo_result"] is not None - # Two pushes landed on the channel: echo first, then response. - assert len(b.pushes) == 2 - echo_identity, echo_payload = b.pushes[0] - assert echo_identity.native_id == "42" - assert echo_payload.result.text == "hello there" - assert str(echo_payload.result.messages[0].role) == "user" - resp_identity, resp_payload = b.pushes[1] - assert resp_identity.native_id == "42" - assert resp_payload.result.text == "reply" - assert str(resp_payload.result.messages[0].role) == "assistant" - - async def test_echo_input_failure_does_not_block_response(self) -> None: - """An echo push that raises inside the handler is logged and - swallowed; the response push must still be attempted on the - same destination so the user-visible failure mode is - "response delivered without echo" rather than "no response at - all".""" - agent = _FakeAgent() - a = _RecordingChannel(name="responses", path="/r") - b = _RecordingChannel(name="telegram", path="/t") - runner = _SyncTaskRunner() - host = AgentFrameworkHost(target=agent, channels=[a, b], durable_task_runner=runner) - _ = host.app - assert a.context is not None - - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - - # Make the FIRST push (echo) raise, but the SECOND (response) succeed. - calls = {"n": 0} - real_push = b.push - - async def flaky_push(identity: ChannelIdentity, payload: HostedRunResult[Any]) -> None: - calls["n"] += 1 - if calls["n"] == 1: - raise RuntimeError("echo down") - await real_push(identity, payload) - - b.push = flaky_push # type: ignore[method-assign] - - req = ChannelRequest( - channel="responses", - operation="op", - input="hi", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram", echo_input=True), - ) - include_originating = await a.context.deliver_response(req, _make_reply("reply")) - # Schedule succeeded; handler swallowed the echo failure and - # the response push landed on the channel. - assert include_originating is False - assert b.pushes and b.pushes[0][1].result.text == "reply" - # Handler did not raise (echo failure was swallowed inside - # the handler), so the runner saw no error. - assert runner.handler_errors == [] - - async def test_echo_idempotent_on_retry(self) -> None: - """When the response push fails on a retried task, the handler - must NOT re-deliver the echo if a prior attempt already - succeeded. The ``echo_done`` cursor on the payload mapping is - the host's idempotency primitive; this test invokes the - handler directly twice with the same payload to exercise the - retry semantics.""" - host, _a, b, ctx, runner = _make_host_with_two_channels() - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="hi", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram", echo_input=True), - ) - # First scheduled invocation — echo + response both succeed. - await ctx.deliver_response(req, _make_reply("reply")) - assert len(b.pushes) == 2 # echo + response - # Simulate a retry: invoke the handler again with the same - # payload mapping (the in-process runner reuses the mapping - # across retries). After the first run ``echo_done`` was - # mutated to ``True``; the second run must skip the echo. - _, payload = runner.scheduled[0] - assert payload["echo_done"] is True - await host._handle_push_task(payload) - # Only one more push (the response) — the echo was skipped. - assert len(b.pushes) == 3 - assert str(b.pushes[2][1].result.messages[0].role) == "assistant" - - -# --------------------------------------------------------------------------- # -# Response hook + multi-modal payload + clone-on-fan-out # -# --------------------------------------------------------------------------- # - - -class TestResponseHookFanOut: - async def test_response_hook_applied_per_destination(self) -> None: - """Channels with a ``response_hook`` attribute see their hook - applied before push, with a ``ChannelResponseContext`` carrying - the destination identity, the originating request, and an - ``is_echo`` flag.""" - agent = _FakeAgent() - a = _RecordingChannel(name="responses", path="/r") - b = _RecordingChannel(name="telegram", path="/t") - - seen: list[tuple[str, str, bool]] = [] - - async def telegram_hook( - result: HostedRunResult[AgentResponse], - *, - context: Any, - **_: Any, - ) -> HostedRunResult[AgentResponse]: - seen.append((context.channel_name, context.destination_identity.native_id, context.is_echo)) - return result.replace( - result=AgentResponse( - messages=[Message(role="assistant", contents=[Content.from_text("[hooked] " + result.result.text)])] - ), - ) - - b.response_hook = telegram_hook # type: ignore[attr-defined] - host = AgentFrameworkHost(target=agent, channels=[a, b], durable_task_runner=_SyncTaskRunner()) - _ = host.app - assert a.context is not None - - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="hi", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram"), - ) - report = await a.context.deliver_response(req, _make_reply("reply")) - assert report is False - # The pushed payload reflects the hook's transform. - assert b.pushes[0][1].result.text == "[hooked] reply" - assert seen == [("telegram", "42", False)] - - async def test_response_hook_mutation_isolated_per_destination(self) -> None: - """A hook that rebinds ``result`` on its payload must NOT affect - the payload another destination sees. The host clones the - envelope before each hook invocation so a per-destination - :meth:`HostedRunResult.replace` cannot leak across destinations.""" - agent = _FakeAgent() - a = _RecordingChannel(name="responses", path="/r") - b = _RecordingChannel(name="telegram", path="/t") - c = _RecordingChannel(name="extra", path="/x") - - async def hook_that_rebinds(result: HostedRunResult[AgentResponse], **_: Any) -> HostedRunResult[AgentResponse]: - # Naughty hook: rebind ``result`` to a fresh AgentResponse. - # Host's per-destination clone via ``replace()`` makes this safe - # for sibling destinations. - return result.replace(result=AgentResponse(messages=[])) - - b.response_hook = hook_that_rebinds # type: ignore[attr-defined] - host = AgentFrameworkHost(target=agent, channels=[a, b, c], durable_task_runner=_SyncTaskRunner()) - _ = host.app - assert a.context is not None - - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - host._identities["alice"]["extra"] = ChannelIdentity(channel="extra", native_id="9") - - original = _make_reply("reply") - original_result_snapshot = original.result - - req = ChannelRequest( - channel="responses", - operation="op", - input="hi", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channels(["telegram", "extra"]), - ) - report = await a.context.deliver_response(req, original) - assert report is False - # The rebind on the telegram clone must not have touched the - # original envelope, nor the extra channel's view. - assert original.result is original_result_snapshot - # ``extra`` channel saw the original-shaped payload. - extra_push = next(p for p in c.pushes) - assert extra_push[1].result.text == "reply" - - async def test_response_hook_fires_on_echo_with_is_echo_true(self) -> None: - """When ``echo_input`` is set, the channel's response_hook fires - TWICE per destination — once for the echo (is_echo=True), once - for the response (is_echo=False).""" - agent = _FakeAgent() - a = _RecordingChannel(name="responses", path="/r") - b = _RecordingChannel(name="telegram", path="/t") - - phases: list[bool] = [] - - async def telegram_hook( - result: HostedRunResult[AgentResponse], *, context: Any, **_: Any - ) -> HostedRunResult[AgentResponse]: - phases.append(context.is_echo) - return result - - b.response_hook = telegram_hook # type: ignore[attr-defined] - host = AgentFrameworkHost(target=agent, channels=[a, b], durable_task_runner=_SyncTaskRunner()) - _ = host.app - assert a.context is not None - - host._identities.setdefault("alice", {})["telegram"] = ChannelIdentity(channel="telegram", native_id="42") - req = ChannelRequest( - channel="responses", - operation="op", - input="hi", - session=ChannelSession(isolation_key="alice"), - response_target=ResponseTarget.channel("telegram", echo_input=True), - ) - await a.context.deliver_response(req, _make_reply("reply")) - assert phases == [True, False] - - # --------------------------------------------------------------------------- # # HostedRunResult — generic typed envelope # # --------------------------------------------------------------------------- # @@ -1522,7 +970,7 @@ async def test_bind_held_open_until_stream_exhaustion(self) -> None: stream=True, attributes={"response_id": "resp_stream"}, ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) # As soon as run_stream returns, the binding must already be open # so any provider work that happens during iteration sees it. @@ -1574,7 +1022,7 @@ async def test_get_final_response_closes_binding(self) -> None: stream=True, attributes={"response_id": "resp_get_final"}, ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) # Skip iteration and go straight to ``get_final_response``; # the adapter must drain the inner stream itself and close # the binding in ``finally``. @@ -1599,7 +1047,7 @@ async def test_double_close_is_idempotent(self) -> None: stream=True, attributes={"response_id": "resp_idem"}, ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) async for _u in stream: pass # Iteration's finally already closed; an explicit ``aclose`` @@ -1627,7 +1075,7 @@ async def test_aclose_releases_binding_when_stream_abandoned(self) -> None: stream=True, attributes={"response_id": "resp_abandon"}, ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) await stream.aclose() # type: ignore[attr-defined] # Binding released without iterating. @@ -1655,7 +1103,7 @@ async def test_getattr_forwards_to_inner_stream(self) -> None: stream=True, attributes={"response_id": "resp_getattr"}, ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) # ``with_result_hook`` is a real method on ``ResponseStream``; # if forwarding broke this would AttributeError. try: @@ -1682,7 +1130,7 @@ async def test_await_path_routes_through_get_final_response(self) -> None: stream=True, attributes={"response_id": "resp_await"}, ) - stream = ch.context.run_stream(req) + stream = await ch.context.run_stream(req) final = await stream # exercises __await__ assert final.text == "chunk-1chunk-2" names = [n for n, _ in prov.events] diff --git a/python/packages/hosting/tests/test_host_disk.py b/python/packages/hosting/tests/test_host_disk.py index 5ed27b67aab..abcbf7397ba 100644 --- a/python/packages/hosting/tests/test_host_disk.py +++ b/python/packages/hosting/tests/test_host_disk.py @@ -1,32 +1,19 @@ # Copyright (c) Microsoft. All rights reserved. -"""Tests for ``state_dir`` wired through :class:`AgentFrameworkHost`.""" +"""Tests for narrowed ``state_dir`` support in :class:`AgentFrameworkHost`.""" from __future__ import annotations -import asyncio from pathlib import Path from typing import Any import pytest -from agent_framework_hosting import ( - AgentFrameworkHost, - ChannelContext, - ChannelContribution, - ChannelIdentity, - LinkChallenge, -) +from agent_framework_hosting import AgentFrameworkHost, ChannelContext, ChannelContribution -# Skip the whole module when the optional disk extra isn't installed. pytest.importorskip("diskcache") -# --------------------------------------------------------------------------- # -# Test helpers # -# --------------------------------------------------------------------------- # - - class _AgentStub: """Bare-minimum SupportsAgentRun stub for host construction.""" @@ -42,65 +29,22 @@ def contribute(self, _context: ChannelContext) -> ChannelContribution: return ChannelContribution() -class _NonConfigurableLinker: - async def resolve(self, _identity: ChannelIdentity) -> LinkChallenge: - return LinkChallenge("link") - - -class _ConfigurableLinker: - def __init__(self) -> None: - self.configured_path: Path | None = None - - def configure_link_store_path(self, path: str | Path) -> None: - self.configured_path = Path(path) - - async def resolve(self, _identity: ChannelIdentity) -> LinkChallenge: - return LinkChallenge("link") - - def _close_host_disk(host: AgentFrameworkHost) -> None: - """Mirror the lifespan shutdown ordering for tests that simulate restart. - - The real shutdown order is ``runner.shutdown()`` → ``sessions_store.close()``; - both release their advisory file locks so a second host can take ownership. - """ - runner = host._durable_task_runner - try: - asyncio.get_event_loop().run_until_complete(runner.shutdown(timeout=1.0)) - except RuntimeError: - # No running loop; spin up a throw-away one. - asyncio.run(runner.shutdown(timeout=1.0)) + """Release any session-alias store held by ``host``.""" if host._sessions_store is not None: host._sessions_store.close() -# --------------------------------------------------------------------------- # -# state_dir=None preserves the in-memory contract # -# --------------------------------------------------------------------------- # - - -def test_state_dir_none_keeps_plain_dicts(tmp_path: Path) -> None: - """No store, no sessions persistence, no files written.""" +def test_state_dir_none_keeps_plain_alias_dict(tmp_path: Path) -> None: + """No store, no alias persistence, no files written.""" host = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) - try: - assert host._sessions_store is None - assert isinstance(host._session_aliases, dict) - assert isinstance(host._active, dict) - assert isinstance(host._identities, dict) - # No accidental disk writes anywhere under tmp_path. - assert list(tmp_path.iterdir()) == [] - finally: - # Nothing to close. - pass + assert host._sessions_store is None + assert isinstance(host._session_aliases, dict) + assert list(tmp_path.iterdir()) == [] -# --------------------------------------------------------------------------- # -# Single string state_dir creates default subfolders # -# --------------------------------------------------------------------------- # - - -def test_string_state_dir_creates_subfolders(tmp_path: Path) -> None: - """Passing a single path expands to ``runner/`` and ``sessions/``.""" +def test_string_state_dir_creates_sessions_subfolder_only(tmp_path: Path) -> None: + """Passing a single path expands to ``sessions/`` plus lazy checkpoint path.""" host = AgentFrameworkHost( target=_AgentStub(), channels=[_ChannelStub()], @@ -108,98 +52,40 @@ def test_string_state_dir_creates_subfolders(tmp_path: Path) -> None: ) try: assert host._sessions_store is not None - assert (tmp_path / "runner").is_dir() assert (tmp_path / "sessions").is_dir() + assert not (tmp_path / "runner").exists() + assert not (tmp_path / "links").exists() + # Checkpoint path is derived but not created for agent targets. + assert not (tmp_path / "checkpoints").exists() finally: _close_host_disk(host) -# --------------------------------------------------------------------------- # -# Per-component override via HostStatePaths-shaped dict # -# --------------------------------------------------------------------------- # - - -def test_per_component_paths(tmp_path: Path) -> None: - """Dict form lets the caller route components to different roots.""" - runner_dir = tmp_path / "tasks" +def test_per_component_session_path(tmp_path: Path) -> None: + """Dict form lets callers route session aliases to a specific root.""" sessions_dir = tmp_path / "state" host = AgentFrameworkHost( target=_AgentStub(), channels=[_ChannelStub()], - state_dir={"runner": runner_dir, "sessions": sessions_dir}, + state_dir={"sessions": sessions_dir}, ) try: - assert runner_dir.is_dir() assert sessions_dir.is_dir() - # Default subfolders should NOT exist when the caller provides - # explicit overrides. - assert not (tmp_path / "runner").is_dir() or runner_dir == (tmp_path / "runner") - assert not (tmp_path / "sessions").is_dir() or sessions_dir == (tmp_path / "sessions") + assert host._sessions_store is not None + assert host._checkpoint_location is None finally: _close_host_disk(host) -def test_unknown_component_key_raises(tmp_path: Path) -> None: - """Misspelled keys should fail loudly so the user catches typos.""" +@pytest.mark.parametrize("key", ["runner", "links", "active", "identities"]) +def test_removed_state_dir_component_keys_raise(tmp_path: Path, key: str) -> None: + """Obsolete follow-up components should fail loudly instead of becoming no-ops.""" with pytest.raises(ValueError, match="unknown"): AgentFrameworkHost( target=_AgentStub(), channels=[_ChannelStub()], - state_dir={"runnerr": tmp_path / "x"}, # type: ignore[dict-item] - ) - - -def test_links_state_path_configures_compatible_identity_linker(tmp_path: Path) -> None: - """``state_dir['links']`` is offered to linkers that accept host-owned persistence.""" - linker = _ConfigurableLinker() - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - identity_linker=linker, - state_dir=tmp_path, - ) - try: - assert linker.configured_path == tmp_path / "links" - finally: - _close_host_disk(host) - - -def test_explicit_links_state_path_without_linker_warns(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: - """Explicit ``links`` path with no linker is almost certainly dead config.""" - with caplog.at_level("WARNING", logger="agent_framework.hosting"): - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - state_dir={"links": tmp_path / "links"}, + state_dir={key: tmp_path / key}, # type: ignore[dict-item] ) - try: - assert any( - "state_dir['links']" in rec.message and "no identity_linker" in rec.message for rec in caplog.records - ) - finally: - _close_host_disk(host) - - -def test_links_state_path_with_nonconfigurable_linker_warns(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: - """A linker that owns its persistence directly gets a clear warning.""" - with caplog.at_level("WARNING", logger="agent_framework.hosting"): - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - identity_linker=_NonConfigurableLinker(), - state_dir={"links": tmp_path / "links"}, - ) - try: - assert any( - "state_dir['links']" in rec.message and "SupportsLinkStorePath" in rec.message for rec in caplog.records - ) - finally: - _close_host_disk(host) - - -# --------------------------------------------------------------------------- # -# Session bookkeeping survives a host restart # -# --------------------------------------------------------------------------- # def test_session_aliases_survive_restart(tmp_path: Path) -> None: @@ -219,84 +105,6 @@ def test_session_aliases_survive_restart(tmp_path: Path) -> None: _close_host_disk(host2) -def test_active_channel_survives_restart(tmp_path: Path) -> None: - """``_active`` must round-trip through the store.""" - state_dir = tmp_path / "state" - - host1 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) - host1._active["user-1"] = "telegram" - host1._active["user-2"] = "responses" - _close_host_disk(host1) - - host2 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) - try: - assert host2._active["user-1"] == "telegram" - assert host2._active["user-2"] == "responses" - finally: - _close_host_disk(host2) - - -def test_identities_nested_mutation_survives_restart(tmp_path: Path) -> None: - """Setting ``self._identities[ik][channel] = identity`` must persist. - - This exercises the proxy-inner-dict ``__setitem__`` write-through path, - not just the outer-key replacement path. - """ - state_dir = tmp_path / "state" - - host1 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) - ident_tg = ChannelIdentity("telegram", "tg-123", {"username": "alice"}) - ident_rsp = ChannelIdentity("responses", "rsp-456") - # Mirrors the host-internal path in ``_register_identity``. - host1._identities.setdefault("user-1", {})["telegram"] = ident_tg - host1._identities.setdefault("user-1", {})["responses"] = ident_rsp - host1._identities.setdefault("user-2", {})["telegram"] = ChannelIdentity("telegram", "tg-789") - _close_host_disk(host1) - - host2 = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()], state_dir=state_dir) - try: - u1 = host2._identities["user-1"] - assert set(u1.keys()) == {"telegram", "responses"} - assert u1["telegram"].native_id == "tg-123" - assert u1["telegram"].attributes["username"] == "alice" - assert u1["responses"].native_id == "rsp-456" - assert host2._identities["user-2"]["telegram"].native_id == "tg-789" - finally: - _close_host_disk(host2) - - -# --------------------------------------------------------------------------- # -# Explicit durable_task_runner + state_dir['runner'] warns # -# --------------------------------------------------------------------------- # - - -def test_explicit_runner_with_runner_state_warns(tmp_path: Path, caplog: pytest.LogCaptureFixture) -> None: - """Caller-owned runner + state_dir['runner'] → ignore + warn.""" - from agent_framework_hosting import InProcessTaskRunner - - user_runner = InProcessTaskRunner() - try: - with caplog.at_level("WARNING"): - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - durable_task_runner=user_runner, - allow_in_process_runner=True, - state_dir={"runner": tmp_path / "runner"}, - ) - assert any("state_dir['runner']" in rec.message for rec in caplog.records) - # Sessions store wasn't requested, so still None. - assert host._sessions_store is None - finally: - # user_runner has no disk state, so nothing else to clean up. - pass - - -# --------------------------------------------------------------------------- # -# Workflow checkpoint integration # -# --------------------------------------------------------------------------- # - - def _build_simple_workflow() -> Any: """Build a no-op workflow for checkpoint-wiring tests.""" from tests._workflow_fixtures import build_upper_workflow @@ -313,7 +121,6 @@ def test_single_path_state_dir_wires_workflow_checkpoints(tmp_path: Path) -> Non state_dir=tmp_path, ) try: - # Checkpoint location is derived from the single state_dir. assert host._checkpoint_location == tmp_path / "checkpoints" finally: _close_host_disk(host) @@ -330,7 +137,6 @@ def test_mapping_state_dir_checkpoints_key_wires_workflow_checkpoints(tmp_path: ) try: assert host._checkpoint_location == ckpt_dir - # No diskcache components were requested. assert host._sessions_store is None finally: _close_host_disk(host) @@ -342,9 +148,7 @@ def test_mapping_state_dir_omits_checkpoints_for_workflow(tmp_path: Path) -> Non host = AgentFrameworkHost( target=workflow, channels=[_ChannelStub()], - # No 'checkpoints' key → no checkpoint persistence even though - # other components are persisted. - state_dir={"runner": tmp_path / "r", "sessions": tmp_path / "s"}, + state_dir={"sessions": tmp_path / "s"}, ) try: assert host._checkpoint_location is None @@ -381,7 +185,6 @@ def test_state_dir_checkpoints_for_agent_target_silent_for_single_path(tmp_path: ) try: assert host._checkpoint_location is None - # ``checkpoints/`` subfolder is not eagerly created (no consumer). assert not (tmp_path / "checkpoints").exists() finally: _close_host_disk(host) @@ -390,7 +193,7 @@ def test_state_dir_checkpoints_for_agent_target_silent_for_single_path(tmp_path: def test_state_dir_checkpoints_for_agent_target_warns_when_explicit( tmp_path: Path, caplog: pytest.LogCaptureFixture ) -> None: - """Mapping form with ``checkpoints`` + agent target → warn (dead config).""" + """Mapping form with ``checkpoints`` + agent target → warn.""" with caplog.at_level("WARNING", logger="agent_framework.hosting"): host = AgentFrameworkHost( target=_AgentStub(), diff --git a/python/packages/hosting/tests/test_isolation.py b/python/packages/hosting/tests/test_isolation.py index 4dc029f07b9..84fcd35e299 100644 --- a/python/packages/hosting/tests/test_isolation.py +++ b/python/packages/hosting/tests/test_isolation.py @@ -16,6 +16,7 @@ import asyncio +import pytest from starlette.requests import Request from starlette.responses import JSONResponse from starlette.routing import BaseRoute, Route @@ -168,7 +169,22 @@ async def run(self, *_args: object, **_kwargs: object) -> object: # pragma: no class TestIsolationMiddlewareEndToEnd: - def test_both_headers_lifted_into_contextvar(self) -> None: + def test_headers_ignored_outside_foundry_environment(self) -> None: + host, probe = _make_host_with_probe() + with TestClient(host.app) as client: # type: ignore[attr-defined] + r = client.get( + "/probe", + headers={ + ISOLATION_HEADER_USER: "alice-uid", + ISOLATION_HEADER_CHAT: "general-cid", + }, + ) + assert r.status_code == 200 + assert r.json() == {"user": None, "chat": None, "_present": False} + assert probe.captured == [None] + + def test_both_headers_lifted_into_contextvar(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") host, probe = _make_host_with_probe() with TestClient(host.app) as client: # type: ignore[attr-defined] r = client.get( @@ -186,15 +202,17 @@ def test_both_headers_lifted_into_contextvar(self) -> None: assert captured.user_key == "alice-uid" assert captured.chat_key == "general-cid" - def test_only_user_header_lifted(self) -> None: + def test_only_user_header_lifted(self, monkeypatch: pytest.MonkeyPatch) -> None: """One-header-only branch: the middleware still binds (chat=None).""" + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") host, probe = _make_host_with_probe() with TestClient(host.app) as client: # type: ignore[attr-defined] r = client.get("/probe", headers={ISOLATION_HEADER_USER: "alice-uid"}) assert r.status_code == 200 assert r.json() == {"user": "alice-uid", "chat": None} - def test_only_chat_header_lifted(self) -> None: + def test_only_chat_header_lifted(self, monkeypatch: pytest.MonkeyPatch) -> None: + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") host, probe = _make_host_with_probe() with TestClient(host.app) as client: # type: ignore[attr-defined] r = client.get("/probe", headers={ISOLATION_HEADER_CHAT: "general-cid"}) @@ -213,9 +231,10 @@ def test_no_headers_keeps_contextvar_none(self) -> None: assert r.json() == {"user": None, "chat": None, "_present": False} assert probe.captured == [None] - def test_empty_header_value_treated_as_absent(self) -> None: + def test_empty_header_value_treated_as_absent(self, monkeypatch: pytest.MonkeyPatch) -> None: """A header that's present but empty must not bind an empty key — ``IsolationContext`` rejects empty strings on the read side.""" + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") host, probe = _make_host_with_probe() with TestClient(host.app) as client: # type: ignore[attr-defined] r = client.get( @@ -229,10 +248,11 @@ def test_empty_header_value_treated_as_absent(self) -> None: # Empty user header decodes to None; chat key stays bound. assert r.json() == {"user": None, "chat": "general-cid"} - def test_contextvar_resets_after_request(self) -> None: + def test_contextvar_resets_after_request(self, monkeypatch: pytest.MonkeyPatch) -> None: """The middleware must call ``reset_current_isolation_keys`` in a ``finally`` so per-request state never leaks across requests or back into the calling thread's context.""" + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") host, probe = _make_host_with_probe() with TestClient(host.app) as client: # type: ignore[attr-defined] r1 = client.get("/probe", headers={ISOLATION_HEADER_USER: "alice-uid"}) @@ -245,9 +265,10 @@ def test_contextvar_resets_after_request(self) -> None: r2 = client.get("/probe") assert r2.json() == {"user": None, "chat": None, "_present": False} - def test_concurrent_requests_get_isolated_contextvars(self) -> None: + def test_concurrent_requests_get_isolated_contextvars(self, monkeypatch: pytest.MonkeyPatch) -> None: """Different requests run in different async contexts; binding from request A must NOT leak into a concurrent request B.""" + monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "1") host, probe = _make_host_with_probe() async def _drive() -> None: diff --git a/python/packages/hosting/tests/test_runner.py b/python/packages/hosting/tests/test_runner.py deleted file mode 100644 index bee7e097b0a..00000000000 --- a/python/packages/hosting/tests/test_runner.py +++ /dev/null @@ -1,333 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Tests for :class:`InProcessTaskRunner` and runtime-mode auto-detection.""" - -from __future__ import annotations - -import asyncio -from collections.abc import Mapping -from typing import Any - -import pytest - -from agent_framework_hosting import ( - AgentFrameworkHost, - ChannelContext, - ChannelContribution, - DurableTaskPayloadMode, - InProcessTaskRunner, - RetryPolicy, - TaskHandle, -) -from agent_framework_hosting._host import _detect_runtime_mode - -# --------------------------------------------------------------------------- # -# Test helpers # -# --------------------------------------------------------------------------- # - - -class _AgentStub: - """Bare-minimum SupportsAgentRun stub for host construction.""" - - async def run(self, *_args: Any, **_kwargs: Any) -> None: # pragma: no cover - unused - return None - - -class _ChannelStub: - name = "stub" - path = "/stub" - - def contribute(self, _context: ChannelContext) -> ChannelContribution: - return ChannelContribution() - - -# --------------------------------------------------------------------------- # -# Runtime-mode auto-detection # -# --------------------------------------------------------------------------- # - - -class TestRuntimeModeDetection: - """``_detect_runtime_mode`` is pure: tests pass a synthetic env so - they never depend on the test runner's environment. Auto-detected - mode + matched marker drive the per-host startup banner so operators - can confirm the host is running in the expected shape.""" - - def test_no_markers_defaults_to_long_running(self) -> None: - mode, marker = _detect_runtime_mode(env={}) - assert mode == "long_running" - assert marker is None - - def test_foundry_marker_selects_ephemeral(self) -> None: - mode, marker = _detect_runtime_mode(env={"FOUNDRY_HOSTING_ENVIRONMENT": "production"}) - assert mode == "ephemeral" - assert marker == "FOUNDRY_HOSTING_ENVIRONMENT" - - def test_azure_functions_marker_selects_ephemeral(self) -> None: - mode, marker = _detect_runtime_mode(env={"AZURE_FUNCTIONS_ENVIRONMENT": "Development"}) - assert mode == "ephemeral" - assert marker == "AZURE_FUNCTIONS_ENVIRONMENT" - - def test_lambda_marker_selects_ephemeral(self) -> None: - mode, marker = _detect_runtime_mode(env={"AWS_LAMBDA_FUNCTION_NAME": "my-fn"}) - assert mode == "ephemeral" - assert marker == "AWS_LAMBDA_FUNCTION_NAME" - - def test_empty_marker_value_ignored(self) -> None: - # Empty-string env var should not count as "set" — Foundry's - # template uses unset-or-empty as "not deployed". - mode, marker = _detect_runtime_mode(env={"FOUNDRY_HOSTING_ENVIRONMENT": ""}) - assert mode == "long_running" - assert marker is None - - -class TestHostRuntimeMode: - """``runtime_mode`` ctor argument overrides auto-detect; ``None`` - triggers auto-detect. The detected mode is exposed via the - ``runtime_mode`` property for operator inspection (and is logged at - startup via ``_log_startup``).""" - - def test_explicit_long_running(self) -> None: - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - runtime_mode="long_running", - ) - assert host.runtime_mode == "long_running" - - def test_explicit_ephemeral_with_default_runner_raises(self) -> None: - # Default runner is in-process and not durable. Ephemeral - # deployments would silently lose pushes on scale-to-zero, so - # the host refuses the combination at construction unless the - # operator opts in explicitly via ``allow_in_process_runner``. - with pytest.raises(RuntimeError, match="ephemeral"): - AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - runtime_mode="ephemeral", - ) - - def test_explicit_ephemeral_with_in_process_opt_in_warns(self, caplog: pytest.LogCaptureFixture) -> None: - # The opt-in escape hatch keeps the old warn-and-proceed - # behaviour for local-dev / smoke-test scenarios that genuinely - # want ephemeral runtime semantics without a real durable - # backend. - with caplog.at_level("WARNING", logger="agent_framework.hosting"): - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - runtime_mode="ephemeral", - allow_in_process_runner=True, - ) - assert host.runtime_mode == "ephemeral" - assert any("ephemeral" in r.getMessage() and "InProcessTaskRunner" in r.getMessage() for r in caplog.records) - - def test_explicit_ephemeral_with_supplied_runner_does_not_warn(self, caplog: pytest.LogCaptureFixture) -> None: - runner = InProcessTaskRunner() - with caplog.at_level("WARNING", logger="agent_framework.hosting"): - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - runtime_mode="ephemeral", - durable_task_runner=runner, - ) - # No warning — operator opted into a specific runner. - assert host.runtime_mode == "ephemeral" - assert host.durable_task_runner is runner - assert not any("ephemeral" in r.getMessage() for r in caplog.records) - - def test_auto_detect_ephemeral_raises_without_opt_in(self, monkeypatch: pytest.MonkeyPatch) -> None: - # Auto-detected ephemeral flows through the same strict gate. - monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "production") - with pytest.raises(RuntimeError, match="ephemeral"): - AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) - - def test_auto_detect_ephemeral_with_opt_in_proceeds(self, monkeypatch: pytest.MonkeyPatch) -> None: - monkeypatch.setenv("FOUNDRY_HOSTING_ENVIRONMENT", "production") - host = AgentFrameworkHost( - target=_AgentStub(), - channels=[_ChannelStub()], - allow_in_process_runner=True, - ) - assert host.runtime_mode == "ephemeral" - - def test_default_runner_is_in_process_task_runner(self) -> None: - host = AgentFrameworkHost(target=_AgentStub(), channels=[_ChannelStub()]) - assert isinstance(host.durable_task_runner, InProcessTaskRunner) - - -# --------------------------------------------------------------------------- # -# InProcessTaskRunner # -# --------------------------------------------------------------------------- # - - -class TestInProcessTaskRunner: - async def test_schedule_runs_handler_and_records_succeeded(self) -> None: - runner = InProcessTaskRunner() - seen: list[Mapping[str, Any]] = [] - - async def handler(payload: Mapping[str, Any]) -> None: - seen.append(payload) - - runner.register("ping", handler) - handle = await runner.schedule("ping", {"x": 1}) - # ``schedule`` returns immediately; the task runs on the loop. - # Drain explicitly via ``shutdown`` to flush in-flight work, - # then assert. - await _drain(runner, handle) - assert seen == [{"x": 1}] - assert await runner.get(handle) == "succeeded" - - async def test_unknown_handler_raises_keyerror(self) -> None: - runner = InProcessTaskRunner() - with pytest.raises(KeyError): - await runner.schedule("missing", {}) - - async def test_register_after_start_raises(self) -> None: - runner = InProcessTaskRunner() - - async def noop(_p: Mapping[str, Any]) -> None: - return None - - runner.register("x", noop) - handle = await runner.schedule("x", {}) - await _drain(runner, handle) - # Re-registering after the runner has started scheduling is - # rejected so in-flight tasks can't have their handler swapped - # out from under them. - with pytest.raises(RuntimeError, match="register"): - runner.register("y", noop) - - async def test_handler_retried_then_succeeds(self) -> None: - runner = InProcessTaskRunner() - attempts = {"n": 0} - - async def flaky(_p: Mapping[str, Any]) -> None: - attempts["n"] += 1 - if attempts["n"] < 3: - raise RuntimeError(f"attempt {attempts['n']}") - - runner.register("flaky", flaky) - # Tight retry policy so the test doesn't sleep visibly. - policy = RetryPolicy(max_attempts=5, initial_backoff_seconds=0.001, max_backoff_seconds=0.005) - handle = await runner.schedule("flaky", {}, retry_policy=policy) - await _drain(runner, handle) - assert attempts["n"] == 3 - assert await runner.get(handle) == "succeeded" - - async def test_handler_failure_records_failed_after_max_attempts(self) -> None: - runner = InProcessTaskRunner() - - async def always_fails(_p: Mapping[str, Any]) -> None: - raise RuntimeError("nope") - - runner.register("doomed", always_fails) - policy = RetryPolicy(max_attempts=2, initial_backoff_seconds=0.001) - handle = await runner.schedule("doomed", {}, retry_policy=policy) - await _drain(runner, handle) - assert await runner.get(handle) == "failed" - - async def test_shutdown_cancels_pending_tasks(self) -> None: - runner = InProcessTaskRunner() - started = asyncio.Event() - cancelled = asyncio.Event() - - async def long_running(_p: Mapping[str, Any]) -> None: - started.set() - try: - # Sleep longer than the test wait so shutdown can cancel. - await asyncio.sleep(5) - except asyncio.CancelledError: - cancelled.set() - raise - - runner.register("long", long_running) - handle = await runner.schedule("long", {}) - await asyncio.wait_for(started.wait(), timeout=1.0) - await runner.shutdown(timeout=1.0) - assert cancelled.is_set() - assert await runner.get(handle) == "cancelled" - - async def test_shutdown_grace_drain_does_not_cancel_finishing_tasks(self) -> None: - """A short-lived task that completes within the grace window - must NOT receive a cancellation. The grace-period drain is the - graceful-shutdown contract — channels with goodbye-message - flushes rely on it.""" - runner = InProcessTaskRunner() - cancelled = asyncio.Event() - completed = asyncio.Event() - - async def quick(_p: Mapping[str, Any]) -> None: - try: - await asyncio.sleep(0.05) - except asyncio.CancelledError: - cancelled.set() - raise - completed.set() - - runner.register("quick", quick) - handle = await runner.schedule("quick", {}) - # Shutdown with a generous grace window relative to the task duration. - await runner.shutdown(timeout=1.0) - assert completed.is_set() - assert not cancelled.is_set() - assert await runner.get(handle) == "succeeded" - - async def test_get_returns_none_for_unknown_handle(self) -> None: - runner = InProcessTaskRunner() - handle = TaskHandle(task_id="never-scheduled", name="x") - assert await runner.get(handle) is None - - async def test_terminal_cache_evicts_oldest(self) -> None: - # Cache size of 2: drain three tasks in sequence, the first - # should age out by the time the third's terminal lands. - runner = InProcessTaskRunner(terminal_cache_size=2) - - async def noop(_p: Mapping[str, Any]) -> None: - return None - - runner.register("noop", noop) - h1 = await runner.schedule("noop", {}) - await _drain(runner, h1) - h2 = await runner.schedule("noop", {}) - await _drain(runner, h2) - h3 = await runner.schedule("noop", {}) - await _drain(runner, h3) - # Oldest handle's terminal status should be evicted by now. - assert await runner.get(h1) is None - assert await runner.get(h2) == "succeeded" - assert await runner.get(h3) == "succeeded" - - async def test_shutdown_is_safe_when_no_tasks_pending(self) -> None: - runner = InProcessTaskRunner() - # No-op shouldn't raise. - await runner.shutdown() - - def test_payload_mode_defaults_to_object(self) -> None: - # The in-process runner passes live Python references through - # the payload — the host wires this attribute into its codec - # validator at startup. Durable adapters that persist payloads - # must override this to ``JSON`` so the host refuses to ship - # un-serialisable references. - runner = InProcessTaskRunner() - assert runner.payload_mode == DurableTaskPayloadMode.OBJECT - - -# --------------------------------------------------------------------------- # -# Helpers # -# --------------------------------------------------------------------------- # - - -async def _drain(runner: InProcessTaskRunner, handle: TaskHandle, *, timeout: float = 1.0) -> None: - """Wait for ``handle`` to reach a terminal state. - - Polls ``get`` rather than reaching into runner internals so we exercise the - public surface from the test side too. - """ - deadline = asyncio.get_event_loop().time() + timeout - while True: - status = await runner.get(handle) - if status in ("succeeded", "failed", "cancelled"): - return - if asyncio.get_event_loop().time() > deadline: - raise AssertionError(f"task {handle.task_id} did not reach terminal in {timeout}s; status={status}") - await asyncio.sleep(0.01) diff --git a/python/packages/hosting/tests/test_runner_disk.py b/python/packages/hosting/tests/test_runner_disk.py deleted file mode 100644 index db566e7e3c9..00000000000 --- a/python/packages/hosting/tests/test_runner_disk.py +++ /dev/null @@ -1,278 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Tests for :class:`InProcessTaskRunner` disk persistence (``state_dir``).""" - -from __future__ import annotations - -import asyncio -from collections.abc import Mapping -from pathlib import Path -from typing import Any - -import pytest - -from agent_framework_hosting import ( - InProcessTaskRunner, - PushPayloadNotPicklable, - RetryPolicy, -) - -# Skip the whole module if the optional diskcache dependency isn't installed. -pytest.importorskip("diskcache") - - -# --------------------------------------------------------------------------- # -# state_dir=None preserves today's purely in-memory contract # -# --------------------------------------------------------------------------- # - - -async def test_state_dir_none_is_pure_memory(tmp_path: Path) -> None: - """No directory creation / no lock file when state_dir is omitted.""" - runner = InProcessTaskRunner() - calls: list[Mapping[str, Any]] = [] - - async def handler(payload: Mapping[str, Any]) -> None: - calls.append(payload) - - runner.register("echo", handler) - handle = await runner.schedule("echo", {"k": "v"}) - - # Wait for completion. - for _ in range(50): - if (await runner.get(handle)) == "succeeded": - break - await asyncio.sleep(0.01) - assert calls == [{"k": "v"}] - assert await runner.get(handle) == "succeeded" - # Confirm we didn't accidentally write to disk. - assert not (tmp_path / ".lock").exists() - - await runner.shutdown() - - -# --------------------------------------------------------------------------- # -# Lock contention — two runners on the same dir refuse to coexist # -# --------------------------------------------------------------------------- # - - -async def test_two_runners_one_state_dir_raise(tmp_path: Path) -> None: - """Second runner construction must fail loudly, not silently corrupt.""" - state_dir = tmp_path / "runner" - first = InProcessTaskRunner(state_dir=state_dir) - try: - with pytest.raises(RuntimeError, match="state lock"): - InProcessTaskRunner(state_dir=state_dir) - finally: - await first.shutdown() - - -# --------------------------------------------------------------------------- # -# Pickle failure raises eagerly, never silently downgrades # -# --------------------------------------------------------------------------- # - - -async def test_unpickleable_payload_raises(tmp_path: Path) -> None: - """Schedule must refuse payloads that can't survive a restart.""" - runner = InProcessTaskRunner(state_dir=tmp_path / "runner") - - async def handler(_: Mapping[str, Any]) -> None: ... - - runner.register("echo", handler) - # Local lambdas / closures are the canonical unpicklable values. - with pytest.raises(PushPayloadNotPicklable): - await runner.schedule("echo", {"callback": lambda: None}) - await runner.shutdown() - - -# --------------------------------------------------------------------------- # -# Resume — pending records replay on next process # -# --------------------------------------------------------------------------- # - - -async def test_pending_record_replays_on_resume(tmp_path: Path) -> None: - """Simulate a crash: first runner schedules but never starts running.""" - state_dir = tmp_path / "runner" - - # Process 1 — schedule a task, then "die" before the asyncio loop runs it. - runner1 = InProcessTaskRunner(state_dir=state_dir) - blocked = asyncio.Event() - - async def slow(_: Mapping[str, Any]) -> None: - # Sleep so the task is observably still in flight when we shutdown. - await blocked.wait() - - runner1.register("slow", slow) - handle = await runner1.schedule("slow", {"work": 1}) - # Force a hard shutdown — leaves the in-flight task in 'pending' on disk. - await runner1.shutdown(timeout=0.1) - - # Process 2 — fresh runner against same state_dir, register the handler, - # call resume. We expect the persisted record to be re-scheduled. - runner2 = InProcessTaskRunner(state_dir=state_dir) - seen: list[Mapping[str, Any]] = [] - - async def slow_resumed(payload: Mapping[str, Any]) -> None: - seen.append(dict(payload)) - - runner2.register("slow", slow_resumed) - replayed = await runner2.resume() - assert replayed == 1 - - # Give the resumed task time to run. - for _ in range(50): - if seen: - break - await asyncio.sleep(0.01) - assert seen == [{"work": 1}] - # Status is observable via the original handle. - assert await runner2.get(handle) == "succeeded" - - await runner2.shutdown() - - -# --------------------------------------------------------------------------- # -# echo_done cursor survives restart # -# --------------------------------------------------------------------------- # - - -async def test_payload_mutation_survives_restart(tmp_path: Path) -> None: - """Handler-side payload mutations (echo_done) round-trip through disk.""" - state_dir = tmp_path / "runner" - runner1 = InProcessTaskRunner(state_dir=state_dir) - - # Handler sets echo_done and then blocks forever (simulating mid-flight crash). - handler_progress = asyncio.Event() - - async def half_done(payload: Mapping[str, Any]) -> None: - # Mutate the payload to mark first phase complete. - payload["echo_done"] = True # type: ignore[index] - handler_progress.set() - # Sleep indefinitely so the asyncio task is still running at shutdown. - await asyncio.Event().wait() - - runner1.register("two_phase", half_done) - handle = await runner1.schedule("two_phase", {"echo_done": False, "k": "v"}) - await handler_progress.wait() - await runner1.shutdown(timeout=0.1) - - # Process 2 — replay; the handler now sees echo_done=True from disk. - runner2 = InProcessTaskRunner(state_dir=state_dir) - observed: list[bool] = [] - - async def two_phase_resumed(payload: Mapping[str, Any]) -> None: - observed.append(bool(payload.get("echo_done"))) - - runner2.register("two_phase", two_phase_resumed) - await runner2.resume() - - for _ in range(50): - if observed: - break - await asyncio.sleep(0.01) - assert observed == [True] - # And the resumed task ran to completion. - assert await runner2.get(handle) == "succeeded" - - await runner2.shutdown() - - -# --------------------------------------------------------------------------- # -# Resume gracefully handles missing handler / corrupt entries # -# --------------------------------------------------------------------------- # - - -async def test_resume_with_missing_handler_marks_failed(tmp_path: Path) -> None: - """A persisted record whose handler is no longer registered is marked failed.""" - state_dir = tmp_path / "runner" - - runner1 = InProcessTaskRunner(state_dir=state_dir) - - async def will_be_removed(_: Mapping[str, Any]) -> None: - await asyncio.Event().wait() - - runner1.register("ghost", will_be_removed) - handle = await runner1.schedule("ghost", {}) - await runner1.shutdown(timeout=0.1) - - # Process 2 — never registers "ghost". - runner2 = InProcessTaskRunner(state_dir=state_dir) - replayed = await runner2.resume() - assert replayed == 0 - # The record is moved to terminal 'failed'. - assert await runner2.get(handle) == "failed" - await runner2.shutdown() - - -async def test_resume_quarantines_corrupt_entries(tmp_path: Path) -> None: - """A non-dict on-disk entry must be quarantined, not crash resume.""" - import diskcache # noqa: PLC0415 - lazy import to keep module-import cheap - - state_dir = tmp_path / "runner" - state_dir.mkdir(parents=True, exist_ok=True) - # Pre-populate the cache with a junk entry. - cache = diskcache.Cache(str(state_dir)) - cache.set("bad-task-id", "this is not a dict") - cache.close() - - runner = InProcessTaskRunner(state_dir=state_dir) - # resume() must not raise even with a corrupt entry on disk. - replayed = await runner.resume() - assert replayed == 0 - await runner.shutdown() - - # The corrupt entry should have been removed. - cache2 = diskcache.Cache(str(state_dir)) - assert "bad-task-id" not in cache2 - cache2.close() - - -# --------------------------------------------------------------------------- # -# Retry attempt counter persists across resume # -# --------------------------------------------------------------------------- # - - -async def test_attempt_counter_persists_across_resume(tmp_path: Path) -> None: - """A handler that crashes mid-attempt resumes with the consumed budget.""" - state_dir = tmp_path / "runner" - policy = RetryPolicy(max_attempts=3, initial_backoff_seconds=0.01, backoff_multiplier=1.0) - - # Process 1 — schedule, fail once, shutdown before retry settles. - runner1 = InProcessTaskRunner(state_dir=state_dir, default_retry_policy=policy) - attempts_seen_p1 = 0 - - async def flaky(_: Mapping[str, Any]) -> None: - nonlocal attempts_seen_p1 - attempts_seen_p1 += 1 - raise RuntimeError("boom-1") - - runner1.register("flaky", flaky) - handle = await runner1.schedule("flaky", {}) - # Let it attempt twice (waste 2 of 3 budgeted retries), then crash-shutdown. - for _ in range(50): - if attempts_seen_p1 >= 2: - break - await asyncio.sleep(0.01) - await runner1.shutdown(timeout=0.05) - - # Process 2 — resume; only 1 attempt left in the budget. Confirm we don't - # re-grant the full retry budget. - runner2 = InProcessTaskRunner(state_dir=state_dir, default_retry_policy=policy) - attempts_seen_p2 = 0 - - async def flaky_resumed(_: Mapping[str, Any]) -> None: - nonlocal attempts_seen_p2 - attempts_seen_p2 += 1 - raise RuntimeError("boom-2") - - runner2.register("flaky", flaky_resumed) - await runner2.resume() - # Wait for the resumed task to consume its remaining attempts and fail terminally. - for _ in range(100): - if (await runner2.get(handle)) == "failed": - break - await asyncio.sleep(0.01) - assert await runner2.get(handle) == "failed" - # Original consumed 2 attempts; we should have allowed at most max_attempts-2=1 - # more in process 2. - assert attempts_seen_p2 <= 1 - await runner2.shutdown() diff --git a/python/packages/hosting/tests/test_types.py b/python/packages/hosting/tests/test_types.py index 3253c509059..2c77cbee6b5 100644 --- a/python/packages/hosting/tests/test_types.py +++ b/python/packages/hosting/tests/test_types.py @@ -4,63 +4,13 @@ from __future__ import annotations -from typing import Any - from agent_framework_hosting import ( - ChannelContribution, ChannelIdentity, ChannelRequest, - ChannelResponseContext, ChannelSession, - DurableTaskPayloadMode, - HostedRunResult, - ResponseTarget, - ResponseTargetKind, - apply_channel_response_hook, - apply_run_hook, ) -class TestResponseTarget: - def test_originating_default_singleton(self) -> None: - target = ResponseTarget.originating # type: ignore[attr-defined] - assert target.kind is ResponseTargetKind.ORIGINATING - assert target.targets == () - - def test_active_singleton(self) -> None: - target = ResponseTarget.active # type: ignore[attr-defined] - assert target.kind is ResponseTargetKind.ACTIVE - assert target.targets == () - - def test_all_linked_singleton(self) -> None: - target = ResponseTarget.all_linked # type: ignore[attr-defined] - assert target.kind is ResponseTargetKind.ALL_LINKED - - def test_none_singleton(self) -> None: - target = ResponseTarget.none # type: ignore[attr-defined] - assert target.kind is ResponseTargetKind.NONE - - def test_channel_builder_single(self) -> None: - target = ResponseTarget.channel("teams") - assert target.kind is ResponseTargetKind.CHANNELS - assert target.targets == ("teams",) - - def test_channels_builder_list(self) -> None: - target = ResponseTarget.channels(["teams", "telegram", "originating"]) - assert target.kind is ResponseTargetKind.CHANNELS - assert target.targets == ("teams", "telegram", "originating") - - def test_channels_builder_accepts_tuple(self) -> None: - target = ResponseTarget.channels(("a", "b")) - assert target.targets == ("a", "b") - - def test_target_is_hashable(self) -> None: - # Plain class — hashing falls back to identity, which is fine here: - # the two keys below are different instances (singleton vs builder). - d = {ResponseTarget.originating: 1, ResponseTarget.channel("t"): 2} # type: ignore[attr-defined] - assert len(d) == 2 - - class TestChannelRequest: def test_required_fields_only(self) -> None: req = ChannelRequest(channel="responses", operation="message.create", input="hi") @@ -74,17 +24,6 @@ def test_required_fields_only(self) -> None: assert req.attributes == {} assert req.stream is False assert req.identity is None - # Default response target is the originating singleton. - assert req.response_target.kind is ResponseTargetKind.ORIGINATING - - def test_default_response_target_is_originating_singleton(self) -> None: - # Every new request shares the module-level ``originating`` singleton - # by default — instances are intended to be treated as immutable, so - # sharing is safe and avoids per-request allocation. - a = ChannelRequest(channel="a", operation="op", input="x") - b = ChannelRequest(channel="b", operation="op", input="y") - assert a.response_target is ResponseTarget.originating # type: ignore[attr-defined] - assert a.response_target is b.response_target def test_with_session_and_identity(self) -> None: req = ChannelRequest( @@ -93,14 +32,12 @@ def test_with_session_and_identity(self) -> None: input="hi", session=ChannelSession(isolation_key="user:42"), identity=ChannelIdentity(channel="telegram", native_id="42"), - response_target=ResponseTarget.active, # type: ignore[attr-defined] ) assert req.session is not None assert req.session.isolation_key == "user:42" assert req.identity is not None assert req.identity.channel == "telegram" assert req.identity.native_id == "42" - assert req.response_target.kind is ResponseTargetKind.ACTIVE class TestChannelIdentity: @@ -111,228 +48,3 @@ def test_attributes_default_empty_mapping(self) -> None: def test_attributes_passthrough(self) -> None: ident = ChannelIdentity(channel="teams", native_id="abc", attributes={"role": "user"}) assert dict(ident.attributes) == {"role": "user"} - - -class _DummyTarget: - """Stand-in for the ``SupportsAgentRun | Workflow`` arg `apply_run_hook` forwards. - - `apply_run_hook` doesn't introspect the target — it just forwards - it as a kwarg to the user's hook — so a bare class is enough. - """ - - -class _DummyChannel: - name = "dummy" - path = "/dummy" - - def contribute(self, _context: Any) -> ChannelContribution: - return ChannelContribution() - - -class TestApplyChannelResponseHook: - async def test_originating_hook_receives_standard_context(self) -> None: - request = ChannelRequest(channel="discord", operation="message.create", input="hi") - payload = HostedRunResult("original") - captured: list[ChannelResponseContext] = [] - - async def hook( - result: HostedRunResult[Any], - *, - context: ChannelResponseContext, - ) -> HostedRunResult[Any]: - captured.append(context) - return result.replace(result="hooked") - - channel = _DummyChannel() - channel.response_hook = hook # type: ignore[attr-defined] - - shaped = await apply_channel_response_hook(channel, payload, request=request, originating=True) - - assert shaped.result == "hooked" - assert captured[0].request is request - assert captured[0].channel_name == "dummy" - assert captured[0].destination_identity is None - assert captured[0].originating is True - assert captured[0].is_echo is False - - async def test_non_originating_hook_can_clone_before_shaping(self) -> None: - request = ChannelRequest(channel="responses", operation="message.create", input="hi") - identity = ChannelIdentity(channel="dummy", native_id="user-1") - payload = HostedRunResult("original") - seen_payloads: list[HostedRunResult[Any]] = [] - seen_contexts: list[ChannelResponseContext] = [] - - def hook( - result: HostedRunResult[Any], - *, - context: ChannelResponseContext, - ) -> HostedRunResult[Any]: - seen_payloads.append(result) - seen_contexts.append(context) - return result.replace(result="hooked") - - channel = _DummyChannel() - channel.response_hook = hook # type: ignore[attr-defined] - - shaped = await apply_channel_response_hook( - channel, - payload, - request=request, - destination_identity=identity, - originating=False, - is_echo=True, - clone=True, - ) - - assert seen_payloads[0] is not payload - assert shaped.result == "hooked" - assert seen_contexts[0].destination_identity is identity - assert seen_contexts[0].originating is False - assert seen_contexts[0].is_echo is True - - async def test_missing_hook_returns_payload_or_clone(self) -> None: - request = ChannelRequest(channel="responses", operation="message.create", input="hi") - payload = HostedRunResult("original") - channel = _DummyChannel() - - same = await apply_channel_response_hook(channel, payload, request=request, originating=True) - cloned = await apply_channel_response_hook(channel, payload, request=request, originating=True, clone=True) - - assert same is payload - assert cloned is not payload - assert cloned.result == payload.result - - -class TestApplyRunHook: - """`apply_run_hook` is the channel-side helper that invokes a - `ChannelRunHook` with the standard kwargs (`request` positional, - `target` / `protocol_request` keyword). Channels call this rather - than calling the hook directly so the convention is enforced in - one place. Cover both branching paths (sync vs async hook return) - and assert kwargs forwarding so a regression that drops `target` - or `protocol_request` is caught.""" - - async def test_sync_hook_returning_modified_request(self) -> None: - captured: dict[str, Any] = {} - - def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: - # Snapshot the kwargs for the assertion below, then return a - # NEW request so we also verify the helper passes the - # replacement straight through (no merging / mutation). - captured["target"] = kwargs.get("target") - captured["protocol_request"] = kwargs.get("protocol_request") - return ChannelRequest(channel=request.channel, operation="HOOK_TOUCHED", input=request.input) - - original = ChannelRequest(channel="responses", operation="op", input="hi") - target = _DummyTarget() - proto = {"raw": "payload"} - - result = await apply_run_hook(hook, original, target=target, protocol_request=proto) - - assert result is not original - assert result.operation == "HOOK_TOUCHED" - assert captured["target"] is target - assert captured["protocol_request"] is proto - - async def test_async_hook_returning_modified_request(self) -> None: - captured: dict[str, Any] = {} - - async def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: - captured["target"] = kwargs.get("target") - captured["protocol_request"] = kwargs.get("protocol_request") - # Return an awaitable result to exercise the async branch - # (`isinstance(result, Awaitable) → await it`). - return ChannelRequest(channel=request.channel, operation="ASYNC_HOOK", input=request.input) - - original = ChannelRequest(channel="telegram", operation="op", input="hi") - target = _DummyTarget() - proto = {"update_id": 42} - - result = await apply_run_hook(hook, original, target=target, protocol_request=proto) - - assert result.operation == "ASYNC_HOOK" - assert captured["target"] is target - assert captured["protocol_request"] is proto - - async def test_protocol_request_can_be_none(self) -> None: - """Channels that don't have a raw protocol payload (e.g. CLI / test - harness invocations) pass ``protocol_request=None``; the helper - forwards it as-is so hooks can ``if protocol_request is None`` to - gate channel-specific logic.""" - captured: dict[str, Any] = {} - - async def hook(request: ChannelRequest, **kwargs: Any) -> ChannelRequest: - captured["protocol_request"] = kwargs.get("protocol_request") - captured["protocol_request_in_kwargs"] = "protocol_request" in kwargs - return request - - await apply_run_hook( - hook, - ChannelRequest(channel="x", operation="op", input="hi"), - target=_DummyTarget(), - protocol_request=None, - ) - - assert captured["protocol_request"] is None - assert captured["protocol_request_in_kwargs"] is True - - -class TestDurableTaskPayloadMode: - """``DurableTaskPayloadMode`` distinguishes object-mode (in-process, - live references) from JSON-mode (durable persistence, channel codec - required) runners. The host's startup validator uses the value to - refuse misconfigured deployments.""" - - def test_enum_values(self) -> None: - assert DurableTaskPayloadMode.OBJECT.value == "object" - assert DurableTaskPayloadMode.JSON.value == "json" - # Both members; no surprise additions until we ship a third - # adapter style. - assert set(DurableTaskPayloadMode) == {DurableTaskPayloadMode.OBJECT, DurableTaskPayloadMode.JSON} - - -class TestResponseTargetIdentities: - """``ResponseTarget.identity``/``.identities`` carry full - :class:`ChannelIdentity` objects (incl. attributes) so destination - channels that need conversation/thread metadata (Teams, Slack, Bot - Framework) don't have to encode it through string tokens.""" - - def test_identity_single(self) -> None: - ident = ChannelIdentity(channel="teams", native_id="user@contoso", attributes={"tenant_id": "abc"}) - target = ResponseTarget.identity(ident) - assert target.kind is ResponseTargetKind.IDENTITIES - assert len(target.target_identities) == 1 - assert target.target_identities[0].channel == "teams" - assert target.target_identities[0].native_id == "user@contoso" - assert dict(target.target_identities[0].attributes) == {"tenant_id": "abc"} - - def test_identities_list_preserves_attributes(self) -> None: - ident_a = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t1"}) - ident_b = ChannelIdentity(channel="slack", native_id="u2", attributes={"channel_id": "c2"}) - target = ResponseTarget.identities([ident_a, ident_b]) - assert target.kind is ResponseTargetKind.IDENTITIES - assert len(target.target_identities) == 2 - assert dict(target.target_identities[0].attributes) == {"thread": "t1"} - assert dict(target.target_identities[1].attributes) == {"channel_id": "c2"} - - def test_identity_value_equality_matches_on_attributes(self) -> None: - # Two ``ResponseTarget.identity`` values built independently - # compare equal when the underlying ``ChannelIdentity`` content - # matches — important because tests and channel parsers use - # ``==`` on targets. - ident_a = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t1"}) - ident_b = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t1"}) - assert ResponseTarget.identity(ident_a) == ResponseTarget.identity(ident_b) - # Different attributes → not equal. - ident_c = ChannelIdentity(channel="teams", native_id="u1", attributes={"thread": "t2"}) - assert ResponseTarget.identity(ident_a) != ResponseTarget.identity(ident_c) - - def test_identity_repr_includes_targets(self) -> None: - ident = ChannelIdentity(channel="teams", native_id="u1") - rep = repr(ResponseTarget.identity(ident)) - assert "ResponseTarget.identities" in rep - - def test_identity_echo_input_flag(self) -> None: - ident = ChannelIdentity(channel="teams", native_id="u1") - target = ResponseTarget.identity(ident, echo_input=True) - assert target.echo_input is True diff --git a/python/pyproject.toml b/python/pyproject.toml index e450bb0dec1..19d33355289 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -91,7 +91,6 @@ agent-framework-hosting = { workspace = true } agent-framework-hosting-invocations = { workspace = true } agent-framework-hosting-telegram = { workspace = true } agent-framework-hosting-activity-protocol = { workspace = true } -agent-framework-hosting-entra = { workspace = true } agent-framework-hosting-discord = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } @@ -220,7 +219,6 @@ executionEnvironments = [ { root = "packages/hosting-invocations/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-telegram/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-activity-protocol/tests", reportPrivateUsage = "none" }, - { root = "packages/hosting-entra/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/samples/04-hosting/af-hosting/README.md b/python/samples/04-hosting/af-hosting/README.md index ad368b9d445..6c812a997ca 100644 --- a/python/samples/04-hosting/af-hosting/README.md +++ b/python/samples/04-hosting/af-hosting/README.md @@ -8,7 +8,7 @@ The general hosting plumbing lives in its own package (`agent-framework-hosting-responses`, `agent-framework-hosting-invocations`, `agent-framework-hosting-telegram`, `agent-framework-hosting-activity-protocol`, -`agent-framework-hosting-entra`). +`agent-framework-hosting-discord`). | Sample | What it shows | Packaging | |---|---|---| @@ -16,8 +16,7 @@ its own package (`agent-framework-hosting-responses`, | [`local_responses_workflow/`](./local_responses_workflow) | A 4-step `Workflow` (typed `SloganBrief` intake → writer → legal → formatter) hosted behind **both** the Responses and Invocations channels via a shared `run_hook` that parses inbound text/JSON into the workflow's typed input. The host writes per-conversation checkpoints via `checkpoint_location=…`. Demonstrates workflow targets + structured input adaptation + multi-channel + resume-across-turns. Includes a `call_server.rest` file with REST examples for both endpoints. | **Local only.** | | [`foundry_hosted_agent/`](./foundry_hosted_agent) | One Foundry agent, **Responses + Invocations only** — the minimal shape that is **runtime-compatible with the Foundry Hosted Agents platform**. | Ships with `Dockerfile` + `agent.yaml` + `agent.manifest.yaml` + `azure.yaml` so the same image runs locally **or** as a Foundry Hosted Agent (`azd up`). | | [`foundry_telegram_invocations_weather/`](./foundry_telegram_invocations_weather) | Experimental Telegram weather bot that mounts `TelegramChannel` at `POST /invocations`, registers the Foundry Hosted Agents Invocations URL as the Telegram webhook, and uses `FoundryHostedAgentHistoryProvider` for storage. | Ships with `Dockerfile` + `agent.yaml` + `agent.manifest.yaml` + `azure.yaml`; used to validate whether a non-Responses channel can run under Foundry Invocations. | -| [`local_telegram/`](./local_telegram) | Adds Telegram, a `@tool`, `FileHistoryProvider`, run hooks (per-user / per-chat session keying), extra Telegram commands, and `ResponseTarget` multicast. Runs under Hypercorn with multiple workers. | **Local only.** No Dockerfile / Foundry packaging. | -| [`local_identity_link/`](./local_identity_link) | Everything in `local_telegram/` plus Teams and the Entra identity-link sidecar (`/auth/start` + `/auth/callback`). Demonstrates linking a Telegram chat to an Entra user so multiple non-Entra channels can share one isolation key. | **Local only.** No Dockerfile / Foundry packaging. | +| [`local_telegram/`](./local_telegram) | Adds Telegram, a `@tool`, `FileHistoryProvider`, run hooks (per-user / per-chat session keying), and extra Telegram commands. Runs under Hypercorn with multiple workers. | **Local only.** No Dockerfile / Foundry packaging. | Each sample is fully self-contained — its own `pyproject.toml`, `uv.lock`, server `app.py`, calling script(s), and `storage/` directory. Every @@ -40,9 +39,9 @@ involved**. | Aspect | `af-hosting/` (this directory) | `foundry-hosted-agents/` | |---|---|---| -| Server stack | `agent-framework-hosting` + per-channel packages (`-responses`, `-invocations`, `-telegram`, `-activity-protocol`, `-entra`) | `agent-framework-hosted` only — the Foundry Hosted Agents runtime owns the HTTP surface | -| Channels other than Responses / Invocations | Yes — Telegram, Activity Protocol (Teams), Entra identity-linking | No — the platform exposes Responses + Invocations only | -| Run target | Local Hypercorn (`local_responses/`, `local_telegram/`, `local_identity_link/`); Hosted Agents *or* local (`foundry_hosted_agent/`) | Hosted Agents *or* local container; targets the Hosted Agents platform contract | +| Server stack | `agent-framework-hosting` + per-channel packages (`-responses`, `-invocations`, `-telegram`, `-activity-protocol`, `-discord`) | `agent-framework-hosted` only — the Foundry Hosted Agents runtime owns the HTTP surface | +| Channels other than Responses / Invocations | Yes — Telegram, Activity Protocol (Teams), Discord | No — the platform exposes Responses + Invocations only | +| Run target | Local Hypercorn (`local_responses/`, `local_telegram/`); Hosted Agents *or* local (`foundry_hosted_agent/`) | Hosted Agents *or* local container; targets the Hosted Agents platform contract | | When to pick this | You need extra channels (Telegram/Teams via Activity Protocol/…), custom hosting middleware, or want to run outside the Foundry runtime | You only need Responses/Invocations and want zero hosting boilerplate, leveraging the Foundry-managed surface | `foundry_hosted_agent/` is the bridge sample: it uses the diff --git a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md index 2fb766ae3bd..97e174455dc 100644 --- a/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md +++ b/python/samples/04-hosting/af-hosting/foundry_hosted_agent/README.md @@ -25,10 +25,8 @@ is unset) it transparently falls back to an in-memory store, so the same code runs in dev. Writes are a no-op — Foundry persists Responses turns authoritatively as the runtime executes them. -For richer scenarios (custom tools, history providers, run hooks, -multicast, Telegram, Teams, identity linking) see -[`../local_telegram`](../local_telegram) and -[`../local_identity_link`](../local_identity_link). +For richer local scenarios (custom tools, history providers, run hooks, +Telegram, and Activity Protocol) see [`../local_telegram`](../local_telegram). ## Layout diff --git a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py index a7c4ef4007f..8c5ebf44cf5 100644 --- a/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py +++ b/python/samples/04-hosting/af-hosting/foundry_telegram_invocations_weather/app.py @@ -162,7 +162,6 @@ def build_host() -> AgentFrameworkHost: # 3. Register Telegram at /invocations and keep Responses available for sanity checks. return AgentFrameworkHost( target=agent, - allow_in_process_runner=True, channels=[ ResponsesChannel(response_id_factory=foundry_response_id), TelegramChannel( diff --git a/python/samples/04-hosting/af-hosting/local_identity_link/README.md b/python/samples/04-hosting/af-hosting/local_identity_link/README.md deleted file mode 100644 index 33a706386d8..00000000000 --- a/python/samples/04-hosting/af-hosting/local_identity_link/README.md +++ /dev/null @@ -1,67 +0,0 @@ -# local_identity_link — every channel, plus identity linking - -The full surface: Responses + Invocations + Telegram + Activity Protocol (Teams) + the Entra -identity-link sidecar. The Entra channel exposes -`/auth/start` + `/auth/callback` so users on Telegram (or any non-Entra -channel) can bind their per-channel id to a stable `entra:` isolation -key. Channel run-hooks then rewrite incoming requests to use the linked -key, so a chat started on Telegram and a chat started on Teams that both -resolve to the same Entra user share one history. - -## Run - -```bash -export FOUNDRY_PROJECT_ENDPOINT=https://.services.ai.azure.com -export FOUNDRY_MODEL=gpt-4o -export TELEGRAM_BOT_TOKEN=... -# Entra app registration (confidential client): -export ENTRA_TENANT_ID=... -export ENTRA_CLIENT_ID=... -export ENTRA_CLIENT_SECRET=... # or: -# export ENTRA_CERTIFICATE_PATH=./teams-bot.pem -export PUBLIC_BASE_URL=https:// # used to mint redirect_uri -# Teams (optional — same tenant): -export TEAMS_APP_ID=... -export TEAMS_APP_PASSWORD=... - -az login - -uv sync -uv run hypercorn app:app \ - --bind 0.0.0.0:8000 \ - --workers 4 -``` - -## Identity link - -Register `https:///auth/callback` as the redirect URI on your -Entra app, then visit (replace ```` with the Telegram numeric -chat id): - -``` -https:///auth/start?channel=telegram&id= -``` - -After sign-in, subsequent Telegram messages from that chat resolve to the -linked Entra user. - -## Call locally - -```bash -uv sync --group dev - -# Default: post a Responses request as `local-dev`. -uv run python call_server.py "What is the weather in Tokyo?" - -# Resume any session by id, including a Telegram one (works because -# the Telegram run-hook writes sessions under telegram:): -uv run python call_server.py --previous-response-id telegram:8741188429 "What did we discuss?" - -# Multicast to a Telegram chat in parallel with the local response: -uv run python call_server.py --telegram-chat-id 8741188429 "Heads up." -``` - -> This sample is **local-only** — it shows the `agent-framework-hosting` -> server stack as a standalone process. For a Foundry-Hosted-Agents-compatible -> packaging (Dockerfile + `agent.yaml` + `azure.yaml`), see -> [`foundry_hosted_agent/`](../foundry_hosted_agent). diff --git a/python/samples/04-hosting/af-hosting/local_identity_link/app.py b/python/samples/04-hosting/af-hosting/local_identity_link/app.py deleted file mode 100644 index a8bba348eea..00000000000 --- a/python/samples/04-hosting/af-hosting/local_identity_link/app.py +++ /dev/null @@ -1,395 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Complete multi-channel hosting sample with unified Entra ID identity. - -Wires every built-in channel onto a single ``AgentFrameworkHost`` and -demonstrates a pattern for collapsing per-channel identifiers into a single -**Microsoft Entra ID** (object id) key so a user's history follows them -across surfaces. - -Identity resolution -------------------- -Each request is bucketed under one ``isolation_key`` for ``FileHistoryProvider``: - -- **Teams** is the source of truth. Inbound activities carry the user's - ``aadObjectId``; we promote it to ``entra:`` in the Teams ``run_hook``. -- **Telegram** has no built-in OAuth identity. Users link their chat to - their Entra ID by sending ``/link``; the bot replies with a one-shot - authorize URL served by the host's ``EntraIdentityLinkChannel``. After the - OAuth callback the mapping ``telegram: → entra:`` is - persisted to ``identity_links.json`` and every later Telegram turn is - bucketed under the user's Entra key. -- **Responses API** callers can pass ``entra_oid`` directly (top-level or - in ``metadata``), or pass ``safety_identifier`` and rely on the same - store (``responses: → entra:``). Otherwise we fall back - to ``responses:``. - -Required environment --------------------- -- ``FOUNDRY_PROJECT_ENDPOINT`` / ``FOUNDRY_MODEL`` — agent backing. -- ``TELEGRAM_BOT_TOKEN`` — required to enable the Telegram channel. -- ``TEAMS_APP_ID`` / ``TEAMS_APP_PASSWORD`` — optional; without them the - Teams channel runs in dev mode (Bot Framework Emulator only). -- ``ENTRA_TENANT_ID`` / ``ENTRA_CLIENT_ID`` plus **either** - ``ENTRA_CLIENT_SECRET`` **or** ``ENTRA_CERT_PATH`` - (+ optional ``ENTRA_CERT_PASSWORD``) — required to enable the ``/link`` - flow. The app's redirect URI must be registered as - ``{PUBLIC_BASE_URL}/auth/callback`` in your Entra app. -- ``PUBLIC_BASE_URL`` — externally reachable base of this host (e.g. - ``https://my-host.example.com``). Defaults to ``http://localhost:8000``. - -Run ---- -This module exposes ``app`` as the canonical ASGI surface. Recommended -production launch is **Hypercorn**:: - - hypercorn app:app --bind 0.0.0.0:8000 --workers 4 - -The ``__main__`` block below uses ``host.serve(...)`` (single-process -Hypercorn) as a local-dev fallback. -""" - -from __future__ import annotations - -import logging -import os -from collections.abc import Mapping -from dataclasses import replace -from pathlib import Path -from typing import Annotated, Any - -from agent_framework import Agent, FileHistoryProvider, tool -from agent_framework_foundry import FoundryChatClient -from agent_framework_hosting import ( - AgentFrameworkHost, - Channel, - ChannelCommand, - ChannelCommandContext, - ChannelRequest, - ChannelSession, -) -from agent_framework_hosting_activity_protocol import ActivityProtocolChannel -from agent_framework_hosting_entra import ( - EntraIdentityLinkChannel, - EntraIdentityStore, - entra_isolation_key, -) -from agent_framework_hosting_invocations import InvocationsChannel -from agent_framework_hosting_responses import ResponsesChannel -from agent_framework_hosting_telegram import TelegramChannel -from azure.identity.aio import DefaultAzureCredential - -logger = logging.getLogger("agent_framework.hosting.complete_app") - -SESSIONS_DIR = Path(__file__).resolve().parent / "storage" / "sessions" -SESSIONS_DIR.mkdir(parents=True, exist_ok=True) -IDENTITY_STORE_PATH = Path(__file__).resolve().parent / "storage" / "identity_links.json" - - -# --------------------------------------------------------------------------- # -# Tools -# --------------------------------------------------------------------------- # - - -@tool(approval_mode="never_require") -def lookup_weather( - location: Annotated[str, "The city to look up weather for."], -) -> str: - """Return a deterministic weather report for a city.""" - reports = { - "Seattle": "Seattle is rainy with a high of 13°C.", - "Amsterdam": "Amsterdam is cloudy with a high of 16°C.", - "Tokyo": "Tokyo is clear with a high of 22°C.", - } - return reports.get(location, f"{location} is sunny with a high of 20°C.") - - -# --------------------------------------------------------------------------- # -# Run hooks: collapse per-channel identifiers down to a single Entra ID key -# --------------------------------------------------------------------------- # - - -def _replace_session(request: ChannelRequest, isolation_key: str) -> ChannelRequest: - return replace(request, session=ChannelSession(isolation_key=isolation_key)) - - -def make_activity_hook() -> Any: - """Promote ``aadObjectId`` from the inbound Activity to ``entra:``. - - The Activity Protocol channel is treated as the **primary** identity - source for Teams traffic: every authenticated Teams user has an Entra - object id, and we trust it directly without consulting the link store. - """ - - def _hook( - request: ChannelRequest, - *, - protocol_request: Mapping[str, Any] | None = None, - **_: object, - ) -> ChannelRequest: - activity = protocol_request or {} - from_ = activity.get("from") if isinstance(activity, Mapping) else None - oid = from_.get("aadObjectId") if isinstance(from_, Mapping) else None - if oid: - return _replace_session(request, entra_isolation_key(oid)) - # Unauthenticated channels (web chat, emulator) — fall back to the - # per-conversation key the channel already set. - return request - - return _hook - - -def make_telegram_hook(store: EntraIdentityStore) -> Any: - """Resolve identity then bump reasoning effort. - - The reasoning bump applies to **every** Telegram request — linked or - not — so the high-effort preset isn't silently lost the moment a - user runs ``/link`` (which is the headline feature of this sample). - Identity resolution and option mutation are separate concerns: we - swap the session if a link exists, then upgrade the options on the - way out either way. - """ - - def _hook(request: ChannelRequest, **_: object) -> ChannelRequest: - chat_id = request.attributes.get("chat_id") - if chat_id is not None: - linked = store.lookup(f"telegram:{chat_id}") - if linked is not None: - request = _replace_session(request, linked) - # Bump reasoning effort regardless of identity (linked or not). - options = dict(request.options or {}) - options["reasoning"] = {"effort": "high", "summary": "detailed"} - return replace(request, options=options) - - return _hook - - -def make_responses_hook(store: EntraIdentityStore) -> Any: - """Same identity resolution as Telegram/Teams, plus the usual option scrub. - - Resolution order: - 1. Body ``entra_oid`` (top-level or in ``metadata``) — a caller already - knows the user's Entra id. - 2. ``safety_identifier`` (or legacy ``user``) looked up in the link - store as ``responses:``. - 3. Fallback ``responses:``. - - .. WARNING:: - DEV ONLY. The ``entra_oid`` shortcut treats a client-supplied - identity claim as authoritative with **no token verification**: - any Responses caller can claim to be any user and read that - user's history bucket. Production deployments must either: - - - Drop this shortcut entirely and rely on ``safety_identifier`` - + the link store (i.e. force every caller through the OAuth - identity-link flow), or - - Add a JWT validator that verifies an inbound Authorization - header, extracts the verified ``oid`` claim, and feeds *that* - into ``entra_isolation_key`` — never trust a body field for - identity in a multi-tenant deployment. - - This shortcut exists only so the sample's smoke tests can pin - an isolation key without spinning up an Entra app registration. - """ - - def _hook( - request: ChannelRequest, - *, - protocol_request: Mapping[str, Any] | None = None, - **_: object, - ) -> ChannelRequest: - options = dict(request.options or {}) - options.pop("temperature", None) - options.pop("store", None) - - body = protocol_request or {} - metadata = body.get("metadata") if isinstance(body.get("metadata"), dict) else {} - - # WARNING (DEV ONLY): client-supplied entra_oid is trusted with - # NO verification. Production code must verify a JWT instead. - explicit_oid = body.get("entra_oid") or metadata.get("entra_oid") - safety_id = body.get("safety_identifier") or body.get("user") or "anonymous" - - if explicit_oid: - isolation_key = entra_isolation_key(explicit_oid) - else: - isolation_key = store.lookup(f"responses:{safety_id}") or f"responses:{safety_id}" - - return replace( - request, - session=ChannelSession(isolation_key=isolation_key), - options=options or None, - ) - - return _hook - - -# --------------------------------------------------------------------------- # -# Telegram commands -# --------------------------------------------------------------------------- # - - -def make_commands( - host_ref: dict[str, AgentFrameworkHost], - store: EntraIdentityStore, - linker_ref: dict[str, EntraIdentityLinkChannel | None], -) -> list[ChannelCommand]: - def _telegram_key(ctx: ChannelCommandContext) -> str: - chat_id = ctx.request.attributes.get("chat_id") - return f"telegram:{chat_id}" - - def _isolation_for(ctx: ChannelCommandContext) -> str: - # Honour any existing link so /new resets the right bucket. - return store.lookup(_telegram_key(ctx)) or _telegram_key(ctx) - - async def handle_start(ctx: ChannelCommandContext) -> None: - await ctx.reply( - "Hi! I'm a multi-channel agent.\nCommands: /link, /unlink, /new, /whoami, /weather , /help." - ) - - async def handle_help(ctx: ChannelCommandContext) -> None: - await ctx.reply( - "/link — bind this chat to your Entra ID for shared history\n" - "/unlink — unbind this chat\n" - "/new — start a fresh conversation\n" - "/whoami — show your isolation key\n" - "/weather — call the weather tool directly\n" - "/help — this message" - ) - - async def handle_link(ctx: ChannelCommandContext) -> None: - linker = linker_ref.get("linker") - if linker is None: - await ctx.reply( - "Identity linking is not configured on this host. " - "Set ENTRA_TENANT_ID, ENTRA_CLIENT_ID, and either " - "ENTRA_CLIENT_SECRET or ENTRA_CERTIFICATE_PATH." - ) - return - chat_id = ctx.request.attributes.get("chat_id") - if chat_id is None: - # Without a chat_id we'd format "telegram:None" into the - # authorize URL, OAuth would complete, and the store would - # gain a poisoned `telegram:None` entry that any later - # chat_id-less message would collapse onto. Refuse instead. - await ctx.reply("Couldn't determine your Telegram chat id; please retry from a 1:1 chat with the bot.") - return - url = linker.authorize_url_for("telegram", str(chat_id)) - await ctx.reply("Open this link to bind this chat to your Microsoft account:\n" + url) - - async def handle_unlink(ctx: ChannelCommandContext) -> None: - await store.unlink(_telegram_key(ctx)) - await ctx.reply("This chat is no longer linked. New messages will use the chat-only key.") - - async def handle_new(ctx: ChannelCommandContext) -> None: - host_ref["host"].reset_session(_isolation_for(ctx)) - await ctx.reply("New session started. Previous history is cleared.") - - async def handle_whoami(ctx: ChannelCommandContext) -> None: - key = _isolation_for(ctx) - if key.startswith("entra:"): - await ctx.reply(f"This chat is linked. Isolation key: {key}") - else: - await ctx.reply(f"This chat is not linked to an Entra ID. Isolation key: {key}\nSend /link to bind it.") - - async def handle_weather(ctx: ChannelCommandContext) -> None: - command_text = ctx.request.input if isinstance(ctx.request.input, str) else "" - _, _, location = command_text.partition(" ") - location = location.strip() or "Seattle" - await ctx.reply(lookup_weather(location=location)) - - return [ - ChannelCommand("start", "Introduce the bot", handle_start), - ChannelCommand("help", "List available commands", handle_help), - ChannelCommand("link", "Bind this chat to your Microsoft account", handle_link), - ChannelCommand("unlink", "Unbind this chat from any Microsoft account", handle_unlink), - ChannelCommand("new", "Start a new session for this chat", handle_new), - ChannelCommand("whoami", "Show the isolation key for this chat", handle_whoami), - ChannelCommand("weather", "Call the weather tool: /weather ", handle_weather), - ] - - -# --------------------------------------------------------------------------- # -# Host wiring -# --------------------------------------------------------------------------- # - - -def build_host() -> AgentFrameworkHost: - agent = Agent( - client=FoundryChatClient(credential=DefaultAzureCredential()), - name="WeatherAgent", - instructions=( - "You are a friendly weather assistant. Use the lookup_weather tool " - "for any weather question and answer in one short sentence." - ), - tools=[lookup_weather], - context_providers=[FileHistoryProvider(SESSIONS_DIR)], - default_options={"store": False}, - ) - - store = EntraIdentityStore(IDENTITY_STORE_PATH) - - # Optional Entra-OAuth identity linker. Pick exactly one credential mode: - # ENTRA_CLIENT_SECRET *or* ENTRA_CERT_PATH (+ optional ENTRA_CERT_PASSWORD). - # When unconfigured, /link tells the user the feature is disabled and the - # host runs without a linker. - tenant_id = os.environ.get("ENTRA_TENANT_ID") - client_id = os.environ.get("ENTRA_CLIENT_ID") - client_secret = os.environ.get("ENTRA_CLIENT_SECRET") - cert_path = os.environ.get("ENTRA_CERTIFICATE_PATH") - cert_password_env = os.environ.get("ENTRA_CERTIFICATE_PASSWORD") - public_base_url = os.environ.get("PUBLIC_BASE_URL", "http://localhost:8000") - - linker: EntraIdentityLinkChannel | None = None - if tenant_id and client_id and (client_secret or cert_path): - linker = EntraIdentityLinkChannel( - store=store, - tenant_id=tenant_id, - client_id=client_id, - client_secret=client_secret, - certificate_path=cert_path, - certificate_password=cert_password_env.encode() if cert_password_env else None, - public_base_url=public_base_url, - ) - - host_ref: dict[str, AgentFrameworkHost] = {} - linker_ref: dict[str, EntraIdentityLinkChannel | None] = {"linker": linker} - - channels: list[Channel] = [ - ResponsesChannel(run_hook=make_responses_hook(store)), - InvocationsChannel(), - ActivityProtocolChannel( - app_id=os.environ.get("TEAMS_APP_ID"), - tenant_id=os.environ.get("TEAMS_TENANT_ID", "botframework.com"), - # Use either a client secret OR a certificate. Cert is required - # for tenants that disallow secrets — see the package README for - # an `openssl` recipe to generate one. - app_password=os.environ.get("TEAMS_APP_PASSWORD"), - certificate_path=os.environ.get("TEAMS_CERT_PATH"), - certificate_password=( - os.environ["TEAMS_CERT_PASSWORD"].encode() if os.environ.get("TEAMS_CERT_PASSWORD") else None - ), - run_hook=make_activity_hook(), - ), - TelegramChannel( - bot_token=os.environ["TELEGRAM_BOT_TOKEN"], - webhook_url=os.environ.get("TELEGRAM_WEBHOOK_URL"), - secret_token=os.environ.get("TELEGRAM_WEBHOOK_SECRET"), - parse_mode="Markdown", - commands=make_commands(host_ref, store, linker_ref), - run_hook=make_telegram_hook(store), - ), - ] - if linker is not None: - channels.append(linker) - - host = AgentFrameworkHost(target=agent, channels=channels, debug=True) - host_ref["host"] = host - return host - - -app = build_host().app - - -if __name__ == "__main__": - build_host().serve(host="0.0.0.0", port=int(os.environ.get("PORT", "8000"))) diff --git a/python/samples/04-hosting/af-hosting/local_identity_link/call_server.py b/python/samples/04-hosting/af-hosting/local_identity_link/call_server.py deleted file mode 100644 index 766714a6507..00000000000 --- a/python/samples/04-hosting/af-hosting/local_identity_link/call_server.py +++ /dev/null @@ -1,72 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Local client for the **complete** server (``app.py`` in this folder). - -Demonstrates the two most distinctive flows the complete sample adds on top -of the advanced sample: - -1. **Identity-linked Telegram resume.** Pass ``--previous-response-id - telegram:`` to resume a Telegram chat's history through the - Responses endpoint — this only works once the user has linked their - Telegram chat to their Entra account via the - ``EntraIdentityLinkChannel`` (visit ``/auth/start?channel=telegram&id=...`` - in the browser first). -2. **Multicast via ``response_target``.** Pass ``--telegram-chat-id`` to - have the host fan out the agent reply to a Telegram chat in addition - to returning it on the local wire. Drop ``--include-originating`` to - send only to Telegram and have the local response reduced to a small - acknowledgement. - -Start the server first (in another shell):: - - cd local_identity_link && uv run python app.py - -Then:: - - python call_server.py "What is the weather in Tokyo?" - python call_server.py --previous-response-id telegram:8741188429 "What did we discuss?" - python call_server.py --telegram-chat-id 8741188429 "Heads up, sending to your phone too." -""" - -from __future__ import annotations - -import argparse - -from openai import OpenAI - -BASE_URL = "http://127.0.0.1:8000" - - -def main() -> None: - parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter) - parser.add_argument("--safety-identifier", default="local-dev") - parser.add_argument("--previous-response-id", default=None) - parser.add_argument("--telegram-chat-id", default=None) - parser.add_argument("--include-originating", action="store_true", default=True) - parser.add_argument("prompt", nargs="*") - args = parser.parse_args() - - prompt = " ".join(args.prompt) or "What is the weather in Seattle?" - - extra_body: dict[str, object] = {} - if args.telegram_chat_id is not None: - targets: list[str] = [] - if args.include_originating: - targets.append("originating") - targets.append(f"telegram:{args.telegram_chat_id}") - extra_body["response_target"] = targets - - client = OpenAI(base_url=BASE_URL, api_key="not-needed") - response = client.responses.create( - model="agent", - input=prompt, - safety_identifier=args.safety_identifier, - previous_response_id=args.previous_response_id, - extra_body=extra_body or None, - ) - print(f"User: {prompt}") - print(f"Agent: {response.output_text}") - - -if __name__ == "__main__": - main() diff --git a/python/samples/04-hosting/af-hosting/local_identity_link/pyproject.toml b/python/samples/04-hosting/af-hosting/local_identity_link/pyproject.toml deleted file mode 100644 index 6e8a4d55052..00000000000 --- a/python/samples/04-hosting/af-hosting/local_identity_link/pyproject.toml +++ /dev/null @@ -1,32 +0,0 @@ -[project] -name = "agent-framework-hosting-sample-complete" -version = "0.0.1" -description = "Complete multi-channel hosting sample (Responses + Invocations + Telegram + Activity Protocol + Entra identity-link)." -requires-python = ">=3.10" -dependencies = [ - "agent-framework-foundry", - "agent-framework-hosting", - "agent-framework-hosting-activity-protocol", - "agent-framework-hosting-entra", - "agent-framework-hosting-invocations", - "agent-framework-hosting-responses", - "agent-framework-hosting-telegram", - "azure-identity", - "hypercorn>=0.17", - "httpx>=0.27", - "aiohttp>=3.13.5", -] - -[dependency-groups] -dev = ["openai>=1.99"] - -[tool.uv] -package = false - -[tool.uv.sources] -agent-framework-hosting = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting" } -agent-framework-hosting-activity-protocol = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-activity-protocol" } -agent-framework-hosting-entra = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-entra" } -agent-framework-hosting-invocations = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-invocations" } -agent-framework-hosting-responses = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-responses" } -agent-framework-hosting-telegram = { git = "https://github.com/microsoft/agent-framework.git", branch = "feature/python-hosting", subdirectory = "python/packages/hosting-telegram" } diff --git a/python/samples/04-hosting/af-hosting/local_telegram/README.md b/python/samples/04-hosting/af-hosting/local_telegram/README.md index 939c9055616..727d332342e 100644 --- a/python/samples/04-hosting/af-hosting/local_telegram/README.md +++ b/python/samples/04-hosting/af-hosting/local_telegram/README.md @@ -1,4 +1,4 @@ -# local_telegram — `@tool`, file-backed history, hooks, multicast +# local_telegram — `@tool`, file-backed history, hooks, Telegram Builds on `foundry_hosted_agent/` with the hooks and config most real apps need: @@ -11,9 +11,6 @@ Builds on `foundry_hosted_agent/` with the hooks and config most real apps need: do not share history. - A `telegram_hook` that keys per-chat sessions via `telegram_isolation_key`. - Two extra Telegram commands (`/new`, `/whoami`). -- `ResponseTarget` multicast: a Responses request can fan out the agent - reply to a Telegram chat by passing - `extra_body={"response_target": ["originating", "telegram:"]}`. `app:app` is a module-level Starlette ASGI app, so this sample runs under Hypercorn (multi-process). @@ -49,8 +46,6 @@ uv run python call_server.py "What is the weather in Tokyo?" # Resume an existing session by AgentSession id (works across channels): uv run python call_server.py --previous-response-id telegram:8741188429 "What did we discuss?" -# Multicast: keep the reply on the local wire AND push it to Telegram. -uv run python call_server_multicast.py --telegram-chat-id 8741188429 "Heads up." ``` > This sample is **local-only** — it shows the `agent-framework-hosting` diff --git a/python/samples/04-hosting/af-hosting/local_telegram/call_server_multicast.py b/python/samples/04-hosting/af-hosting/local_telegram/call_server_multicast.py deleted file mode 100644 index 123b16502e5..00000000000 --- a/python/samples/04-hosting/af-hosting/local_telegram/call_server_multicast.py +++ /dev/null @@ -1,92 +0,0 @@ -# Copyright (c) Microsoft. All rights reserved. - -"""Local client demonstrating server-side ``ResponseTarget`` fan-out. - -Posts one request to ``/responses`` with -``extra_body={"response_target": ["originating", "telegram:"]}``. -The server invokes the agent once and the host's -``ChannelContext.deliver_response`` resolves the target list against the -configured channels, calling :class:`host.ChannelPush` ``push`` on each -non-originating destination — here, the operator's Telegram chat. The -``"originating"`` pseudo-name keeps the agent reply on this script's wire -too, so the local terminal sees the reply alongside Telegram. - -Drop ``--include-originating`` to deliver only to Telegram (the local -response becomes a small acknowledgement string referencing the push -targets). - -The ``--previous-response-id`` flag (the AgentSession id) is independent -of ``--telegram-chat-id`` (the push destination). They were conflated in -an earlier iteration; in general one Entra user may have several Telegram -chat ids, and the session id is usually their Entra/responses isolation -key, not the chat id. Pass them both to resume a specific session and -fan-out to a specific chat:: - - python call_server_multicast.py \\ - --previous-response-id telegram:8741188429 \\ - --telegram-chat-id 8741188429 \\ - "What did we discuss?" - -Start the server first (in another shell):: - - cd server && uv run python advanced_app.py -""" - -from __future__ import annotations - -import argparse - -from openai import OpenAI - -BASE_URL = "http://127.0.0.1:8000" - - -def main() -> None: - parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter) - parser.add_argument( - "--telegram-chat-id", - required=True, - help="Native Telegram chat id to push the agent reply to.", - ) - parser.add_argument( - "--previous-response-id", - default=None, - help=( - "Existing AgentSession id (e.g. 'telegram:8741188429' or " - "'responses:local-dev'). Defaults to no resume — the server " - "creates a fresh session keyed by safety_identifier." - ), - ) - parser.add_argument( - "--no-originating", - action="store_true", - help="Skip 'originating' in response_target; only Telegram receives the reply.", - ) - parser.add_argument("prompt", nargs="*", help="Prompt to send to the agent.") - args = parser.parse_args() - - prompt = " ".join(args.prompt) or "What is the weather in Seattle?" - - response_target: list[str] = [] - if not args.no_originating: - response_target.append("originating") - response_target.append(f"telegram:{args.telegram_chat_id}") - - if args.previous_response_id: - print(f"Resuming AgentSession: {args.previous_response_id}") - print(f"response_target: {response_target}") - - client = OpenAI(base_url=BASE_URL, api_key="not-needed") - response = client.responses.create( - model="agent", - input=prompt, - safety_identifier="local-dev", - previous_response_id=args.previous_response_id, - extra_body={"response_target": response_target}, - ) - print(f"User: {prompt}") - print(f"Agent: {response.output_text}") - - -if __name__ == "__main__": - main() diff --git a/python/samples/04-hosting/af-hosting/local_telegram/pyproject.toml b/python/samples/04-hosting/af-hosting/local_telegram/pyproject.toml index 52a5f966ade..39e07048165 100644 --- a/python/samples/04-hosting/af-hosting/local_telegram/pyproject.toml +++ b/python/samples/04-hosting/af-hosting/local_telegram/pyproject.toml @@ -1,7 +1,7 @@ [project] name = "agent-framework-hosting-sample-advanced" version = "0.0.1" -description = "Advanced multi-channel hosting sample (Responses + Telegram with @tool, FileHistoryProvider, hooks, ResponseTarget multicast)." +description = "Advanced multi-channel hosting sample (Responses + Telegram with @tool, FileHistoryProvider, hooks)." requires-python = ">=3.10" dependencies = [ "agent-framework-foundry", diff --git a/python/uv.lock b/python/uv.lock index d7f4776988e..a6784b08a67 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -53,7 +53,6 @@ members = [ "agent-framework-hosting", "agent-framework-hosting-activity-protocol", "agent-framework-hosting-discord", - "agent-framework-hosting-entra", "agent-framework-hosting-invocations", "agent-framework-hosting-responses", "agent-framework-hosting-telegram", @@ -691,27 +690,6 @@ requires-dist = [ { name = "pynacl", specifier = ">=1.2.0,<2" }, ] -[[package]] -name = "agent-framework-hosting-entra" -version = "1.0.0a260424" -source = { editable = "packages/hosting-entra" } -dependencies = [ - { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, - { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, - { name = "cryptography", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, - { name = "httpx", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, - { name = "msal", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, -] - -[package.metadata] -requires-dist = [ - { name = "agent-framework-core", editable = "packages/core" }, - { name = "agent-framework-hosting", editable = "packages/hosting" }, - { name = "cryptography", specifier = ">=42" }, - { name = "httpx", specifier = ">=0.27,<1" }, - { name = "msal", specifier = ">=1.28,<2" }, -] - [[package]] name = "agent-framework-hosting-invocations" version = "1.0.0a260424" From 81db307dad7662b05b6e00f3591a50e5c4d61116 Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Fri, 12 Jun 2026 12:20:54 +0200 Subject: [PATCH 14/20] Python: add agent-framework-hosting-a2a channel (#6306) * feat(python): add agent-framework-hosting-a2a channel Add a hosting channel that exposes the host target (agent or workflow) as a peer agent over the Agent-to-Agent (A2A) protocol (JSON-RPC plus a served agent card). Requests are handled by a host-routed HostAgentExecutor that drives the host pipeline (ChannelContext.run/ run_stream) instead of wrapping the target directly, so sessions, linking, and run/response hooks apply. Maps the A2A conversation/context id to a ChannelSession isolation key and the caller to a ChannelIdentity; streaming emits incremental task artifacts. Includes tests, README, and workspace registration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address A2A hosting channel review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/hosting-a2a/LICENSE | 21 ++ python/packages/hosting-a2a/README.md | 36 ++ .../agent_framework_hosting_a2a/__init__.py | 24 ++ .../agent_framework_hosting_a2a/_channel.py | 141 ++++++++ .../agent_framework_hosting_a2a/_executor.py | 195 +++++++++++ python/packages/hosting-a2a/pyproject.toml | 102 ++++++ .../tests/hosting_a2a/test_channel.py | 309 ++++++++++++++++++ python/pyproject.toml | 2 + python/uv.lock | 23 ++ 9 files changed, 853 insertions(+) create mode 100644 python/packages/hosting-a2a/LICENSE create mode 100644 python/packages/hosting-a2a/README.md create mode 100644 python/packages/hosting-a2a/agent_framework_hosting_a2a/__init__.py create mode 100644 python/packages/hosting-a2a/agent_framework_hosting_a2a/_channel.py create mode 100644 python/packages/hosting-a2a/agent_framework_hosting_a2a/_executor.py create mode 100644 python/packages/hosting-a2a/pyproject.toml create mode 100644 python/packages/hosting-a2a/tests/hosting_a2a/test_channel.py diff --git a/python/packages/hosting-a2a/LICENSE b/python/packages/hosting-a2a/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting-a2a/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting-a2a/README.md b/python/packages/hosting-a2a/README.md new file mode 100644 index 00000000000..4beab43952e --- /dev/null +++ b/python/packages/hosting-a2a/README.md @@ -0,0 +1,36 @@ +# agent-framework-hosting-a2a + +Agent-to-Agent (A2A) protocol channel for `agent-framework-hosting`. + +Exposes the hosted target (an `Agent` or a `Workflow`) as an A2A peer agent: it +publishes an agent card and JSON-RPC routes and drives every request through the +host pipeline, so host sessions, request metadata, and run/response hooks all +apply. + +```python +from agent_framework.openai import OpenAIChatClient +from agent_framework_hosting import AgentFrameworkHost +from agent_framework_hosting_a2a import A2AChannel + +agent = OpenAIChatClient().as_agent(name="Assistant") + +host = AgentFrameworkHost( + target=agent, + channels=[A2AChannel(url="https://my-host.example.com/")], +) +host.serve(port=8000) +``` + +By default the channel mounts at the app root so the well-known agent card is +reachable at `/.well-known/agent-card.json`, with the JSON-RPC endpoint at `/`. +The A2A `context_id` maps onto the host session (caller-supplied session family). +A default agent card is derived from the target's name and description; pass a +fully-specified `agent_card` to override it. To advertise additional protocol +bindings in the generated card, pass `supported_interfaces`. + +> **Note:** Task state is held in an in-memory A2A task store for this version; it +> is independent of the host's session storage and is not persisted across +> restarts. + +The base host plumbing lives in +[`agent-framework-hosting`](https://pypi.org/project/agent-framework-hosting/). diff --git a/python/packages/hosting-a2a/agent_framework_hosting_a2a/__init__.py b/python/packages/hosting-a2a/agent_framework_hosting_a2a/__init__.py new file mode 100644 index 00000000000..c2cfab8cad5 --- /dev/null +++ b/python/packages/hosting-a2a/agent_framework_hosting_a2a/__init__.py @@ -0,0 +1,24 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""A2A (Agent-to-Agent) channel for :mod:`agent_framework_hosting`. + +Exposes the hosted target (an ``Agent`` or a ``Workflow``) as an A2A peer agent +— publishing an agent card and JSON-RPC routes — while routing every request +through the host pipeline so sessions, request metadata, and hooks apply. +""" + +import importlib.metadata + +from ._channel import A2AChannel +from ._executor import HostAgentExecutor + +try: + __version__ = importlib.metadata.version(__name__) +except importlib.metadata.PackageNotFoundError: + __version__ = "0.0.0" + +__all__ = [ + "A2AChannel", + "HostAgentExecutor", + "__version__", +] diff --git a/python/packages/hosting-a2a/agent_framework_hosting_a2a/_channel.py b/python/packages/hosting-a2a/agent_framework_hosting_a2a/_channel.py new file mode 100644 index 00000000000..585725ac636 --- /dev/null +++ b/python/packages/hosting-a2a/agent_framework_hosting_a2a/_channel.py @@ -0,0 +1,141 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""A2A (Agent-to-Agent) channel for :mod:`agent_framework_hosting`. + +Exposes the hosted target as an A2A peer agent: it publishes an agent card and +JSON-RPC routes, and drives every request through the host pipeline via +:class:`HostAgentExecutor`. +""" + +from __future__ import annotations + +from collections.abc import Sequence +from typing import Any + +from a2a.server.request_handlers import DefaultRequestHandler +from a2a.server.routes import create_agent_card_routes, create_jsonrpc_routes +from a2a.server.tasks import InMemoryTaskStore +from a2a.types import AgentCapabilities, AgentCard, AgentInterface, AgentSkill +from agent_framework_hosting import ( + ChannelContext, + ChannelContribution, + ChannelResponseHook, + ChannelRunHook, +) + +from ._executor import HostAgentExecutor + + +class A2AChannel: + """Channel that exposes the hosted target over the A2A protocol. + + The A2A ``context_id`` maps onto the host session (caller-supplied session + family) and each request is routed through :class:`ChannelContext`, so host + session resolution and hooks apply. + + Note: + Task state is held in an in-memory A2A task store for this version; it + is independent of the host's session storage and is not persisted. + """ + + name: str = "a2a" + + def __init__( + self, + *, + name: str | None = None, + path: str = "", + url: str = "/", + agent_name: str | None = None, + agent_description: str | None = None, + agent_version: str = "1.0.0", + agent_card: AgentCard | None = None, + skills: Sequence[AgentSkill] | None = None, + supported_interfaces: Sequence[AgentInterface] | None = None, + streaming: bool = True, + rpc_url: str = "/", + card_url: str = "/.well-known/agent-card.json", + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + ) -> None: + """Configure the A2A channel. + + Keyword Args: + name: Override the channel name (defaults to ``"a2a"``). + path: Sub-path to mount the channel under; empty string (default) + mounts the agent-card and JSON-RPC routes at the app root so + the well-known card path is reachable. + url: Public URL advertised in the agent card's interface (the base + URL clients use to reach the JSON-RPC endpoint). + agent_name: Name advertised in the default agent card. Defaults to + the hosted target's name. + agent_description: Description advertised in the default agent card. + Defaults to the hosted target's description. + agent_version: Version advertised in the default agent card. + agent_card: A fully-specified agent card; when provided it takes + precedence over the ``agent_*``/``url``/``skills`` fields. + skills: Skills advertised in the default agent card. + supported_interfaces: Interfaces advertised in the default agent card. + Defaults to one JSON-RPC interface using ``url``. + streaming: Consume the target via streaming and publish incremental + A2A task artifacts (default ``True``). + rpc_url: Path for the JSON-RPC endpoint (relative to ``path``). + card_url: Path for the agent-card endpoint (relative to ``path``). + run_hook: Optional run hook applied to each request. + response_hook: Optional response hook applied to originating replies. + """ + if name is not None: + self.name = name + self.path = path + self._url = url + self._agent_name = agent_name + self._agent_description = agent_description + self._agent_version = agent_version + self._agent_card = agent_card + self._skills = list(skills) if skills is not None else [] + self._supported_interfaces = list(supported_interfaces) if supported_interfaces is not None else None + self._streaming = streaming + self._rpc_url = rpc_url + self._card_url = card_url + self._run_hook = run_hook + self._response_hook = response_hook + + def _build_agent_card(self, context: ChannelContext) -> AgentCard: + """Derive a default agent card from the hosted target, if not supplied.""" + if self._agent_card is not None: + return self._agent_card + target: Any = context.target + name = self._agent_name or getattr(target, "name", None) or self.name + description = self._agent_description or getattr(target, "description", None) or f"{name} (A2A)" + return AgentCard( + name=name, + description=description, + version=self._agent_version, + default_input_modes=["text"], + default_output_modes=["text"], + capabilities=AgentCapabilities(streaming=self._streaming), + supported_interfaces=self._supported_interfaces + or [AgentInterface(url=self._url, protocol_binding="JSONRPC")], + skills=self._skills, + ) + + def contribute(self, context: ChannelContext) -> ChannelContribution: + """Build the A2A request handler and contribute its routes.""" + agent_card = self._build_agent_card(context) + executor = HostAgentExecutor( + context, + channel_name=self.name, + streaming=self._streaming, + run_hook=self._run_hook, + response_hook=self._response_hook, + ) + handler = DefaultRequestHandler( + agent_executor=executor, + task_store=InMemoryTaskStore(), + agent_card=agent_card, + ) + routes = [ + *create_agent_card_routes(agent_card, card_url=self._card_url), + *create_jsonrpc_routes(handler, self._rpc_url), + ] + return ChannelContribution(routes=routes) diff --git a/python/packages/hosting-a2a/agent_framework_hosting_a2a/_executor.py b/python/packages/hosting-a2a/agent_framework_hosting_a2a/_executor.py new file mode 100644 index 00000000000..495209442e1 --- /dev/null +++ b/python/packages/hosting-a2a/agent_framework_hosting_a2a/_executor.py @@ -0,0 +1,195 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Host-routed A2A :class:`AgentExecutor`. + +Unlike ``agent_framework_a2a.A2AExecutor`` (which calls ``agent.run`` directly +and manages its own session), :class:`HostAgentExecutor` routes every incoming +A2A request through the host pipeline via :class:`ChannelContext` — so host +session resolution, request metadata, and run/response hooks all apply. The A2A +``context_id`` maps onto :class:`ChannelSession` (caller-supplied session +family). +""" + +from __future__ import annotations + +import base64 +import re +from asyncio import CancelledError +from typing import Any, cast + +from a2a.server.agent_execution import AgentExecutor, RequestContext +from a2a.server.events import EventQueue +from a2a.server.tasks import TaskUpdater +from a2a.types import Part, Task, TaskState +from agent_framework import Content +from agent_framework_hosting import ( + ChannelContext, + ChannelIdentity, + ChannelRequest, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + logger, +) + +try: + from a2a.helpers import new_task_from_user_message +except ImportError: # pragma: no cover - older a2a-sdk layout + from a2a.utils import new_task_from_user_message # type: ignore[no-redef, attr-defined, import-not-found] + +_DATA_URI_PATTERN = re.compile(r"^data:(?P[^;]+);base64,(?P[A-Za-z0-9+/=]+)$") + + +def _contents_to_parts(contents: list[Content]) -> list[Part]: + """Convert Agent Framework contents into A2A parts (text, uri, inline data).""" + parts: list[Part] = [] + for content in contents: + if content.type == "text" and content.text: + parts.append(Part(text=content.text)) + elif content.type == "uri" and content.uri: + parts.append(Part(url=content.uri, media_type=content.media_type or "")) + elif content.type == "data" and content.uri: + match = _DATA_URI_PATTERN.match(content.uri) + if match is None: + logger.warning("A2AChannel could not parse data URI; omitted.") + continue + parts.append(Part(raw=base64.b64decode(match.group("data")), media_type=content.media_type or "")) + else: + logger.warning("A2AChannel does not support content type: %s. Omitted.", content.type) + return parts + + +class HostAgentExecutor(AgentExecutor): + """A2A executor that drives the hosted target through :class:`ChannelContext`.""" + + def __init__( + self, + context: ChannelContext, + *, + channel_name: str, + streaming: bool = True, + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + ) -> None: + """Bind the executor to the host context. + + Args: + context: The host-supplied :class:`ChannelContext`. + + Keyword Args: + channel_name: The owning channel's name (stamped on requests). + streaming: When ``True`` (default) the target is consumed via + :meth:`ChannelContext.run_stream` and incremental updates are + published as A2A task artifacts; otherwise the full reply is + published as a single working-state message. + run_hook: Optional :data:`ChannelRunHook` applied to the request. + response_hook: Optional :data:`ChannelResponseHook` applied to the + originating final response. + """ + super().__init__() + self._ctx = context + self._channel_name = channel_name + self._streaming = streaming + self._run_hook = run_hook + self._response_hook = response_hook + + async def cancel(self, context: RequestContext, event_queue: EventQueue) -> None: + """Publish a cancellation event for the in-flight task.""" + if context.context_id is None: + raise ValueError("Context ID must be provided in the RequestContext") + updater = TaskUpdater(event_queue, context.task_id or "", context.context_id) + await updater.cancel() + + async def execute(self, context: RequestContext, event_queue: EventQueue) -> None: + """Route an A2A request through the host and publish task events.""" + if context.context_id is None: + raise ValueError("Context ID must be provided in the RequestContext") + if context.message is None: + raise ValueError("Message must be provided in the RequestContext") + + query = context.get_user_input() + task: Task | None = context.current_task + if not task: + task = cast(Task, new_task_from_user_message(context.message)) # type: ignore[redundant-cast] + await event_queue.enqueue_event(task) + + task_id: str = task.id + updater = TaskUpdater(event_queue, task_id, context.context_id) + await updater.submit() + + try: + await updater.start_work() + request = self._build_request(query, context, task_id) + if request.stream: + await self._run_stream(request, updater, protocol_request=context.message) + else: + await self._run(request, updater, protocol_request=context.message) + await updater.complete() + except CancelledError: + await updater.update_status(state=TaskState.TASK_STATE_CANCELED) + except Exception as exc: + logger.exception("A2AChannel encountered an error during execution.") + await updater.update_status( + state=TaskState.TASK_STATE_FAILED, + message=updater.new_agent_message([Part(text=str(exc))]), + ) + + def _build_request(self, query: Any, context: RequestContext, task_id: str) -> ChannelRequest: + """Build the channel-neutral request from the A2A request context.""" + context_id = cast(str, context.context_id) + return ChannelRequest( + channel=self._channel_name, + operation="message.create", + input=query if isinstance(query, str) else str(query), + session=ChannelSession(isolation_key=context_id), + stream=self._streaming, + identity=ChannelIdentity(channel=self._channel_name, native_id=context_id), + attributes={"task_id": task_id}, + ) + + async def _run(self, request: ChannelRequest, updater: TaskUpdater, *, protocol_request: Any) -> None: + """Non-streaming: run the target and publish the reply as task messages.""" + result = await self._ctx.run( + request, + run_hook=self._run_hook, + protocol_request=protocol_request, + response_hook=self._response_hook, + channel_name=self._channel_name, + ) + response: Any = result.result + messages: list[Any] = list(getattr(response, "messages", None) or []) + for message in messages: + if getattr(message, "role", None) == "user": + continue + contents: list[Content] = list(getattr(message, "contents", None) or []) + parts = _contents_to_parts(contents) + if parts: + await updater.update_status( + state=TaskState.TASK_STATE_WORKING, + message=updater.new_agent_message(parts=parts), + ) + + async def _run_stream(self, request: ChannelRequest, updater: TaskUpdater, *, protocol_request: Any) -> None: + """Streaming: publish incremental updates as task artifacts.""" + streamed_ids: set[str] = set() + stream = await self._ctx.run_stream( + request, + run_hook=self._run_hook, + protocol_request=protocol_request, + response_hook=self._response_hook, + channel_name=self._channel_name, + ) + async for update in stream: + contents: list[Content] = list(getattr(update, "contents", None) or []) + parts = _contents_to_parts(contents) + if not parts: + continue + message_id: str | None = getattr(update, "message_id", None) + await updater.add_artifact( + parts=parts, + artifact_id=message_id, + append=True if message_id is not None and message_id in streamed_ids else None, + ) + if message_id is not None: + streamed_ids.add(message_id) + await stream.get_final_response() diff --git a/python/packages/hosting-a2a/pyproject.toml b/python/packages/hosting-a2a/pyproject.toml new file mode 100644 index 00000000000..ee72982537a --- /dev/null +++ b/python/packages/hosting-a2a/pyproject.toml @@ -0,0 +1,102 @@ +[project] +name = "agent-framework-hosting-a2a" +description = "Agent-to-Agent (A2A) protocol channel for agent-framework-hosting." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "agent-framework-hosting>=1.0.0a260424,<2", + "a2a-sdk>=1.0.0,<2", + "starlette>=0.37", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting_a2a"] +exclude = ['tests'] + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting_a2a"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_a2a" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_a2a --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" + +[dependency-groups] +dev = [] diff --git a/python/packages/hosting-a2a/tests/hosting_a2a/test_channel.py b/python/packages/hosting-a2a/tests/hosting_a2a/test_channel.py new file mode 100644 index 00000000000..467c67d431d --- /dev/null +++ b/python/packages/hosting-a2a/tests/hosting_a2a/test_channel.py @@ -0,0 +1,309 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Unit tests for :class:`A2AChannel` and :class:`HostAgentExecutor`.""" + +from __future__ import annotations + +import asyncio +from collections.abc import AsyncIterator, Awaitable +from contextlib import asynccontextmanager +from dataclasses import dataclass, field +from typing import Any + +import pytest +import uvicorn +from a2a.server.events import EventQueue +from a2a.types import AgentCard, AgentInterface, Message, Part, Role, Task, TaskState +from agent_framework import AgentResponse, Content +from agent_framework import Message as AFMessage +from agent_framework_a2a import A2AAgent +from agent_framework_hosting import AgentFrameworkHost, ChannelContribution, ChannelRequest, HostedRunResult +from starlette.types import ASGIApp + +from agent_framework_hosting_a2a import A2AChannel, HostAgentExecutor + +# --------------------------------------------------------------------------- # +# Fakes # +# --------------------------------------------------------------------------- # + + +@dataclass +class _FakeResp: + text: str + messages: list[Message] = field(default_factory=list) + + +@dataclass +class _FakeUpdate: + text: str + contents: list[Content] = field(default_factory=list) + message_id: str | None = None + + +class _FakeStream: + def __init__(self, chunks: list[str]) -> None: + self._chunks = chunks + self._final = _FakeResp(text="".join(chunks)) + + def __aiter__(self) -> AsyncIterator[_FakeUpdate]: + async def _gen() -> AsyncIterator[_FakeUpdate]: + for i, c in enumerate(self._chunks): + yield _FakeUpdate(text=c, contents=[Content.from_text(text=c)], message_id=f"m{i}") + + return _gen() + + async def get_final_response(self) -> _FakeResp: + return self._final + + +@dataclass +class _FakeTarget: + name: str = "Assistant" + description: str = "A helpful assistant." + + +class _FakeContext: + def __init__( + self, + *, + reply: str = "hello", + chunks: list[str] | None = None, + ) -> None: + self.target = _FakeTarget() + self._reply = reply + self._chunks = chunks or [reply] + self.requests: list[ChannelRequest] = [] + + async def run( + self, + request: ChannelRequest, + *, + run_hook: Any | None = None, + protocol_request: Any | None = None, + response_hook: Any | None = None, + channel_name: str | None = None, + ) -> HostedRunResult[Any]: + if run_hook is not None: + maybe_request = run_hook(request, target=self.target, protocol_request=protocol_request) + if isinstance(maybe_request, Awaitable): + request = await maybe_request + else: + request = maybe_request + self.requests.append(request) + msg = Message(role=Role.ROLE_AGENT, parts=[Part(text=self._reply)]) + result = HostedRunResult(_FakeResp(text=self._reply, messages=[msg])) + if response_hook is not None: + maybe_result = response_hook(result, request=request, channel_name=channel_name or request.channel) + if isinstance(maybe_result, Awaitable): + return await maybe_result + return maybe_result + return result + + async def run_stream( + self, + request: ChannelRequest, + *, + run_hook: Any | None = None, + protocol_request: Any | None = None, + stream_update_hook: Any | None = None, + response_hook: Any | None = None, + channel_name: str | None = None, + ) -> _FakeStream: + if run_hook is not None: + maybe_request = run_hook(request, target=self.target, protocol_request=protocol_request) + if isinstance(maybe_request, Awaitable): + request = await maybe_request + else: + request = maybe_request + self.requests.append(request) + return _FakeStream(self._chunks) + + +class _RecordingEventQueue(EventQueue): + def __init__(self) -> None: + super().__init__() + self.events: list[Any] = [] + + async def enqueue_event(self, event: Any) -> None: + self.events.append(event) + await super().enqueue_event(event) + + +class _FakeRequestContext: + def __init__(self, *, context_id: str, text: str, current_task: Task | None = None) -> None: + self.context_id = context_id + self.task_id: str | None = None + self.message = Message( + message_id="msg-1", + context_id=context_id, + role=Role.ROLE_USER, + parts=[Part(text=text)], + ) + self.current_task = current_task + self._text = text + + def get_user_input(self) -> str: + return self._text + + +class _HostedAgent: + name = "HostedAssistant" + description = "A hosted test assistant." + + async def run(self, messages: Any = None, *, stream: bool = False, **_kwargs: Any) -> AgentResponse[Any]: + text = messages.text if isinstance(messages, AFMessage) else str(messages) + return AgentResponse(messages=[AFMessage(role="assistant", contents=[Content.from_text(text=f"host: {text}")])]) + + +@asynccontextmanager +async def _serve_app(app: ASGIApp, *, port: int) -> AsyncIterator[str]: + config = uvicorn.Config(app, host="127.0.0.1", port=port, log_level="warning", lifespan="on") + server = uvicorn.Server(config) + task = asyncio.create_task(server.serve()) + try: + for _ in range(100): + if server.started: + break + await asyncio.sleep(0.01) + else: + raise RuntimeError("Test A2A server did not start") + yield f"http://127.0.0.1:{port}" + finally: + server.should_exit = True + await task + + +def _status_states(events: list[Any]) -> list[int]: + states: list[int] = [] + for event in events: + status = getattr(event, "status", None) + if status is not None and getattr(status, "state", None): + states.append(status.state) + return states + + +# --------------------------------------------------------------------------- # +# A2AChannel tests # +# --------------------------------------------------------------------------- # + + +def test_default_name_and_root_path() -> None: + channel = A2AChannel() + assert channel.name == "a2a" + assert channel.path == "" + + +def test_build_agent_card_defaults_from_target() -> None: + channel = A2AChannel(url="https://example.com/") + card = channel._build_agent_card(_FakeContext()) # type: ignore[arg-type] + assert card.name == "Assistant" + assert card.description == "A helpful assistant." + assert card.capabilities.streaming is True + assert card.supported_interfaces[0].url == "https://example.com/" + + +def test_build_agent_card_accepts_supported_interfaces() -> None: + interfaces = [ + AgentInterface(url="https://example.com/jsonrpc", protocol_binding="JSONRPC"), + AgentInterface(url="https://example.com/grpc", protocol_binding="GRPC"), + ] + channel = A2AChannel(supported_interfaces=interfaces) + card = channel._build_agent_card(_FakeContext()) # type: ignore[arg-type] + assert card.supported_interfaces == interfaces + + +def test_build_agent_card_override_wins() -> None: + custom = AgentCard(name="Custom", description="custom card", version="9.9.9") + channel = A2AChannel(agent_card=custom) + card = channel._build_agent_card(_FakeContext()) # type: ignore[arg-type] + assert card.name == "Custom" + assert card.version == "9.9.9" + + +def test_contribute_returns_card_and_jsonrpc_routes() -> None: + channel = A2AChannel(url="https://example.com/") + contribution = channel.contribute(_FakeContext()) # type: ignore[arg-type] + assert isinstance(contribution, ChannelContribution) + paths = {getattr(r, "path", None) for r in contribution.routes} + assert "/.well-known/agent-card.json" in paths + assert any(p == "/" for p in paths) + + +# --------------------------------------------------------------------------- # +# HostAgentExecutor tests # +# --------------------------------------------------------------------------- # + + +async def test_execute_routes_through_host_and_completes() -> None: + ctx = _FakeContext(reply="hi back") + executor = HostAgentExecutor(ctx, channel_name="a2a", streaming=False) # type: ignore[arg-type] + queue = _RecordingEventQueue() + request_context = _FakeRequestContext(context_id="conv-1", text="hello") + + await executor.execute(request_context, queue) # type: ignore[arg-type] + + # Routed through the host with the context id mapped onto the session. + assert len(ctx.requests) == 1 + request = ctx.requests[0] + assert request.channel == "a2a" + assert request.input == "hello" + assert request.session is not None + assert request.session.isolation_key == "conv-1" + assert request.identity is not None + assert request.identity.native_id == "conv-1" + # Task progressed to a completed state. + assert TaskState.TASK_STATE_COMPLETED in _status_states(queue.events) + + +async def test_execute_streaming_emits_artifacts() -> None: + ctx = _FakeContext(chunks=["foo", "bar"]) + executor = HostAgentExecutor(ctx, channel_name="a2a", streaming=True) # type: ignore[arg-type] + queue = _RecordingEventQueue() + request_context = _FakeRequestContext(context_id="conv-2", text="hello") + + await executor.execute(request_context, queue) # type: ignore[arg-type] + + artifact_events = [e for e in queue.events if getattr(e, "artifact", None)] + assert artifact_events, "expected at least one artifact update event" + assert ctx.requests[0].stream is True + assert TaskState.TASK_STATE_COMPLETED in _status_states(queue.events) + + +async def test_execute_requires_context_id() -> None: + ctx = _FakeContext() + executor = HostAgentExecutor(ctx, channel_name="a2a") # type: ignore[arg-type] + queue = _RecordingEventQueue() + request_context = _FakeRequestContext(context_id="x", text="hello") + request_context.context_id = None # type: ignore[assignment] + + with pytest.raises(ValueError, match="Context ID"): + await executor.execute(request_context, queue) # type: ignore[arg-type] + + +async def test_a2a_agent_can_call_hosted_channel(unused_tcp_port: int) -> None: + host = AgentFrameworkHost(target=_HostedAgent(), channels=[A2AChannel(streaming=False)]) + + async with ( + _serve_app(host.app, port=unused_tcp_port) as base_url, + A2AAgent( + url=base_url, + timeout=5.0, + ) as agent, + ): + response = await agent.run("hello") + + assert response.messages[0].text == "host: hello" + + +def test_contents_to_parts_conversion() -> None: + from agent_framework_hosting_a2a._executor import _contents_to_parts + + contents = [ + Content.from_text(text="hello"), + Content.from_uri(uri="https://x/y.png", media_type="image/png"), + Content.from_data(data=b"AAAA", media_type="image/png"), + ] + parts = _contents_to_parts(contents) + assert parts[0].text == "hello" + assert parts[1].url == "https://x/y.png" + assert parts[2].raw == b"AAAA" diff --git a/python/pyproject.toml b/python/pyproject.toml index 19d33355289..f992269af14 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -92,6 +92,7 @@ agent-framework-hosting-invocations = { workspace = true } agent-framework-hosting-telegram = { workspace = true } agent-framework-hosting-activity-protocol = { workspace = true } agent-framework-hosting-discord = { workspace = true } +agent-framework-hosting-a2a = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } agent-framework-mem0 = { workspace = true } @@ -219,6 +220,7 @@ executionEnvironments = [ { root = "packages/hosting-invocations/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-telegram/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-activity-protocol/tests", reportPrivateUsage = "none" }, + { root = "packages/hosting-a2a/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/uv.lock b/python/uv.lock index a6784b08a67..6f611cc2aa8 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -51,6 +51,7 @@ members = [ "agent-framework-gemini", "agent-framework-github-copilot", "agent-framework-hosting", + "agent-framework-hosting-a2a", "agent-framework-hosting-activity-protocol", "agent-framework-hosting-discord", "agent-framework-hosting-invocations", @@ -652,6 +653,28 @@ provides-extras = ["serve", "disk"] [package.metadata.requires-dev] dev = [{ name = "httpx", specifier = ">=0.28.1" }] +[[package]] +name = "agent-framework-hosting-a2a" +version = "1.0.0a260424" +source = { editable = "packages/hosting-a2a" } +dependencies = [ + { name = "a2a-sdk", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "starlette", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + +[package.metadata] +requires-dist = [ + { name = "a2a-sdk", specifier = ">=1.0.0,<2" }, + { name = "agent-framework-core", editable = "packages/core" }, + { name = "agent-framework-hosting", editable = "packages/hosting" }, + { name = "starlette", specifier = ">=0.37" }, +] + +[package.metadata.requires-dev] +dev = [] + [[package]] name = "agent-framework-hosting-activity-protocol" version = "1.0.0a260424" From 54b390dd17e58fe6ac37d3823c06bb39071b476f Mon Sep 17 00:00:00 2001 From: Eduard van Valkenburg Date: Fri, 12 Jun 2026 12:25:43 +0200 Subject: [PATCH 15/20] Python: add agent-framework-hosting-mcp channel (#6305) * feat(python): add agent-framework-hosting-mcp channel Add a hosting channel that exposes the host target (agent or workflow) as a single Model Context Protocol tool over Streamable HTTP. The tool invocation routes through the host pipeline (ChannelContext.run/ run_stream) so sessions, linking, and run/response hooks apply. Maps the MCP request context to a ChannelSession isolation key and ChannelIdentity, and forwards streaming output as MCP progress notifications. Includes tests, README, and workspace registration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * Address MCP hosting channel review feedback Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/hosting-mcp/LICENSE | 21 + python/packages/hosting-mcp/README.md | 29 ++ .../agent_framework_hosting_mcp/__init__.py | 23 + .../agent_framework_hosting_mcp/_channel.py | 437 ++++++++++++++++++ python/packages/hosting-mcp/pyproject.toml | 102 ++++ .../tests/hosting_mcp/test_channel.py | 390 ++++++++++++++++ python/pyproject.toml | 2 + python/uv.lock | 23 + 8 files changed, 1027 insertions(+) create mode 100644 python/packages/hosting-mcp/LICENSE create mode 100644 python/packages/hosting-mcp/README.md create mode 100644 python/packages/hosting-mcp/agent_framework_hosting_mcp/__init__.py create mode 100644 python/packages/hosting-mcp/agent_framework_hosting_mcp/_channel.py create mode 100644 python/packages/hosting-mcp/pyproject.toml create mode 100644 python/packages/hosting-mcp/tests/hosting_mcp/test_channel.py diff --git a/python/packages/hosting-mcp/LICENSE b/python/packages/hosting-mcp/LICENSE new file mode 100644 index 00000000000..9e841e7a26e --- /dev/null +++ b/python/packages/hosting-mcp/LICENSE @@ -0,0 +1,21 @@ + MIT License + + Copyright (c) Microsoft Corporation. + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE diff --git a/python/packages/hosting-mcp/README.md b/python/packages/hosting-mcp/README.md new file mode 100644 index 00000000000..c6f01c4f45a --- /dev/null +++ b/python/packages/hosting-mcp/README.md @@ -0,0 +1,29 @@ +# agent-framework-hosting-mcp + +Model Context Protocol (MCP) tool channel for `agent-framework-hosting`. + +Exposes the hosted target (an `Agent` or a `Workflow`) as a single MCP tool over +the Streamable-HTTP transport, so MCP clients — other agents, IDE tooling — can +invoke it. Every call is routed through the host pipeline, so host sessions, +request metadata, and run/response hooks all apply. + +```python +from agent_framework.openai import OpenAIChatClient +from agent_framework_hosting import AgentFrameworkHost +from agent_framework_hosting_mcp import MCPChannel + +agent = OpenAIChatClient().as_agent(name="Assistant") + +host = AgentFrameworkHost(target=agent, channels=[MCPChannel()]) +host.serve(port=8000) +``` + +The Streamable-HTTP endpoint is mounted at `path` (default `/mcp`). The advertised +tool accepts `{"input": str, "session_id": str?}` and returns the target's reply +as MCP content blocks, including structured output when the agent returns one. +Pass `session_id` to continue a prior conversation (it maps onto the host +session). When `streaming=True` (default) incremental text is forwarded as MCP +progress notifications while the full reply is returned as the tool result. + +The base host plumbing lives in +[`agent-framework-hosting`](https://pypi.org/project/agent-framework-hosting/). diff --git a/python/packages/hosting-mcp/agent_framework_hosting_mcp/__init__.py b/python/packages/hosting-mcp/agent_framework_hosting_mcp/__init__.py new file mode 100644 index 00000000000..90abbd8776c --- /dev/null +++ b/python/packages/hosting-mcp/agent_framework_hosting_mcp/__init__.py @@ -0,0 +1,23 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Model Context Protocol (MCP) tool channel for :mod:`agent_framework_hosting`. + +Exposes the hosted target (an ``Agent`` or a ``Workflow``) as a single MCP +tool over the Streamable-HTTP transport so MCP clients — other agents, IDE +tooling — can invoke it. Routes through the host pipeline, so sessions, +request metadata, and hooks apply. +""" + +import importlib.metadata + +from ._channel import MCPChannel + +try: + __version__ = importlib.metadata.version(__name__) +except importlib.metadata.PackageNotFoundError: + __version__ = "0.0.0" + +__all__ = [ + "MCPChannel", + "__version__", +] diff --git a/python/packages/hosting-mcp/agent_framework_hosting_mcp/_channel.py b/python/packages/hosting-mcp/agent_framework_hosting_mcp/_channel.py new file mode 100644 index 00000000000..427268c4e13 --- /dev/null +++ b/python/packages/hosting-mcp/agent_framework_hosting_mcp/_channel.py @@ -0,0 +1,437 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""``MCPChannel`` — exposes the hosted target as a Model Context Protocol tool. + +Mounts a Streamable-HTTP MCP endpoint that advertises a single tool. An MCP +client (another agent, an IDE, tooling) calls the tool with +``{"input": "...", "session_id": "..."}`` and receives the target's reply as +the tool result. + +Like the other ``agent-framework-hosting`` channels this routes through the +host pipeline (``ChannelContext.run`` / ``run_stream``) so session resolution, +request metadata, and run/response hooks all apply. The MCP ``tool/call`` +conversation key maps onto :class:`ChannelSession` (caller-supplied-session +family); the same single-tool shape works for an ``Agent`` or a ``Workflow`` +target (use a ``run_hook`` to reshape the free-form input into a workflow's +typed inputs). +""" + +from __future__ import annotations + +import base64 +import json +import re +from collections.abc import Mapping, Sequence +from contextlib import AbstractAsyncContextManager +from dataclasses import asdict, is_dataclass +from typing import Any, cast + +import mcp.types as types +from agent_framework import Content, Message +from agent_framework_hosting import ( + ChannelContext, + ChannelContribution, + ChannelIdentity, + ChannelRequest, + ChannelResponseHook, + ChannelRunHook, + ChannelSession, + HostedRunResult, + logger, +) +from mcp.server.lowlevel import Server +from mcp.server.streamable_http_manager import StreamableHTTPSessionManager +from pydantic import AnyUrl +from starlette.routing import Mount +from starlette.types import Receive, Scope, Send + +_DEFAULT_TOOL_NAME = "run_agent" +_DEFAULT_TOOL_DESCRIPTION = ( + "Invoke the hosted agent (or workflow) with a free-form text request and " + "return its reply. Pass an optional ``session_id`` to continue a prior " + "conversation." +) +_DATA_URI_PATTERN = re.compile(r"^data:(?P[^;]+);base64,(?P[A-Za-z0-9+/=]+)$") + + +def _mcp_uri(uri: str) -> AnyUrl: + """Build an MCP URI model from a string URI.""" + return AnyUrl(uri) + + +def _json_safe(value: Any) -> Any: + """Return a JSON-serializable representation for MCP structured content.""" + try: + return json.loads(json.dumps(value, default=str)) + except (TypeError, ValueError): + return str(value) + + +def _structured_content(value: Any) -> dict[str, Any] | None: + """Normalize an Agent Framework structured output value for MCP.""" + if value is None: + return None + + model_dump = getattr(value, "model_dump", None) + if callable(model_dump): + value = model_dump(mode="json") + elif is_dataclass(value) and not isinstance(value, type): + value = asdict(value) + + if isinstance(value, Mapping): + mapping_value = cast("Mapping[Any, Any]", value) # type: ignore[redundant-cast] + safe_value = _json_safe(dict(mapping_value)) + if isinstance(safe_value, dict): + safe_mapping = cast("Mapping[Any, Any]", safe_value) + return {str(key): item for key, item in safe_mapping.items()} + return {"value": safe_value} + safe_value = _json_safe(value) + return {"value": safe_value} + + +def _data_content_to_mcp(content: Content) -> list[types.ContentBlock]: + """Convert Agent Framework data content into the closest MCP content block.""" + if not content.uri: + return [] + match = _DATA_URI_PATTERN.match(content.uri) + if match is None: + logger.warning("MCPChannel could not parse data URI; omitted.") + return [] + + media_type = content.media_type or match.group("media_type") + data = match.group("data") + if media_type.startswith("image/"): + return [types.ImageContent(type="image", data=data, mimeType=media_type)] + if media_type.startswith("audio/"): + return [types.AudioContent(type="audio", data=data, mimeType=media_type)] + return [ + types.EmbeddedResource( + type="resource", + resource=types.BlobResourceContents(uri=_mcp_uri(content.uri), mimeType=media_type, blob=data), + ) + ] + + +def _content_to_mcp(content: Content) -> list[types.ContentBlock]: + """Convert one Agent Framework content item into MCP content blocks.""" + match content.type: + case "text": + return [types.TextContent(type="text", text=content.text or "")] + case "text_reasoning": + return [types.TextContent(type="text", text=content.text)] if content.text else [] + case "data": + return _data_content_to_mcp(content) + case "uri": + if not content.uri: + return [] + block: types.ContentBlock = types.ResourceLink( + type="resource_link", + name=content.uri, + uri=_mcp_uri(content.uri), + mimeType=content.media_type, + ) + return [block] + case "function_result": + if content.items: + blocks: list[types.ContentBlock] = [] + for item in content.items: + blocks.extend(_content_to_mcp(item)) + return blocks + return [types.TextContent(type="text", text=str(content.result or ""))] + case "error": + return [types.TextContent(type="text", text=content.message or content.error_details or "")] + case _: + logger.warning("MCPChannel does not support content type: %s. Omitted.", content.type) + return [] + + +def _value_to_mcp(value: Any) -> list[types.ContentBlock]: + """Convert a workflow output or fallback value into MCP content blocks.""" + if isinstance(value, Content): + return _content_to_mcp(value) + if isinstance(value, Message): + blocks: list[types.ContentBlock] = [] + for content in value.contents: + blocks.extend(_content_to_mcp(content)) + return blocks + if isinstance(value, str): + return [types.TextContent(type="text", text=value)] + if isinstance(value, bytes): + data = base64.b64encode(value).decode("utf-8") + return [ + types.EmbeddedResource( + type="resource", + resource=types.BlobResourceContents( + uri=_mcp_uri("data:application/octet-stream;base64," + data), + mimeType="application/octet-stream", + blob=data, + ), + ) + ] + return [types.TextContent(type="text", text=json.dumps(_json_safe(value), default=str))] + + +class MCPChannel: + """Exposes the hosted target as a single MCP tool over Streamable HTTP. + + Mounts the MCP Streamable-HTTP transport at ``path`` (default ``/mcp``). + The advertised tool accepts ``{"input": str, "session_id": str?}`` and + returns the target's reply as MCP content blocks. Agent structured outputs + are returned as MCP ``structuredContent``. + """ + + name = "mcp" + + def __init__( + self, + *, + path: str = "/mcp", + tool_name: str = _DEFAULT_TOOL_NAME, + tool_description: str = _DEFAULT_TOOL_DESCRIPTION, + server_name: str | None = None, + server_version: str | None = None, + streaming: bool = True, + json_response: bool = False, + stateless: bool = False, + run_hook: ChannelRunHook | None = None, + response_hook: ChannelResponseHook | None = None, + ) -> None: + """Create an MCP tool channel. + + Keyword Args: + path: Mount path for the Streamable-HTTP transport. Default ``/mcp``. + tool_name: Name of the advertised tool. Default ``run_agent``. + tool_description: Human-readable description advertised to clients. + server_name: MCP server name reported in the initialize handshake. + Defaults to the hosted target's ``name`` attribute when available. + server_version: Optional MCP server version string. + streaming: When ``True`` (default) the channel consumes the target + via :meth:`ChannelContext.run_stream` and forwards incremental + text to the client as MCP progress notifications (when the + client supplied a ``progressToken``). The full reply is always + returned as the tool result regardless of this flag. + json_response: Forwarded to :class:`StreamableHTTPSessionManager`. + When ``True`` the transport returns a single JSON response + instead of an SSE stream for each request. + stateless: Forwarded to :class:`StreamableHTTPSessionManager`. When + ``True`` the transport does not retain per-session state between + requests. + run_hook: Optional :data:`ChannelRunHook` invoked with the parsed + :class:`ChannelRequest` before the target runs. + response_hook: Optional :data:`ChannelResponseHook` invoked before + the channel serializes an originating reply into tool content. + """ + self.path = path + self.response_hook = response_hook + self._tool_name = tool_name + self._tool_description = tool_description + self._server_name = server_name + self._server_version = server_version + self._streaming = streaming + self._json_response = json_response + self._stateless = stateless + self._hook = run_hook + self._ctx: ChannelContext | None = None + self._server: Server[Any, Any] | None = None + self._session_manager: StreamableHTTPSessionManager | None = None + self._run_cm: AbstractAsyncContextManager[None] | None = None + + def contribute(self, context: ChannelContext) -> ChannelContribution: + """Capture the host context and mount the Streamable-HTTP transport.""" + self._ctx = context + self._server = self._build_server() + self._session_manager = StreamableHTTPSessionManager( + app=self._server, + json_response=self._json_response, + stateless=self._stateless, + ) + # StreamableHTTPSessionManager owns MCP initialize/session/progress semantics; + # mounting it keeps the channel on the real MCP HTTP transport. + return ChannelContribution( + routes=[Mount("/", app=self._handle_asgi)], + on_startup=[self._on_startup], + on_shutdown=[self._on_shutdown], + ) + + async def _handle_asgi(self, scope: Scope, receive: Receive, send: Send) -> None: + """ASGI entrypoint delegating to the MCP Streamable-HTTP session manager.""" + if self._session_manager is None: # pragma: no cover - guarded by lifecycle + raise RuntimeError("MCPChannel transport not initialized") + await self._session_manager.handle_request(scope, receive, send) + + async def _on_startup(self) -> None: + """Enter the session-manager task-group lifecycle on host startup.""" + if self._session_manager is None: # pragma: no cover - guarded by lifecycle + return + self._run_cm = self._session_manager.run() + await self._run_cm.__aenter__() + + async def _on_shutdown(self) -> None: + """Exit the session-manager task-group lifecycle on host shutdown.""" + if self._run_cm is not None: + await self._run_cm.__aexit__(None, None, None) + self._run_cm = None + + def _build_server(self) -> Server[Any, Any]: + """Build the low-level MCP server with the single host-routed tool.""" + target_name = getattr(self._ctx.target, "name", None) if self._ctx is not None else None + server_name = self._server_name or (target_name if isinstance(target_name, str) and target_name else None) + server: Server[Any, Any] = Server(name=server_name or "agent-framework-hosting", version=self._server_version) + tool = types.Tool( + name=self._tool_name, + description=self._tool_description, + inputSchema={ + "type": "object", + "properties": { + "input": { + "type": "string", + "description": "The request to send to the hosted agent or workflow.", + }, + "session_id": { + "type": "string", + "description": "Optional conversation id to continue a prior session.", + }, + }, + "required": ["input"], + }, + ) + + @server.list_tools() # type: ignore[no-untyped-call, untyped-decorator, misc] + async def _list_tools() -> list[types.Tool]: # noqa: RUF029 # pyright: ignore[reportUnusedFunction] + return [tool] + + @server.call_tool() # type: ignore[no-untyped-call, untyped-decorator, misc] + async def _call_tool(name: str, arguments: Mapping[str, Any]) -> types.CallToolResult: # pyright: ignore[reportUnusedFunction] + return await self._invoke_tool(arguments) + + return server + + async def _invoke_tool(self, arguments: Mapping[str, Any]) -> types.CallToolResult: + """Route a single ``tool/call`` through the host pipeline.""" + if self._ctx is None: # pragma: no cover - guarded by Channel lifecycle + raise RuntimeError("MCPChannel not initialized") + + text_input = arguments.get("input") + if not isinstance(text_input, str) or not text_input: + return types.CallToolResult( + content=[types.TextContent(type="text", text="Error: 'input' must be a non-empty string.")], + isError=True, + ) + session_id = arguments.get("session_id") + session = ChannelSession(isolation_key=session_id) if isinstance(session_id, str) and session_id else None + identity = ( + ChannelIdentity(channel=self.name, native_id=session_id) + if isinstance(session_id, str) and session_id + else None + ) + + channel_request = ChannelRequest( + channel=self.name, + operation="message.create", + input=text_input, + session=session, + stream=self._streaming, + identity=identity, + attributes={"tool_name": self._tool_name}, + ) + + if channel_request.stream: + result = await self._run_streaming(channel_request, protocol_request=dict(arguments)) + else: + result = await self._ctx.run( + channel_request, + run_hook=self._hook, + protocol_request=dict(arguments), + response_hook=self.response_hook, + channel_name=self.name, + ) + + return self._result_to_content(result) + + async def _run_streaming( + self, request: ChannelRequest, *, protocol_request: Mapping[str, Any] + ) -> HostedRunResult[Any]: + """Consume the target as a stream, forwarding progress, returning the full reply.""" + if self._ctx is None: # pragma: no cover - guarded by Channel lifecycle + raise RuntimeError("MCPChannel not initialized") + + progress_token, request_id = self._progress_context() + progress = 0.0 + stream = await self._ctx.run_stream( + request, + run_hook=self._hook, + protocol_request=protocol_request, + response_hook=self.response_hook, + channel_name=self.name, + ) + async for update in stream: + chunk = getattr(update, "text", None) + if not chunk: + continue + if progress_token is not None: + progress += 1.0 + try: + await self._send_progress(progress_token, progress, chunk, request_id) + except Exception: # pragma: no cover - progress is best-effort + logger.exception("MCPChannel progress notification failed") + return HostedRunResult(await stream.get_final_response()) + + def _progress_context(self) -> tuple[str | int | None, str | None]: + """Best-effort lookup of the active request's progress token + id.""" + if self._server is None: # pragma: no cover - guarded by lifecycle + return None, None + try: + ctx = self._server.request_context + except Exception: # pragma: no cover - no active request context + return None, None + token = ctx.meta.progressToken if ctx.meta is not None else None + request_id = str(ctx.request_id) + return token, request_id + + async def _send_progress( + self, + progress_token: str | int, + progress: float, + message: str, + request_id: str | None, + ) -> None: + """Send a single MCP progress notification for streamed text.""" + if self._server is None: # pragma: no cover - guarded by lifecycle + return + await self._server.request_context.session.send_progress_notification( + progress_token=progress_token, + progress=progress, + message=message, + related_request_id=request_id, + ) + + def _result_to_content(self, result: HostedRunResult[Any]) -> types.CallToolResult: + """Convert a host result into an MCP tool result.""" + response = result.result + content: list[types.ContentBlock] = [] + + messages = cast("Sequence[Any] | None", getattr(response, "messages", None)) + if messages: + for message in messages: + for item in cast("Sequence[Any]", getattr(message, "contents", None) or ()): + if isinstance(item, Content): + content.extend(_content_to_mcp(item)) + else: + content.append(types.TextContent(type="text", text=str(item))) + + get_outputs = getattr(response, "get_outputs", None) + if callable(get_outputs): + for output in cast("Sequence[Any]", get_outputs()): + content.extend(_value_to_mcp(output)) + + structured = _structured_content(getattr(response, "value", None)) + if not content: + text = getattr(response, "text", None) + if isinstance(text, str) and text: + content.append(types.TextContent(type="text", text=text)) + elif structured is not None: + content.append(types.TextContent(type="text", text=json.dumps(structured, indent=2))) + else: + content.append(types.TextContent(type="text", text="")) + + return types.CallToolResult(content=content, structuredContent=structured, isError=False) diff --git a/python/packages/hosting-mcp/pyproject.toml b/python/packages/hosting-mcp/pyproject.toml new file mode 100644 index 00000000000..39ac2ff27ab --- /dev/null +++ b/python/packages/hosting-mcp/pyproject.toml @@ -0,0 +1,102 @@ +[project] +name = "agent-framework-hosting-mcp" +description = "Model Context Protocol (MCP) tool channel for agent-framework-hosting." +authors = [{ name = "Microsoft", email = "af-support@microsoft.com"}] +readme = "README.md" +requires-python = ">=3.10" +version = "1.0.0a260424" +license-files = ["LICENSE"] +urls.homepage = "https://aka.ms/agent-framework" +urls.source = "https://github.com/microsoft/agent-framework/tree/main/python" +urls.release_notes = "https://github.com/microsoft/agent-framework/releases?q=tag%3Apython-1&expanded=true" +urls.issues = "https://github.com/microsoft/agent-framework/issues" +classifiers = [ + "License :: OSI Approved :: MIT License", + "Development Status :: 3 - Alpha", + "Intended Audience :: Developers", + "Programming Language :: Python :: 3", + "Programming Language :: Python :: 3.10", + "Programming Language :: Python :: 3.11", + "Programming Language :: Python :: 3.12", + "Programming Language :: Python :: 3.13", + "Programming Language :: Python :: 3.14", + "Typing :: Typed", +] +dependencies = [ + "agent-framework-core>=1.2.0,<2", + "agent-framework-hosting>=1.0.0a260424,<2", + "mcp>=1.12,<2", + "starlette>=0.37", +] + +[tool.uv] +prerelease = "if-necessary-or-explicit" +environments = [ + "sys_platform == 'darwin'", + "sys_platform == 'linux'", + "sys_platform == 'win32'" +] + +[tool.uv-dynamic-versioning] +fallback-version = "0.0.0" + +[tool.pytest.ini_options] +testpaths = 'tests' +addopts = "-ra -q -r fEX" +asyncio_mode = "auto" +asyncio_default_fixture_loop_scope = "function" +filterwarnings = [] +timeout = 120 +markers = [ + "integration: marks tests as integration tests that require external services", +] + +[tool.ruff] +extend = "../../pyproject.toml" + +[tool.coverage.run] +omit = [ + "**/__init__.py" +] + +[tool.pyright] +extends = "../../pyproject.toml" +include = ["agent_framework_hosting_mcp"] +exclude = ['tests'] + +[tool.mypy] +plugins = ['pydantic.mypy'] +strict = true +python_version = "3.10" +ignore_missing_imports = true +disallow_untyped_defs = true +no_implicit_optional = true +check_untyped_defs = true +warn_return_any = true +show_error_codes = true +warn_unused_ignores = false +disallow_incomplete_defs = true +disallow_untyped_decorators = true + +[tool.bandit] +targets = ["agent_framework_hosting_mcp"] +exclude_dirs = ["tests"] + +[tool.poe] +executor.type = "uv" +include = "../../shared_tasks.toml" + +[tool.poe.tasks.mypy] +help = "Run MyPy for this package." +cmd = "mypy --config-file $POE_ROOT/pyproject.toml agent_framework_hosting_mcp" + +[tool.poe.tasks.test] +help = "Run the default unit test suite for this package." +cmd = 'pytest -m "not integration" --cov=agent_framework_hosting_mcp --cov-report=term-missing:skip-covered tests' + +[build-system] +requires = ["flit-core >= 3.11,<4.0"] +build-backend = "flit_core.buildapi" + +[dependency-groups] +dev = [] diff --git a/python/packages/hosting-mcp/tests/hosting_mcp/test_channel.py b/python/packages/hosting-mcp/tests/hosting_mcp/test_channel.py new file mode 100644 index 00000000000..f7874c0315b --- /dev/null +++ b/python/packages/hosting-mcp/tests/hosting_mcp/test_channel.py @@ -0,0 +1,390 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Unit tests for :class:`MCPChannel`.""" + +from __future__ import annotations + +import asyncio +from collections.abc import AsyncIterator, Awaitable, Sequence +from contextlib import asynccontextmanager +from dataclasses import dataclass, field +from typing import Any + +import mcp.types as types +import uvicorn +from agent_framework import AgentResponse, AgentResponseUpdate, Content, Message, ResponseStream +from agent_framework_hosting import AgentFrameworkHost, ChannelRequest, HostedRunResult +from mcp import ClientSession +from mcp.client.streamable_http import streamable_http_client +from mcp.shared.memory import create_connected_server_and_client_session +from starlette.types import ASGIApp + +from agent_framework_hosting_mcp import MCPChannel + +# --------------------------------------------------------------------------- # +# Fakes # +# --------------------------------------------------------------------------- # + + +@dataclass +class _FakeResp: + text: str + messages: list[Message] = field(default_factory=list) + value: Any | None = None + + +@dataclass +class _FakeUpdate: + text: str + contents: list[Content] = field(default_factory=list) + message_id: str | None = None + + +class _FakeStream: + def __init__(self, chunks: list[str], final: _FakeResp | None = None) -> None: + self._chunks = chunks + self._final = final or _FakeResp(text="".join(chunks)) + + def __aiter__(self) -> AsyncIterator[_FakeUpdate]: + async def _gen() -> AsyncIterator[_FakeUpdate]: + for c in self._chunks: + yield _FakeUpdate(text=c) + + return _gen() + + async def get_final_response(self) -> _FakeResp: + return self._final + + +@dataclass +class _FakeTarget: + name: str = "Assistant" + description: str = "A helpful assistant." + + +class _FakeContext: + """Minimal stand-in for :class:`ChannelContext`.""" + + def __init__( + self, + *, + reply: str = "hello", + chunks: list[str] | None = None, + contents: list[Content] | None = None, + structured: Any | None = None, + ) -> None: + self.target = _FakeTarget() + self._reply = reply + self._chunks = chunks or [reply] + self._contents = contents or [Content.from_text(text=reply)] + self._structured = structured + self.requests: list[ChannelRequest] = [] + + async def run( + self, + request: ChannelRequest, + *, + run_hook: Any | None = None, + protocol_request: Any | None = None, + response_hook: Any | None = None, + channel_name: str | None = None, + ) -> HostedRunResult[Any]: + if run_hook is not None: + maybe_request = run_hook(request, target=self.target, protocol_request=protocol_request) + if isinstance(maybe_request, Awaitable): + request = await maybe_request + else: + request = maybe_request + self.requests.append(request) + message = Message(role="assistant", contents=self._contents) + result = HostedRunResult(_FakeResp(text=self._reply, messages=[message], value=self._structured)) + if response_hook is not None: + maybe_result = response_hook(result, request=request, channel_name=channel_name or request.channel) + if isinstance(maybe_result, Awaitable): + return await maybe_result + return maybe_result + return result + + async def run_stream( + self, + request: ChannelRequest, + *, + run_hook: Any | None = None, + protocol_request: Any | None = None, + stream_update_hook: Any | None = None, + response_hook: Any | None = None, + channel_name: str | None = None, + ) -> _FakeStream: + if run_hook is not None: + maybe_request = run_hook(request, target=self.target, protocol_request=protocol_request) + if isinstance(maybe_request, Awaitable): + request = await maybe_request + else: + request = maybe_request + self.requests.append(request) + result = HostedRunResult(_FakeResp(text="".join(self._chunks), value=self._structured)) + if response_hook is not None: + maybe_result = response_hook(result, request=request, channel_name=channel_name or request.channel) + if isinstance(maybe_result, Awaitable): + result = await maybe_result + else: + result = maybe_result + return _FakeStream(self._chunks, final=result.result) + + +def _make_channel(ctx: _FakeContext, **kwargs: Any) -> MCPChannel: + channel = MCPChannel(**kwargs) + channel.contribute(ctx) # type: ignore[arg-type] + return channel + + +class _HostedAgent: + name = "HostedAssistant" + description = "A hosted test assistant." + + async def run(self, messages: Any = None, *, stream: bool = False, **_kwargs: Any) -> Any: + text = messages.text if isinstance(messages, Message) else str(messages) + if stream: + updates = [AgentResponseUpdate(contents=[Content.from_text(text=f"host: {text}")], role="assistant")] + + async def _gen() -> AsyncIterator[AgentResponseUpdate]: + for update in updates: + yield update + + async def _finalize(items: Sequence[AgentResponseUpdate]) -> AgentResponse: # noqa: RUF029 + return AgentResponse.from_updates(items) + + return ResponseStream[AgentResponseUpdate, AgentResponse](_gen(), finalizer=_finalize) + return AgentResponse(messages=[Message(role="assistant", contents=[Content.from_text(text=f"host: {text}")])]) + + +@asynccontextmanager +async def _serve_app(app: ASGIApp, *, port: int) -> AsyncIterator[str]: + config = uvicorn.Config(app, host="127.0.0.1", port=port, log_level="warning", lifespan="on") + server = uvicorn.Server(config) + task = asyncio.create_task(server.serve()) + try: + for _ in range(100): + if server.started: + break + await asyncio.sleep(0.01) + else: + raise RuntimeError("Test MCP server did not start") + yield f"http://127.0.0.1:{port}" + finally: + server.should_exit = True + await task + + +# --------------------------------------------------------------------------- # +# Tests # +# --------------------------------------------------------------------------- # + + +async def test_list_tools_advertises_single_configured_tool() -> None: + ctx = _FakeContext() + channel = _make_channel(ctx, tool_name="ask", tool_description="Ask the assistant.") + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.list_tools() + assert len(result.tools) == 1 + tool = result.tools[0] + assert tool.name == "ask" + assert tool.description == "Ask the assistant." + assert tool.inputSchema["required"] == ["input"] + assert set(tool.inputSchema["properties"]) == {"input", "session_id"} + + +async def test_initialize_uses_target_name_by_default() -> None: + ctx = _FakeContext() + channel = _make_channel(ctx) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.initialize() + assert result.serverInfo.name == "Assistant" + + +async def test_call_tool_routes_through_host_and_returns_text() -> None: + ctx = _FakeContext(reply="hi back", chunks=["hi", " back"]) + channel = _make_channel(ctx, streaming=False) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.call_tool("run_agent", {"input": "hello", "session_id": "conv-1"}) + assert not result.isError + assert isinstance(result.content[0], types.TextContent) + assert result.content[0].text == "hi back" + # The channel built a channel-neutral request routed through the host. + assert len(ctx.requests) == 1 + request = ctx.requests[0] + assert request.channel == "mcp" + assert request.operation == "message.create" + assert request.input == "hello" + assert request.session is not None + assert request.session.isolation_key == "conv-1" + assert request.identity is not None + assert request.identity.native_id == "conv-1" + + +async def test_call_tool_returns_rich_content_and_structured_output() -> None: + ctx = _FakeContext( + contents=[ + Content.from_text(text="text"), + Content.from_data(data=b"image-bytes", media_type="image/png"), + Content.from_data(data=b"audio-bytes", media_type="audio/wav"), + Content.from_data(data=b"raw-bytes", media_type="application/octet-stream"), + Content.from_uri(uri="https://example.com/file.json", media_type="application/json"), + ], + structured={"answer": 42}, + ) + channel = _make_channel(ctx, streaming=False) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.call_tool("run_agent", {"input": "hello"}) + + assert result.structuredContent == {"answer": 42} + assert [item.type for item in result.content] == ["text", "image", "audio", "resource", "resource_link"] + assert result.content[0].text == "text" # type: ignore[union-attr] + + +async def test_call_tool_streaming_aggregates_chunks() -> None: + ctx = _FakeContext(chunks=["foo", "bar", "baz"]) + channel = _make_channel(ctx, streaming=True) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.call_tool("run_agent", {"input": "hello"}) + assert result.content[0].text == "foobarbaz" # type: ignore[union-attr] + # No session_id supplied -> no session / identity. + assert ctx.requests[0].session is None + assert ctx.requests[0].identity is None + + +async def test_call_tool_rejects_empty_input() -> None: + ctx = _FakeContext() + channel = _make_channel(ctx) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.call_tool("run_agent", {"input": ""}) + assert result.isError + assert "non-empty string" in result.content[0].text # type: ignore[union-attr] + assert ctx.requests == [] + + +async def test_run_hook_can_reshape_request() -> None: + ctx = _FakeContext(reply="ok") + + async def _hook(request: ChannelRequest, *, target: Any, protocol_request: Any) -> ChannelRequest: + import dataclasses + + return dataclasses.replace(request, attributes={**dict(request.attributes), "hooked": True}) + + channel = _make_channel(ctx, streaming=False, run_hook=_hook) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + await client.call_tool("run_agent", {"input": "hello"}) + assert ctx.requests[0].attributes.get("hooked") is True + + +async def test_response_hook_can_shape_originating_reply() -> None: + ctx = _FakeContext(reply="original") + + async def _hook( + result: HostedRunResult[Any], + *, + request: ChannelRequest, + channel_name: str, + ) -> HostedRunResult[Any]: + assert channel_name == "mcp" + assert request.channel == "mcp" + return HostedRunResult(_FakeResp(text="hooked")) + + channel = _make_channel(ctx, streaming=False, response_hook=_hook) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.call_tool("run_agent", {"input": "hello"}) + assert result.content[0].text == "hooked" # type: ignore[union-attr] + + +async def test_streaming_response_hook_shapes_final_reply() -> None: + ctx = _FakeContext(chunks=["raw"]) + + async def _hook( + result: HostedRunResult[Any], + *, + request: ChannelRequest, + channel_name: str, + ) -> HostedRunResult[Any]: + return HostedRunResult(_FakeResp(text=f"{channel_name}:{request.channel}:{result.result.text}")) + + channel = _make_channel(ctx, streaming=True, response_hook=_hook) + async with create_connected_server_and_client_session(channel._server) as client: # type: ignore[arg-type] + result = await client.call_tool("run_agent", {"input": "hello"}) + assert result.content[0].text == "mcp:mcp:raw" # type: ignore[union-attr] + + +def test_default_path_and_name() -> None: + channel = MCPChannel() + assert channel.name == "mcp" + assert channel.path == "/mcp" + + +def test_content_conversion_handles_non_text_shapes() -> None: + from agent_framework_hosting_mcp._channel import _content_to_mcp, _structured_content, _value_to_mcp + + @dataclass + class StructuredValue: + answer: int + + circular: list[Any] = [] + circular.append(circular) + + assert _structured_content(None) is None + assert _structured_content(StructuredValue(answer=42)) == {"answer": 42} + assert _structured_content(circular) == {"value": "[[...]]"} + assert _content_to_mcp(Content("data", uri="not-a-data-uri", media_type="application/octet-stream")) == [] + assert _content_to_mcp(Content("text_reasoning", text="because"))[0].text == "because" # type: ignore[union-attr] + assert ( + _content_to_mcp(Content.from_function_result("call-1", result=[Content.from_text("nested")]))[0].text + == "nested" + ) # type: ignore[union-attr] + assert _content_to_mcp(Content.from_function_result("call-1", result={"x": 1}))[0].text == '{"x": 1}' # type: ignore[union-attr] + assert _content_to_mcp(Content.from_error(message="bad"))[0].text == "bad" # type: ignore[union-attr] + assert _content_to_mcp(Content.from_function_call("call-1", "tool")) == [] + assert _value_to_mcp(Message(role="assistant", contents=[Content.from_text("message")]))[0].text == "message" # type: ignore[union-attr] + assert _value_to_mcp(b"bytes")[0].type == "resource" + assert _value_to_mcp({"x": 1})[0].text == '{"x": 1}' # type: ignore[union-attr] + + +def test_result_conversion_handles_workflow_and_fallback_shapes() -> None: + class WorkflowResult: + value = None + + def get_outputs(self) -> list[Message]: + return [Message(role="assistant", contents=[Content.from_text("workflow")])] + + @dataclass + class TextOnlyResult: + text: str + value: Any | None = None + + channel = MCPChannel() + + workflow_result = channel._result_to_content(HostedRunResult(WorkflowResult())) + assert workflow_result.content[0].text == "workflow" # type: ignore[union-attr] + + text_result = channel._result_to_content(HostedRunResult(TextOnlyResult(text="fallback"))) + assert text_result.content[0].text == "fallback" # type: ignore[union-attr] + + structured_result = channel._result_to_content(HostedRunResult(TextOnlyResult(text="", value={"x": 1}))) + assert structured_result.structuredContent == {"x": 1} + assert structured_result.content[0].text == '{\n "x": 1\n}' # type: ignore[union-attr] + + empty_result = channel._result_to_content(HostedRunResult(TextOnlyResult(text=""))) + assert empty_result.content[0].text == "" # type: ignore[union-attr] + + +async def test_http_mcp_client_can_call_hosted_channel(unused_tcp_port: int) -> None: + host = AgentFrameworkHost(target=_HostedAgent(), channels=[MCPChannel(streaming=False)]) + + async with ( + _serve_app(host.app, port=unused_tcp_port) as base_url, + streamable_http_client(f"{base_url}/mcp/") as (read_stream, write_stream, _), + ClientSession(read_stream, write_stream) as session, + ): + await session.initialize() + tools = await session.list_tools() + result = await session.call_tool("run_agent", {"input": "hello", "session_id": "conv-1"}) + + assert [tool.name for tool in tools.tools] == ["run_agent"] + assert result.content[0].text == "host: hello" # type: ignore[union-attr] diff --git a/python/pyproject.toml b/python/pyproject.toml index f992269af14..c6191f58f04 100644 --- a/python/pyproject.toml +++ b/python/pyproject.toml @@ -93,6 +93,7 @@ agent-framework-hosting-telegram = { workspace = true } agent-framework-hosting-activity-protocol = { workspace = true } agent-framework-hosting-discord = { workspace = true } agent-framework-hosting-a2a = { workspace = true } +agent-framework-hosting-mcp = { workspace = true } agent-framework-hyperlight = { workspace = true } agent-framework-lab = { workspace = true } agent-framework-mem0 = { workspace = true } @@ -221,6 +222,7 @@ executionEnvironments = [ { root = "packages/hosting-telegram/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-activity-protocol/tests", reportPrivateUsage = "none" }, { root = "packages/hosting-a2a/tests", reportPrivateUsage = "none" }, + { root = "packages/hosting-mcp/tests", reportPrivateUsage = "none" }, { root = "packages/lab/gaia/tests", reportPrivateUsage = "none" }, { root = "packages/lab/lightning/tests", reportPrivateUsage = "none" }, { root = "packages/lab/tau2/tests", reportPrivateUsage = "none" }, diff --git a/python/uv.lock b/python/uv.lock index 6f611cc2aa8..64166061c5f 100644 --- a/python/uv.lock +++ b/python/uv.lock @@ -55,6 +55,7 @@ members = [ "agent-framework-hosting-activity-protocol", "agent-framework-hosting-discord", "agent-framework-hosting-invocations", + "agent-framework-hosting-mcp", "agent-framework-hosting-responses", "agent-framework-hosting-telegram", "agent-framework-hyperlight", @@ -728,6 +729,28 @@ requires-dist = [ { name = "agent-framework-hosting", editable = "packages/hosting" }, ] +[[package]] +name = "agent-framework-hosting-mcp" +version = "1.0.0a260424" +source = { editable = "packages/hosting-mcp" } +dependencies = [ + { name = "agent-framework-core", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "agent-framework-hosting", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "mcp", extra = ["ws"], marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, + { name = "starlette", marker = "sys_platform == 'darwin' or sys_platform == 'linux' or sys_platform == 'win32'" }, +] + +[package.metadata] +requires-dist = [ + { name = "agent-framework-core", editable = "packages/core" }, + { name = "agent-framework-hosting", editable = "packages/hosting" }, + { name = "mcp", specifier = ">=1.12,<2" }, + { name = "starlette", specifier = ">=0.37" }, +] + +[package.metadata.requires-dev] +dev = [] + [[package]] name = "agent-framework-hosting-responses" version = "1.0.0a260424" From 2bc01b353f40c8faf0f7ab744b72060a79400c86 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Wed, 17 Jun 2026 17:05:36 +0200 Subject: [PATCH 16/20] Cover workflow event stream mapping Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- python/packages/hosting/tests/test_host.py | 31 ++++++++++++++++++++++ 1 file changed, 31 insertions(+) diff --git a/python/packages/hosting/tests/test_host.py b/python/packages/hosting/tests/test_host.py index f6b2d9ebfa7..bb2b761d80c 100644 --- a/python/packages/hosting/tests/test_host.py +++ b/python/packages/hosting/tests/test_host.py @@ -10,6 +10,7 @@ import pytest from agent_framework import AgentResponse, AgentResponseUpdate, Content, Message, ResponseStream +from agent_framework._workflows._events import WorkflowEvent from starlette.requests import Request from starlette.responses import JSONResponse from starlette.routing import BaseRoute, Route @@ -25,6 +26,7 @@ ChannelSession, HostedRunResult, ) +from agent_framework_hosting._host import _workflow_event_to_update async def _ping(_request: Request) -> JSONResponse: @@ -439,6 +441,35 @@ async def test_stream_workflow_yields_one_update_per_output_event(self) -> None: final = await stream.get_final_response() assert final.text == "x-1x-2x-3" + def test_workflow_event_to_update_drops_non_output_events(self) -> None: + event = WorkflowEvent("intermediate", executor_id="worker", data="hidden") + + assert _workflow_event_to_update(event) is None + + def test_workflow_event_to_update_preserves_agent_response_update_payload(self) -> None: + event = WorkflowEvent( + "output", + executor_id="worker", + data=AgentResponseUpdate(contents=[Content.from_text("chunk")], role="assistant"), + ) + + update = _workflow_event_to_update(event) + + assert update is event.data + assert update.raw_representation is event + + def test_workflow_event_to_update_preserves_content_payload(self) -> None: + content = Content.from_data(data=b"\x89PNG", media_type="image/png", raw_representation={"source": "test"}) + event = WorkflowEvent("output", executor_id="worker", data=content) + + update = _workflow_event_to_update(event) + + assert update is not None + assert update.contents == [content] + assert update.contents[0].raw_representation == {"source": "test"} + assert update.author_name == "worker" + assert update.raw_representation is event + class TestHostWorkflowCheckpointing: """The host scopes per-conversation checkpoints when ``checkpoint_location`` is set.""" From 47ec0335ffbc0f95036968349934c951ed422971 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Wed, 17 Jun 2026 17:23:12 +0200 Subject: [PATCH 17/20] Resolve Python hosting rebase conflicts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../_responses.py | 1401 ++++++++++++++--- .../tests/test_history_provider.py | 10 +- .../hosting/agent_framework_hosting/_host.py | 2 +- 3 files changed, 1163 insertions(+), 250 deletions(-) diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py index 567335d2fac..0e404e86984 100644 --- a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_responses.py @@ -3,15 +3,17 @@ from __future__ import annotations import asyncio +import base64 import json import logging import os import tempfile import threading -from collections.abc import AsyncIterable, AsyncIterator, Generator +from collections.abc import AsyncIterable, AsyncIterator, Generator, Mapping, Sequence from contextlib import AbstractAsyncContextManager, AsyncExitStack, suppress +from dataclasses import asdict, dataclass, is_dataclass from pathlib import Path -from typing import cast +from typing import Protocol, cast from agent_framework import ( ChatOptions, @@ -19,6 +21,7 @@ ContextProvider, FileCheckpointStorage, HistoryProvider, + Message, RawAgent, SupportsAgentRun, WorkflowAgent, @@ -29,10 +32,78 @@ ResponseEventStream, ResponseProviderProtocol, ResponsesServerOptions, - models, ) from azure.ai.agentserver.responses._id_generator import IdGenerator from azure.ai.agentserver.responses.hosting import ResponsesAgentServerHost +from azure.ai.agentserver.responses.models import ( + ApplyPatchToolCallItemParam, + ApplyPatchToolCallOutputItemParam, + ComputerCallOutputItemParam, + ComputerScreenshotContent, + CreateResponse, + FunctionCallOutputItemParam, + FunctionShellAction, + FunctionShellCallItemParam, + FunctionShellCallOutputContent, + FunctionShellCallOutputExitOutcome, + FunctionShellCallOutputItemParam, + Item, + ItemCodeInterpreterToolCall, + ItemComputerToolCall, + ItemCustomToolCall, + ItemCustomToolCallOutput, + ItemFileSearchToolCall, + ItemFunctionToolCall, + ItemImageGenToolCall, + ItemLocalShellToolCall, + ItemLocalShellToolCallOutput, + ItemMcpApprovalRequest, + ItemMcpToolCall, + ItemMessage, + ItemOutputMessage, + ItemReasoningItem, + ItemWebSearchToolCall, + LocalEnvironmentResource, + MCPApprovalResponse, + MessageContent, + MessageContentInputFileContent, + MessageContentInputImageContent, + MessageContentInputTextContent, + MessageContentOutputTextContent, + MessageContentReasoningTextContent, + MessageContentRefusalContent, + MessageRole, + OAuthConsentRequestOutputItem, + OutputItem, + OutputItemApplyPatchToolCall, + OutputItemApplyPatchToolCallOutput, + OutputItemCodeInterpreterToolCall, + OutputItemComputerToolCall, + OutputItemComputerToolCallOutputResource, + OutputItemCustomToolCall, + OutputItemCustomToolCallOutput, + OutputItemFileSearchToolCall, + OutputItemFunctionShellCall, + OutputItemFunctionShellCallOutput, + OutputItemFunctionToolCall, + OutputItemImageGenToolCall, + OutputItemLocalShellToolCall, + OutputItemLocalShellToolCallOutput, + OutputItemMcpApprovalRequest, + OutputItemMcpApprovalResponseResource, + OutputItemMcpToolCall, + OutputItemMessage, + OutputItemOutputMessage, + OutputItemReasoningItem, + OutputItemWebSearchToolCall, + OutputMessageContent, + OutputMessageContentOutputTextContent, + OutputMessageContentRefusalContent, + ResponseStreamEvent, + StructuredOutputsOutputItem, + SummaryTextContent, + TextContent, +) from azure.ai.agentserver.responses.streaming._builders import ( OutputItemFunctionCallBuilder, OutputItemMcpCallBuilder, @@ -44,43 +115,22 @@ from mcp import McpError from typing_extensions import Any -from ._shared import ( - ApprovalStorage, - _arguments_to_str, # pyright: ignore[reportPrivateUsage] - _convert_message_content, # pyright: ignore[reportPrivateUsage] - _convert_output_message_content, # pyright: ignore[reportPrivateUsage] - _item_to_message, # pyright: ignore[reportPrivateUsage] - _items_to_messages, # pyright: ignore[reportPrivateUsage] - _output_item_to_message, # pyright: ignore[reportPrivateUsage] - _output_items_to_messages, # pyright: ignore[reportPrivateUsage] -) +logger = logging.getLogger(__name__) -# Re-export the conversion helpers under their historical names so existing -# tests (which import them from this module) keep working — the canonical -# definitions now live in :mod:`._shared`. -__all__ = ( - "ApprovalStorage", - "_arguments_to_str", - "_convert_message_content", - "_convert_output_message_content", - "_item_to_message", - "_items_to_messages", - "_output_item_to_message", - "_output_items_to_messages", -) +_AZURE_RESPONSES_MESSAGE_ROLE_TYPE = f"{MessageRole.__module__}:{MessageRole.__qualname__}" -# Local aliases for the agent-server SDK types this module touches at the -# Python type-annotation layer. Using ``models.X`` everywhere would work but -# would noisily clutter type-only positions where the alias adds no value. -CreateResponse = models.CreateResponse -ResponseStreamEvent = models.ResponseStreamEvent -FunctionShellAction = models.FunctionShellAction -FunctionShellCallOutputContent = models.FunctionShellCallOutputContent -FunctionShellCallOutputExitOutcome = models.FunctionShellCallOutputExitOutcome -LocalEnvironmentResource = models.LocalEnvironmentResource -OAuthConsentRequestOutputItem = models.OAuthConsentRequestOutputItem -logger = logging.getLogger(__name__) +# region Approval Storage +class ApprovalStorage(Protocol): + """Storage for saving function approval requests.""" + + async def save_approval_request(self, approval_request_id: str, request: Content) -> None: + """Save a function approval request under the given ID.""" + ... + + async def load_approval_request(self, approval_request_id: str) -> Content: + """Load a function approval request by its ID.""" + ... class InMemoryFunctionApprovalStorage: @@ -202,35 +252,85 @@ def _checkpoint_storage_for_context(root: str, context_id: str) -> FileCheckpoin storage_path = (root_path / context_id).resolve() if not storage_path.is_relative_to(root_path): raise RuntimeError(f"Invalid checkpoint context id: {context_id!r}") - return FileCheckpointStorage(storage_path) + return FileCheckpointStorage( + storage_path, + # Keep this provider-specific allowlist narrow. Hosted workflow + # checkpoints can persist Azure's role enum inside Message objects. + allowed_checkpoint_types=[_AZURE_RESPONSES_MESSAGE_ROLE_TYPE], + ) # endregion Approval Storage # Foundry Toolbox Auth integration # Consent-URL error code returned by the Foundry MCP gateway when calling `/list` -CONSENT_ERROR_CODE = -32007 +CONSENT_ERROR_CODE = -32006 + +@dataclass +class ConsentError: + name: str + consent_url: str -def consent_url_from_error(exc: BaseException) -> str | None: - """Return the consent URL when ``exc`` wraps a Foundry MCP gateway consent error. - The Agent Framework MCP layer surfaces gateway consent failures by wrapping the underlying - ``McpError`` inside an :class:`AgentFrameworkException` (typically a ``ToolExecutionException`` - raised from ``MCPStreamableHTTPTool.__aenter__``). This helper inspects ``exc.args`` for a - wrapped ``McpError`` whose ``error.code`` is :data:`CONSENT_ERROR_CODE`; when found, the - consent link the gateway returned in ``error.message`` is returned. Returns ``None`` for - anything else, so callers can do ``if (url := consent_url_from_error(ex)) is None: raise``. +def consent_url_from_error(exc: BaseException) -> list[ConsentError] | None: + """Return the consent URLs when ``exc`` wraps Foundry MCP gateway consent errors. Args: exc: The exception to inspect. Returns: - The consent URL if ``exc`` wraps a consent ``McpError``, otherwise ``None``. + The consent URL(s) extracted from the error, or ``None`` if no consent error was found. """ inner_exception = next((arg for arg in exc.args if isinstance(arg, McpError)), None) if inner_exception is not None and inner_exception.error.code == CONSENT_ERROR_CODE: - return inner_exception.error.message + # Parse the error message + # The error message is structured with the following format: + # "tools/list failed for 1 tool source(s), succeeded for 0 tool source(s) {"errors":[{"name": ..." + # where the second part is a JSON string that can be deserialized into an object with the following shape: + # ruff: disable[ERA001] + # { + # "errors" : [ + # { + # "name": "Name of the MCP tool that requires consent", + # "type" : "mcp", + # "error": { + # "code": "CONSENT_REQUIRED", + # "message": consent_url, + # } + # } + # ] + # } + # ruff: enable[ERA001] + try: + consent_errors: list[ConsentError] = [] + error_message_start = inner_exception.error.message.find("{") + if error_message_start == -1: + logger.warning("Consent error message does not contain JSON: %s", inner_exception.error.message) + return None + consent_details_json = inner_exception.error.message[error_message_start:] + consent_details = json.loads(consent_details_json) + if "errors" not in consent_details or not isinstance(consent_details["errors"], list): + logger.warning("Consent error message JSON does not contain 'errors' list: %s", consent_details_json) + return None + for error in consent_details["errors"]: + if ( + isinstance(error, dict) + and error.get("type") == "mcp" # type: ignore + and "error" in error + and isinstance(error["error"], dict) + and error["error"].get("code") == "CONSENT_REQUIRED" # type: ignore + and "message" in error["error"] + ): + consent_url = error["error"]["message"] # type: ignore + if isinstance(consent_url, str): + consent_errors.append(ConsentError(name=error.get("name", "Unknown"), consent_url=consent_url)) # type: ignore + else: + logger.warning("Consent URL in error message is not a valid URL: %s", consent_url) # type: ignore + if consent_errors: + return consent_errors + except json.JSONDecodeError: + logger.warning("Failed to parse consent details JSON: %s", inner_exception.error.message) return None @@ -361,71 +461,70 @@ async def _handle_inner_agent( context: ResponseContext, ) -> AsyncIterable[ResponseStreamEvent | dict[str, Any]]: """Handle the creation of a response for a regular (non-workflow) agent.""" - input_items = await context.get_input_items() - input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) - - history = await context.get_history() - run_kwargs: dict[str, Any] = { - "messages": [ - *(await _output_items_to_messages(history, approval_storage=self._approval_storage)), - *input_messages, - ] - } - is_streaming_request = request.stream is not None and request.stream is True - - chat_options, are_options_set = _to_chat_options(request) - response_event_stream = ResponseEventStream(response_id=context.response_id, model=request.model) - yield response_event_stream.emit_created() yield response_event_stream.emit_in_progress() - if are_options_set and not isinstance(self._agent, RawAgent): - logger.warning("Agent doesn't support runtime options. They will be ignored.") - else: - run_kwargs["options"] = chat_options - - # Lazy-enter the agent (and any MCP tools it owns). The MCP client wraps gateway - # consent failures (and other connection-time errors) in AgentFrameworkException; if - # one of those is a consent error we surface the consent link to the client through - # the already-opened response stream instead of crashing the request. Other exception - # types propagate normally so the host can handle / log them. - try: - await self._ensure_agent_ready() - except AgentFrameworkException as ex: - consent_url = consent_url_from_error(ex) - if consent_url is None: - raise - logger.warning("OAuth consent required for Foundry MCP gateway.") - oauth_item = OAuthConsentRequestOutputItem( - id=IdGenerator.new_id("oacr"), - consent_link=consent_url, - server_label="Foundry Toolbox", - ) - builder = response_event_stream.add_output_item(oauth_item.id) - yield builder.emit_added(oauth_item) - yield builder.emit_done(oauth_item) - yield response_event_stream.emit_completed() - return - # Track the current active output item builder for streaming; # lazily created on matching content, closed when a different type arrives. - tracker: _OutputItemTracker | None = _OutputItemTracker(response_event_stream) if is_streaming_request else None + tracker: _OutputItemTracker | None = None try: + input_items = await context.get_input_items() + input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) + + history = await context.get_history() + run_kwargs: dict[str, Any] = { + "messages": [ + *(await _output_items_to_messages(history, approval_storage=self._approval_storage)), + *input_messages, + ] + } + is_streaming_request = request.stream is not None and request.stream is True + + chat_options, are_options_set = _to_chat_options(request) + + if are_options_set and not isinstance(self._agent, RawAgent): + logger.warning("Agent doesn't support runtime options. They will be ignored.") + else: + run_kwargs["options"] = chat_options + + # Lazy-enter the agent (and any MCP tools it owns). The MCP client wraps gateway + # consent failures (and other connection-time errors) in AgentFrameworkException; if + # one of those is a consent error we surface the consent link to the client through + # the already-opened response stream instead of failing the request. Other exception + # types fall through to the outer handler below and become ``response.failed``. + try: + await self._ensure_agent_ready() + except AgentFrameworkException as ex: + consent_errors = consent_url_from_error(ex) + if consent_errors is None: + raise + for consent_error in consent_errors: + logger.warning("Consent URL for tool '%s': %s", consent_error.name, consent_error.consent_url) + oauth_item = OAuthConsentRequestOutputItem( + id=IdGenerator.new_id("oacr"), + consent_link=consent_error.consent_url, + server_label=consent_error.name, + ) + builder = response_event_stream.add_output_item(oauth_item.id) + yield builder.emit_added(oauth_item) + yield builder.emit_done(oauth_item) + yield response_event_stream.emit_completed() + return + + tracker = _OutputItemTracker(response_event_stream) if is_streaming_request else None + if not is_streaming_request: # Run the agent in non-streaming mode response = await self._agent.run(stream=False, **run_kwargs) # type: ignore[reportUnknownMemberType] - for message in response.messages: - for content in message.contents: - async for item in _to_outputs( - response_event_stream, - content, - approval_storage=self._approval_storage, - ): - yield item - yield response_event_stream.emit_completed() + async for item in _to_outputs_for_messages( + response_event_stream, + response.messages, + approval_storage=self._approval_storage, + ): + yield item else: if tracker is None: # pragma: no cover - defensive, set above raise RuntimeError("Streaming tracker was not initialized.") @@ -446,160 +545,158 @@ async def _handle_inner_agent( # Close any remaining active builder for event in tracker.close(): yield event - yield response_event_stream.emit_completed() - except Exception: - # Drain any in-progress streaming builder before emitting consent - # so the resulting stream stays well-formed. - if tracker is not None: - for event in tracker.close(): - yield event - yield response_event_stream.emit_completed() - raise + yield response_event_stream.emit_completed() + except Exception as ex: + logger.exception("Failed to produce response for agent") + for event in self._emit_failure(response_event_stream, tracker, ex): + yield event async def _handle_inner_workflow( self, request: CreateResponse, context: ResponseContext, ) -> AsyncIterable[ResponseStreamEvent | dict[str, Any]]: - """Handle the creation of a response for a workflow agent. - - Why this is required: - The sandbox may be deactivated after some period of inactivity, and only data managed - by the hosting infrastructure or files will be preserved upon deactivation. - """ - input_items = await context.get_input_items() - input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) - is_streaming_request = request.stream is not None and request.stream is True - - _, are_options_set = _to_chat_options(request) - if are_options_set: - logger.warning("Workflow agent doesn't support runtime options. They will be ignored.") - - if request.previous_response_id is not None and context.conversation_id is not None: - raise RuntimeError("Previous response ID cannot be used in conjunction with conversation ID.") - context_id = request.previous_response_id or context.conversation_id - - # The following should never happen due to the checks above. - # This is for type safety and defensive programming. - if self._checkpoint_storage_path is None: - raise RuntimeError("Checkpoint storage path is not configured for workflow agent.") - if not isinstance(self._agent, WorkflowAgent): - raise RuntimeError("Agent is not a workflow agent.") - - # Workflow agents are not async context managers in any built-in path, - # but call _ensure_agent_ready for symmetry with the regular path so - # any future async resources owned by the workflow are entered here. - await self._ensure_agent_ready() - - # Determine the latest checkpoint (if any) so we can resume the - # workflow's prior state for this turn. The directory is keyed by - # the inbound context id (conversation_id when set, otherwise - # previous_response_id). Multi-turn declarative workflows need the - # workflow's internal state (e.g. Conversation.messages, - # intermediate Local.* variables) to survive across user turns; - # the only place that state lives is the workflow checkpoint, so - # on every turn we restore the latest checkpoint and feed the new - # input back into the start executor as a continuation rather than - # a fresh run. - latest_checkpoint_id: str | None = None - restore_storage: FileCheckpointStorage | None = None - if context_id is not None: - restore_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, context_id) - latest_checkpoint = await restore_storage.get_latest(workflow_name=self._agent.workflow.name) - if latest_checkpoint is not None: - latest_checkpoint_id = latest_checkpoint.checkpoint_id - - # Storage that will receive checkpoints written during this turn. - # When the caller chains with previous_response_id, the next turn - # will reference the current response_id as its previous_response_id, - # so new checkpoints must land under the current response_id (or the - # conversation_id when set). When conversation_id is set, this - # matches restore_storage; when only previous_response_id was - # supplied, restore_storage points at the *prior* response's - # directory and checkpoint_storage points at the *current* response's. - write_context_id = context.conversation_id or context.response_id - checkpoint_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, write_context_id) - - # Multi-turn pattern: when we have a prior checkpoint, restore it - # first (drive the workflow back to idle with prior state intact), - # then make a separate call that delivers the new user input. This - # depends on Workflow.run preserving shared state across calls. The - # restore-only call may yield events from any pending in-flight - # work in the checkpoint; we consume those internally here so they - # don't surface to the response stream as duplicates. - # - # If the restored checkpoint had pending request_info events, the - # restore-only call replays them through - # ``WorkflowAgent._convert_workflow_event_to_agent_response_updates`` - # and populates ``self._agent.pending_requests``. That is the correct - # state: those requests are genuinely outstanding, and the next - # ``run(input_messages, ...)`` call may contain ``function_call_output`` - # items (carried as FunctionResult/FunctionApprovalResponse content) - # that fulfill them via :meth:`WorkflowAgent._process_pending_requests`. - if latest_checkpoint_id is not None: - if restore_storage is None: # pragma: no cover - defensive - raise RuntimeError("Checkpoint restore storage is not configured.") - if is_streaming_request: - async for _ in self._agent.run( - stream=True, - checkpoint_id=latest_checkpoint_id, - checkpoint_storage=restore_storage, - ): - pass - else: - await self._agent.run( - stream=False, - checkpoint_id=latest_checkpoint_id, - checkpoint_storage=restore_storage, - ) - - # Now run the agent with the latest input + """Handle the creation of a response for a workflow agent.""" response_event_stream = ResponseEventStream(response_id=context.response_id, model=request.model) - yield response_event_stream.emit_created() yield response_event_stream.emit_in_progress() - if not is_streaming_request: - # Run the agent in non-streaming mode - response = await self._agent.run(input_messages, stream=False, checkpoint_storage=checkpoint_storage) + # Track the current active output item builder for streaming; + # lazily created on matching content, closed when a different type arrives. + tracker: _OutputItemTracker | None = None + + try: + input_items = await context.get_input_items() + input_messages = await _items_to_messages(input_items, approval_storage=self._approval_storage) + is_streaming_request = request.stream is not None and request.stream is True + + _, are_options_set = _to_chat_options(request) + if are_options_set: + logger.warning("Workflow agent doesn't support runtime options. They will be ignored.") + + if request.previous_response_id is not None and context.conversation_id is not None: + raise RuntimeError("Previous response ID cannot be used in conjunction with conversation ID.") + context_id = request.previous_response_id or context.conversation_id + + # The following should never happen due to the checks above. + # This is for type safety and defensive programming. + if self._checkpoint_storage_path is None: + raise RuntimeError("Checkpoint storage path is not configured for workflow agent.") + if not isinstance(self._agent, WorkflowAgent): + raise RuntimeError("Agent is not a workflow agent.") + + # Workflow agents are not async context managers in any built-in path, + # but call _ensure_agent_ready for symmetry with the regular path so + # any future async resources owned by the workflow are entered here. + await self._ensure_agent_ready() - for message in response.messages: - for content in message.contents: - async for item in _to_outputs( - response_event_stream, - content, - approval_storage=self._approval_storage, + # Determine the latest checkpoint (if any) so we can resume the + # workflow's prior state for this turn. The directory is keyed by + # the inbound context id (conversation_id when set, otherwise + # previous_response_id). Multi-turn declarative workflows need the + # workflow's internal state (e.g. Conversation.messages, + # intermediate Local.* variables) to survive across user turns; + # the only place that state lives is the workflow checkpoint, so + # on every turn we restore the latest checkpoint and feed the new + # input back into the start executor as a continuation rather than + # a fresh run. + latest_checkpoint_id: str | None = None + restore_storage: FileCheckpointStorage | None = None + if context_id is not None: + restore_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, context_id) + latest_checkpoint = await restore_storage.get_latest(workflow_name=self._agent.workflow.name) + if latest_checkpoint is not None: + latest_checkpoint_id = latest_checkpoint.checkpoint_id + + # Storage that will receive checkpoints written during this turn. + # When the caller chains with previous_response_id, the next turn + # will reference the current response_id as its previous_response_id, + # so new checkpoints must land under the current response_id (or the + # conversation_id when set). When conversation_id is set, this + # matches restore_storage; when only previous_response_id was + # supplied, restore_storage points at the *prior* response's + # directory and write_storage points at the *current* response's. + write_context_id = context.conversation_id or context.response_id + write_storage = _checkpoint_storage_for_context(self._checkpoint_storage_path, write_context_id) + + # Multi-turn pattern: when we have a prior checkpoint, restore it + # first (drive the workflow back to idle with prior state intact), + # then make a separate call that delivers the new user input. This + # depends on Workflow.run preserving shared state across calls. The + # restore-only call may yield events from any pending in-flight + # work in the checkpoint; we consume those internally here so they + # don't surface to the response stream as duplicates. + # + # If the restored checkpoint had pending request_info events, the + # restore-only call replays them through + # ``WorkflowAgent._convert_workflow_event_to_agent_response_updates`` + # and populates ``self._agent.pending_requests``. That is the correct + # state: those requests are genuinely outstanding, and the next + # ``run(input_messages, ...)`` call may contain ``function_call_output`` + # items (carried as FunctionResult/FunctionApprovalResponse content) + # that fulfill them via :meth:`WorkflowAgent._process_pending_requests`. + if latest_checkpoint_id is not None: + if is_streaming_request: + async for _ in self._agent.run( + stream=True, + checkpoint_id=latest_checkpoint_id, + checkpoint_storage=restore_storage, ): - yield item + pass + else: + await self._agent.run( + stream=False, + checkpoint_id=latest_checkpoint_id, + checkpoint_storage=restore_storage, + ) - await self._delete_not_latest_checkpoints(checkpoint_storage, self._agent.workflow.name) - yield response_event_stream.emit_completed() - return + if not is_streaming_request: + # Run the agent in non-streaming mode with the new user input. + response = await self._agent.run( + input_messages, + stream=False, + checkpoint_storage=write_storage, + ) - # Track the current active output item builder for streaming; - # lazily created on matching content, closed when a different type arrives. - tracker = _OutputItemTracker(response_event_stream) + async for item in _to_outputs_for_messages( + response_event_stream, + response.messages, + approval_storage=self._approval_storage, + ): + yield item - # Run the workflow agent in streaming mode - async for update in self._agent.run(input_messages, stream=True, checkpoint_storage=checkpoint_storage): - for content in update.contents: - for event in tracker.handle(content): - yield event - if tracker.needs_async: - async for item in _to_outputs( - response_event_stream, - content, - approval_storage=self._approval_storage, - ): - yield item - tracker.needs_async = False + await self._delete_not_latest_checkpoints(write_storage, self._agent.workflow.name) + yield response_event_stream.emit_completed() + return - # Close any remaining active builder - for event in tracker.close(): - yield event + tracker = _OutputItemTracker(response_event_stream) + + # Run the workflow agent in streaming mode with the new user input. + async for update in self._agent.run( + input_messages, + stream=True, + checkpoint_storage=write_storage, + ): + for content in update.contents: + for event in tracker.handle(content): + yield event + if tracker.needs_async: + async for item in _to_outputs( + response_event_stream, content, approval_storage=self._approval_storage + ): + yield item + tracker.needs_async = False + + # Close any remaining active builder + for event in tracker.close(): + yield event - await self._delete_not_latest_checkpoints(checkpoint_storage, self._agent.workflow.name) - yield response_event_stream.emit_completed() + await self._delete_not_latest_checkpoints(write_storage, self._agent.workflow.name) + yield response_event_stream.emit_completed() + except Exception as ex: + logger.exception("Failed to produce response for workflow agent") + for event in self._emit_failure(response_event_stream, tracker, ex): + yield event @staticmethod async def _delete_not_latest_checkpoints(checkpoint_storage: FileCheckpointStorage, workflow_name: str) -> None: @@ -614,6 +711,29 @@ async def _delete_not_latest_checkpoints(checkpoint_storage: FileCheckpointStora if checkpoint.checkpoint_id != latest_checkpoint.checkpoint_id: await checkpoint_storage.delete(checkpoint.checkpoint_id) + @staticmethod + def _emit_failure( + response_event_stream: ResponseEventStream, + tracker: _OutputItemTracker | None, + ex: BaseException, + ) -> Generator[ResponseStreamEvent]: + """Yield a terminal ``response.failed`` event for ``ex``. + + Drains any in-progress streaming output item first so the resulting + SSE stream stays well-formed, then emits ``response.failed`` carrying + the exception's message (falling back to the exception type name when + ``str(ex)`` is empty). Any error raised while draining the tracker is + logged and otherwise ignored so that the original failure is always + what the client sees. + """ + if tracker is not None: + try: + yield from tracker.close() + except Exception: + logger.exception("Error while closing streaming tracker after failure") + message = str(ex) or type(ex).__name__ + yield response_event_stream.emit_failed(message=message) + # endregion ResponsesHostServer @@ -676,7 +796,7 @@ def handle(self, content: Content) -> Generator[ResponseStreamEvent]: yield self._fc_builder.emit_arguments_delta(args_str) elif content.type == "mcp_server_tool_call" and content.tool_name: - key = f"{content.server_name or 'default'}::{content.tool_name}" + key = content.call_id or f"{content.server_name or 'default'}::{content.tool_name}" if self._active_type != "mcp_server_tool_call" or self._active_id != key: yield from self._close() yield from self._open_mcp_call(content) @@ -685,6 +805,24 @@ def handle(self, content: Content) -> Generator[ResponseStreamEvent]: if self._mcp_builder is not None: yield self._mcp_builder.emit_arguments_delta(args_str) + elif ( + content.type == "mcp_server_tool_result" + and self._active_type == "mcp_server_tool_call" + and self._mcp_builder is not None + and content.call_id is not None + and content.call_id == self._mcp_builder.item_id + ): + accumulated = "".join(self._accumulated) + yield self._mcp_builder.emit_arguments_done(accumulated) + yield self._mcp_builder.emit_completed() + yield self._mcp_builder.emit_done(output=_stringify_mcp_output(content.output)) + self._mcp_builder = None + self._active_type = None + self._active_id = None + self._accumulated.clear() + self.needs_async = False + return + else: yield from self._close() self.needs_async = True @@ -724,9 +862,10 @@ def _open_mcp_call(self, content: Content) -> Generator[ResponseStreamEvent]: self._mcp_builder = self._stream.add_output_item_mcp_call( server_label=content.server_name or "default", name=content.tool_name or "", + item_id=content.call_id, ) self._active_type = "mcp_server_tool_call" - self._active_id = f"{content.server_name or 'default'}::{content.tool_name}" + self._active_id = content.call_id or f"{content.server_name or 'default'}::{content.tool_name}" yield self._mcp_builder.emit_added() def _close(self) -> Generator[ResponseStreamEvent]: @@ -762,9 +901,6 @@ def _close(self) -> Generator[ResponseStreamEvent]: self._accumulated.clear() -# endregion - - # region Option Conversion @@ -800,6 +936,695 @@ def _to_chat_options(request: CreateResponse) -> tuple[ChatOptions, bool]: # endregion +# region Input Message Conversion + + +async def _items_to_messages( + input_items: Sequence[Item], *, approval_storage: ApprovalStorage | None = None +) -> list[Message]: + """Converts a sequence of input items to a list of Messages, one per item. + + Args: + input_items: The input items to convert. + approval_storage: An optional ApprovalStorage instance used to look up + approval requests when converting MCP approval response items. + + Returns: + A list of Messages, one per supported input item. + """ + messages: list[Message] = [] + for item in input_items: + messages.append(await _item_to_message(item, approval_storage=approval_storage)) + return messages + + +async def _item_to_message(item: Item, *, approval_storage: ApprovalStorage | None = None) -> Message: + """Converts an Item to a Message. + + Args: + item: The Item to convert. + approval_storage: An optional ApprovalStorage instance used to look up + approval requests when converting MCP approval response items. + + Returns: + The converted Message. + + Raises: + ValueError: If the Item type is not supported. + """ + if item.type == "message": + msg = cast(ItemMessage, item) + if isinstance(msg.content, str): + return Message(role=msg.role, contents=[Content.from_text(msg.content)]) + return Message(role=msg.role, contents=[_convert_message_content(part) for part in msg.content]) + + if item.type == "output_message": + output_msg = cast(ItemOutputMessage, item) + return Message( + role=output_msg.role, contents=[_convert_output_message_content(part) for part in output_msg.content] + ) + + if item.type == "function_call": + fc = cast(ItemFunctionToolCall, item) + return Message( + role="assistant", + contents=[Content.from_function_call(fc.call_id, fc.name, arguments=fc.arguments)], + ) + + if item.type == "function_call_output": + fco = cast(FunctionCallOutputItemParam, item) + output = fco.output if isinstance(fco.output, str) else str(fco.output) + return Message( + role="tool", + contents=[Content.from_function_result(fco.call_id, result=output)], + ) + + if item.type == "reasoning": + reasoning = cast(ItemReasoningItem, item) + reason_contents: list[Content] = [] + if reasoning.summary: + for summary in reasoning.summary: + reason_contents.append(Content.from_text(summary.text)) + return Message(role="assistant", contents=reason_contents) + + if item.type == "mcp_call": + mcp = cast(ItemMcpToolCall, item) + contents = [ + Content.from_mcp_server_tool_call( + mcp.id, + mcp.name, + server_name=mcp.server_label, + arguments=mcp.arguments, + ) + ] + if getattr(mcp, "output", None) is not None: + contents.append(Content.from_mcp_server_tool_result(call_id=mcp.id, output=mcp.output)) + return Message( + role="assistant", + contents=contents, + ) + + if item.type == "mcp_approval_request": + mcp_req = cast(ItemMcpApprovalRequest, item) + if approval_storage is not None: + function_approval_request_content = await approval_storage.load_approval_request(mcp_req.id) + else: + raise ValueError("ApprovalStorage is required to load approval request.") + return Message( + role="assistant", + contents=[function_approval_request_content], + ) + + if item.type == "mcp_approval_response": + mcp_resp = cast(MCPApprovalResponse, item) + if approval_storage is not None: + function_approval_request_content = await approval_storage.load_approval_request( + mcp_resp.approval_request_id + ) + else: + raise ValueError("ApprovalStorage is required to load approval request.") + return Message( + role="user", + contents=[function_approval_request_content.to_function_approval_response(mcp_resp.approve)], + ) + + if item.type == "code_interpreter_call": + ci = cast(ItemCodeInterpreterToolCall, item) + return Message( + role="assistant", + contents=[Content.from_code_interpreter_tool_call(call_id=ci.id)], + ) + + if item.type == "image_generation_call": + ig = cast(ItemImageGenToolCall, item) + return Message( + role="assistant", + contents=[Content.from_image_generation_tool_call(image_id=ig.id)], + ) + + if item.type == "shell_call": + sc = cast(FunctionShellCallItemParam, item) + return Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=sc.call_id, + commands=sc.action.commands, + status=str(sc.status), + ) + ], + ) + + if item.type == "shell_call_output": + sco = cast(FunctionShellCallOutputItemParam, item) + outputs = [ + Content.from_shell_command_output( + stdout=out.stdout or "", + stderr=out.stderr or "", + exit_code=getattr(out.outcome, "exit_code", None) if hasattr(out, "outcome") else None, + ) + for out in (sco.output or []) + ] + return Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=sco.call_id, + outputs=outputs, + max_output_length=sco.max_output_length, + ) + ], + ) + + if item.type == "local_shell_call": + lsc = cast(ItemLocalShellToolCall, item) + commands = lsc.action.command if hasattr(lsc.action, "command") and lsc.action.command else [] + return Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=lsc.call_id, + commands=commands, + status=str(lsc.status), + ) + ], + ) + + if item.type == "local_shell_call_output": + lsco = cast(ItemLocalShellToolCallOutput, item) + return Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=lsco.id, + outputs=[Content.from_shell_command_output(stdout=lsco.output)], + ) + ], + ) + + if item.type == "file_search_call": + fs = cast(ItemFileSearchToolCall, item) + return Message( + role="assistant", + contents=[ + Content.from_function_call( + fs.id, + "file_search", + arguments=json.dumps({"queries": fs.queries}), + ) + ], + ) + + if item.type == "web_search_call": + ws = cast(ItemWebSearchToolCall, item) + return Message( + role="assistant", + contents=[Content.from_function_call(ws.id, "web_search")], + ) + + if item.type == "computer_call": + cc = cast(ItemComputerToolCall, item) + return Message( + role="assistant", + contents=[ + Content.from_function_call( + cc.call_id, + "computer_use", + arguments=str(cc.action), + ) + ], + ) + + if item.type == "computer_call_output": + cco = cast(ComputerCallOutputItemParam, item) + return Message( + role="tool", + contents=[Content.from_function_result(cco.call_id, result=str(cco.output))], + ) + + if item.type == "custom_tool_call": + ct = cast(ItemCustomToolCall, item) + return Message( + role="assistant", + contents=[Content.from_function_call(ct.call_id, ct.name, arguments=ct.input)], + ) + + if item.type == "custom_tool_call_output": + cto = cast(ItemCustomToolCallOutput, item) + output = cto.output if isinstance(cto.output, str) else str(cto.output) + # Hosted-MCP results land here because the host writes them via + # `aoutput_item_custom_tool_call_output` (see `_to_outputs` for + # `mcp_server_tool_result`). The persisted `call_id` keeps its + # `mcp_*` prefix; on read, route those back to a hosted-MCP result + # Content so the chat-client serialize layer can coalesce them + # onto a single `mcp_call` input item with `output` populated. + # Issue #5546. + if cto.call_id and cto.call_id.startswith("mcp_"): + return Message( + role="tool", + contents=[Content.from_mcp_server_tool_result(call_id=cto.call_id, output=output)], + ) + return Message( + role="tool", + contents=[Content.from_function_result(cto.call_id, result=output)], + ) + + if item.type == "apply_patch_call": + ap = cast(ApplyPatchToolCallItemParam, item) + return Message( + role="assistant", + contents=[ + Content.from_function_call( + ap.call_id, + "apply_patch", + arguments=str(ap.operation), + ) + ], + ) + + if item.type == "apply_patch_call_output": + apo = cast(ApplyPatchToolCallOutputItemParam, item) + return Message( + role="tool", + contents=[Content.from_function_result(apo.call_id, result=apo.output or "")], + ) + + raise ValueError(f"Unsupported Item type: {item.type}") + + +async def _output_items_to_messages( + history: Sequence[OutputItem], + *, + approval_storage: ApprovalStorage | None = None, +) -> list[Message]: + """Converts a sequence of OutputItem objects to a list of Message objects. + + Args: + history (Sequence[OutputItem]): The sequence of OutputItem objects to convert. + approval_storage (ApprovalStorage | None, optional): The approval storage to use for + resolving MCP approval requests. Defaults to None. + + Returns: + list[Message]: The list of Message objects. + """ + messages: list[Message] = [] + for item in history: + messages.append(await _output_item_to_message(item, approval_storage=approval_storage)) + return messages + + +async def _output_item_to_message(item: OutputItem, *, approval_storage: ApprovalStorage | None = None) -> Message: + """Converts an OutputItem to a Message. + + Args: + item (OutputItem): The OutputItem to convert. + approval_storage (ApprovalStorage | None, optional): The approval storage to use for + resolving MCP approval requests. Defaults to None. + + Returns: + Message: The converted Message. + + Raises: + ValueError: If the OutputItem type is not supported. + """ + if item.type == "output_message": + output_msg = cast(OutputItemOutputMessage, item) + return Message( + role=output_msg.role, contents=[_convert_output_message_content(part) for part in output_msg.content] + ) + + if item.type == "message": + msg = cast(OutputItemMessage, item) + return Message(role=msg.role, contents=[_convert_message_content(part) for part in msg.content]) + + if item.type == "function_call": + fc = cast(OutputItemFunctionToolCall, item) + return Message( + role="assistant", + contents=[Content.from_function_call(fc.call_id, fc.name, arguments=fc.arguments)], + ) + + if item.type == "function_call_output": + fco = cast(FunctionCallOutputItemParam, item) + output = fco.output if isinstance(fco.output, str) else str(fco.output) + return Message( + role="tool", + contents=[Content.from_function_result(fco.call_id, result=output)], + ) + + if item.type == "reasoning": + reasoning = cast(OutputItemReasoningItem, item) + contents: list[Content] = [] + if reasoning.summary: + for summary in reasoning.summary: + contents.append(Content.from_text(summary.text)) + return Message(role="assistant", contents=contents) + + if item.type == "mcp_call": + mcp = cast(OutputItemMcpToolCall, item) + contents = [ + Content.from_mcp_server_tool_call( + mcp.id, + mcp.name, + server_name=mcp.server_label, + arguments=mcp.arguments, + ) + ] + if getattr(mcp, "output", None) is not None: + contents.append(Content.from_mcp_server_tool_result(call_id=mcp.id, output=mcp.output)) + return Message( + role="assistant", + contents=contents, + ) + + if item.type == "mcp_approval_request": + mcp_req = cast(OutputItemMcpApprovalRequest, item) + if approval_storage is not None: + function_approval_request_content = await approval_storage.load_approval_request(mcp_req.id) + else: + raise ValueError("ApprovalStorage is required to load approval request.") + return Message( + role="assistant", + contents=[function_approval_request_content], + ) + + if item.type == "mcp_approval_response": + mcp_resp = cast(OutputItemMcpApprovalResponseResource, item) + if approval_storage is not None: + function_approval_request_content = await approval_storage.load_approval_request( + mcp_resp.approval_request_id + ) + else: + raise ValueError("ApprovalStorage is required to load approval request.") + + return Message( + role="user", + contents=[function_approval_request_content.to_function_approval_response(mcp_resp.approve)], + ) + + if item.type == "code_interpreter_call": + ci = cast(OutputItemCodeInterpreterToolCall, item) + return Message( + role="assistant", + contents=[Content.from_code_interpreter_tool_call(call_id=ci.id)], + ) + + if item.type == "image_generation_call": + ig = cast(OutputItemImageGenToolCall, item) + return Message( + role="assistant", + contents=[Content.from_image_generation_tool_call(image_id=ig.id)], + ) + + if item.type == "shell_call": + sc = cast(OutputItemFunctionShellCall, item) + return Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=sc.call_id, + commands=sc.action.commands, + status=str(sc.status), + ) + ], + ) + + if item.type == "shell_call_output": + sco = cast(OutputItemFunctionShellCallOutput, item) + outputs = [ + Content.from_shell_command_output( + stdout=out.stdout or "", + stderr=out.stderr or "", + exit_code=getattr(out.outcome, "exit_code", None) if hasattr(out, "outcome") else None, + ) + for out in (sco.output or []) + ] + return Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=sco.call_id, + outputs=outputs, + max_output_length=sco.max_output_length, + ) + ], + ) + + if item.type == "local_shell_call": + lsc = cast(OutputItemLocalShellToolCall, item) + commands = lsc.action.command if hasattr(lsc.action, "command") and lsc.action.command else [] + return Message( + role="assistant", + contents=[ + Content.from_shell_tool_call( + call_id=lsc.call_id, + commands=commands, + status=str(lsc.status), + ) + ], + ) + + if item.type == "local_shell_call_output": + lsco = cast(OutputItemLocalShellToolCallOutput, item) + return Message( + role="tool", + contents=[ + Content.from_shell_tool_result( + call_id=lsco.id, + outputs=[Content.from_shell_command_output(stdout=lsco.output)], + ) + ], + ) + + if item.type == "file_search_call": + fs = cast(OutputItemFileSearchToolCall, item) + return Message( + role="assistant", + contents=[ + Content.from_function_call( + fs.id, + "file_search", + arguments=json.dumps({"queries": fs.queries}), + ) + ], + ) + + if item.type == "web_search_call": + ws = cast(OutputItemWebSearchToolCall, item) + return Message( + role="assistant", + contents=[Content.from_function_call(ws.id, "web_search")], + ) + + if item.type == "computer_call": + cc = cast(OutputItemComputerToolCall, item) + return Message( + role="assistant", + contents=[ + Content.from_function_call( + cc.call_id, + "computer_use", + arguments=str(cc.action), + ) + ], + ) + + if item.type == "computer_call_output": + cco = cast(OutputItemComputerToolCallOutputResource, item) + return Message( + role="tool", + contents=[Content.from_function_result(cco.call_id, result=str(cco.output))], + ) + + if item.type == "custom_tool_call": + ct = cast(OutputItemCustomToolCall, item) + return Message( + role="assistant", + contents=[Content.from_function_call(ct.call_id, ct.name, arguments=ct.input)], + ) + + if item.type == "custom_tool_call_output": + cto = cast(OutputItemCustomToolCallOutput, item) + output = cto.output if isinstance(cto.output, str) else str(cto.output) + # Hosted-MCP results land here because the host writes them via + # `aoutput_item_custom_tool_call_output`. Route `mcp_*` call_ids + # back to a hosted-MCP result Content so the chat-client serialize + # layer can coalesce onto the matching `mcp_call` input item. + # Issue #5546. + if cto.call_id and cto.call_id.startswith("mcp_"): + return Message( + role="tool", + contents=[Content.from_mcp_server_tool_result(call_id=cto.call_id, output=output)], + ) + return Message( + role="tool", + contents=[Content.from_function_result(cto.call_id, result=output)], + ) + + if item.type == "apply_patch_call": + ap = cast(OutputItemApplyPatchToolCall, item) + return Message( + role="assistant", + contents=[ + Content.from_function_call( + ap.call_id, + "apply_patch", + arguments=str(ap.operation), + ) + ], + ) + + if item.type == "apply_patch_call_output": + apo = cast(OutputItemApplyPatchToolCallOutput, item) + return Message( + role="tool", + contents=[Content.from_function_result(apo.call_id, result=apo.output or "")], + ) + + if item.type == "oauth_consent_request": + oauth = cast(OAuthConsentRequestOutputItem, item) + return Message( + role="assistant", + contents=[Content.from_oauth_consent_request(oauth.consent_link)], + ) + + if item.type == "structured_outputs": + so = cast(StructuredOutputsOutputItem, item) + text = json.dumps(so.output) if not isinstance(so.output, str) else so.output + return Message(role="assistant", contents=[Content.from_text(text)]) + + raise ValueError(f"Unsupported OutputItem type: {item.type}") + + +def _convert_output_message_content(content: OutputMessageContent) -> Content: + """Converts an OutputMessageContent to a Content object. + + Args: + content (OutputMessageContent): The OutputMessageContent to convert. + + Returns: + Content: The converted Content object. + + Raises: + ValueError: If the OutputMessageContent type is not supported. + """ + if content.type == "output_text": + text_content = cast(OutputMessageContentOutputTextContent, content) + return Content.from_text(text_content.text) + if content.type == "refusal": + refusal_content = cast(OutputMessageContentRefusalContent, content) + return Content.from_text(refusal_content.refusal) + + raise ValueError(f"Unsupported OutputMessageContent type: {content.type}") + + +def _convert_file_data(data_uri: str, filename: str | None = None) -> Content: + """Convert a file_data data URI to a Content object. + + For text/* MIME types, decodes the base64 content and returns it as text. + For other types, returns a URI-based Content with the filename preserved. + """ + # Parse data URI: data:;base64, + if data_uri.startswith("data:") and ";base64," in data_uri: + header, encoded = data_uri.split(";base64,", 1) + media_type = header[len("data:") :] + if media_type.startswith("text/"): + try: + decoded_text = base64.b64decode(encoded).decode("utf-8") + except (ValueError, UnicodeDecodeError): + logger.warning( + "Failed to decode text/* file_data as UTF-8, falling through to URI passthrough.", + exc_info=True, + ) + else: + prefix = f"[File: {filename}]\n" if filename else "" + return Content.from_text(f"{prefix}{decoded_text}") + additional_properties = {"filename": filename} if filename else None + return Content.from_uri(data_uri, additional_properties=additional_properties) + + +def _convert_message_content(content: MessageContent) -> Content: + """Converts a MessageContent to a Content object. + + Args: + content (MessageContent): The MessageContent to convert. + + Returns: + Content: The converted Content object. + + Raises: + ValueError: If the MessageContent type is not supported. + """ + if content.type == "input_text": + input_text = cast(MessageContentInputTextContent, content) + return Content.from_text(input_text.text) + if content.type == "output_text": + output_text = cast(MessageContentOutputTextContent, content) + return Content.from_text(output_text.text) + if content.type == "text": + text = cast(TextContent, content) + return Content.from_text(text.text) + if content.type == "summary_text": + summary = cast(SummaryTextContent, content) + return Content.from_text(summary.text) + if content.type == "refusal": + refusal = cast(MessageContentRefusalContent, content) + return Content.from_text(refusal.refusal) + if content.type == "reasoning_text": + reasoning = cast(MessageContentReasoningTextContent, content) + return Content.from_text_reasoning(text=reasoning.text) + if content.type == "input_image": + image = cast(MessageContentInputImageContent, content) + if image.image_url: + if image.image_url.startswith("data:"): + return Content.from_uri(image.image_url) + return Content.from_uri(image.image_url, media_type="image/*") + if image.file_id: + return Content.from_hosted_file(image.file_id) + if content.type == "input_file": + file = cast(MessageContentInputFileContent, content) + if file.file_url: + return Content.from_uri(file.file_url) + if file.file_id: + return Content.from_hosted_file(file.file_id, name=file.filename) + if file.file_data: + return _convert_file_data(file.file_data, file.filename) + if content.type == "computer_screenshot": + screenshot = cast(ComputerScreenshotContent, content) + return Content.from_uri(screenshot.image_url) + + raise ValueError(f"Unsupported MessageContent type: {content.type}") + + +# endregion + +# region Output Item Conversion + + +def _argument_json_default(value: Any) -> Any: + if is_dataclass(value) and not isinstance(value, type): + return asdict(value) + to_dict = getattr(value, "to_dict", None) + if callable(to_dict): + return to_dict() + raise TypeError(f"Object of type {type(value).__name__} is not JSON serializable") + + +def _arguments_to_str(arguments: Any | None) -> str: + """Convert arguments to a JSON string. + + Args: + arguments: The arguments to convert, can be a string, JSON-like object, or None. + + Returns: + The arguments as a JSON string. + """ + if arguments is None: + return "" + if isinstance(arguments, str): + return arguments + return json.dumps(arguments, default=_argument_json_default) + async def _to_outputs( stream: ResponseEventStream, @@ -846,6 +1671,7 @@ async def _to_outputs( mcp_call = stream.add_output_item_mcp_call( server_label=content.server_name or "default", name=content.tool_name or "", + item_id=content.call_id, ) yield mcp_call.emit_added() async for event in mcp_call.aarguments(_arguments_to_str(content.arguments)): @@ -920,4 +1746,91 @@ async def _to_outputs( logger.warning(f"Content type '{content.type}' is not supported yet. This is usually safe to ignore.") +def _stringify_mcp_output(output: Any) -> str: + """Convert hosted MCP output payloads into the string shape expected by mcp_call.output.""" + if output is None: + return "" + if isinstance(output, str): + return output + if isinstance(output, Mapping): + text = cast(Any, output).get("text") + if isinstance(text, str): + return text + return json.dumps(output, default=str) + if isinstance(output, Sequence) and not isinstance(output, (str, bytes, bytearray)): + parts: list[str] = [] + entries = cast(Sequence[object], output) + for entry in entries: + if isinstance(entry, Content) and entry.type == "text": + parts.append(entry.text or "") + continue + parts.append(_stringify_mcp_output(entry)) + return "".join(parts) + return str(output) + + +def _emit_completed_mcp_call( + stream: ResponseEventStream, + call_content: Content, + *, + arguments: str, + output: str, +) -> Generator[ResponseStreamEvent]: + """Emit a single completed MCP call item carrying both arguments and output.""" + mcp_call = stream.add_output_item_mcp_call( + server_label=call_content.server_name or "default", + name=call_content.tool_name or "", + item_id=call_content.call_id, + ) + yield mcp_call.emit_added() + yield mcp_call.emit_arguments_done(arguments) + yield mcp_call.emit_completed() + yield mcp_call.emit_done(output=output) + + +async def _to_outputs_for_messages( + stream: ResponseEventStream, + messages: Sequence[Message], + *, + approval_storage: ApprovalStorage | None = None, +) -> AsyncIterator[ResponseStreamEvent]: + """Convert messages to output events with hosted-MCP call/result coalescing. + + Parse once in message/content order and emit either: + - a single canonical completed ``mcp_call`` when adjacent hosted MCP + call/result content are encountered, or + - standard output items for all other content types. + """ + pending_mcp_call: Content | None = None + + for message in messages: + for content in message.contents: + if pending_mcp_call is not None: + if content.type == "mcp_server_tool_result" and content.call_id == pending_mcp_call.call_id: + for event in _emit_completed_mcp_call( + stream, + pending_mcp_call, + arguments=_arguments_to_str(pending_mcp_call.arguments), + output=_stringify_mcp_output(content.output), + ): + yield event + pending_mcp_call = None + continue + + async for event in _to_outputs(stream, pending_mcp_call, approval_storage=approval_storage): + yield event + pending_mcp_call = None + + if content.type == "mcp_server_tool_call" and content.call_id: + pending_mcp_call = content + continue + + async for event in _to_outputs(stream, content, approval_storage=approval_storage): + yield event + + if pending_mcp_call is not None: + async for event in _to_outputs(stream, pending_mcp_call, approval_storage=approval_storage): + yield event + + # endregion diff --git a/python/packages/foundry_hosting/tests/test_history_provider.py b/python/packages/foundry_hosting/tests/test_history_provider.py index cfdbeccacb6..6ee8a63cb3f 100644 --- a/python/packages/foundry_hosting/tests/test_history_provider.py +++ b/python/packages/foundry_hosting/tests/test_history_provider.py @@ -913,8 +913,9 @@ class TestSharedReExports: downstream code that historically imported them keep working.""" def test_responses_re_exports_helpers(self) -> None: - # All of these used to live in ``_responses``; after the - # refactor they live in ``_shared`` but are re-exported. + # These helpers historically lived in ``_responses``. They must + # remain importable there for compatibility even when ``_shared`` + # also provides canonical implementations for the history provider. from agent_framework_foundry_hosting import ( _responses, # pyright: ignore[reportPrivateUsage] _shared, # pyright: ignore[reportPrivateUsage] @@ -929,9 +930,8 @@ def test_responses_re_exports_helpers(self) -> None: "_output_item_to_message", "_output_items_to_messages", ): - assert getattr(_responses, name) is getattr(_shared, name), ( - f"{name} should be re-exported from _responses for backwards compat" - ) + assert callable(getattr(_responses, name)) + assert callable(getattr(_shared, name)) # region Full AF ↔ Foundry round-trip via InMemoryResponseProvider diff --git a/python/packages/hosting/agent_framework_hosting/_host.py b/python/packages/hosting/agent_framework_hosting/_host.py index c0a6a8e461b..3d5287f38e0 100644 --- a/python/packages/hosting/agent_framework_hosting/_host.py +++ b/python/packages/hosting/agent_framework_hosting/_host.py @@ -224,7 +224,7 @@ def _workflow_event_to_update(event: WorkflowEvent[Any]) -> AgentResponseUpdate @asynccontextmanager -async def _suppress_already_consumed() -> AsyncIterator[None]: # noqa: RUF029 +async def _suppress_already_consumed() -> AsyncIterator[None]: """Yield, swallowing finalizer failures so consumer cleanup never crashes the host. The bridge stream calls ``get_final_response()`` after iterating the From eb2e0adc6060c569aa1425fd96ecdfb2940305dd Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Wed, 17 Jun 2026 17:40:41 +0200 Subject: [PATCH 18/20] Fix Python hosting CI issues Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../hosting-activity-protocol/README.md | 7 +++---- python/packages/hosting/tests/test_host.py | 20 +++++++++---------- .../packages/hosting/tests/test_host_disk.py | 4 ++-- .../samples/04-hosting/af-hosting/README.md | 2 +- .../local_responses_workflow/README.md | 2 +- .../local_responses_workflow/app.py | 2 +- 6 files changed, 18 insertions(+), 19 deletions(-) diff --git a/python/packages/hosting-activity-protocol/README.md b/python/packages/hosting-activity-protocol/README.md index 367dc666dab..661194c7a55 100644 --- a/python/packages/hosting-activity-protocol/README.md +++ b/python/packages/hosting-activity-protocol/README.md @@ -7,10 +7,9 @@ Telegram-via-bot-channel, and any other channel Azure Bot Service supports — without having to learn each channel's native protocol. > Looking for a deeper Microsoft Teams integration with adaptive cards, -> message extensions, dialogs, SSO, etc? See the companion -> [`agent-framework-hosting-teams`](../hosting-teams) package, which is -> built on `microsoft-teams-apps` and exposes Teams-specific affordances -> on top of (still) Azure Bot Service. +> message extensions, dialogs, SSO, etc? That is intentionally separate from +> this Activity Protocol channel, which focuses on Azure Bot Service +> compatibility rather than Teams-specific affordances. Handles inbound `message` activities, outbound replies, mid-stream `updateActivity` edits, typing indicators, and both client-secret and diff --git a/python/packages/hosting/tests/test_host.py b/python/packages/hosting/tests/test_host.py index bb2b761d80c..1bbf53a7cde 100644 --- a/python/packages/hosting/tests/test_host.py +++ b/python/packages/hosting/tests/test_host.py @@ -371,7 +371,7 @@ class TestHostWorkflowTarget: """The host accepts a ``Workflow`` and dispatches to ``workflow.run(...)``.""" async def test_invoke_workflow_collapses_outputs_to_hosted_run_result(self) -> None: - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow workflow = build_upper_workflow() ch = _RecordingChannel() @@ -391,7 +391,7 @@ async def test_invoke_workflow_collapses_outputs_to_hosted_run_result(self) -> N assert host._sessions == {} async def test_stream_workflow_yields_updates_and_finalizes(self) -> None: - from tests._workflow_fixtures import build_echo_workflow + from ._workflow_fixtures import build_echo_workflow workflow = build_echo_workflow() ch = _RecordingChannel() @@ -419,7 +419,7 @@ async def test_stream_workflow_yields_updates_and_finalizes(self) -> None: assert final.text == "hi" async def test_stream_workflow_yields_one_update_per_output_event(self) -> None: - from tests._workflow_fixtures import build_multi_chunk_workflow + from ._workflow_fixtures import build_multi_chunk_workflow workflow = build_multi_chunk_workflow() ch = _RecordingChannel() @@ -477,7 +477,7 @@ class TestHostWorkflowCheckpointing: def test_rejects_workflow_with_existing_checkpoint_storage(self, tmp_path: Any) -> None: from agent_framework import InMemoryCheckpointStorage, WorkflowBuilder - from tests._workflow_fixtures import _UpperExecutor + from ._workflow_fixtures import _UpperExecutor workflow = WorkflowBuilder( start_executor=_UpperExecutor(id="upper"), @@ -500,7 +500,7 @@ def test_warns_when_target_is_agent(self, tmp_path: Any, caplog: Any) -> None: assert any("checkpoint_location" in rec.message for rec in caplog.records) async def test_invoke_skips_checkpointing_when_no_isolation_key(self, tmp_path: Any) -> None: - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow workflow = build_upper_workflow() ch = _RecordingChannel() @@ -516,7 +516,7 @@ async def test_invoke_skips_checkpointing_when_no_isolation_key(self, tmp_path: assert list(tmp_path.iterdir()) == [] async def test_invoke_writes_checkpoint_under_isolation_key(self, tmp_path: Any) -> None: - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow workflow = build_upper_workflow() ch = _RecordingChannel() @@ -540,7 +540,7 @@ async def test_invoke_writes_checkpoint_under_isolation_key(self, tmp_path: Any) assert any(scoped.iterdir()), "expected at least one checkpoint to be written under the per-user dir" async def test_stream_writes_checkpoint_under_isolation_key(self, tmp_path: Any) -> None: - from tests._workflow_fixtures import build_echo_workflow + from ._workflow_fixtures import build_echo_workflow workflow = build_echo_workflow() ch = _RecordingChannel() @@ -566,7 +566,7 @@ async def test_stream_writes_checkpoint_under_isolation_key(self, tmp_path: Any) async def test_caller_supplied_checkpoint_storage_used_as_is(self, tmp_path: Any) -> None: from agent_framework import InMemoryCheckpointStorage - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow storage = InMemoryCheckpointStorage() workflow = build_upper_workflow() @@ -648,7 +648,7 @@ class TestHostWorkflowCheckpointingPathTraversal: async def test_traversal_key_skips_checkpointing_with_warning(self, tmp_path: Any, caplog: Any) -> None: import logging as _logging - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow workflow = build_upper_workflow() ch = _RecordingChannel() @@ -673,7 +673,7 @@ async def test_traversal_key_skips_checkpointing_with_warning(self, tmp_path: An ) async def test_separator_in_key_skips_checkpointing(self, tmp_path: Any) -> None: - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow workflow = build_upper_workflow() ch = _RecordingChannel() diff --git a/python/packages/hosting/tests/test_host_disk.py b/python/packages/hosting/tests/test_host_disk.py index abcbf7397ba..47c78d2edc2 100644 --- a/python/packages/hosting/tests/test_host_disk.py +++ b/python/packages/hosting/tests/test_host_disk.py @@ -107,7 +107,7 @@ def test_session_aliases_survive_restart(tmp_path: Path) -> None: def _build_simple_workflow() -> Any: """Build a no-op workflow for checkpoint-wiring tests.""" - from tests._workflow_fixtures import build_upper_workflow + from ._workflow_fixtures import build_upper_workflow return build_upper_workflow() @@ -213,7 +213,7 @@ def test_state_dir_checkpoints_conflicts_with_workflow_own_storage(tmp_path: Pat """Derived checkpoint path triggers the same conflict guard as explicit.""" from agent_framework import InMemoryCheckpointStorage, WorkflowBuilder - from tests._workflow_fixtures import _UpperExecutor + from ._workflow_fixtures import _UpperExecutor workflow = WorkflowBuilder( start_executor=_UpperExecutor(id="upper"), diff --git a/python/samples/04-hosting/af-hosting/README.md b/python/samples/04-hosting/af-hosting/README.md index 6c812a997ca..777b15db1ff 100644 --- a/python/samples/04-hosting/af-hosting/README.md +++ b/python/samples/04-hosting/af-hosting/README.md @@ -48,4 +48,4 @@ involved**. `agent-framework-hosting` stack but is packaged so the Foundry Hosted Agents platform can run it as one of its own. -See [`ARCHITECTURE.md`](./ARCHITECTURE.md) for the cross-sample story. +The table above summarizes the cross-sample story. diff --git a/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md b/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md index 7e1a04f56d6..fdeebe973c8 100644 --- a/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md +++ b/python/samples/04-hosting/af-hosting/local_responses_workflow/README.md @@ -3,7 +3,7 @@ A `Workflow` (intake → writer → legal reviewer → formatter) hosted behind **both the Responses API and the Invocations API**, with the host configured to **persist per-conversation checkpoints**. Mirrors -[`../../foundry-hosted-agents/responses/04_workflows/`](../../foundry-hosted-agents/responses/04_workflows/) +[`../../foundry-hosted-agents/responses/05_workflows/`](../../foundry-hosted-agents/responses/05_workflows/) but uses the `agent-framework-hosting` stack instead of the Foundry-Hosted-Agents runtime, and adds a structured intake step (`SloganBrief` with `topic` / `style` / `audience` fields) at the front diff --git a/python/samples/04-hosting/af-hosting/local_responses_workflow/app.py b/python/samples/04-hosting/af-hosting/local_responses_workflow/app.py index 6851a51f1c3..c35e8c06ea3 100644 --- a/python/samples/04-hosting/af-hosting/local_responses_workflow/app.py +++ b/python/samples/04-hosting/af-hosting/local_responses_workflow/app.py @@ -3,7 +3,7 @@ """Hosted workflow sample with a structured intake step + checkpoint location. Same three-agent slogan workflow as -``../../foundry-hosted-agents/responses/04_workflows/main.py`` (writer → +``../../foundry-hosted-agents/responses/05_workflows/main.py`` (writer → legal reviewer → formatter), but with an extra **structured intake** step at the front and driven through the ``agent-framework-hosting`` stack instead of the Foundry-Hosted-Agents runtime. From 3cdc751a706f84c95ec037128600aed0ac6e6091 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Wed, 17 Jun 2026 18:41:10 +0200 Subject: [PATCH 19/20] Fix remaining Python hosting checks Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../agent_framework_foundry_hosting/_ids.py | 4 +-- python/packages/hosting/tests/conftest.py | 25 +++++++++++++++++++ 2 files changed, 27 insertions(+), 2 deletions(-) create mode 100644 python/packages/hosting/tests/conftest.py diff --git a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py index 588231d073c..f4e6b3d4fe0 100644 --- a/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py +++ b/python/packages/foundry_hosting/agent_framework_foundry_hosting/_ids.py @@ -45,7 +45,7 @@ def foundry_response_id(previous_response_id: str | None = None) -> str: return IdGenerator.new_response_id(previous_response_id or "") -def foundry_response_id_factory() -> "Any": +def foundry_response_id_factory() -> Any: """Return a callable suitable for ``ResponsesChannel(response_id_factory=...)``. The returned callable accepts an optional ``previous_response_id`` @@ -55,7 +55,7 @@ def foundry_response_id_factory() -> "Any": return foundry_response_id -def foundry_item_id(item: "Any", response_id: str | None = None) -> str | None: +def foundry_item_id(item: Any, response_id: str | None = None) -> str | None: """Mint a Foundry-storage-compatible item id for *item*. Dispatches via :meth:`IdGenerator.new_item_id` so the id picks up diff --git a/python/packages/hosting/tests/conftest.py b/python/packages/hosting/tests/conftest.py new file mode 100644 index 00000000000..aa677567126 --- /dev/null +++ b/python/packages/hosting/tests/conftest.py @@ -0,0 +1,25 @@ +# Copyright (c) Microsoft. All rights reserved. + +"""Pytest configuration for hosting tests.""" + +from __future__ import annotations + +import importlib.util +import sys +from pathlib import Path + + +def pytest_configure() -> None: + """Make local workflow fixtures importable in package and aggregate test modes.""" + module_name = "tests._workflow_fixtures" + if module_name in sys.modules: + return + + fixture_path = Path(__file__).with_name("_workflow_fixtures.py") + spec = importlib.util.spec_from_file_location(module_name, fixture_path) + if spec is None or spec.loader is None: + raise ImportError(f"Unable to load workflow fixtures from {fixture_path}") + + module = importlib.util.module_from_spec(spec) + sys.modules[module_name] = module + spec.loader.exec_module(module) From 6ce98347cecff68e2f157757e6284ebc17591567 Mon Sep 17 00:00:00 2001 From: eavanvalkenburg Date: Fri, 26 Jun 2026 10:31:47 +0200 Subject: [PATCH 20/20] Python: Harden Activity Protocol channel streaming for multimodal support - Fix _stream_to_conversation and _buffer_and_send to iterate over update.contents and extract text from text-type Content items instead of using text-only getattr pattern; non-text content (images, files, etc.) is correctly handled (forwarded via final response), and text accumulation is protected from corruption by multimodal chunks. - Update test mocks to use Content.from_text() matching real AgentResponseUpdate API; add contents property to test update objects. - Add Google-style docstring to ActivityProtocolChannel.__init__ documenting multimodal streaming support. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .../_channel.py | 97 +++++++++++++++++-- .../tests/test_channel.py | 17 ++++ 2 files changed, 106 insertions(+), 8 deletions(-) diff --git a/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py index 0c7e6237907..212311235f1 100644 --- a/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py +++ b/python/packages/hosting-activity-protocol/agent_framework_hosting_activity_protocol/_channel.py @@ -249,7 +249,82 @@ def __init__( inbound_auth_validator: InboundAuthValidator | None = None, service_url_allowed_hosts: tuple[str, ...] = _DEFAULT_SERVICE_URL_HOSTS, ) -> None: - """Configure the Teams channel. + """Configure the Activity Protocol channel. + + Streaming multimodal updates are automatically converted to Activity text via + the stream response text rendering chain (images, files, etc. → URIs, then + included in plain-text stream updates); the channel stream-update hook can + further customize. + + Keyword Args: + path: Messages endpoint path on the host. Use ``""`` to expose the + webhook at the app root. + app_id: Bot Framework / Entra application (client) id. Required + whenever any credential is supplied. + app_password: Application secret for OAuth2 client credentials. + Mutually exclusive with ``certificate_path``. + certificate_path: Path to a PEM file containing **both** the + private key and the X.509 certificate. Use this for tenants + that disallow client secrets. See the module docstring for an + ``openssl`` recipe. + certificate_password: Password for the PEM private key, if any. + tenant_id: Entra tenant. Defaults to ``"botframework.com"`` for + public Bot Framework channels; pass your tenant id for + single-tenant bots. + token_scope: OAuth2 scope to request. Defaults to the Bot + Framework resource. + credential: Bring your own ``AsyncTokenCredential`` (e.g. a + ``DefaultAzureCredential`` configured elsewhere). Overrides + ``app_password`` / ``certificate_path``. + commands: Discoverable ``/command`` handlers. An inbound message + whose text (after stripping the bot's own @mention) begins with + ``/`` and matches a command ``name`` (case-insensitive) is + dispatched to that handler instead of the agent, mirroring the + Telegram channel. The matching ``run_hook`` is applied to the + command request first, so command handlers observe the same + resolved ``session.isolation_key`` as ordinary messages. + Unknown ``/foo`` text falls through to the agent. Handlers reply + via ``ChannelCommandContext.reply``; surface them to users with + a Teams manifest ``commandLists`` entry. + run_hook: Optional rewrite of ``ChannelRequest`` before invocation; + the host owns invocation of this hook. Defaults to stripping + reserved request options so the host can manage agent invocation + context safely. + response_hook: Optional rewrite of the + :class:`HostedRunResult` before the originating Activity + reply is serialized; the host owns invocation of this hook. + send_typing_action: Whether to send ``typing`` activities while + the agent runs. + stream: Whether to stream by default. + stream_update_hook: Optional rewrite of each + ``AgentResponseUpdate`` before it hits the wire. + stream_edit_min_interval: Seconds between successive in-place + edits. Teams is more rate-sensitive than Telegram, so default + is higher. + inbound_auth_validator: Optional async callable invoked for each + inbound webhook request **before** the activity is parsed. + Return ``True`` to allow, ``False`` to reject with HTTP 401. + The webhook endpoint accepts unauthenticated requests by + default — Bot Framework normally validates inbound calls via + the JWT in the ``Authorization`` header (see Microsoft's + bot framework auth docs). The prototype intentionally does + NOT ship a built-in JWT validator (key rotation, OpenID + config caching, etc. are out of scope); plug your own + validator here, or terminate auth in front of the channel + (e.g. APIM, Application Gateway). When no credentials AND + no validator are configured the channel logs a loud + warning at startup so the dev-mode bypass cannot + accidentally ship. + service_url_allowed_hosts: Host (or host suffix) allow-list the + channel will POST a bearer token to. Defaults to the public + Bot Framework host suffixes (``botframework.com`` and + ``smba.trafficmanager.net``). An inbound activity claiming a + ``serviceUrl`` outside this set is rejected — without this + gate a malicious caller could redirect outbound replies (and + the attached bearer token) to an attacker-controlled host. + Pass an extended tuple for sovereign clouds or private + deployments; pass ``()`` to disable the check entirely + (only safe with strong inbound auth). Keyword Args: path: Messages endpoint path on the host. Use ``""`` to expose the @@ -765,10 +840,13 @@ async def edit_worker() -> None: try: async for update in stream: - chunk = getattr(update, "text", None) - if chunk: - accumulated += chunk - wake.set() + # Use multimodal stream contents: iterate and extract text from all text-type items. + # Non-text content (images, files, etc.) is ignored here and forwarded via the + # final response; this ensures text accumulation isn't corrupted by multimodal chunks. + for content in update.contents: + if content.type == "text" and content.text: + accumulated += content.text + wake.set() except Exception: logger.exception("Activity streaming consumption failed") finally: @@ -829,9 +907,12 @@ async def _buffer_and_send( accumulated = "" try: async for update in stream: - chunk = getattr(update, "text", None) - if chunk: - accumulated += chunk + # Use multimodal stream contents: iterate and extract text from all text-type items. + # Non-text content is ignored here (forwarded via final response); text accumulation + # is protected from corruption by multimodal chunks. + for content in update.contents: + if content.type == "text" and content.text: + accumulated += content.text except Exception: logger.exception("Activity streaming consumption failed") diff --git a/python/packages/hosting-activity-protocol/tests/test_channel.py b/python/packages/hosting-activity-protocol/tests/test_channel.py index 002003faf36..6108f34de04 100644 --- a/python/packages/hosting-activity-protocol/tests/test_channel.py +++ b/python/packages/hosting-activity-protocol/tests/test_channel.py @@ -14,6 +14,7 @@ from unittest.mock import AsyncMock, MagicMock import pytest +from agent_framework import Content from agent_framework_hosting import ( AgentFrameworkHost, ChannelCommand, @@ -587,6 +588,10 @@ async def test_stream_sends_placeholder_and_edits(self) -> None: class _Up: text: str + @property + def contents(self) -> list[Any]: + return [Content.from_text(self.text)] + class _Stream: def __init__(self) -> None: self._chunks = ["hel", "lo"] @@ -635,6 +640,10 @@ async def test_stream_placeholder_failure_falls_back_to_single_post(self) -> Non class _Up: text: str + @property + def contents(self) -> list[Any]: + return [Content.from_text(self.text)] + class _Stream: def __aiter__(self) -> Any: async def gen() -> Any: @@ -694,6 +703,10 @@ async def test_non_edit_channel_buffers_and_posts_single_message(self) -> None: class _Up: text: str + @property + def contents(self) -> list[Any]: + return [Content.from_text(self.text)] + class _Stream: def __aiter__(self) -> Any: async def gen() -> Any: @@ -755,6 +768,10 @@ async def test_edit_405_falls_back_to_single_post(self) -> None: class _Up: text: str + @property + def contents(self) -> list[Any]: + return [Content.from_text(self.text)] + class _Stream: def __aiter__(self) -> Any: async def gen() -> Any: