Skip to content

[v2] ClientSession runs on JSONRPCDispatcher; BaseSession removed#2838

Open
maxisbey wants to merge 24 commits into
mainfrom
maxisbey/v2-client-dispatcher-swap
Open

[v2] ClientSession runs on JSONRPCDispatcher; BaseSession removed#2838
maxisbey wants to merge 24 commits into
mainfrom
maxisbey/v2-client-dispatcher-swap

Conversation

@maxisbey

@maxisbey maxisbey commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

V2 client receive path: Transport -> JSONRPCDispatcher -> ClientSession callbacks, replacing the BaseSession receive loop. Companion to #2710, which did the same for the server seat; BaseSession is now deleted.

Motivation and Context

ClientSession was the last consumer of the v1 BaseSession engine. Moving it onto the dispatcher unifies both seats on one receive path and fixes the v1 client's structural problems: server-initiated requests were handled inline in the receive loop (a slow sampling callback blocked everything, a callback that sent a request deadlocked, and peer cancellation was unactionable), and a raising notification handler could take down the connection.

ClientSession's public surface is unchanged: same constructor, same typed methods, manual initialize(), same context-manager lifecycle. A new keyword-only dispatcher= constructor argument accepts a pre-built dispatcher instead of the stream pair (e.g. DirectDispatcher for in-process embedding).

What changed

  • CallOptions.cancel_on_abandon + suppression: abandoning a request (timeout or caller cancellation) sends a courtesy notifications/cancelled, unless the caller opted out (initialize, which the spec forbids cancelling) or the request carries resumption hints (the peer's work must survive for the resume)
  • The two shielded cancellation-path writes are bounded (5s), so a wedged transport write can no longer hang shutdown or a cancelled caller
  • Spawned notification handlers and the new on_stream_exception observer are contained at the dispatcher: a raising handler costs itself, never the connection (matching the TypeScript/C#/Go engines)
  • ClientSession internals rewritten over the dispatcher; mcp.shared.session shrinks to a compatibility module (ProgressFnT re-export, typing-only RequestResponder stub for MessageHandlerFnT annotations)
  • The dispatcher constructor overloads collapsed via a defaulted TypeVar (PEP 696)
  • Drive-by fix: the standalone SSE writer no longer logs an ERROR traceback when transport teardown closes its stream mid-listen

Parity bar

The transport-parametrized interaction suite passes. Three recorded divergences are now resolved and deleted from the requirements manifest (protocol:timeout:sends-cancellation, protocol:cancel:late-response-ignored, protocol:cancel:server-to-client), and the server-to-client cancellation requirement is pinned by a new test passing over all three transports.

Breaking Changes

Documented in docs/migration.md (one grouped entry):

  • Request ids count from 1 (previously 0); progress tokens follow
  • Timeout error text is now Request 'tools/call' timed out, and a timed-out or abandoned request sends notifications/cancelled, interrupting the server handler
  • Server-initiated requests run concurrently; a server's cancellation now actually interrupts the client callback
  • Notification callbacks are concurrent (no completion-before-response guarantee)
  • Responses with unknown ids are ignored per spec instead of surfacing a RuntimeError to message_handler
  • Unknown inbound methods are answered with METHOD_NOT_FOUND (carrying the method in data), completing Fix unknown-method error code and add a protocol version registry #2836's contract on the client seat
  • A raising request callback is answered with code=0 and the exception text (previously flattened to INVALID_PARAMS)
  • send_request before entering the context manager raises RuntimeError immediately

Fixes #2489. Fixes #2507. Supersedes #2490.

Related, not closed here: #2673 (a raising request callback now answers the server with code=0 and the exception text instead of INVALID_PARAMS; the opt-in propagation of the exception to the local call_tool awaiter that the issue requests stays open as a separate decision) and #2610 (server-seat bug on a code path that no longer exists on main after #2710; still open for v1.x triage — not addressed here).

How Has This Been Tested?

Full suite (1741 tests) including the interaction suite over in-memory, SSE, and streamable HTTP; 100% branch coverage held on all changed files. New tests: courtesy-cancel wire pins and suppression, wedged-transport shutdown bound (trio virtual clock), notification-handler containment, stream-exception observer, the dispatcher= constructor over direct dispatch, server-seat timeout cancellation, abandoned-server-request cancellation, standalone-stream teardown ordering.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

AI Disclaimer

@maxisbey maxisbey force-pushed the maxisbey/v2-client-dispatcher-swap branch from aec8155 to a51357c Compare June 11, 2026 17:13
@maxisbey maxisbey marked this pull request as ready for review June 12, 2026 14:29
maxisbey added 20 commits June 12, 2026 15:04
- New CallOptions key cancel_on_abandon (default true): abandoning a request
  (timeout or caller cancellation) sends notifications/cancelled unless the
  caller opted out or the request carries resumption hints
- Bound the two shielded cancellation-path writes with a 5s deadline so a
  wedged transport write cannot hang shutdown or a cancelled caller
- Capitalize the connection-closed fan-out message ("Connection closed")
- Pin the server-seat timeout contract in the interaction suite: a timed-out
  server-initiated request is followed by notifications/cancelled
A raising notification handler ran as a bare task in the dispatcher's task
group, so its exception cancelled the read loop and every in-flight request.
Wrap spawned handlers in the same containment boundary progress callbacks
already have: log the failure and keep the connection serving.
Transports yield Exception items on the read stream for connection faults
and parse errors; the dispatcher debug-logged and dropped them. An optional
observer now receives them (awaited in the read loop, contained so a raising
observer costs the item, not the connection). Unset keeps the old behavior.
TransportT now defaults to TransportContext (PEP 696, same pattern as
shared/context.py), so omitting transport_builder no longer needs a
dedicated overload to pin the type parameter.
ClientSession keeps its public surface (constructor, typed methods, manual
initialize, context-manager lifecycle) but now owns a JSONRPCDispatcher
instead of inheriting the v1 BaseSession receive loop. Server-initiated
requests are answered through the existing callbacks via the closed-union
parse; notifications validate-or-drop and tee to message_handler; transport
exceptions reach message_handler through the dispatcher's stream-exception
observer. A from_dispatcher constructor accepts a pre-built dispatcher for
in-process embedding.

mcp.shared.session shrinks to the surviving names: the ProgressFnT re-export
and a typing-only RequestResponder stub for MessageHandlerFnT annotations.

Behavior changes (deliberate, to be covered in the migration guide):
- request ids count from 1; the progress token follows
- timeouts use the dispatcher error text and send notifications/cancelled,
  so a timed-out server handler is interrupted instead of running on
- responses with unknown ids are ignored per spec instead of surfacing a
  RuntimeError to message_handler
- a raising request callback is answered with code 0 and the exception text
- notification callbacks run concurrently (no completion-before-response)

Three interaction-suite divergence entries are resolved and deleted, and the
server-to-client cancellation requirement is now pinned by a passing test.
Adds tests for ServerMessageMetadata routing, related-request-id
notifications, and params-absent inbound requests over direct dispatch,
plus the migration-guide entry for the ClientSession dispatcher swap.
Transport teardown closes the standalone stream's send side first, so a
writer parked in receive() ends on a clean end-of-stream; but when teardown
lands while the writer is between dequeues, the next receive() raises
ClosedResourceError, which fell into the catch-all and logged a traceback
at ERROR level for a routine disconnect. Catch it and end quietly. A new
test pins the close ordering that keeps the parked path clean.
Replaces the from_dispatcher classmethod: read_stream/write_stream become
optional and dispatcher is a keyword-only alternative, with mutual
exclusion validated at construction. Drops the __new__-based alternate
constructor and its shared state-init helper.
Covers MCPError.from_jsonrpc_error and the context-stream sync close()
methods, whose only exercisers died with BaseSession and its tests, and
restructures three test handler arms that could never take their false
branch.
- Bound the timeout-path courtesy cancel like the other abandon writes and
  extract a single _final_write policy: 5s for courtesy cancels, 1s for
  shutdown responses so closing a session stays fast on a wedged transport
- Answer shutdown-interrupted requests with CONNECTION_CLOSED and retire the
  REQUEST_CANCELLED constant (-32002 collides with resource-not-found)
- Key the peer-cancel error response on cancelled_caught so a cancel landing
  after the handler finished cannot produce a second answer for the same id
- Decide outbound metadata and cancel-on-abandon suppression in one place:
  only resumption hints that actually reach the transport suppress the
  courtesy cancel
- Never send notifications/cancelled for a request whose write never completed
- Identity-guard the in-flight pop so a finished handler cannot evict a newer
  entry that reused its request id
- Map request-write stream failures to MCPError(CONNECTION_CLOSED); warn when
  a bounded final write gives up; mark the Dispatcher lifecycle provisional
- Unwind the entered task group when __aenter__ is cancelled while the
  dispatcher is starting, instead of abandoning its cancel scope
- Deliver transport-level exceptions to message_handler concurrently and
  contained, like notifications, so a handler that awaits session I/O no
  longer deadlocks the read loop
- Route related_request_id=0 correctly in send_notification (ids are opaque)
- Document the dispatcher= constructor path in send_request's contract
- Add an interaction test proving a timed-out initialize sends no
  notifications/cancelled, and drop the stale deferred rationale for the
  initialize-not-cancellable requirement
- Record null-id error-response drops in the requirements manifest and
  describe the cancelled-request error response as applying to both seats
- Record wire messages only after the inner send succeeds, so a failed or
  cancelled write is not logged as sent
- Bound the remaining indefinite waits, refresh comments stale since the
  courtesy-cancel change, and assert the issue-88 timeout by error code
- Rewrite the suite README's concurrency section for the dispatcher model
Drive the between-dequeues teardown path directly through the transport's
ASGI entry point with a gated send, so the ClosedResourceError arm is
covered by a real test and no longer needs its coverage pragma. The e2e
teardown test's docstring now claims only what its assertion proves.
Cut the comment and docstring volume roughly in half: single-sentence
docstring summaries, Raises sections kept but shortened, inline narration
replaced by one-line statements of the non-inferable constraint, and
development-artifact comments removed. No code changes.
ServerMessageMetadata.related_request_id exists for the server's
streamable-HTTP transport to route outbound messages onto the originating
request's SSE stream. No client transport has ever serialized it, so
ClientSession's related_request_id parameter and ServerMessageMetadata
acceptance were dead inheritance from the shared v1 BaseSession.

- send_notification loses its related_request_id parameter
- send_request's metadata narrows to ClientMessageMetadata | None
  (resumption hints, the live part)
- the isinstance(dispatcher) downcasts those parameters forced are gone

Progress and response correlation (progressToken in params, JSON-RPC id)
are payload-level mechanisms and are unaffected.
RequestContext[ClientSession] was the only instantiation of the generic
left in the tree (the server seat has ServerRequestContext), so the
public ClientRequestContext alias becomes the real dataclass and the
private mcp.shared._context module is deleted. request_id is now always
populated: the client only builds a context for inbound requests, and
ping is answered before any context exists.
Server-initiated sampling/elicitation/roots requests over a ClientSession
built with dispatcher=DirectDispatcher failed before the callback ran:
the session requires a populated request id and direct dispatch carried
none. DirectDispatcher now assigns per-instance monotonic ids to inbound
requests (notifications keep None, which is how middleware distinguishes
them). Adds a non-ping direct-dispatch test and bounds the indefinite
awaits in the existing dispatcher= tests.
send_raw_request raised RuntimeError both before run() and after the
transport closed, contradicting the documented contract that connection
loss surfaces as MCPError(CONNECTION_CLOSED). A closed flag now separates
the two states: RuntimeError remains for use before run(), and a request
after EOF gets the same CONNECTION_CLOSED error the in-flight waiters
receive.
The previous canary (a raising message_handler) could never fire: correct
code drops unknown-id responses before any handler, and regressed code's
delivery paths contain handler exceptions. Collect surfaced messages
instead and assert only a control notification arrives; verified against
a simulation of the v1 surface-as-RuntimeError behavior.
@maxisbey maxisbey force-pushed the maxisbey/v2-client-dispatcher-swap branch from ac209ab to 1703d42 Compare June 12, 2026 15:16
maxisbey added 4 commits June 12, 2026 15:52
Cut the ClientSession section by about a third: merged single-concern
bullets (stray responses; the timeout exemptions), dropped rationale
padding, reframed the shutdown bullet as new-versus-v1 (v1 never answered
at shutdown), and removed the REQUEST_CANCELLED bullet entirely - the
constant never existed in v1, so it has no place in a v1-to-v2 guide.
run() entered both streams inside its own task group, so at teardown the
write stream closed before in-flight handlers sent their final answers:
the shutdown CONNECTION_CLOSED response was deterministically dropped on
the EOF path and raced the close on the cancel path. The write stream's
scope now wraps the task group, so scope exits order the join strictly
before the close and teardown writes always land. The shutdown-delivery
test becomes a real memory-stream pin, and the wedged-shutdown test's
synthetic stream is replaced by a plain unread one.
Two rendezvous tests: three concurrent outbound tool calls proven
simultaneously in flight and resolved out of order to their own callers,
and two overlapping server-initiated sampling requests serviced
concurrently by the client's callbacks (the v1 receive loop serialized
these). Also note in the migration guide that delivery concurrency is
unbounded.
- Rename _SHIELDED_WRITE_TIMEOUT to _ABANDON_WRITE_TIMEOUT; the
  timed-out arm passes it unshielded, so the old name lied there
- Document that on_stream_exception is awaited inline in the read loop
- Note in run()'s docstring that the dispatcher is single-shot
- Import ProgressFnT from mcp.shared.dispatcher (its home) instead of
  the mcp.shared.session compat shim
- Migration guide: scope the optional-fields section to
  ServerRequestContext, and correct the null-id bullet (v1 surfaced
  the transport's ValidationError, not an MCPError)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant