ci(conformance): run server --suite draft and baseline the 2026-07-28 scenarios#2878
Merged
Conversation
… scenarios Adds the draft-spec server step on top of #2877, mirroring typescript-sdk's two-step server job (active then draft). - Second run-server.sh invocation with --suite draft. - expected-failures.yml server: populated with the 17 draft scenarios that fail or warn (14 FAILURE + 3 WARNING-only), grouped by SEP. Same 17-entry set as typescript-sdk's server baseline. - Two of the 19 draft scenarios are intentionally not baselined: their negative-case assertions are vacuously satisfied today by the stateful server's -32600 response. Noted in a trailing comment. client: section unchanged; --suite all already covers draft client scenarios via the PR #2877 baseline.
felixweinberger
approved these changes
Jun 15, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds the 2026-07-28 draft-spec server suite on top of #2877, mirroring typescript-sdk's two-step server job (
activethendraft). This is the second half of bringing the conformance harness up to date for the v2/2026 line — #2877 was the modernization, this adds the draft coverage so 2026 SEP work has a burn-down list.Motivation and Context
AGENTS.mdrequires new 2026-07-28 features to have a passing conformance test. With #2877 the server job only runs--suite active(2025-11-25 scenarios); draft-spec server scenarios were never exercised. This adds the--suite draftstep and baselines the 17 scenarios that don't pass yet, so each SEP implementation PR can remove its entries from the baseline as it lands.Changes
conformance.yml: secondrun-server.sh --suite draft --expected-failures ...step in theserver-conformancejob.expected-failures.ymlserver:: 17 entries grouped by SEP — same set as typescript-sdk's server baseline. Two of the 19 draft scenarios (input-required-result-unsupported-methods,input-required-result-validate-input) are intentionally not baselined: their negative-case assertions are vacuously satisfied today by the stateful server's-32600response, and the runner would flag them stale. A trailing comment records this.client:section unchanged —--suite allalready covers draft client scenarios via the ci(conformance): pin harness to 0.2.0-alpha.3 with expected-failures baseline #2877 baseline (re-verified post-[v2] ClientSession runs on JSONRPCDispatcher; BaseSession removed #2838: 24 pass + 16 expected-fail, exit 0).How Has This Been Tested?
Run locally against
@modelcontextprotocol/conformance@0.2.0-alpha.3:server --suite draft --expected-failures ...→ 19 scenarios, 17 expected-fail, 0 unexpected, 0 stale, exit 0. Run twice on a fresh server each time; output byte-identical.server --suite active --expected-failures ...with the 17 server entries present → 30/30 pass, exit 0 (the evaluator ignores baseline entries not in the current run, so the draft entries don't trip the active step).client --suite all --expected-failures ...→ 24 pass + 16 expected-fail, exit 0.Breaking Changes
None.
Types of changes
Checklist
Additional context
-32600 Missing session ID—mcp-everything-serverhas no stateless mode yet, so most scenarios hit the transport wall before SEP-specific checks run. Same as typescript-sdk's baseline; the entries reshuffle once SEP-2575 lands.client:block and in typescript-sdk; would need an upstream change to validate baseline names against the scenario registry.run-server.shis invoked twice now; cleanup relies onuv runforwarding SIGTERM to uvicorn. Verified to work on the sequential-restart path locally.client.pyreadingMCP_CONFORMANCE_PROTOCOL_VERSION(typescript-sdk doesn't either),mcp-everything-server --stateless.python-sdk(main) entry toknown-sdks.ts.AI Disclaimer