Skip to content

ci(conformance): pin harness to 0.2.0-alpha.3 with expected-failures baseline#2877

Merged
maxisbey merged 1 commit into
mainfrom
conformance-harness-0.2.0-alpha.3
Jun 15, 2026
Merged

ci(conformance): pin harness to 0.2.0-alpha.3 with expected-failures baseline#2877
maxisbey merged 1 commit into
mainfrom
conformance-harness-0.2.0-alpha.3

Conversation

@maxisbey

Copy link
Copy Markdown
Contributor

Modernizes the conformance CI to match the typescript-sdk pattern so the job actually gates and the failure set burns down per SEP. Supersedes #1921 (the composite-action approach) — pinning the npm package directly is what typescript-sdk settled on.

Motivation and Context

The conformance jobs were pinned to 0.1.10/0.1.13 (drifted between server and client), ran with continue-on-error: true so they never gated, and had no baseline file — failures were silently ignored. The conformance harness has since gained --expected-failures (#113) and the 0.2.0-alpha line with version-aware scenario selection.

Changes

  • Pin the harness via a single workflow-level CONFORMANCE_VERSION: "0.2.0-alpha.3" env var (one place to bump).
  • Add .github/actions/conformance/expected-failures.yml — the path the conformance repo's known-sdks.ts already expects. Client baseline is 16 scenarios grouped by SEP (same set as typescript-sdk's baseline); server active suite is fully green so server: [].
  • Drop continue-on-error: true from both jobs. The runner now exits 0 only when failures match the baseline, 1 on regressions or stale entries.
  • Harden run-server.sh with a port-conflict guard, dead-process check in the readiness loop, and curl --max-time, mirroring typescript-sdk's wrapper.

The server --suite draft step (2026-07-28 scenarios) is a follow-up PR; the baseline file already has a server: placeholder for it.

How Has This Been Tested?

Run locally against the conformance harness at 0.2.0-alpha.3:

  • server --suite active --expected-failures ... → 30/30 scenarios pass, 42 assertions, exit 0.
  • client --suite all --expected-failures ... → 40 scenarios: 24 pass, 16 expected-fail, 0 unexpected, 0 stale, exit 0.
  • Stale-entry guard: appending a passing scenario (initialize) to the baseline → exit 1 with Stale baseline entries (now passing - remove from baseline): initialize.
  • bash -n, YAML parse, and pre-commit all clean.

Breaking Changes

None for SDK users. For contributors: the conformance jobs now fail on regressions instead of silently passing. run-server.sh requires CONFORMANCE_VERSION to be set when run locally (the error message points at the workflow pin).

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Checklist

  • I have read the MCP Documentation
  • My code follows the repository's style guidelines
  • New and existing tests pass locally
  • I have added appropriate error handling
  • I have added or updated documentation as needed

Additional context

  • The conformance jobs are not branch-protection-required checks, so an npm-registry blip turns them red but does not block merges. Same posture as typescript-sdk; can revisit once it has run cleanly for a while.
  • Backporting run-server.sh to v1.x will need the env: CONFORMANCE_VERSION block backported in the same change (the script now reads it via ${CONFORMANCE_VERSION:?}).
  • Follow-up on the conformance repo: add a python-sdk (main) entry to known-sdks.ts alongside the existing python-sdk-v1.

AI Disclaimer

…baseline

Modernizes the conformance CI to match the typescript-sdk pattern:

- Pin @modelcontextprotocol/conformance via a single workflow-level
  CONFORMANCE_VERSION env var (was 0.1.10/0.1.13, drifted between jobs).
- Add .github/actions/conformance/expected-failures.yml so the runner
  exits 0 on known-failing scenarios and 1 on regressions or stale
  entries. Client baseline is 16 scenarios grouped by SEP; server
  active suite is fully green.
- Drop continue-on-error from both jobs so conformance actually gates.
- Harden run-server.sh with a port-conflict guard, dead-process check,
  and curl --max-time, mirroring typescript-sdk's wrapper.

The server --suite draft step (2026-07-28 scenarios) is a follow-up;
the baseline file already has a placeholder for it.

Supersedes #1921.
@maxisbey maxisbey marked this pull request as ready for review June 15, 2026 14:24
run: >-
npx --yes @modelcontextprotocol/conformance@"$CONFORMANCE_VERSION" client
--command 'uv run --frozen python .github/actions/conformance/client.py'
--suite all

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure all is the correct one, counterintuitively - I believe it has 2 scenarios that aren't actually load bearing (I think something related to old client auth cases from a pre-2025-11-25 spec version)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Land it for now though, we can fix if we end up with remaining failures that aren't covered by any SEPs

- auth/scope-step-up
# SEP-990 (enterprise-managed authorization extension): no fixture handler /
# client support for the token-exchange + JWT bearer flow.
- auth/enterprise-managed-authorization

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we not have EMA in Python at all? Do we need to build this into Python potentially @pcarleton?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea it does not apparently

@maxisbey maxisbey merged commit a3689ab into main Jun 15, 2026
59 of 61 checks passed
@maxisbey maxisbey deleted the conformance-harness-0.2.0-alpha.3 branch June 15, 2026 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants