oauth: H-2 — refresh-token reuse detection (gating mode)#106
Closed
BorisTyshkevich wants to merge 4 commits into
Closed
oauth: H-2 — refresh-token reuse detection (gating mode)#106BorisTyshkevich wants to merge 4 commits into
BorisTyshkevich wants to merge 4 commits into
Conversation
804528d to
5302328
Compare
Refs #103. Embed jti + family_id into MCP-issued refresh JWEs in gating mode; persist consumed jtis + revoked families in two ClickHouse tables in the new `altinity` database; revoke the entire family when a previously -redeemed jti is replayed. This closes the silent 30-day replay window for captured gating-mode refresh tokens (OAuth 2.1 §4.13.2, MCP auth 2025-11-25). Opt-in via oauth.refresh_revokes_tracking (default false). Operators must run docs/sql/oauth-state.sql + GRANT INSERT,SELECT on altinity.* to mcp_service before flipping the flag — startup refuses to boot if the flag is on with mode: forward, clickhouse.read_only: true, or an empty clickhouse.database. Hard fail with ERR-level zerolog on every CH error path; never silently mints a new pair when state lookup fails. Legacy refresh tokens (lacking family_id/jti) are rejected with invalid_grant on first redemption after deploy; clients re-auth once. Auto-promotion was rejected because it would let a captured pre-deploy token be replayed exactly once before the family starts tracking. Forward mode is intentionally out of scope: Auth0 itself rotates and detects reuse upstream, and our wrapper stays stateless to keep horizontal scaling cheap. Tests cover: HappyPath, ReplayRevokesFamily, LegacyTokenRejected, StateUnreachable, plus three startup-validation rejections (forward mode, read_only=true, empty database). All run with an in-memory fake oauth_state.Store; live-CH SQL roundtrip is validated by the otel-mcp deployment's negative replay test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6e6598c to
0f3318d
Compare
Pre-merge external review of #106 (H-2) flagged that the previous SELECT-then-INSERT in pkg/oauth_state/store.go is not race-safe: two parallel redemptions of the same captured refresh JWE both observe count()=0 on the SELECT, both succeed at the INSERT, and the family forks before any reuse is detected. From that point an attacker who stole one token has their own valid branch of the family chain and never needs the original token again. RFC 9700 §refresh-token rotation requires atomic *consume-before-mint*: a check-then-act pattern where the second redeemer cannot see "not consumed" while the first hasn't yet committed is the only model that prevents the parallel-replay fork. The fix uses ClickHouse KeeperMap with `keeper_map_strict_mode=1` — a Raft-linearizable KV that throws a duplicate-key exception on collision. Concurrent INSERTs are serialised through the Keeper leader; exactly one wins, the rest receive `KEEPER_EXCEPTION: Transaction failed (Node exists)` (Code 999). The losing pods record the family revocation and reject the request. The earlier H-2 plan rejected KeeperMap with the rationale "couples OAuth correctness to Keeper". That rationale was wrong: ReplicatedMergeTree is already on Keeper for replication coordination. KeeperMap doesn't add coupling — it exposes the linearizable primitive Keeper already provides. Implementation: * pkg/oauth_state/store.go reshape: cheap revoked-family check first (point lookup on ReplicatedMergeTree), then atomic INSERT into the KeeperMap consumed-jtis table with `SETTINGS keeper_map_strict_mode=1` on the statement (table-level SETTINGS is silently ignored — verified empirically against CH 26.1.6). On duplicate-key exception, record the family revocation best-effort and return ErrRefreshReused. The revoke-table INSERT is non-fatal: KeeperMap atomicity alone is what enforces single-claim; revoked_families is the audit trail. * isKeeperMapDuplicateKeyError matches the exact Keeper error string ("Transaction failed (Node exists)") observed against 26.1.6 stock and 26.1.11.20001.altinityantalya. The phrase is specific to Keeper Raft create-if-absent rejections; other Keeper errors (timeout, leader-loss) produce different messages. * New pkg/oauth_state/cleanup.go runs an in-process goroutine on an hourly ticker issuing `ALTER TABLE altinity.oauth_refresh_consumed_jtis DELETE WHERE consumed_at < now() - INTERVAL 35 DAY`. KeeperMap doesn't support CH-native TTL, so cleanup is application-side. Multi-pod deployments all run their own loops — duplicate ALTER DELETEs are harmless. Failures log at WARN, never panic; cleanup failure does not affect the security control (KeeperMap atomicity does). * pkg/server/server.go starts the cleanup loop when RefreshRevokesTracking is enabled. serverCleanupRunner adapter reads s.refreshStateStore on each tick so SetRefreshStateStore (test-only) takes effect on the cleanup loop too. * docs/sql/oauth-state.sql switches consumed_jtis to ENGINE = KeeperMap('/altinity_mcp/oauth_refresh_consumed_jtis') PRIMARY KEY jti. revoked_families stays ReplicatedMergeTree (INSERTs are idempotent for revocation; CH-native TTL handles cleanup). Adds ALTER DELETE to the mcp_service grants for the cleanup goroutine. * docs/oauth-refresh-reuse-detection.md and the operator wiki (acm/mcp/.wiki/mcp-oauth-debugging.md) document the new keeper_map_path_prefix CH config drop-in prerequisite, the query-level SETTINGS load-bearing constraint, and the engine-mixed storage shape. Tests: * TestOAuthRefreshReuseDetection_AtomicConcurrentClaim hammers the refresh handler with 50 goroutines redeeming the SAME refresh JWE in parallel via a barrier release. Asserts exactly 1 success and 49 invalid_grant with reuse_detected. The fake store synchronises via sync.Mutex (faithful to KeeperMap's exactly-one-winner semantics). This is the regression test for the parallel-replay race. * fakeRefreshStateStore extended with no-op Cleanup to satisfy the Store interface change. Verified against otel CH cluster 9572 during development: * keeper_map_path_prefix added via acmctl setting create + push * DROP+CREATE migrated consumed_jtis to KeeperMap * GRANT INSERT, SELECT, ALTER DELETE applied * Smoke INSERT-twice produced KEEPER_EXCEPTION as expected Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eplay) Two scripts + README covering refresh-token reuse detection end-to-end against a live deployment, complementing the unit test TestOAuthRefreshReuseDetection_AtomicConcurrentClaim: - scripts/qa/h2-replay-test.sh — drives the OAuth dance interactively (DCR + Auth0 browser login captured by an inline python http.server on localhost:8910), then exercises the sequential reuse-detection path: refresh R0 (expect 200 + R1), replay R0 (expect 400 invalid_grant reuse_detected), bonus replay R1 (expect 400 too — family-wide revocation per RFC 9700). - scripts/qa/h2-parallel-test.sh — same OAuth dance, then fires N concurrent redemptions of the same R0 via backgrounded curl + wait. Asserts exactly 1 × 200 OK + N-1 × 400 invalid_grant; non-zero exit on anomaly. This is the test that proves KeeperMap strict mode's atomicity: the SELECT-then-INSERT design would fail this; the new code shouldn't. Verified passing against otel-mcp during development. - scripts/qa/README.md — when to run, prerequisites (jq + openssl + python3 + browser), state-table verification SQL, failure-mode triage. Verified against otel-mcp (cd65134, oauth-h2-atomic image): - replay-test → consumed_jtis row (R0's jti) + revoked_families row (family_id, reason=reuse_detected); R1 also rejected - parallel-test (N=50) → exactly 1 × 200, 49 × 400 reuse_detected; consumed_jtis +1, revoked_families +1 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…im re-check Second-pass review of #106 flagged that the consumed-jti claim was made atomic via KeeperMap, but the revoked-families table stayed ReplicatedMergeTree and was checked on every refresh. ReplicatedMergeTree replication is asynchronous: a recent revoke INSERT on pod A can be invisible to pod B's read on a different CH replica for some replication lag. That window lets the winner of a forked family keep refreshing on different pods/replicas while the revoke catches up — the same parallel- replay window the KeeperMap consumed-jti claim was supposed to close. Three fixes: 1. revoked_families becomes KeeperMap, no strict mode (idempotent overwrite — parallel losers writing the same family_id is fine). KeeperMap reads are linearizable through Keeper Raft, so a revoke write on pod A is visible to pod B's next pre-check regardless of CH replica routing. 2. Revoke INSERT failures hard-fail instead of WARN-and-return. The previous code logged WARN and returned ErrRefreshReused even if the revoke row didn't land — leaving the winning fork alive on every subsequent refresh. RFC 9700 family-wide revocation must be authoritative, not best-effort. On revoke INSERT failure we now return a generic error which the handler turns into HTTP 500 server_error; operators see an ERR-level "SECURITY: reuse detected but revoke INSERT failed" log and page on the underlying Keeper or grant issue. 3. Post-claim revoked re-check. Step 1 (pre-check) and step 2 (atomic claim) are not a single Keeper transaction — there's a microsecond TOCTOU window where another pod can revoke between them. After a successful claim we re-check revocation; if the family was revoked during the window we return ErrRefreshReused (the consumed-jti slot is spent but no token is minted). This was the residual concern that even with KeeperMap-backed revoked_families wasn't fully closed by the pre-check alone. Schema migration applied to otel CHI 9572: DROP + ReplicatedMergeTree → CREATE + KeeperMap. 2 existing audit rows discarded; the affected families' refresh tokens are still rejected by the consumed-jti pre-check on the new path. Cleanup goroutine now runs ALTER TABLE … DELETE on both KeeperMap tables on the same hourly ticker. Failures on either are wrapped and returned together so observability isn't masked. Tests: * TestOAuthRefreshReuseDetection_RevokeInsertFailureHardFails pins the new hard-fail behavior. The fake store grew a `failRevoke` knob: when armed, the duplicate-detected branch returns a generic error instead of ErrRefreshReused, simulating a Keeper write failure on the revoke. The handler must respond with 500 server_error, NOT 400 invalid_grant — otherwise the regression silently restores the "winning fork stays alive" hole. * Existing TestOAuthRefreshReuseDetection_AtomicConcurrentClaim (50 concurrent redeemers, exactly 1 winner) still passes 10× runs + race detector. The fake's reuse-detected branch now also defaults to recording the revoke before returning ErrRefreshReused, matching the new production semantics. * Documentation updated in docs/oauth-refresh-reuse-detection.md ("Why KeeperMap (for both tables)", "Revoke must persist", "Post- claim revocation re-check") and the operator wiki to call out the KeeperMap-on-both-tables shape. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 8, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Closes #103.
H-2 from the OAuth security review: refresh-token reuse detection in gating mode. Embeds
jti+family_idinto MCP-issued refresh JWEs and persists consumed jtis + revoked families in two ClickHouse tables. When a previously-redeemed jti is replayed, the entire token family is invalidated and the user re-authenticates.This is the third of a 3-PR stack:
Threat closed
Without H-2, a captured refresh JWE can be redeemed many times in parallel — each redemption mints a fresh access+refresh pair until the JWE's
exp(default 30 days). An attacker who briefly captures a refresh token (leaked log, intermediate proxy, browser-extension compromise) gets a silent 30-day window of access against the legitimate user's identity. The legitimate user has no signal.With H-2, the moment the legitimate client refreshes after the attacker (or vice-versa), the loser's jti is in the consumed-set, the family is revoked, both parties are rejected on subsequent attempts, and the user re-auths once. The auth event is a clear signal.
OAuth 2.1 §4.13.2 and MCP authorization 2025-11-25 require this. RFC 6749 §10.4 frames the same requirement at SHOULD level for OAuth 2.0.
Approach
jtifamily_idTwo ClickHouse tables in a new
altinitydatabase:altinity.oauth_refresh_consumed_jtisjtialtinity.oauth_refresh_revoked_familiesfamily_idBoth
[Replicated]MergeTreewithTTL ... + INTERVAL 35 DAYfor storage bound. Engine-agnostic Go code: SELECT and INSERT work the same on both.Per refresh: 1 combined
SELECT count()(two subqueries) + 1INSERT. At realistic load (~hundreds of clients, refresh ~1×/h) ≈ 0.5 qps cluster-wide. Lookups happen only ongrant_type=refresh_token— never on regular MCP requests (those validate access tokens locally via HMAC).Out of scope (intentionally):
mode: forward+ the flag.[Replicated]MergeTreeis the right tool at this query volume.Operator prerequisite
Run
docs/sql/oauth-state.sql(clustered or single-node flavor) as a CH admin user before flippingoauth.refresh_revokes_tracking: truein helm values. Creates thealtinitydatabase, both tables, andGRANT INSERT, SELECT ON altinity.* TO mcp_service.When the flag is on:
mode: forwardorclickhouse.read_only: truemcp_serviceuser cannot have aREADONLY=1profile (operator's responsibility — verify withsystem.users/system.settings_profiles)Cluster name
all-replicatedis convention for a single logical shard spanning all replicas (sharding-compatible). Documented indocs/oauth-refresh-reuse-detection.md.Legacy-token policy
Refresh tokens issued before deploy lack
family_id/jti→ rejected withinvalid_granton first redemption. Clients re-authenticate once. Auto-promotion was rejected because it would let a captured pre-deploy token be replayed exactly once before tracking starts.Failure modes — hard fail with ERR
Every CH-state failure path returns HTTP 500
server_errorand emits an ERR-level zerolog line. No silent fallthrough. Detailed table indocs/oauth-refresh-reuse-detection.md§ Failure modes.Live deployment
Running on
otel-mcp.demo.altinity.cloudsince this commit landed (imagewip-oauth-h2-81b69b3):mcp_service(NOT readonly profile)currentUser()returns the impersonated email, confirming H-1 + H-2 + cluster_secret chain intactconsumed_jtirow will land naturally when the otel connector's current 1h-TTL access token expires (token issued before H-2 deploy, exp ~17:28 UTC)Test plan
pkg/oauth_stateunit tests cover happy-path / replay-revokes-family / legacy-token-rejected / state-unreachable / config-rejection (forward+flag, read_only+flag, empty database+flag)oauth_state.Storeexercises the refresh-handler control flow without standing up a CH harness/oauth/tokenwith a captured refresh JWE (deferred — easy to drive once anyone wants to extract a refresh JWE from claude.ai's local cache)See also
docs/oauth-refresh-reuse-detection.md— full design + threat model + DDL examples + grants + failure modes + rollbackdocs/sql/oauth-state.sql— operator DDL (clustered + single-node flavors)/Users/Workspaces/acm/mcp/.wiki/mcp-oauth-debugging.md§ H-2 — operator wiki rollout checklist + per-deployment ops🤖 Generated with Claude Code