Skip to content

Add E2E test for upstreamInject identity propagation after cross-pod Redis restore#5660

Draft
tgrunnagle wants to merge 8 commits into
vmcp-restoresession-identity-issue_5336from
vmcp-restoresession-e2e-issue_5658
Draft

Add E2E test for upstreamInject identity propagation after cross-pod Redis restore#5660
tgrunnagle wants to merge 8 commits into
vmcp-restoresession-identity-issue_5336from
vmcp-restoresession-e2e-issue_5658

Conversation

@tgrunnagle

@tgrunnagle tgrunnagle commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Summary

This PR is stacked on #5650 and adds only the E2E test infrastructure that was blocked by the lack of an in-cluster OIDC provider (see the `TODO(#5336)` removed here). All production code fixes are in #5650.

PR #5650 fixed two bugs in the cross-pod Redis session restore path for multi-replica vMCP deployments:

  1. `RestoreSession` was fabricating a partial `*auth.Identity` with empty `Token`/`UpstreamTokens`; it now passes `nil`.
  2. `loadSession` was using `context.Background()` for the backend `Initialize` handshake during restore; it now threads the incoming request's context (carrying a live `*auth.Identity` with populated `UpstreamTokens`) through `context.WithoutCancel`.

The existing cross-pod E2E tests exercised only anonymous sessions and could not catch regressions in the authenticated restore path. This PR adds the missing coverage.

Closes #5658

Type of change

  • Tests only

Test plan

  • E2E tests (`task test-e2e`)

The new test (Context 5 of `virtualmcp_redis_session_test.go`) exercises the full authenticated restore path:

  1. Deploys Dex as an in-cluster OIDC provider (Option B from E2E test: vMCP cross-pod restore with upstreamInject outgoing auth #5658) with a `mockCallback` connector — no browser consent required.
  2. Deploys an instrumented MCP backend that counts `initialize` JSON-RPC calls and logs `Authorization: Bearer` tokens to a `/stats` endpoint.
  3. Wires the embedded auth server → Dex → `upstreamInject` outgoing auth chain.
  4. Establishes an authenticated session on pod A; asserts `InitializeCalls >= 1` on the backend.
  5. Restores the same session on pod B (triggering `RestoreSession` from Redis); asserts `InitializeCalls >= 2`.

The `InitializeCalls` metric is incremented only on the `initialize` JSON-RPC method, making the threshold strictly load-bearing: it can reach 2 only after both pod A's Initialize and pod B's RestoreSession-triggered Initialize succeed with a valid upstream token. The test fails if `context.WithoutCancel` in `loadSession` is reverted to `context.Background()`.

Changes

File Change
`test/e2e/images/images.go` Added `DexImage` constant (`ghcr.io/dexidp/dex:v2.42.0`)
`test/e2e/thv-operator/virtualmcp/helpers.go` Added `deployDex`/`cleanupDex` helpers, `DexInfo` struct, `InstrumentedMCPBackendScript`, `GetInstrumentedMCPBackendStats`, and `getEmbeddedASToken` (OAuth2 PKCE flow helper)
`test/e2e/thv-operator/virtualmcp/virtualmcp_redis_session_test.go` New Context 5, removed `TODO(#5336)`
`.github/workflows/test-e2e-lifecycle.yml` Pre-pull and load `ghcr.io/dexidp/dex:v2.42.0` into Kind alongside the other test images

Special notes for reviewers

Why Dex: The existing `DeployParameterizedOIDCServer` only serves JWKS/discovery for token validation. The embedded auth server also needs OAuth2 authorization and token endpoints (`/auth`, `/token`) to complete the upstream token exchange that drives `upstreamInject`. Dex with the `mockCallback` connector provides these without browser interaction.

OAuth2 flow from the test process: The embedded AS runs inside vMCP pods (in-cluster) and Dex is in-cluster. `getEmbeddedASToken` bridges this by combining port-forwards (for the embedded AS endpoints) and Dex's NodePort (for external reachability), rewriting in-cluster URLs to local addresses at each redirect hop of the PKCE flow.

Shared Redis: The test reuses the session-storage Redis instance for the embedded AS's upstream token store (different key prefix), avoiding a second Redis deployment.

CI: The workflow triggers automatically on PRs touching `test/e2e/thv-operator/**`. Ginkgo runs all tests in that tree with `--procs=8` and no label filter, so Context 5 runs as part of every CI check. The Dex image is now pre-loaded into Kind in the same parallel pull step as the other test images.

Generated with Claude Code

tgrunnagle and others added 2 commits June 26, 2026 13:37
Implements issue #5658 (Option B: Dex as in-cluster OIDC provider).

- Add DexImage constant to centralized E2E image registry
- Add deployDex/cleanupDex helpers to helpers.go: deploy Dex with
  mockCallback connector (auto-approves, no browser interaction) +
  NodePort service for external test-process access
- Add InstrumentedMCPBackendScript: a Python Flask MCP server that
  correctly handles MCP JSON-RPC and logs every inbound Bearer token
  to a /stats endpoint
- Add getEmbeddedASToken: performs the full OAuth2 PKCE authorization
  code flow from the test process against the embedded AS, rewriting
  in-cluster URLs to NodePort/port-forward addresses at each hop
- Add "When cross-pod session restore preserves upstreamInject identity"
  Context in virtualmcp_redis_session_test.go that verifies the fix
  from #5650: context propagation through loadSession ensures the
  incoming identity (with UpstreamTokens from Dex, keyed by tsid in
  shared Redis) reaches RestoreSession, so upstreamInject can inject
  the Dex token into backend Initialize requests after cross-pod restore
- Remove the TODO(#5336) comment; the test now covers this gap

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Mark InstrumentedBackendScript deprecated in favor of InstrumentedMCPBackendScript
- Wrap Dex NodePort read in Eventually to handle async assignment
- Fix rewriteURLBase to use exact hostname equality instead of substring Contains

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Jun 26, 2026

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@codecov

codecov Bot commented Jun 26, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.34%. Comparing base (533301f) to head (53a0f9d).

Additional details and impacted files
@@                             Coverage Diff                             @@
##           vmcp-restoresession-identity-issue_5336    #5660      +/-   ##
===========================================================================
- Coverage                                    70.34%   70.34%   -0.01%     
===========================================================================
  Files                                          649      649              
  Lines                                        66175    66091      -84     
===========================================================================
- Hits                                         46553    46490      -63     
+ Misses                                       16271    16255      -16     
+ Partials                                      3351     3346       -5     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tgrunnagle tgrunnagle changed the base branch from main to vmcp-restoresession-identity-issue_5336 June 26, 2026 21:11
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Jun 26, 2026
@github-actions github-actions Bot dismissed their stale review June 26, 2026 21:11

PR size has been reduced below the XL threshold. Thank you for splitting this up!

@github-actions

Copy link
Copy Markdown
Contributor

✅ PR size has been reduced below the XL threshold. The size review has been dismissed and this PR can now proceed with normal review. Thank you for splitting this up!

1. Audience mismatch: the operator derives AllowedAudiences from
   oidcRef.ResourceURL; without an explicit ResourceURL it falls back to
   http://<vmcpName>.svc:4483 (the resource name, not the service name).
   Set ResourceURL = vmcpServiceURL explicitly so AllowedAudiences matches
   the Audience set in MCPOIDCConfigReference, satisfying the validator.

2. Backend container crashes: the proxy runner's configureContainer() calls
   WithArgs(MCPServer.Spec.Args), which overwrites the Args set in the
   PodTemplateSpec patch. Pass InstrumentedMCPBackendScript via
   v1beta1test.WithArgs() instead, leaving only the Command override
   (sh -c) in the patch where it is preserved.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 26, 2026
Adding "net/url" to helpers.go caused the revive linter to flag existing
"url" variable/parameter names as import-shadowing. Rename them to avoid
the conflict: url -> healthURL in checkHTTPHealthReady, url -> rawURL in
getAndDecodeJSON.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 26, 2026

@tgrunnagle tgrunnagle left a comment

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

Finding Severity Score Action
F01: Pod B BearerTokenRequests >= 2 not load-bearing HIGH 8/10 Fix
F02: rewriteURLBase allows port-mismatch bypass MEDIUM 9/10 Fix
F03: Intermediate bearer assertion satisfied by ListTools MEDIUM 7/10 Fix
F04: rand.Read return value discarded without comment MEDIUM 7/10 Fix
F05: HTTP response bodies not drained before close LOW 8/10 Fix
F06: registerOAuthClient errors not wrapped with %w LOW 9/10 Fix

Holistic Assessment

The infrastructure design is solid — Dex with mockCallback, the 9-step PKCE flow, and InstrumentedMCPBackendScript are all well-executed and correctly model the out-of-cluster test process accessing in-cluster endpoints. The Dex helper mirrors deployRedis/cleanupRedis cleanly, and the Deprecated annotation on InstrumentedBackendScript correctly directs future tests to InstrumentedMCPBackendScript.

The critical issue (F01) is that the final assertion uses BearerTokenRequests >= 2, which can be satisfied by pod A's Initialize + ListTools before pod B's RestoreSession fires — meaning the test would pass even if context.WithoutCancel in loadSession were reverted to context.Background(). The fix is to assert on InitializeCalls >= 2 instead, which is incremented only on the initialize JSON-RPC method and is strictly load-bearing.

Generated with Claude Code

Comment thread test/e2e/thv-operator/virtualmcp/virtualmcp_redis_session_test.go Outdated
Comment thread test/e2e/thv-operator/virtualmcp/virtualmcp_redis_session_test.go Outdated
Comment thread test/e2e/thv-operator/virtualmcp/helpers.go
Comment thread test/e2e/thv-operator/virtualmcp/helpers.go
Comment thread test/e2e/thv-operator/virtualmcp/helpers.go
Comment thread test/e2e/thv-operator/virtualmcp/helpers.go
tgrunnagle and others added 2 commits June 26, 2026 14:42
BearerTokenRequests is incremented by every backend request that carries a
Bearer token — including ListTools, CallTool, and Initialize. On pod A,
both the Initialize and the subsequent ListTools call inject the upstream
token via upstreamInject, so BearerTokenRequests can reach 2 before pod B's
RestoreSession fires. This means the final assertion was not load-bearing:
the test passed even if context.WithoutCancel were reverted.

InitializeCalls is incremented only on the "initialize" JSON-RPC method, so
it reaches 2 only after both pod A's Initialize and pod B's RestoreSession-
triggered Initialize succeed. Replace both assertions with InitializeCalls.

Addresses #5660 review comments:
- HIGH test/e2e/thv-operator/virtualmcp/virtualmcp_redis_session_test.go (3484255613): final assertion not load-bearing
- MEDIUM test/e2e/thv-operator/virtualmcp/virtualmcp_redis_session_test.go (3484255624): intermediate assertion inflated by ListTools

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
F02 rewriteURLBase: the fallback path stripped ports and compared only
hostnames, so same-hostname/different-port URLs would silently bypass the
guard. Compare both hostname and port after splitting.

F04 generatePKCEPair: document why rand.Read return values are discarded —
crypto/rand.Read cannot return an error since Go 1.20.

F05 getEmbeddedASToken: drain the three intermediate 302 response bodies
(steps 3, 5, 7) before closing, per the go-style.md connection-reuse rule.

F06 registerOAuthClient: wrap bare errors with fmt.Errorf so callers get
call-site context (which DCR step failed).

Addresses #5660 review comments:
- MEDIUM helpers.go (3484255633): rewriteURLBase port-mismatch bypass
- MEDIUM helpers.go (3484255648): rand.Read discard undocumented
- LOW helpers.go (3484255652): response bodies not drained before close
- LOW helpers.go (3484255643): bare errors in registerOAuthClient

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 26, 2026
ghcr.io/dexidp/dex:v2.42.0 is used by Context 5 (upstreamInject cross-pod
restore test) but was not pre-pulled like the other test images. Add it to
the parallel pull and kind load steps so the node doesn't go to the internet
mid-test, consistent with every other image in the suite.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Jun 26, 2026
The proxy runner's configureContainer applies ReadOnlyRootFilesystem to
the inner mcp container via the platform-aware security context. This
causes pip install flask to fail — Python's tempfile module probes
/tmp, /var/tmp, /usr/tmp, and / in order, and finds none writable.

Mount a writable emptyDir volume at /tmp via the PodTemplateSpec patch
so pip can create build-tracker temp directories during installation.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot removed the size/L Large PR: 600-999 lines changed label Jun 26, 2026
@github-actions github-actions Bot added the size/L Large PR: 600-999 lines changed label Jun 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/L Large PR: 600-999 lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

E2E test: vMCP cross-pod restore with upstreamInject outgoing auth

1 participant