Skip to content

Configure rate limits on VirtualMCPServer PR A#5079

Open
Sanskarzz wants to merge 7 commits into
stacklok:mainfrom
Sanskarzz:ratelimitingVMCP
Open

Configure rate limits on VirtualMCPServer PR A#5079
Sanskarzz wants to merge 7 commits into
stacklok:mainfrom
Sanskarzz:ratelimitingVMCP

Conversation

@Sanskarzz
Copy link
Copy Markdown
Contributor

@Sanskarzz Sanskarzz commented Apr 27, 2026

PR A — shared→global rename + CRD schema for vMCP

Summary

Add the VirtualMCPServer rate-limiting API surface in line with THV-0057.

This change introduces spec.config.rateLimiting support on VirtualMCPServer. The field lives on the vMCP runtime config type so the operator can serialize it into the generated config.yaml and the vMCP process can read it at startup.

The CRD validation rejects invalid combinations early, including per-user limits without OIDC incoming auth and any rate limiting without Redis session storage. Generated CRDs and operator docs were updated accordingly.

Runtime vMCP enforcement is intentionally split out of this PR and will be handled in a follow-up PR.

Fixes #4552

Type of change

  • Bug fix
  • New feature
  • Refactoring (no behavior change)
  • Dependency update
  • Documentation
  • Other (describe):

Test plan

  • Unit tests (task test)
  • E2E tests (task test-e2e)
  • Linting (task lint-fix)
  • Manual testing (describe below) I will describe full manual testing in follow-up PR

API Compatibility

This PR adds optional VirtualMCPServer.spec.rateLimiting configuration to the v1beta1 API. This is an additive, backward-compatible API change.

Existing VirtualMCPServer resources do not need migration. Cluster admins can opt in by adding spec.rateLimiting and must configure Redis-backed spec.sessionStorage when rate
limiting is enabled. Per-user limits additionally require OIDC incoming auth.

  • This PR does not break the v1beta1 API, OR the api-break-allowed label is applied and the migration guidance is described above.

Changes

  • Add VirtualMCPServer.spec.config.rateLimiting on the vMCP config type
  • Add CEL validation for Redis session storage and OIDC-required per-user limits
  • Preserve rate-limit config through CRD-to-vMCP config conversion
  • Keep pkg/ratelimit using the CRD RateLimitConfig type directly
  • Add unit coverage for vMCP config serialization and converter passthrough

Large PR Justification

This is a new feature package with a large test suite, and it needs to land as one coherent phase.

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions github-actions Bot added the size/XL Extra large PR: 1000+ lines changed label Apr 27, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 27, 2026
Copy link
Copy Markdown
Contributor

@jerm-dro jerm-dro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Sanskar, thanks for putting this together — the end-to-end shape is right: top-level spec.rateLimiting with CEL validation, converter wiring, and the vMCP middleware integration with optimizer-aware tool name resolution. The manual testing scenario in the description is thorough. Requesting changes on two axes before we go further.

1. Drop pkg/ratelimit/config — use the CRD type directly

The proxyrunner path already uses *v1beta1.RateLimitConfig directly throughout pkg/ratelimit on main today. vMCP can do the same — just pass the CRD type through. The new package duplicates the CRD types 1:1, still imports metav1.Duration, and adds conversion boilerplate (ToInternal(), hand-written DeepCopy, unused EffectiveGlobal() methods) without solving a real problem.

2. Split the PR

Even after removing pkg/ratelimit/config, this PR bundles several separable changes. At minimum I'd split into:

PR A — sharedglobal rename + CRD schema for vMCP (pure schema change, no runtime behavior):

  • Add Global field to RateLimitConfig / ToolRateLimitConfig with shared as deprecated alias
  • Add spec.rateLimiting to VirtualMCPServerSpec with CEL validation
  • Converter wiring
  • Generated CRDs, docs

PR B — vMCP rate-limit middleware wiring (depends on A):

  • buildRateLimitMiddleware + Redis client lifecycle
  • Middleware chain refactoring in server.go (this changes execution order — MCP parsing moves before audit — and deserves focused review)
  • Optimizer call_tool tool-name unwrapping
  • Unit tests + E2E test

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 28, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.15%. Comparing base (110b9a7) to head (c829eb5).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #5079      +/-   ##
==========================================
- Coverage   68.17%   68.15%   -0.02%     
==========================================
  Files         618      618              
  Lines       63122    63122              
==========================================
- Hits        43033    43023      -10     
- Misses      16878    16891      +13     
+ Partials     3211     3208       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added size/XL Extra large PR: 1000+ lines changed and removed size/XL Extra large PR: 1000+ lines changed labels Apr 28, 2026
@github-actions github-actions Bot added the size/L Large PR: 600-999 lines changed label Apr 30, 2026
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels Apr 30, 2026
vmcpconfig.Config `yaml:",inline"`

RateLimiting *mcpv1beta1.RateLimitConfig `yaml:"rateLimiting,omitempty"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why introduce this new type? Why not put ratelimiting on vmcpconfig.Config?

As implemented, I don't think the RateLimiting config ever makes it into vmcp's server.go. The CRD field doesn't survive the conversion step, since there's nowhere to write the RateLimiting config to.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You’re right. I had added an operator-side ConfigMap wrapper, but that was incomplete because vMCP loads the file into vmcpconfig.Config, so rateLimiting would be ignored before reaching server.go.
I removed the wrapper and reverted this path to marshal vmcpconfig.Config directly. Since this PR is now scoped to the CRD/schema validation side, I’ll leave the actual runtime propagation/enforcement for PR B, where we can wire the config into vMCP and the middleware properly.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost there. You'll want to put RateLimiting *RateLimitConfig on the config type, here, rather than on the CRD type, VirtualMCPServerSpec. The config type is already within the VirtualMCPServerSpec. Putting the RateLimiting field on the config struct means that configuration information will be available to vMCP when it's running. vMCP and most other toolhive services don't read CRDs at runtime, they read config types. If the field is not on the config, then it cannot enable rate limiting.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense. I was treating rateLimiting as a VMCP CRD-level field, but I missed that vMCP only sees the serialized runtime config, not the CRD object directly.

I moved it onto pkg/vmcp/config.Config as spec.config.rateLimiting, removed the top-level VirtualMCPServerSpec.rateLimiting field, updated the VMCP CEL validation to reference
config.rateLimiting, and regenerated CRDs/docs/deepcopy.

I also added converter coverage to verify the field survives the VirtualMCPServer -> vMCP config path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should've been more exact. I didn't exactly mean RateLimiting *RateLimitConfig on the pkg/vmcp/config.Config. I meant you should reuse the existing v1beta1.RateLimitConfig. You shouldn't have to introduce the types: RateLimitConfig, RateLimitBucket, and ToolRateLimitConfig in pkg/vmcp/config/config.go. Those can be deleted.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked at this and agree we should not duplicate RateLimitConfig, RateLimitBucket, and ToolRateLimitConfig in pkg/vmcp/config.

One implementation detail: pkg/vmcp/config cannot directly use v1beta1.RateLimitConfig, because v1beta1.VirtualMCPServerSpec already embeds pkg/vmcp/config.Config, so importing v1beta1 from pkg/vmcp/config creates an import cycle:

pkg/vmcp/config -> cmd/thv-operator/api/v1beta1 -> pkg/vmcp/config

To avoid that while still keeping a single concrete type, I’m thinking of moving the rate-limit structs into a lower-level shared package, e.g. pkg/ratelimit/types, and then aliasing them from v1beta1:

type RateLimitConfig = ratelimittypes.RateLimitConfig

That lets existing code keep using v1beta1.RateLimitConfig, while pkg/vmcp/config.Config.RateLimiting uses the same underlying type without duplicating structs.

Does that package split sound acceptable, or would you prefer a different home for the shared rate-limit config types?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable. Thank you for explaining.

@Sanskarzz Sanskarzz force-pushed the ratelimitingVMCP branch from 41b9789 to dc566e1 Compare May 2, 2026 15:59
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 2, 2026
@Sanskarzz Sanskarzz force-pushed the ratelimitingVMCP branch from 9885fd4 to dcc2331 Compare May 6, 2026 08:43
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 6, 2026
@Sanskarzz Sanskarzz force-pushed the ratelimitingVMCP branch from dcc2331 to bccacff Compare May 6, 2026 13:23
@github-actions github-actions Bot added size/L Large PR: 600-999 lines changed and removed size/L Large PR: 600-999 lines changed labels May 6, 2026
@github-actions github-actions Bot removed the size/L Large PR: 600-999 lines changed label May 11, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large PR Detected

This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.

How to unblock this PR:

Add a section to your PR description with the following format:

## Large PR Justification

[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformation

Alternative:

Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.

See our Contributing Guidelines for more details.


This review will be automatically dismissed once you add the justification section.

@github-actions
Copy link
Copy Markdown
Contributor

✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds the API surface and schema validation needed to configure rate limiting on VirtualMCPServer (and aligns rate-limit type definitions across operator/runtime), while deferring actual runtime enforcement to a follow-up.

Changes:

  • Introduces shared pkg/ratelimit/types API types and wires them into v1beta1 CRD types and vMCP runtime config serialization.
  • Adds CRD/CEL validations for VirtualMCPServer to reject unsupported rate-limit configurations (requires Redis session storage; per-user limits require OIDC incoming auth).
  • Updates generated artifacts (CRDs + operator/API docs + OpenAPI swagger) and adds/updates unit & integration coverage.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go E2E recovery flow tweak (now waits for workload template update before deleting pods).
pkg/vmcp/config/zz_generated.deepcopy.go Deepcopy regen to include RateLimiting + import alias adjustments.
pkg/vmcp/config/config.go Adds RateLimiting to vMCP runtime config model.
pkg/ratelimit/types/zz_generated.deepcopy.go Generated deepcopies for new shared rate-limit API types.
pkg/ratelimit/types/types.go New shared RateLimitConfig/RateLimitBucket/ToolRateLimitConfig structs + validations.
pkg/ratelimit/types/doc.go Controller-gen markers for shared rate-limit types package.
pkg/ratelimit/limiter_test.go Test rename (global→shared) + new test asserting Redis key creation.
pkg/ratelimit/internal/bucket/bucket.go Comment update (global→shared) to match terminology.
docs/server/swagger.yaml OpenAPI schema updates to reference shared rate-limit types.
docs/server/swagger.json OpenAPI schema updates to reference shared rate-limit types.
docs/server/docs.go Embedded OpenAPI template updated for shared rate-limit types.
docs/operator/crd-api.md Operator CRD API docs updated to include ratelimit types and new vMCP config field.
deploy/charts/operator-crds/templates/toolhive.stacklok.dev_virtualmcpservers.yaml Generated CRD template updated with config.rateLimiting schema + CEL validations.
deploy/charts/operator-crds/files/crds/toolhive.stacklok.dev_virtualmcpservers.yaml Generated CRD (rendered) updated with config.rateLimiting schema + CEL validations.
cmd/thv-operator/test-integration/virtualmcp/virtualmcpserver_sessionstorage_cel_test.go Integration tests added for new CEL validation rules around rate limiting.
cmd/thv-operator/test-integration/embedding-server/embeddingserver_update_test.go Strengthens update test to also assert Service existence.
cmd/thv-operator/Taskfile.yml Ensures controller-gen + docs generation include new ratelimit types package.
cmd/thv-operator/pkg/vmcpconfig/converter_test.go Adds converter passthrough coverage for RateLimiting.
cmd/thv-operator/controllers/virtualmcpserver_vmcpconfig_test.go Improves vmcp config ConfigMap test by unmarshalling and asserting core fields.
cmd/thv-operator/api/v1beta1/zz_generated.deepcopy.go Deepcopy regen to use shared rate-limit types.
cmd/thv-operator/api/v1beta1/virtualmcpserver_types.go Adds CEL validations enforcing Redis + OIDC requirements for vMCP rate limiting.
cmd/thv-operator/api/v1beta1/mcpserver_types.go Rehomes rate-limit types to shared package via type aliases and updates spec field type.
cmd/thv-operator/api/v1beta1/mcpserver_types_test.go Adds JSON roundtrip coverage for vMCP spec.config.rateLimiting.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +464 to +481
By("Waiting for backend StatefulSet template to use the fixed image")
Eventually(func() error {
sts := &appsv1.StatefulSet{}
if err := k8sClient.Get(ctx, types.NamespacedName{
Name: backend2Name,
Namespace: testNamespace,
}, sts); err != nil {
return err
}
for _, container := range sts.Spec.Template.Spec.Containers {
if container.Name == "mcp" {
if container.Image != images.YardstickServerImage {
return fmt.Errorf("statefulset still has image %q", container.Image)
}
return nil
}
}
return fmt.Errorf("mcp container not found in statefulset template")
Comment thread docs/server/swagger.yaml
type: object
github_com_stacklok_toolhive_pkg_ratelimit_types.RateLimitBucket:
description: |-
PerUser token bucket configuration for this tool.
Comment thread docs/server/swagger.json
"type": "object"
},
"github_com_stacklok_toolhive_pkg_ratelimit_types.RateLimitBucket": {
"description": "PerUser token bucket configuration for this tool.\n+optional",
Comment thread docs/server/docs.go
"type": "object"
},
"github_com_stacklok_toolhive_pkg_ratelimit_types.RateLimitBucket": {
"description": "PerUser token bucket configuration for this tool.\n+optional",
Sanskarzz and others added 7 commits May 12, 2026 18:50
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR: 1000+ lines changed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Configure rate limits on VirtualMCPServer

3 participants