Configure rate limits on VirtualMCPServer PR A#5079
Conversation
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
dfe941c to
d01f3cd
Compare
jerm-dro
left a comment
There was a problem hiding this comment.
Hey Sanskar, thanks for putting this together — the end-to-end shape is right: top-level spec.rateLimiting with CEL validation, converter wiring, and the vMCP middleware integration with optimizer-aware tool name resolution. The manual testing scenario in the description is thorough. Requesting changes on two axes before we go further.
1. Drop pkg/ratelimit/config — use the CRD type directly
The proxyrunner path already uses *v1beta1.RateLimitConfig directly throughout pkg/ratelimit on main today. vMCP can do the same — just pass the CRD type through. The new package duplicates the CRD types 1:1, still imports metav1.Duration, and adds conversion boilerplate (ToInternal(), hand-written DeepCopy, unused EffectiveGlobal() methods) without solving a real problem.
2. Split the PR
Even after removing pkg/ratelimit/config, this PR bundles several separable changes. At minimum I'd split into:
PR A — shared→global rename + CRD schema for vMCP (pure schema change, no runtime behavior):
- Add
Globalfield toRateLimitConfig/ToolRateLimitConfigwithsharedas deprecated alias - Add
spec.rateLimitingtoVirtualMCPServerSpecwith CEL validation - Converter wiring
- Generated CRDs, docs
PR B — vMCP rate-limit middleware wiring (depends on A):
buildRateLimitMiddleware+ Redis client lifecycle- Middleware chain refactoring in
server.go(this changes execution order — MCP parsing moves before audit — and deserves focused review) - Optimizer
call_tooltool-name unwrapping - Unit tests + E2E test
a52365e to
93ca143
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5079 +/- ##
==========================================
- Coverage 68.17% 68.15% -0.02%
==========================================
Files 618 618
Lines 63122 63122
==========================================
- Hits 43033 43023 -10
- Misses 16878 16891 +13
+ Partials 3211 3208 -3 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
93ca143 to
fdb40fe
Compare
fdb40fe to
7e8bbe4
Compare
7e8bbe4 to
349bb94
Compare
87f7350 to
41b9789
Compare
| vmcpconfig.Config `yaml:",inline"` | ||
|
|
||
| RateLimiting *mcpv1beta1.RateLimitConfig `yaml:"rateLimiting,omitempty"` | ||
| } |
There was a problem hiding this comment.
Why introduce this new type? Why not put ratelimiting on vmcpconfig.Config?
As implemented, I don't think the RateLimiting config ever makes it into vmcp's server.go. The CRD field doesn't survive the conversion step, since there's nowhere to write the RateLimiting config to.
There was a problem hiding this comment.
You’re right. I had added an operator-side ConfigMap wrapper, but that was incomplete because vMCP loads the file into vmcpconfig.Config, so rateLimiting would be ignored before reaching server.go.
I removed the wrapper and reverted this path to marshal vmcpconfig.Config directly. Since this PR is now scoped to the CRD/schema validation side, I’ll leave the actual runtime propagation/enforcement for PR B, where we can wire the config into vMCP and the middleware properly.
There was a problem hiding this comment.
Almost there. You'll want to put RateLimiting *RateLimitConfig on the config type, here, rather than on the CRD type, VirtualMCPServerSpec. The config type is already within the VirtualMCPServerSpec. Putting the RateLimiting field on the config struct means that configuration information will be available to vMCP when it's running. vMCP and most other toolhive services don't read CRDs at runtime, they read config types. If the field is not on the config, then it cannot enable rate limiting.
There was a problem hiding this comment.
Thanks, that makes sense. I was treating rateLimiting as a VMCP CRD-level field, but I missed that vMCP only sees the serialized runtime config, not the CRD object directly.
I moved it onto pkg/vmcp/config.Config as spec.config.rateLimiting, removed the top-level VirtualMCPServerSpec.rateLimiting field, updated the VMCP CEL validation to reference
config.rateLimiting, and regenerated CRDs/docs/deepcopy.
I also added converter coverage to verify the field survives the VirtualMCPServer -> vMCP config path.
There was a problem hiding this comment.
Sorry, I should've been more exact. I didn't exactly mean RateLimiting *RateLimitConfig on the pkg/vmcp/config.Config. I meant you should reuse the existing v1beta1.RateLimitConfig. You shouldn't have to introduce the types: RateLimitConfig, RateLimitBucket, and ToolRateLimitConfig in pkg/vmcp/config/config.go. Those can be deleted.
There was a problem hiding this comment.
I looked at this and agree we should not duplicate RateLimitConfig, RateLimitBucket, and ToolRateLimitConfig in pkg/vmcp/config.
One implementation detail: pkg/vmcp/config cannot directly use v1beta1.RateLimitConfig, because v1beta1.VirtualMCPServerSpec already embeds pkg/vmcp/config.Config, so importing v1beta1 from pkg/vmcp/config creates an import cycle:
pkg/vmcp/config -> cmd/thv-operator/api/v1beta1 -> pkg/vmcp/config
To avoid that while still keeping a single concrete type, I’m thinking of moving the rate-limit structs into a lower-level shared package, e.g. pkg/ratelimit/types, and then aliasing them from v1beta1:
type RateLimitConfig = ratelimittypes.RateLimitConfig
That lets existing code keep using v1beta1.RateLimitConfig, while pkg/vmcp/config.Config.RateLimiting uses the same underlying type without duplicating structs.
Does that package split sound acceptable, or would you prefer a different home for the shared rate-limit config types?
There was a problem hiding this comment.
That sounds reasonable. Thank you for explaining.
41b9789 to
dc566e1
Compare
9885fd4 to
dcc2331
Compare
dcc2331 to
bccacff
Compare
bccacff to
49153aa
Compare
There was a problem hiding this comment.
Large PR Detected
This PR exceeds 1000 lines of changes and requires justification before it can be reviewed.
How to unblock this PR:
Add a section to your PR description with the following format:
## Large PR Justification
[Explain why this PR must be large, such as:]
- Generated code that cannot be split
- Large refactoring that must be atomic
- Multiple related changes that would break if separated
- Migration or data transformationAlternative:
Consider splitting this PR into smaller, focused changes (< 1000 lines each) for easier review and reduced risk.
See our Contributing Guidelines for more details.
This review will be automatically dismissed once you add the justification section.
|
✅ Large PR justification has been provided. The size review has been dismissed and this PR can now proceed with normal review. |
There was a problem hiding this comment.
Pull request overview
Adds the API surface and schema validation needed to configure rate limiting on VirtualMCPServer (and aligns rate-limit type definitions across operator/runtime), while deferring actual runtime enforcement to a follow-up.
Changes:
- Introduces shared
pkg/ratelimit/typesAPI types and wires them intov1beta1CRD types and vMCP runtime config serialization. - Adds CRD/CEL validations for
VirtualMCPServerto reject unsupported rate-limit configurations (requires Redis session storage; per-user limits require OIDC incoming auth). - Updates generated artifacts (CRDs + operator/API docs + OpenAPI swagger) and adds/updates unit & integration coverage.
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e/thv-operator/virtualmcp/virtualmcp_circuit_breaker_test.go | E2E recovery flow tweak (now waits for workload template update before deleting pods). |
| pkg/vmcp/config/zz_generated.deepcopy.go | Deepcopy regen to include RateLimiting + import alias adjustments. |
| pkg/vmcp/config/config.go | Adds RateLimiting to vMCP runtime config model. |
| pkg/ratelimit/types/zz_generated.deepcopy.go | Generated deepcopies for new shared rate-limit API types. |
| pkg/ratelimit/types/types.go | New shared RateLimitConfig/RateLimitBucket/ToolRateLimitConfig structs + validations. |
| pkg/ratelimit/types/doc.go | Controller-gen markers for shared rate-limit types package. |
| pkg/ratelimit/limiter_test.go | Test rename (global→shared) + new test asserting Redis key creation. |
| pkg/ratelimit/internal/bucket/bucket.go | Comment update (global→shared) to match terminology. |
| docs/server/swagger.yaml | OpenAPI schema updates to reference shared rate-limit types. |
| docs/server/swagger.json | OpenAPI schema updates to reference shared rate-limit types. |
| docs/server/docs.go | Embedded OpenAPI template updated for shared rate-limit types. |
| docs/operator/crd-api.md | Operator CRD API docs updated to include ratelimit types and new vMCP config field. |
| deploy/charts/operator-crds/templates/toolhive.stacklok.dev_virtualmcpservers.yaml | Generated CRD template updated with config.rateLimiting schema + CEL validations. |
| deploy/charts/operator-crds/files/crds/toolhive.stacklok.dev_virtualmcpservers.yaml | Generated CRD (rendered) updated with config.rateLimiting schema + CEL validations. |
| cmd/thv-operator/test-integration/virtualmcp/virtualmcpserver_sessionstorage_cel_test.go | Integration tests added for new CEL validation rules around rate limiting. |
| cmd/thv-operator/test-integration/embedding-server/embeddingserver_update_test.go | Strengthens update test to also assert Service existence. |
| cmd/thv-operator/Taskfile.yml | Ensures controller-gen + docs generation include new ratelimit types package. |
| cmd/thv-operator/pkg/vmcpconfig/converter_test.go | Adds converter passthrough coverage for RateLimiting. |
| cmd/thv-operator/controllers/virtualmcpserver_vmcpconfig_test.go | Improves vmcp config ConfigMap test by unmarshalling and asserting core fields. |
| cmd/thv-operator/api/v1beta1/zz_generated.deepcopy.go | Deepcopy regen to use shared rate-limit types. |
| cmd/thv-operator/api/v1beta1/virtualmcpserver_types.go | Adds CEL validations enforcing Redis + OIDC requirements for vMCP rate limiting. |
| cmd/thv-operator/api/v1beta1/mcpserver_types.go | Rehomes rate-limit types to shared package via type aliases and updates spec field type. |
| cmd/thv-operator/api/v1beta1/mcpserver_types_test.go | Adds JSON roundtrip coverage for vMCP spec.config.rateLimiting. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| By("Waiting for backend StatefulSet template to use the fixed image") | ||
| Eventually(func() error { | ||
| sts := &appsv1.StatefulSet{} | ||
| if err := k8sClient.Get(ctx, types.NamespacedName{ | ||
| Name: backend2Name, | ||
| Namespace: testNamespace, | ||
| }, sts); err != nil { | ||
| return err | ||
| } | ||
| for _, container := range sts.Spec.Template.Spec.Containers { | ||
| if container.Name == "mcp" { | ||
| if container.Image != images.YardstickServerImage { | ||
| return fmt.Errorf("statefulset still has image %q", container.Image) | ||
| } | ||
| return nil | ||
| } | ||
| } | ||
| return fmt.Errorf("mcp container not found in statefulset template") |
| type: object | ||
| github_com_stacklok_toolhive_pkg_ratelimit_types.RateLimitBucket: | ||
| description: |- | ||
| PerUser token bucket configuration for this tool. |
| "type": "object" | ||
| }, | ||
| "github_com_stacklok_toolhive_pkg_ratelimit_types.RateLimitBucket": { | ||
| "description": "PerUser token bucket configuration for this tool.\n+optional", |
| "type": "object" | ||
| }, | ||
| "github_com_stacklok_toolhive_pkg_ratelimit_types.RateLimitBucket": { | ||
| "description": "PerUser token bucket configuration for this tool.\n+optional", |
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
Signed-off-by: Sanskarzz <sanskar.gur@gmail.com>
PR A — shared→global rename + CRD schema for vMCP
Summary
Add the
VirtualMCPServerrate-limiting API surface in line with THV-0057.This change introduces
spec.config.rateLimitingsupport onVirtualMCPServer. The field lives on the vMCP runtime config type so the operator can serialize it into the generatedconfig.yamland the vMCP process can read it at startup.The CRD validation rejects invalid combinations early, including per-user limits without OIDC incoming auth and any rate limiting without Redis session storage. Generated CRDs and operator docs were updated accordingly.
Runtime vMCP enforcement is intentionally split out of this PR and will be handled in a follow-up PR.
Fixes #4552
Type of change
Test plan
task test)task test-e2e)task lint-fix)API Compatibility
This PR adds optional
VirtualMCPServer.spec.rateLimitingconfiguration to thev1beta1API. This is an additive, backward-compatible API change.Existing
VirtualMCPServerresources do not need migration. Cluster admins can opt in by addingspec.rateLimitingand must configure Redis-backedspec.sessionStoragewhen ratelimiting is enabled. Per-user limits additionally require OIDC incoming auth.
v1beta1API, OR theapi-break-allowedlabel is applied and the migration guidance is described above.Changes
VirtualMCPServer.spec.config.rateLimitingon the vMCP config typepkg/ratelimitusing the CRDRateLimitConfigtype directlyLarge PR Justification
This is a new feature package with a large test suite, and it needs to land as one coherent phase.