feat(agent-core): add API key pool for concurrent subagent key rotation#191
feat(agent-core): add API key pool for concurrent subagent key rotation#191Liewzheng wants to merge 6 commits into
Conversation
🦋 Changeset detectedLatest commit: 6912d04 The changes in this PR will be included in the next version bump. This PR includes changesets to release 3 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 36a4f98489
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (providerConfig?.type === 'kimi' && this.options.apiKeyPool !== undefined) { | ||
| const pool = this.options.apiKeyPool; | ||
| return async (request) => { | ||
| const key = pool.acquire(); | ||
| const auth: ProviderRequestAuth = { apiKey: key }; |
There was a problem hiding this comment.
Respect explicit Kimi provider keys
When a global pool exists, this branch wraps every type: 'kimi' provider with per-request auth, even when that provider has its own apiKey or env-derived key. In the inspected Agent.generate path, the supplied options.auth takes precedence over the provider's constructor-time apiKey, so a user who configures a custom Kimi provider/baseUrl or separate account key while also exporting KIMI_API_KEY/KIMI_API_KEY_1 will silently send those requests with the pool keys instead, leading to auth failures or wrong-account billing. Consider gating the pool to the intended managed/env-backed provider or skipping it when providerApiKey(providerConfig) is set.
Useful? React with 👍 / 👎.
| if (primary !== undefined && primary.length > 0) { | ||
| keys.push(primary); |
There was a problem hiding this comment.
Ignore whitespace-only pool keys
If an operator leaves a numbered key set to whitespace (for example KIMI_API_KEY_1=' '), this check still counts it as a usable key and can create a pool containing that value; the later per-request auth path then passes it through as apiKey, bypassing the existing provider-key normalization that trims empty strings. In that environment every rotation onto the whitespace entry will make Kimi requests fail with invalid credentials, so these env values should be trimmed/validated before adding them to the pool.
Useful? React with 👍 / 👎.
| const key = pool.acquire(); | ||
| const auth: ProviderRequestAuth = { apiKey: key }; |
There was a problem hiding this comment.
Keep video upload and generation on the same key
Because each withAuth invocation acquires a fresh key, a video read can upload the file through createVideoUploader with one pool key and then the next LLM generation that consumes the returned ms://... file id can run with another key. In sessions with two or more pool keys and video inputs, that crosses API-key/account boundaries for file IDs and can make the follow-up Kimi request unable to access the uploaded video. The pool should pin a key for related requests, such as at agent/turn scope, rather than rotating independently per request.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 537e0d8103
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| if (isRetryableGenerateError(error)) { | ||
| pool.recordFailure(key); |
There was a problem hiding this comment.
Quarantine pooled keys rejected with 401
When the pool contains a bad or expired KIMI_API_KEY_N, requests made with that key will fail with APIStatusError(401), but this branch only calls recordFailure for isRetryableGenerateError cases, which excludes 401. Because pooled API-key auth has no OAuth refresh path, that credential remains marked healthy and every future rotation onto it keeps failing indefinitely instead of routing around the bad key.
Useful? React with 👍 / 👎.
| const result = await request(auth); | ||
| pool.resetKey(key); |
There was a problem hiding this comment.
Preserve cooldowns across concurrent successes
With parallel subagents, the same pooled key can have multiple in-flight requests; if one request for a key returns 429 and records a cooldown, a different request for that same key that was already in flight can later succeed and this unconditional resetKey clears the cooldown immediately. In that common concurrent case the rate-limited key re-enters rotation before its backoff expires, defeating the pool’s rerouting behavior.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 5301a76fec
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| this.resolveOAuthTokenProvider = options.resolveOAuthTokenProvider; | ||
| this.skillDirs = options.skillDirs ?? []; | ||
| this.telemetry = options.telemetry ?? noopTelemetryClient; | ||
| this.apiKeyPool = ApiKeyPool.fromEnv() ?? undefined; |
There was a problem hiding this comment.
Gate key pooling on the opt-in env var
In environments that already export multiple KIMI_API_KEY* values but have not opted into pooling, this still constructs a pool unconditionally at startup. Because that same pool is passed into each session and ProviderManager injects it for Kimi providers without explicit keys, those users get request-level rotation instead of the previous single-key behavior, which can unexpectedly send traffic to a backup or different-account key. Please check the opt-in flag before creating the pool.
Useful? React with 👍 / 👎.
Add ApiKeyPool that rotates through multiple KIMI_API_KEY* env vars with exponential cooldown on retryable errors. Integrate into ProviderManager so concurrent subagents do not share a single key's rate-limit quota. Enable via KIMI_API_KEY_POOL env var.
- Use Array.at() instead of bracket index access (prefer-at) - Wrap void expressions in arrow function bodies to avoid no-confusing-void-expression
- Verify round-robin key distribution under concurrent acquire/load - Verify cooling keys are skipped even under heavy concurrent load - Verify fallback behavior when all keys are cooling - Verify key recovery after cooldown expires - Fix lint warnings: avoid floating promises and non-thenable in Promise.all
… trim env whitespace
Codex review fixes:
- Pool no longer overrides provider-level explicit apiKey. The pool path
is only entered when providerApiKey() returns undefined.
- Whitespace-only env values (' ') are now trimmed before validation
and treated as missing.
- Update pool tests to use a dedicated POOL_BASE_CONFIG without apiKey.
…rent success Codex review fixes: - Pool now records failures for 401 (Unauthorized) errors so bad/expired keys are quarantined instead of being retried indefinitely. - resetKey() no longer clears an active cooldown, preventing a concurrent success from wiping out a cooldown that was set by another in-flight request for the same key. - Update resetKey test to advance past cooldown before resetting.
Codex review fix: - Pool construction in KimiCore is now conditional on the KIMI_API_KEY_POOL environment variable being set. Previously, any environment with multiple KIMI_API_KEY* values would get request-level rotation even without opting in, which could silently redirect traffic to backup/different-account keys.
6698122 to
6912d04
Compare
Related Issue
Resolve #2368
Problem
When multiple subagents run in parallel, they all share the same
KIMI_API_KEY. This concentrates load on a single key and quickly exhausts its rate-limit quota, causing 429 errors and degraded throughput for parallel tasks.What changed
Add
ApiKeyPoolthat rotates through multipleKIMI_API_KEY*environment variables with exponential cooldown on retryable errors. Integrate intoProviderManagerso concurrent subagents do not share a single key's rate-limit quota. Enable viaKIMI_API_KEY_POOLenv var.Key design decisions:
acquire()rotates through keys and skips any that are in a cooldown period (30s → 5min → 30min based on consecutive failures). If all keys are cooling, it falls back to round-robin so the agent can still make progress.resolveAuth: Instead of wrapping the provider (like kimi-cli'sKeyPoolKimi), we inject rotating keys through the existingProviderManager.resolveAuthmechanism, which already supported per-request OAuth tokens. This keeps provider construction unchanged and works naturally withAgent.generate.resolveAuthimmediately records the failure and retries once with a fresh key. If that also fails, the error propagates to the existingchatWithRetryloop for normal backoff.oauthandapiKey, the existing mutual-exclusion error is thrown. The pool path is only entered whenoauthis not configured.providerConfig.type === 'kimi'. Non-kimi providers are unaffected.ApiKeyPool.fromEnv()readsKIMI_API_KEY,KIMI_API_KEY_1…KIMI_API_KEY_99. A pool is only created when ≥2 keys are found.Checklist
gen-changesetsskill, or this PR needs no changeset.gen-docsskill, or this PR needs no doc update. (No user-facing CLI commands or config options changed; the feature is env-var only.)Test coverage
Unit tests (
api-key-pool.test.ts— 22 tests):fromEnvkey collection, constructor validation, round-robin rotationacquire()distribute evenly (34/33/33); cooling keys skipped under 50 concurrent acquires; full-cooldown fallback; mid-stream recovery after cooldown expiresIntegration tests (
runtime-provider.test.ts— 37 tests):resolveAuthcalls distribute keys evenly; failed keys enter cooldown and are rerouted under concurrent loadVerification
pnpm typecheck— all packages + apps passpnpm test— 382 test files, 4960+ tests passpnpm build— all packages and apps build successfullypnpm lint— 0 errors; new files have 0 warnings (existing codebase has unrelated warnings)