feat(agent-core): add API key pool for concurrent subagent key rotation by Liewzheng · Pull Request #191 · MoonshotAI/kimi-code

Liewzheng · 2026-05-29T05:41:51Z

Related Issue

Resolve #2368

Problem

When multiple subagents run in parallel, they all share the same KIMI_API_KEY. This concentrates load on a single key and quickly exhausts its rate-limit quota, causing 429 errors and degraded throughput for parallel tasks.

What changed

Add ApiKeyPool that rotates through multiple KIMI_API_KEY* environment variables with exponential cooldown on retryable errors. Integrate into ProviderManager so concurrent subagents do not share a single key's rate-limit quota. Enable via KIMI_API_KEY_POOL env var.

Key design decisions:

Round-robin allocation with cooldown skipping: acquire() rotates through keys and skips any that are in a cooldown period (30s → 5min → 30min based on consecutive failures). If all keys are cooling, it falls back to round-robin so the agent can still make progress.
Per-request key injection via resolveAuth: Instead of wrapping the provider (like kimi-cli's KeyPoolKimi), we inject rotating keys through the existing ProviderManager.resolveAuth mechanism, which already supported per-request OAuth tokens. This keeps provider construction unchanged and works naturally with Agent.generate.
One rotation retry before tenacity backoff: On a retryable status error (429/500/502/503/504), resolveAuth immediately records the failure and retries once with a fresh key. If that also fails, the error propagates to the existing chatWithRetry loop for normal backoff.
OAuth precedence preserved: If a provider has both oauth and apiKey, the existing mutual-exclusion error is thrown. The pool path is only entered when oauth is not configured.
Provider-type gating: Pool keys are only injected for providerConfig.type === 'kimi'. Non-kimi providers are unaffected.
Env var collection: ApiKeyPool.fromEnv() reads KIMI_API_KEY, KIMI_API_KEY_1 … KIMI_API_KEY_99. A pool is only created when ≥2 keys are found.

Checklist

I have read the CONTRIBUTING document.
I have linked a related issue, or explained the problem above.
I have added tests that prove my feature works.
Ran gen-changesets skill, or this PR needs no changeset.
Ran gen-docs skill, or this PR needs no doc update. (No user-facing CLI commands or config options changed; the feature is env-var only.)

Test coverage

Unit tests (api-key-pool.test.ts — 22 tests):

fromEnv key collection, constructor validation, round-robin rotation
Cooldown skipping and fallback when all keys are cooling
Failure progression (30s → 5min → 30min) and key recovery after cooldown
Concurrent stress: 100 concurrent acquire() distribute evenly (34/33/33); cooling keys skipped under 50 concurrent acquires; full-cooldown fallback; mid-stream recovery after cooldown expires

Integration tests (runtime-provider.test.ts — 37 tests):

Pool key injection for kimi provider; skipped for non-kimi providers
Key rotation on retryable errors (429/500/502/503/504)
OAuth precedence when both are configured
Concurrent stress: 5/60 concurrent resolveAuth calls distribute keys evenly; failed keys enter cooldown and are rerouted under concurrent load

Verification

pnpm typecheck — all packages + apps pass
pnpm test — 382 test files, 4960+ tests pass
pnpm build — all packages and apps build successfully
pnpm lint — 0 errors; new files have 0 warnings (existing codebase has unrelated warnings)

changeset-bot · 2026-05-29T05:41:57Z

🦋 Changeset detected

Latest commit: 6912d04

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 3 packages

Name	Type
@moonshot-ai/agent-core	Minor
@moonshot-ai/kimi-code	Minor
@moonshot-ai/migration-legacy	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 36a4f98489

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T05:46:09Z

+    if (providerConfig?.type === 'kimi' && this.options.apiKeyPool !== undefined) {
+      const pool = this.options.apiKeyPool;
+      return async (request) => {
+        const key = pool.acquire();
+        const auth: ProviderRequestAuth = { apiKey: key };


Respect explicit Kimi provider keys

When a global pool exists, this branch wraps every type: 'kimi' provider with per-request auth, even when that provider has its own apiKey or env-derived key. In the inspected Agent.generate path, the supplied options.auth takes precedence over the provider's constructor-time apiKey, so a user who configures a custom Kimi provider/baseUrl or separate account key while also exporting KIMI_API_KEY/KIMI_API_KEY_1 will silently send those requests with the pool keys instead, leading to auth failures or wrong-account billing. Consider gating the pool to the intended managed/env-backed provider or skipping it when providerApiKey(providerConfig) is set.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T05:46:09Z

+    if (primary !== undefined && primary.length > 0) {
+      keys.push(primary);


Ignore whitespace-only pool keys

If an operator leaves a numbered key set to whitespace (for example KIMI_API_KEY_1=' '), this check still counts it as a usable key and can create a pool containing that value; the later per-request auth path then passes it through as apiKey, bypassing the existing provider-key normalization that trims empty strings. In that environment every rotation onto the whitespace entry will make Kimi requests fail with invalid credentials, so these env values should be trimmed/validated before adding them to the pool.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T05:46:09Z

+        const key = pool.acquire();
+        const auth: ProviderRequestAuth = { apiKey: key };


Keep video upload and generation on the same key

Because each withAuth invocation acquires a fresh key, a video read can upload the file through createVideoUploader with one pool key and then the next LLM generation that consumes the returned ms://... file id can run with another key. In sessions with two or more pool keys and video inputs, that crosses API-key/account boundaries for file IDs and can make the follow-up Kimi request unable to access the uploaded video. The pool should pin a key for related requests, such as at agent/turn scope, rather than rotating independently per request.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 537e0d8103

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T06:03:11Z

+          if (isRetryableGenerateError(error)) {
+            pool.recordFailure(key);


Quarantine pooled keys rejected with 401

When the pool contains a bad or expired KIMI_API_KEY_N, requests made with that key will fail with APIStatusError(401), but this branch only calls recordFailure for isRetryableGenerateError cases, which excludes 401. Because pooled API-key auth has no OAuth refresh path, that credential remains marked healthy and every future rotation onto it keeps failing indefinitely instead of routing around the bad key.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-05-29T06:03:11Z

+          const result = await request(auth);
+          pool.resetKey(key);


Preserve cooldowns across concurrent successes

With parallel subagents, the same pooled key can have multiple in-flight requests; if one request for a key returns 429 and records a cooldown, a different request for that same key that was already in flight can later succeed and this unconditional resetKey clears the cooldown immediately. In that common concurrent case the rate-limited key re-enters rotation before its backoff expires, defeating the pool’s rerouting behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5301a76fec

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-05-29T06:14:47Z

    this.resolveOAuthTokenProvider = options.resolveOAuthTokenProvider;
    this.skillDirs = options.skillDirs ?? [];
    this.telemetry = options.telemetry ?? noopTelemetryClient;
+    this.apiKeyPool = ApiKeyPool.fromEnv() ?? undefined;


Gate key pooling on the opt-in env var

In environments that already export multiple KIMI_API_KEY* values but have not opted into pooling, this still constructs a pool unconditionally at startup. Because that same pool is passed into each session and ProviderManager injects it for Kimi providers without explicit keys, those users get request-level rotation instead of the previous single-key behavior, which can unexpectedly send traffic to a backup or different-account key. Please check the opt-in flag before creating the pool.

Useful? React with 👍 / 👎.

Add ApiKeyPool that rotates through multiple KIMI_API_KEY* env vars with exponential cooldown on retryable errors. Integrate into ProviderManager so concurrent subagents do not share a single key's rate-limit quota. Enable via KIMI_API_KEY_POOL env var.

- Use Array.at() instead of bracket index access (prefer-at) - Wrap void expressions in arrow function bodies to avoid no-confusing-void-expression

- Verify round-robin key distribution under concurrent acquire/load - Verify cooling keys are skipped even under heavy concurrent load - Verify fallback behavior when all keys are cooling - Verify key recovery after cooldown expires - Fix lint warnings: avoid floating promises and non-thenable in Promise.all

… trim env whitespace Codex review fixes: - Pool no longer overrides provider-level explicit apiKey. The pool path is only entered when providerApiKey() returns undefined. - Whitespace-only env values (' ') are now trimmed before validation and treated as missing. - Update pool tests to use a dedicated POOL_BASE_CONFIG without apiKey.

…rent success Codex review fixes: - Pool now records failures for 401 (Unauthorized) errors so bad/expired keys are quarantined instead of being retried indefinitely. - resetKey() no longer clears an active cooldown, preventing a concurrent success from wiping out a cooldown that was set by another in-flight request for the same key. - Update resetKey test to advance past cooldown before resetting.

Codex review fix: - Pool construction in KimiCore is now conditional on the KIMI_API_KEY_POOL environment variable being set. Previously, any environment with multiple KIMI_API_KEY* values would get request-level rotation even without opting in, which could silently redirect traffic to backup/different-account keys.

chatgpt-codex-connector Bot reviewed May 29, 2026

View reviewed changes

Liewzheng added 6 commits May 29, 2026 14:38

fix(agent-core): resolve oxlint warnings in ApiKeyPool

41b988f

- Use Array.at() instead of bracket index access (prefer-at) - Wrap void expressions in arrow function bodies to avoid no-confusing-void-expression

Liewzheng force-pushed the dev-feat-subagent-optimized branch from 6698122 to 6912d04 Compare May 29, 2026 06:39

Liewzheng changed the title ~~Dev feat subagent optimized~~ feat(agent-core): add API key pool for subagent key rotation May 29, 2026

Liewzheng changed the title ~~feat(agent-core): add API key pool for subagent key rotation~~ feat(agent-core): add API key pool for concurrent subagent key rotation May 29, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(agent-core): add API key pool for concurrent subagent key rotation#191

feat(agent-core): add API key pool for concurrent subagent key rotation#191
Liewzheng wants to merge 6 commits into
MoonshotAI:mainfrom
Liewzheng:dev-feat-subagent-optimized

Liewzheng commented May 29, 2026

Uh oh!

changeset-bot Bot commented May 29, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		if (primary !== undefined && primary.length > 0) {
		keys.push(primary);

		const key = pool.acquire();
		const auth: ProviderRequestAuth = { apiKey: key };

		if (isRetryableGenerateError(error)) {
		pool.recordFailure(key);

Conversation

Liewzheng commented May 29, 2026

Related Issue

Problem

What changed

Checklist

Test coverage

Verification

Uh oh!

changeset-bot Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

changeset-bot Bot commented May 29, 2026 •

edited

Loading