Conversation
- Align check-types-split with packages/db, packages/lib, and UI components; exclude Storybook stories from the UI chunk. - Add check-types:tests to CI; add root scripts for UI, agent-server, and gateway typechecks. - packages/ui: path aliases for @/lib, @/db, @/app, @/packages/ui; styled-jsx attrs for <style jsx>. - packages/lib: JWKS uses jose KeyLike; runtime-factory imports plugin-sql default with typed adapter factory. - gateway-discord: explicit discord.js handler types, connectBot characterId, sendTyping guard, listener map cast. - services/agent-server: requireInternalAuth accepts Elysia string/number status. - Tests: mocks, RequestInit/fetch preconnect, UUID casts, toMatchObject, financial addCredits, Playwright APIRequestContext, and related fixes. - New anthropic-thinking provider helper and unit test. Note: services/operator (Pepr) is not part of the main workspace typecheck. Made-with: Cursor
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughAdds Anthropic “extended thinking” (CoT) budget support with per-agent settings and env fallbacks/caps, provider-options merge helpers, integrations into AI request calls, test and type hardening, Milady test helper refactor, rate-limit and config updates, and docs/changelog entries. Changes
Sequence Diagram(s)sequenceDiagram
participant Client as Client
participant Route as MCP/A2A Route
participant Resolver as CharacterResolver
participant Anthropic as AnthropicThinking
participant LLM as LLMProvider
Client->>Route: POST /api/agents/{id}/mcp (chat)
Route->>Resolver: resolve character record
Resolver-->>Route: character (includes settings)
Route->>Anthropic: parseThinkingBudgetFromCharacterSettings(settings)
Anthropic-->>Route: agentThinkingBudget?
Route->>Anthropic: resolveAnthropicThinkingBudgetTokens(model, env, agentBudget)
Anthropic-->>Route: effectiveBudget / providerOptions fragment
Route->>LLM: streamText({ model, prompt, maxOutputTokens, ...providerOptions })
LLM-->>Route: streamed response
Route-->>Client: stream completion
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
|
|
Code Review This PR adds Anthropic extended thinking (ANTHROPIC_COT_BUDGET) support across all AI routes, fixes TypeScript type errors in tests/packages, adds a check-types:tests CI step, and hardens the Discord gateway with explicit handler types. The implementation is generally solid with good test coverage for the new helper. Issues:
app-promotion-assets.ts, app-promotion.ts, twitter-automation/app-automation.ts, and seo.ts pass explicit temperature alongside mergeProviderOptions. When ANTHROPIC_COT_BUDGET is set, @ai-sdk/anthropic silently strips temperature/topP/topK (noted in the v1/chat completions comment). Callers relying on temperature 0.7/0.8 will silently get default sampling - an invisible behavior change. Consider logging a warning at startup when both are configured.
The function deep-merges only gateway, anthropic, and google. Any other provider key present in both a and b (e.g. openai, azure) is overwritten by the shallow spread without merging. Worth a comment flagging this limitation.
Removing the isDevelopment override drops STANDARD from 10,000 to 60 req/min in local dev. Developers running hot-reload tests or UI workflows will now hit production limits backed only by in-memory storage. Consider keeping separate dev limits or documenting the change prominently.
The default import is immediately cast to a manually-written signature. If the upstream @elizaos/plugin-sql signature changes, this cast will silently pass wrong types. Worth a comment explaining the workaround and a TODO to remove it when upstream fixes the type export.
In x402-topup.test.ts, afterAll accesses referralsServiceForTest etc. with definite-assignment assertions and no null guards. If beforeAll throws before all three are assigned, afterAll will also throw and may swallow the original failure. The previous null-guard pattern was safer. Positives:
|
Rename milady billing route test to z-milady-billing-route for script ordering, use queue-backed db mocks re-registered in beforeEach, and add mockMiladyPricingMinimumDepositForRouteTests so route tests do not replace MILADY_PRICING with only MINIMUM_DEPOSIT (which broke billing thresholds in-process). Update package.json bulk/special unit paths accordingly. Made-with: Cursor
Extend anthropic-thinking handling, add cloud-provider-options, update provider types and index exports, and thread safe model params through v1 chat, completions, messages, responses, generate-image, MCP/A2A, and internal services. Expand anthropic-thinking unit tests. Made-with: Cursor
Add engineering docs (unit-testing-milady-mocks, anthropic-cot-budget), root CHANGELOG.md, README links and accurate test script table, ROADMAP and site changelog entries. Expand JSDoc and .env.example with rationale for ANTHROPIC_COT_BUDGET and partial MILADY_PRICING mock pitfalls. Made-with: Cursor
|
Code Review - PR 417 This PR is a broad type-fix/quality pass with one new functional feature (Anthropic extended thinking via ANTHROPIC_COT_BUDGET). The changes are generally clean. A few things worth discussing: ISSUES
File: packages/lib/middleware/rate-limit.ts Removing the isDevelopment branch means local dev now hits the same strict limits as production (60/min STANDARD, 10/min STRICT, 5 per 5-min CRITICAL) backed only by in-memory storage. Developers iterating locally will now hit 429s immediately. The old 10000 req/window dev ceiling existed precisely to avoid this. The log message change says "same numeric limits as production" explicitly. Was this intentional? If so, it should be called out in the PR description. If not, the isDevelopment guard should be restored.
File: packages/tests/integration/x402-topup.test.ts Using ! on the three service vars silences TS but means if beforeAll throws partway through, afterAll will crash accessing properties on uninitialized vars. The original null-guards were load-bearing. Consider keeping them or initializing with no-op stubs.
File: packages/lib/providers/anthropic-thinking.ts The merge is one level deep for anthropic/google/gateway. For the thinking key specifically: if base has { thinking: { type: "disabled" } } and COT helper adds { thinking: { type: "enabled", budgetTokens: N } }, the second replaces the first entirely. This is the desired behavior, but a short comment would prevent future contributors from trying to fix it.
File: app/api/v1/generate-image/route.ts The else branch in isOpenAIModel ? mergeAnthropicCot : mergeGoogleImage covers everything that is not OpenAI. Future Anthropic image models would fall into the Google path and pick up responseModalities unnecessarily. Low risk now, worth noting when new providers are added.
File: packages/lib/eliza/runtime-factory.ts Casting the default import manually bypasses TS module resolution. A comment explaining the upstream typing gap would help future audits. POSITIVES
Overall this is a solid type cleanup pass. The rate-limit change is the most impactful behavioral difference and should be confirmed as intentional before merge. |
|
PR Review: fix(types): test project, split check-types, UI package, gateways Overall this is well-structured work with clear documentation of design decisions. A few items worth addressing before merge. Bugs / Correctness x402-topup.test.ts: afterAll null safety regression - The refactor replaced null-guarded service patches with definite assignment (!) but the afterAll cleanup now runs unconditionally. If beforeAll throws before the service variables are assigned, afterAll will throw on property access, masking the root failure and potentially breaking test teardown for other suites. Restore optional chaining or null guards in afterAll even if the variable declarations no longer need the union type. Behaviour Change Worth Calling Out Rate limit presets: dev-mode unlimited -> production limits for local development - packages/lib/middleware/rate-limit.ts removes the isDevelopment pattern and always uses production-level limits (STANDARD: 60/min, STRICT: 10/min, etc.). The only difference now is in-memory vs Redis backing. This will hit local developers on standard routes once they exceed 60 req/min. If intentional, worth a note in dev setup docs. Minor Issues
What is Good
Summary: The null-safety regression in x402-topup.test.ts afterAll is the only real bug. The rate-limit behaviour change is worth explicit acknowledgement. Everything else is minor or informational. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Spread order lets
optionsoverride deploy-wide thinking config- Swapped the spread order so mergeAnthropicCotProviderOptions comes after options, ensuring deploy-wide ANTHROPIC_COT_BUDGET cannot be overridden by caller-supplied options.
Or push these changes by commenting:
@cursor push 99924fb2eb
Preview (99924fb2eb)
diff --git a/packages/lib/api/a2a/skills.ts b/packages/lib/api/a2a/skills.ts
--- a/packages/lib/api/a2a/skills.ts
+++ b/packages/lib/api/a2a/skills.ts
@@ -91,8 +91,8 @@
role: m.role as "user" | "assistant" | "system",
content: m.content,
})),
+ ...options,
...mergeAnthropicCotProviderOptions(model),
- ...options,
});
let fullText = "";This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
Read user_characters.settings.anthropicThinkingBudgetTokens for agent chat; add ANTHROPIC_COT_BUDGET_MAX cap and resolution helpers. Wire MCP and A2A routes; extend tests, env-validator, and .env.example. Update CHANGELOG, site changelog, README, ROADMAP, anthropic-cot-budget doc, and route/module comments with WHYs (owner-controlled settings vs untrusted request params). Made-with: Cursor
…ridden Move mergeAnthropicCotProviderOptions spread after options spread in executeSkillChatCompletion to ensure the deploy-wide ANTHROPIC_COT_BUDGET configuration takes precedence over caller-supplied options. This aligns with all other call sites in the codebase which consistently place mergeAnthropicCotProviderOptions last to enforce operator-controlled thinking budgets.
…ridden Move mergeAnthropicCotProviderOptions spread after options spread in executeSkillChatCompletion to ensure the deploy-wide ANTHROPIC_COT_BUDGET configuration takes precedence over caller-supplied options. This aligns with all other call sites in the codebase which consistently place mergeAnthropicCotProviderOptions last to enforce operator-controlled thinking budgets. Applied via @cursor push command
There was a problem hiding this comment.
Pull request overview
Aligns repo typechecking and test typings across the workspace, introduces deploy-scoped Anthropic extended-thinking configuration helpers, and tightens/clarifies several gateway + test mocks/casts to reduce order-dependent failures.
Changes:
- Split/expand typecheck coverage (tests project typecheck in CI, new per-package typecheck scripts, updated tsconfigs/paths).
- Add Anthropic COT budget env + providerOptions merge helpers and thread them through gateway-backed routes/services.
- Stabilize/retarget numerous unit/integration tests (mocks, env handling, casting, and fixture shapes), including Milady pricing mock helper + renamed cron test.
Reviewed changes
Copilot reviewed 100 out of 101 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| tsconfig.test.json | Adds UI package path aliases and refines include/exclude for test typecheck. |
| services/agent-server/src/routes.ts | Broadens set.status typing for Elysia compatibility. |
| services/agent-server/package.json | Adds typecheck script. |
| packages/ui/tsconfig.json | Adds cross-workspace path aliases and excludes Storybook stories from UI typecheck. |
| packages/ui/src/styled-jsx.d.ts | Augments React style attributes for styled-jsx. |
| packages/tests/unit/z-milady-billing-route.test.ts | Reworks Milady billing cron test mocking to be order-stable across the suite. |
| packages/tests/unit/x402/facilitator-service.test.ts | Adjusts casts for stricter typing. |
| packages/tests/unit/wallet-auth.test.ts | Fixes mock typing/resolved values. |
| packages/tests/unit/waifu-bridge.test.ts | Uses a typed mutable env view for NODE_ENV writes. |
| packages/tests/unit/v1-milaidy-provision-route.test.ts | Uses shared Milady pricing mock helper. |
| packages/tests/unit/token-agent-linkage.test.ts | Refactors validation test into scenario loop with strict checks. |
| packages/tests/unit/security-validations.test.ts | Avoids {} prototype typing issues in pollution assertions. |
| packages/tests/unit/referrals-service.test.ts | Uses toMatchObject to reduce brittle equality. |
| packages/tests/unit/provisioning-jobs.test.ts | Fixes Vitest mock restore typing and fetch assignment typing. |
| packages/tests/unit/provisioning-jobs-followups.test.ts | Uses toMatchObject for partial comparisons. |
| packages/tests/unit/privy-sync.test.ts | Uses toMatchObject for hydrated user comparisons. |
| packages/tests/unit/pr385-round5-fixes.test.ts | Aligns fake service signature with optional id param. |
| packages/tests/unit/performance-optimizations.test.ts | Adds typed helpers to avoid repeated unsafe access to provider snapshots. |
| packages/tests/unit/oauth/token-cache.test.ts | Rewrites date normalization branch for clearer typing. |
| packages/tests/unit/oauth/provider-registry.test.ts | Adjusts optional chaining/non-null assertions in provider routes. |
| packages/tests/unit/oauth/oauth-service.test.ts | Tightens typings for provider routes/platform loops/status handling. |
| packages/tests/unit/oauth/errors.test.ts | Stabilizes enum value typing for expectations. |
| packages/tests/unit/milaidy-sandbox-bridge-security.test.ts | Fixes fetch typing and backup fixture shape updates. |
| packages/tests/unit/milaidy-pairing-token-route.test.ts | Stabilizes env setup/teardown for base domain; removes module mock. |
| packages/tests/unit/milaidy-agent-routes-followups.test.ts | Uses shared Milady pricing mock helper. |
| packages/tests/unit/milady-sandbox-service.test.ts | Introduces typed sandbox provider helper and updates sandbox fixture fields. |
| packages/tests/unit/milady-create-routes.test.ts | Uses shared Milady pricing mock helper. |
| packages/tests/unit/mcp-twitter-tools.test.ts | Adds typed OAuth fixtures + handler result typing; normalizes token result shape. |
| packages/tests/unit/mcp-proxy-affiliate-pricing.test.ts | Fixes fetch typing cast. |
| packages/tests/unit/mcp-lib.test.ts | Tightens auth context typing and assertions. |
| packages/tests/unit/mcp-hubspot-tools.test.ts | Adds typed OAuth fixtures/token helper and fixes fetch typing casts. |
| packages/tests/unit/mcp-google-tools.test.ts | Adds typed OAuth fixtures/handler result typing; fixes fetch typing casts. |
| packages/tests/unit/internal-jwt-auth.test.ts | Uses typed env writes for NODE_ENV. |
| packages/tests/unit/field-encryption.test.ts | Adds typed rows and mock signatures to match repository interfaces. |
| packages/tests/unit/evm-rpc-proxy-route.test.ts | Fixes fetch typing casts. |
| packages/tests/unit/engagement-metrics/compute-metrics-cron.test.ts | Aligns mock signatures with implementation params. |
| packages/tests/unit/engagement-metrics/admin-metrics-api.test.ts | Aligns mock signature with implementation params. |
| packages/tests/unit/eliza-app/whatsapp-auth.test.ts | Casts string literals for stricter typing compatibility. |
| packages/tests/unit/eliza-app/telegram-ux-helpers.test.ts | Fixes fetch typing cast. |
| packages/tests/unit/docker-ssh-cloud-deploy.test.ts | Adjusts logger redact import strategy. |
| packages/tests/unit/database-url.test.ts | Uses typed env record for safe env mutation + asserts. |
| packages/tests/unit/compat-envelope.test.ts | Updates Milady sandbox fixture with new billing fields. |
| packages/tests/unit/api/route-test-helpers.ts | Improves routeParams typing and fixes File BlobPart typing. |
| packages/tests/unit/api/openapi-catalog.test.ts | Refactors test structure and simplifies OPTIONS invocation. |
| packages/tests/unit/anthropic-thinking.test.ts | Adds unit tests for new Anthropic thinking env + merge helpers. |
| packages/tests/unit/admin-service-pricing-route.test.ts | Tightens auth mock typing and request construction. |
| packages/tests/runtime/performance.test.ts | Fixes UUID typing casts for agentId. |
| packages/tests/runtime/mcp-assistant-trending.test.ts | Updates debug rendering mode and failure handling structure. |
| packages/tests/runtime/integration/runtime-factory/oauth-cache-invalidation.test.ts | Removes any for MCP settings lookup with typed cast. |
| packages/tests/runtime/integration/performance/runtime-creation.test.ts | Fixes UUID typing casts for agentId. |
| packages/tests/runtime/integration/message-handler/mcp-tools.test.ts | Updates debug rendering mode and failure handling structure. |
| packages/tests/playwright/fixtures/auth.fixture.ts | Uses APIRequestContext instead of inferred request type and removes any call. |
| packages/tests/load-env.ts | Uses typed env write for NODE_ENV. |
| packages/tests/integration/x402-topup.test.ts | Refactors service patch typing and restores methods after suite. |
| packages/tests/integration/webhooks-e2e.test.ts | Wraps fetch to auto-sign webhooks and preserves fetch.preconnect. |
| packages/tests/integration/unified-oauth-api.test.ts | Tightens it wrapper signature. |
| packages/tests/integration/services/users.service.test.ts | Adds fixture guard and tightens nullable handling. |
| packages/tests/integration/services/users-join-regression.test.ts | Adds runtime guard for relation shape. |
| packages/tests/integration/services/organizations.service.test.ts | Simplifies membership assertion. |
| packages/tests/integration/server-wallets.test.ts | Updates mocked returning rows and loosens assertion to toMatchObject. |
| packages/tests/integration/mcp-registry.test.ts | Tightens test wrapper signature. |
| packages/tests/integration/financial/credits-budget-flow.test.ts | Removes source field from addCredits call in test. |
| packages/tests/integration/financial/concurrent-operations.test.ts | Removes source field from addCredits call in test. |
| packages/tests/helpers/mock-milady-pricing-for-route-tests.ts | Adds helper to safely override Milady MINIMUM_DEPOSIT without clobbering constants. |
| packages/tests/helpers/index.ts | Re-exports the new Milady pricing mock helper. |
| packages/tests/fixtures/mcp-test-character.ts | Updates simpleTestCharacter id to UUID-like value. |
| packages/tests/e2e/setup-server.ts | Wraps fetch to ensure server availability and preserves fetch.preconnect. |
| packages/services/gateway-discord/src/gateway-manager.ts | Tightens discord.js handler typing, adds characterId routing, guards sendTyping, and listener map casts. |
| packages/scripts/check-types-split.ts | Updates typecheck chunk selection and excludes stories from temp configs. |
| packages/lib/services/twitter-automation/app-automation.ts | Threads new Anthropic COT merge helper into gateway generateText calls. |
| packages/lib/services/seo.ts | Threads new Anthropic COT merge helper into SEO Claude calls. |
| packages/lib/services/app-promotion.ts | Threads new Anthropic COT merge helper into promo copy generation. |
| packages/lib/services/app-promotion-assets.ts | Uses merge helpers for google image modalities + Anthropic COT for copy generation. |
| packages/lib/services/app-builder-ai-sdk.ts | Threads new Anthropic COT merge helper into streamText calls. |
| packages/lib/providers/types.ts | Broadens providerOptions typing to shared JSON-serializable merged options. |
| packages/lib/providers/index.ts | Exports new anthropic-thinking helpers. |
| packages/lib/providers/cloud-provider-options.ts | Adds shared merged providerOptions type alias. |
| packages/lib/providers/anthropic-thinking.ts | Adds env parsing + merge helpers for Anthropic extended thinking. |
| packages/lib/middleware/rate-limit.ts | Makes preset numeric limits consistent across dev/prod; clarifies logging/docs. |
| packages/lib/eliza/runtime-factory.ts | Changes plugin-sql adapter import to default + casted factory signature. |
| packages/lib/config/env-validator.ts | Adds validation rule for ANTHROPIC_COT_BUDGET. |
| packages/lib/auth/jwks.ts | Updates jose KeyLike typings for cached keys. |
| packages/lib/api/a2a/skills.ts | Threads merge helpers into skill chat + image generation paths. |
| packages/content/changelog.mdx | Adds Mar 27, 2026 entry for Anthropic COT + Milady testing fixes. |
| package.json | Adds per-package typecheck scripts and updates unit test split scripts for renamed file. |
| docs/unit-testing-milady-mocks.md | Documents Bun mock.module pitfalls and Milady pricing mock strategy. |
| docs/anthropic-cot-budget.md | Documents ANTHROPIC_COT_BUDGET behavior and providerOptions merge rationale. |
| docs/ROADMAP.md | Records completed items for COT budget + Milady test stabilization. |
| app/api/v1/responses/route.ts | Uses gateway+Anthropic merge helper for providerOptions injection. |
| app/api/v1/messages/route.ts | Threads merge helper into non-stream and stream AI SDK calls. |
| app/api/v1/generate-image/route.ts | Uses merge helpers for google image modalities and optional Anthropic COT. |
| app/api/v1/chat/route.ts | Threads merge helper into chat streaming call. |
| app/api/v1/chat/completions/route.ts | Threads merge helper into streaming and non-streaming calls. |
| app/api/v1/admin/service-pricing/tests/route.integration.test.ts | Standardizes NextRequest construction via URL base. |
| app/api/mcp/tools/generation.ts | Threads merge helpers into MCP generation (text + image) tools. |
| app/api/agents/[id]/mcp/route.ts | Threads merge helper into tool-call streaming. |
| app/api/agents/[id]/a2a/route.ts | Threads merge helper into A2A streaming. |
| README.md | Adds links to new “WHY” engineering docs and clarifies bun run test staging. |
| CHANGELOG.md | Adds Keep-a-Changelog style engineering changelog entries for these changes. |
| .github/workflows/tests.yml | Adds check-types:tests to CI. |
| .env.example | Documents optional ANTHROPIC_COT_BUDGET env var. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| const createDatabaseAdapter = createDatabaseAdapterDefault as ( | ||
| config: { postgresUrl: string }, | ||
| agentId: UUID, | ||
| ) => IDatabaseAdapter; |
There was a problem hiding this comment.
@elizaos/plugin-sql/node is being imported as a default export and then cast to a callable adapter factory. This bypasses type safety and can hide a runtime mismatch if the default export isn’t actually callable (or if its signature changes). Consider updating the local declare module "@elizaos/plugin-sql/node" typings to correctly model the default export (e.g., a callable function with any attached properties like schema) and then import it with the accurate type, rather than casting unknown here.
| const createDatabaseAdapter = createDatabaseAdapterDefault as ( | |
| config: { postgresUrl: string }, | |
| agentId: UUID, | |
| ) => IDatabaseAdapter; | |
| type CreateDatabaseAdapterFn = ( | |
| config: { postgresUrl: string }, | |
| agentId: UUID, | |
| ) => IDatabaseAdapter; | |
| function ensureCreateDatabaseAdapter(fn: unknown): CreateDatabaseAdapterFn { | |
| if (typeof fn !== "function") { | |
| throw new TypeError( | |
| 'Default export from "@elizaos/plugin-sql/node" is not a callable database adapter factory', | |
| ); | |
| } | |
| return fn as CreateDatabaseAdapterFn; | |
| } | |
| const createDatabaseAdapter = ensureCreateDatabaseAdapter( | |
| createDatabaseAdapterDefault, | |
| ); |
There was a problem hiding this comment.
Dismissed: Comment targets line 21 but identifier schema not found in file — code may have been removed or renamed
There was a problem hiding this comment.
Fixed in 6455e14.
There was a problem hiding this comment.
Dismissed: Line 36-45 now have type checking and casting with validation before using the default export
| ANTHROPIC_COT_BUDGET: { | ||
| required: false, | ||
| description: | ||
| "Default Anthropic extended-thinking token budget when a character omits settings.anthropicThinkingBudgetTokens. Unset or 0 disables unless the character sets a positive budget", | ||
| validate: (value: string) => { | ||
| const trimmed = value.trim(); | ||
| if (trimmed === "") { | ||
| return false; | ||
| } | ||
| if (!/^\d+$/.test(trimmed)) { | ||
| return false; | ||
| } | ||
| const n = Number.parseInt(trimmed, 10); | ||
| return n >= 0 && n <= Number.MAX_SAFE_INTEGER; | ||
| }, | ||
| errorMessage: | ||
| "Must be a non-negative integer string (0 = off; when enabling, Anthropic often expects at least ~1024)", | ||
| }, |
There was a problem hiding this comment.
ANTHROPIC_COT_BUDGET parsing in anthropic-thinking.ts throws on invalid non-empty values, but the env validator treats invalid optional values as warnings (startup continues). That means a mis-set budget can turn into request-time exceptions across any route that spreads mergeAnthropicCotProviderOptions(...). Either make invalid values for this variable a hard validation error (fail fast at startup), or change the parsing/merge path to degrade safely (e.g., log once and treat as "off").
There was a problem hiding this comment.
Fixed in fe10e6b.
There was a problem hiding this comment.
Fixed in 18bfb11.
There was a problem hiding this comment.
No change in this file this run; manual review if still needed.
| // Importing via the file-system path with a cache-buster query param | ||
| // guarantees we always get the real implementation regardless of mocks. | ||
| import { redact } from "../../lib/utils/logger.ts?_real"; | ||
| import { redact } from "../../lib/utils/logger"; |
There was a problem hiding this comment.
The comment above says this import uses a cache-buster query param to avoid mock.module("@/lib/utils/logger") poisoning, but the import no longer includes any cache-buster. If other unit files mock the logger module earlier in the same Bun process, this test may again receive the mocked module and lose the real redact export. Either restore the cache-buster (or another unmocked import strategy) or update the comment and ensure redact is always available in the mocked logger module(s).
There was a problem hiding this comment.
Fixed in fe10e6b.
There was a problem hiding this comment.
Dismissed: Line 16 still includes the cache-buster query param ?v=docker-ssh-test as required
There was a problem hiding this comment.
Dismissed: Line 16 still includes the cache-buster query param ?v=docker-ssh-test as required
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Iteration 1 prr-fix:prrc_kwdop6btdm6yyjtn
Iteration 1 prr-fix:prrc_kwdop6btdm6yyjt7
Changes: - rate-limit.ts: PR Review: fix(types): test project, split check-types, UI package, gateways ...
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Code Review — PR #417: Anthropic CoT ControlsOverall the approach is sound: centralizing thinking-budget resolution in a single helper, keeping budgets off request bodies, and wiring env defaults through a type-safe merge layer. A few issues worth addressing before merge. 🐛 Potential Bugs1. Temperature conflict in Anthropic's API rejects requests with // seo.ts
const { text } = await generateText({
model: gateway.languageModel(modelId),
temperature: 0.3, // ← will cause API error when ANTHROPIC_COT_BUDGET is set
...mergeAnthropicCotProviderOptions(modelId),
...// app-promotion.ts
await generateText({
model: gateway.languageModel(promoModel),
temperature: 0.7, // ← same risk
...mergeAnthropicCotProviderOptions(promoModel),If the AI SDK Gateway path does not strip 2. In const cotOpts = isOpenAIModel
? mergeAnthropicCotProviderOptions(imageModel) // returns {} for OpenAI — no-op
: mergeGoogleImageModalitiesWithAnthropicCot(imageModel); // applies google.responseModalities to ANY non-OpenAI modelBefore this PR,
|
Iteration 4 prr-fix:ic-4180968930-2
Code Review: feat: Anthropic CoT controlsOverall this is well-structured work with clear security reasoning (budgets must not come from untrusted callers). The core library in Bug:
|
PR Review: feat: Anthropic CoT controlsGood overall approach — putting the thinking budget on the character record rather than in API request bodies is the right call for trust boundary reasons, and the resolver hierarchy (per-agent → env default → env max cap) is clean. Unit test coverage in Bug (High) — A2A route missing File: The MCP route correctly sets The A2A route also uses only 500 tokens as the base for credit reservation ( Bug (Medium) — File: When Stale test file at repo root File: This file imports from Dead exported constant File:
Minor — File:
Positive notes
|
Iteration 1 prr-fix:ic-4180483329-5
Iteration 1 prr-fix:prrc_kwdop6btdm60pqio
|
PR Review findings - see below |
|
PR Review: feat: Anthropic CoT controls Overall this is well-structured work. Good separation of concerns in anthropic-thinking.ts, correct security rationale (budgets live on character records, not request bodies), and thorough documentation. A few concrete issues to address before merging. BUGS
In both handleNonStream and handleStream (app/api/v1/messages/route.ts): This only ensures max_tokens >= budgetTokens. Anthropic requires room for the actual response ON TOP of the thinking budget (thinking tokens count against max_tokens). Every other route in this PR adds a buffer:
/messages will produce API rejections when ANTHROPIC_COT_BUDGET is set and request.max_tokens is close to cotBudget.
anthropic-thinking.test.ts (repo root) imports from ./src/lib/anthropic-thinking and tests validateBudgetTokens, getThinkingConfig, buildThinkingParam - none of which exist in packages/lib/providers/anthropic-thinking.ts. The actual exports are resolveAnthropicThinkingBudgetTokens, parseAnthropicCotBudgetFromEnv, etc. This test either targets a file not in this PR, or was written against an earlier API. Either way it does not cover the shipped code. The docs reference packages/tests/unit/anthropic-thinking.test.ts but that path is not in the diff.
MCP uses DEFAULT_MIN_OUTPUT_TOKENS (4096) as its buffer; A2A uses 500. DEFAULT_BASE_OUTPUT_TOKENS = 4096 is exported from app/api/a2a/route.ts with a comment saying handlers should import it, but handleChat in the same file uses hardcoded 500. A2A thinking requests will be systematically underreserved vs MCP. POTENTIAL ISSUES
Previous: dev mode got 10,000 req/window (effectively unlimited). Reasonable change, but silent breakage for existing .env.local. A note in CHANGELOG.md pointing devs to add RATE_LIMIT_MULTIPLIER=1000 to .env.local would prevent confusion.
Tests only cover claude-3-5-sonnet, claude-3-opus, claude-3.5-sonnet. The model patterns also include claude-3-7-sonnet, claude-sonnet-4, claude-opus-4 - none tested. These are the newest and most likely production models. MINOR NITS
SUMMARY
|
Iteration 3 prr-fix:ic-4181043091-0
app/api/v1/messages/route.ts
Outdated
| const cotBudget = resolveAnthropicThinkingBudgetTokens(model, process.env); | ||
| // Passing resolved budget to mergeAnthropicCotProviderOptions short-circuits its internal resolution | ||
| const cotOptions = cotBudget != null ? mergeAnthropicCotProviderOptions(model, process.env, cotBudget) : {}; | ||
| const effectiveMaxTokens = cotBudget ? Math.max(request.max_tokens ?? 0, cotBudget) : request.max_tokens; |
There was a problem hiding this comment.
Missing response token buffer in effectiveMaxTokens calculation
High Severity
effectiveMaxTokens is computed as Math.max(request.max_tokens ?? 0, cotBudget) without adding any buffer for actual response generation. The chat/completions route correctly uses Math.max(requestMaxTokens ?? MIN_RESPONSE_TOKENS, cotBudget + MIN_RESPONSE_TOKENS) (adding 4096). When request.max_tokens is unset, all tokens go to thinking with zero left for the response, producing empty output.
Additional Locations (2)
| /claude-3-opus/, // Claude 3 Opus | ||
| /claude-sonnet-4/, // Claude Sonnet 4 | ||
| /claude-opus-4/, // Claude Opus 4 | ||
| ]; |
There was a problem hiding this comment.
Claude 3.5 Sonnet and 3 Opus incorrectly flagged as thinking-capable
High Severity
EXTENDED_THINKING_MODEL_PATTERNS includes claude-3[.-]5-sonnet and claude-3-opus, but per Anthropic docs, extended thinking is only supported starting from Claude 3.7 Sonnet, Claude Sonnet 4, Claude Opus 4, and Claude Haiku 4.5. Enabling thinking for Claude 3.5 Sonnet or Claude 3 Opus will cause Anthropic API errors, breaking requests when ANTHROPIC_COT_BUDGET is set.
Code Review: PR #417 — Anthropic CoT ControlsOverall this is a well-designed feature with solid documentation and clear reasoning for design choices (e.g., keeping thinking budgets on character records, not in request bodies). Below are the issues I found, ordered by severity. Critical1. A2A route:
const result = await streamText({
model: gateway.languageModel(model),
messages: fullMessages,
...mergeAnthropicCotProviderOptions(model, process.env, agentThinkingBudget),
// maxOutputTokens is missing
});Anthropic's API requires 2. A2A credit estimation uses 500-token base, contradicting its own comment In A2A // Add thinking budget to base output estimate (500 tokens) to match MCP route behavior
const maxOutputTokens = effectiveThinkingBudget != null ? 500 + effectiveThinkingBudget : undefined;The comment says "to match MCP route behavior" but MCP uses
Fix: use High Priority3. const MIN_RESPONSE_BUFFER = 1000; // messages/route.ts
4. Token buffer constants are scattered and inconsistent
Medium5. The file 6. The function uses regex patterns to detect supported models. Claude 3.7 Sonnet and Claude 4.x variants support extended thinking as of 2026-04-02. If those patterns are not in the list, thinking will silently be disabled for users on newer models. Worth auditing against Anthropic's current model lineup. 7. export const DEFAULT_BASE_OUTPUT_TOKENS = 4096;
// JSDoc: "This constant should be imported by @/lib/api/a2a handlers"
Positive Notes
SummaryThe two critical issues (missing |
Iteration 1 prr-fix:prrc_kwdop6btdm60pwi0
Iteration 2 prr-fix:ic-4180539773-2
|
PR Review: Anthropic CoT Controls - see full review below |
Code ReviewOverall this is solid work — the security model (thinking budgets from character settings, not request bodies) is sound and well-documented, the merge helpers are clean, and the test infrastructure fixes address real Bun flakiness. I found a few bugs and inconsistencies worth addressing before merge. Bug: 4-arg call to
|
|
PR Review: Anthropic CoT Controls (4 bugs, 1 test gap, 2 quality issues) BUG 1 - chat/completions/route.ts: Extra arg discards cotBudget. mergeAnthropicCotProviderOptions(model, process.env, {}, cotBudget) passes {} as agentThinkingBudgetTokens (type error) and ignores cotBudget (4th arg). Fix: mergeAnthropicCotProviderOptions(model, process.env, cotBudget). BUG 2 - agents/[id]/a2a/route.ts: maxOutputTokens computed for credit reservation but never passed to streamText. Anthropic API requires max_tokens >= budget_tokens when thinking active; will reject all CoT-enabled A2A requests. Also base is 500 here vs 4096 everywhere else despite comment claiming MCP parity. BUG 3 - packages/lib/api/a2a/skills.ts: Math.max(options.maxTokens ?? 4096, cotBudget) leaves zero tokens for actual response when cotBudget >= maxTokens. All other routes use cotBudget + MIN_RESPONSE_TOKENS. Fix: Math.max(options.maxTokens ?? 4096, cotBudget + 4096). BUG 4 - chat/completions/route.ts and chat/route.ts: ...(effectiveMaxTokens && {...}) silently drops 0. Use != null. TEST GAP - anthropic-thinking.test.ts (root) imports ./src/lib/anthropic-thinking and tests validateBudgetTokens, getThinkingConfig, buildThinkingParam -- none of which exist in packages/lib/providers/anthropic-thinking.ts. Leftover prototype tests, not covering actual production code. The packages/tests/unit/anthropic-thinking.test.ts mentioned in docs is absent from this diff. QUALITY - 4096 appears as 5 different constants (MIN_RESPONSE_TOKENS, MIN_RESPONSE_BUFFER, DEFAULT_MIN_OUTPUT_TOKENS, DEFAULT_BASE_OUTPUT_TOKENS) across routes. Extract a single shared constant. QUALITY - messages/route.ts comment 'short-circuits internal resolution' is misleading; passing resolved cotBudget as agentThinkingBudgetTokens just runs resolveAnthropicThinkingBudgetTokens twice (idempotent but not a short-circuit). |
Iteration 1 prr-fix:ic-4180719820-4 prr-fix:ic-4181094158-0
Explains reasoning for dismissed issues inline in code
| const result = await streamText({ | ||
| model: gateway.languageModel(model), | ||
| messages: fullMessages, | ||
| ...mergeAnthropicCotProviderOptions(model, process.env, agentThinkingBudget), |
There was a problem hiding this comment.
A2A route missing maxOutputTokens for thinking-enabled streamText
High Severity
The per-agent A2A route enables Anthropic extended thinking via mergeAnthropicCotProviderOptions but never passes maxOutputTokens to streamText. The Anthropic API requires max_tokens >= budget_tokens when thinking is enabled. The MCP route correctly computes and passes maxOutputTokens, but the A2A route omits it entirely, which will cause API rejections when the thinking budget exceeds the SDK's default max tokens. Additionally, the cost estimation base is hardcoded to 500 instead of using 4096 like the MCP route, leading to systematic underbilling.
Additional Locations (1)
| // Resolve CoT budget once to compute both provider options and maxOutputTokens | ||
| const cotBudget = resolveAnthropicThinkingBudgetTokens(model, process.env); | ||
| // When thinking is enabled, maxOutputTokens must be >= budgetTokens or Anthropic API rejects | ||
| const effectiveMaxTokens = cotBudget != null ? Math.max(options.maxTokens ?? 4096, cotBudget) : options.maxTokens; |
There was a problem hiding this comment.
Skills route missing response buffer in maxOutputTokens calculation
Medium Severity
The effectiveMaxTokens calculation uses Math.max(options.maxTokens ?? 4096, cotBudget) without adding a buffer for actual response generation. When cotBudget exceeds the base (e.g. 10000), thinking consumes all available tokens leaving zero for the response. Every other route adds + 4096 (or MIN_RESPONSE_TOKENS) on top of the thinking budget to ensure room for the actual output.
| abortSignal: AbortSignal | undefined, | ||
| timeoutMs: number, | ||
| settleReservation: (actualCost: number) => Promise<void>, | ||
| cotBudget: number | null, |
There was a problem hiding this comment.
Unused cotBudget parameter in streaming request handler
Low Severity
The cotBudget parameter is passed to handleStreamingRequest but never referenced inside the function body. Only cotOptions and effectiveMaxTokens are used. The non-streaming handler correctly omits this parameter. This is dead code that adds confusion about whether streaming has additional CoT logic.
Code ReviewOverall this is a well-structured PR. The core Issues1. Stale root-level test file
2. Null-check inconsistency in Lines 347–349 use const estimatedOutputTokens = effectiveThinkingBudget != null
? baseOutputTokens + effectiveThinkingBudget
: baseOutputTokens;But line 379 uses a bare truthy check: const maxOutputTokens = effectiveThinkingBudget
? Math.max(...) + DEFAULT_MIN_OUTPUT_TOKENS
: undefined;Since 3. The doc comment acknowledges this: keys outside 4. Rate limit multiplier is frozen at module load const rateLimitMultiplier = getRateLimitMultiplier(); // evaluated once
export const RateLimitPresets = { STANDARD: { maxRequests: 60 * rateLimitMultiplier }, ... };On Vercel serverless this is fine (cold-start per deployment). On long-lived Node processes (e.g. gateway-discord, agent-server), a change to 5. The comment says "actual rate limiting is still enforced" when Minor / Nits
Security ✅
Test Coverage ✅
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 8 total unresolved issues (including 7 from previous reviews).
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| const cotBudget = resolveAnthropicThinkingBudgetTokens(selectedModel, process.env); | ||
| const effectiveMaxOutputTokens = cotBudget | ||
| ? Math.max(DEFAULT_MIN_OUTPUT_TOKENS, cotBudget + DEFAULT_MIN_OUTPUT_TOKENS) | ||
| : undefined; |
There was a problem hiding this comment.
Inconsistent nullish checks for CoT budget across routes
Medium Severity
The cotBudget variable is checked with truthiness (cotBudget ? and effectiveMaxOutputTokens ?) instead of != null. While resolveAnthropicThinkingBudgetTokens returns null (not 0) for disabled, this pattern is inconsistent with other routes (e.g. chat/completions uses != null checks). More critically, if cotBudget is null, the call to mergeAnthropicCotProviderOptions at line 308 passes cotBudget ?? undefined — but if cotBudget were somehow 0, the truthiness check would skip setting maxOutputTokens while still passing 0 to the merge function.
Code Review: PR #417 — Anthropic CoT ControlsOverall this is a well-structured feature. The core architecture is solid: budget lives on the character record (not in request bodies), the env-based fallback/cap chain is clearly reasoned, and the explicit disable ( 🔴 CriticalTest file wrong import path — Also, 🟠 High — A2A route under-bills extended thinking
// A2A: only 500 base tokens added
const maxOutputTokens = effectiveThinkingBudget != null ? 500 + effectiveThinkingBudget : undefined;The comment says "to match MCP route behavior" but MCP uses Fix: use the same formula as MCP — Also: MCP passes 🟡 Medium — MCP token calculation inconsistency
// Credit reservation:
estimatedOutputTokens = baseOutputTokens + effectiveThinkingBudget // 4096 + budget
// maxOutputTokens for API call:
maxOutputTokens = Math.max(DEFAULT_MIN_OUTPUT_TOKENS, effectiveThinkingBudget) + DEFAULT_MIN_OUTPUT_TOKENS
// = Math.max(4096, budget) + 4096For 🟡 Medium — Rate limit dev mode behavior change is a silent breaking change
🟡 Medium — Duplicated token calculation across v1 routesThe pattern: const cotBudget = resolveAnthropicThinkingBudgetTokens(model, process.env);
const effectiveMaxTokens = cotBudget != null
? Math.max(MIN_RESPONSE_BUFFER, cotBudget + MIN_RESPONSE_BUFFER)
: undefined;...is copy-pasted into 🟡 Medium —
|



Note: services/operator (Pepr) is not part of the main workspace typecheck.
Made-with: Cursor
Note
Medium Risk
Changes how multiple API routes reserve credits and set
maxOutputTokens/providerOptionswhen calling Anthropic models, which can affect billing accuracy and request acceptance. Also alters rate-limit behavior in development via a new multiplier env var, so misconfiguration could unexpectedly throttle or relax limits outside prod.Overview
Enables per-agent Anthropic extended thinking by reading
user_characters.settings.anthropicThinkingBudgetTokens, falling back toANTHROPIC_COT_BUDGET, and optionally clamping withANTHROPIC_COT_BUDGET_MAX; introduces shared helpers inpackages/lib/providers/anthropic-thinking.tsto resolve budgets and deep-mergeproviderOptionswithout clobbering existinggateway/googlesettings.Updates MCP/A2A agent chat (and several v1 AI endpoints) to apply the resolved thinking budget, adjust
maxOutputTokensto satisfy Anthropic constraints, and include the budget in credit reservation/estimation; background services (SEO/promo/app builder) explicitly disable CoT (budget=0) to preserve temperature behavior. Separately, rate limiting now uses production-equivalent numeric presets in dev with an optionalRATE_LIMIT_MULTIPLIER, CI adds test-project typechecking, and the PR includes targeted test/mocking hardening (notably Milady pricing mocks) plus new docs (docs/anthropic-cot-budget.md,docs/unit-testing-milady-mocks.md) and an engineeringCHANGELOG.md.Written by Cursor Bugbot for commit 9272304. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Documentation
Bug Fixes
Chores