fix(ai): report finalization-call usage on synthesized RUN_FINISHED#640
fix(ai): report finalization-call usage on synthesized RUN_FINISHED#640tombeckenham wants to merge 1 commit into
Conversation
Adapters without a native `structuredOutputStream` (Anthropic, Gemini, Ollama, OpenRouter) ran agentic structured output through the legacy finalization round-trip, and that extra call's token usage never reached the chat middleware `onUsage` hook — any cost-tracking middleware silently under-counted by exactly one call per request. Adapters now return `usage` from `structuredOutput()`, and `fallbackStructuredOutputStream` forwards it onto the synthesized RUN_FINISHED event. `StructuredOutputResult` gains an optional `usage` field; adapters that can't report it leave it undefined, which the engine treats as "no usage to report" — same as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughToken usage from provider APIs is now exposed in the non-streaming ChangesStructured Output Token Usage
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 ESLint
ESLint install failed: lockfile failed supply-chain policy check. Run Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🚀 Changeset Version Preview11 package(s) bumped directly, 16 bumped as dependents. 🟥 Major bumps
🟨 Minor bumps
🟩 Patch bumps
|
|
View your CI Pipeline Execution ↗ for commit 554d31e
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/typescript/ai-anthropic/src/adapters/text.ts`:
- Around line 279-288: The code currently forces a usage object with zeros by
defaulting response.usage fields to 0; change it to only include the usage
property when response.usage is present so unknown usage remains unknown. In the
function returning { data: parsed, rawText, usage: ... }, check if
response.usage != null (or typeof response.usage !== 'undefined') and only then
compute inputTokens = response.usage.input_tokens, outputTokens =
response.usage.output_tokens and return usage:{ promptTokens: inputTokens,
completionTokens: outputTokens, totalTokens: inputTokens + outputTokens };
otherwise omit the usage field (or set it to undefined).
In `@packages/typescript/ai-ollama/src/adapters/text.ts`:
- Around line 204-214: The current return always sets usage with promptTokens
and completionTokens defaulted to 0, collapsing "not reported" into zero; change
the construction in the adapter (around where parsed/rawText are returned) to
preserve optional semantics by using the raw response fields
(response.prompt_eval_count and response.eval_count) without defaulting to 0 and
only include the usage object when at least one of those fields is defined—i.e.,
set promptTokens = response.prompt_eval_count (no ?? 0), completionTokens =
response.eval_count (no ?? 0) and omit or set usage to undefined when both are
undefined so middleware sees "not reported" instead of 0.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7b982479-33e3-4669-a73f-883798bcffdd
📒 Files selected for processing (7)
.changeset/structured-output-finalization-usage.mdpackages/typescript/ai-anthropic/src/adapters/text.tspackages/typescript/ai-gemini/src/adapters/text.tspackages/typescript/ai-ollama/src/adapters/text.tspackages/typescript/ai-openrouter/src/adapters/text.tspackages/typescript/ai/src/activities/chat/adapter.tspackages/typescript/ai/src/activities/chat/index.ts
| const inputTokens = response.usage?.input_tokens ?? 0 | ||
| const outputTokens = response.usage?.output_tokens ?? 0 | ||
| return { | ||
| data: parsed, | ||
| rawText, | ||
| usage: { | ||
| promptTokens: inputTokens, | ||
| completionTokens: outputTokens, | ||
| totalTokens: inputTokens + outputTokens, | ||
| }, |
There was a problem hiding this comment.
Avoid forcing zero-usage when provider usage is missing.
Line 279–Line 288 always emits a usage object by defaulting to 0, which changes “unknown usage” into “known zero usage.” That can incorrectly fire onUsage with zero totals.
Proposed fix
- const inputTokens = response.usage?.input_tokens ?? 0
- const outputTokens = response.usage?.output_tokens ?? 0
+ const inputTokens = response.usage?.input_tokens
+ const outputTokens = response.usage?.output_tokens
return {
data: parsed,
rawText,
- usage: {
- promptTokens: inputTokens,
- completionTokens: outputTokens,
- totalTokens: inputTokens + outputTokens,
- },
+ ...(inputTokens !== undefined && outputTokens !== undefined
+ ? {
+ usage: {
+ promptTokens: inputTokens,
+ completionTokens: outputTokens,
+ totalTokens: inputTokens + outputTokens,
+ },
+ }
+ : {}),
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/typescript/ai-anthropic/src/adapters/text.ts` around lines 279 -
288, The code currently forces a usage object with zeros by defaulting
response.usage fields to 0; change it to only include the usage property when
response.usage is present so unknown usage remains unknown. In the function
returning { data: parsed, rawText, usage: ... }, check if response.usage != null
(or typeof response.usage !== 'undefined') and only then compute inputTokens =
response.usage.input_tokens, outputTokens = response.usage.output_tokens and
return usage:{ promptTokens: inputTokens, completionTokens: outputTokens,
totalTokens: inputTokens + outputTokens }; otherwise omit the usage field (or
set it to undefined).
| const promptTokens = response.prompt_eval_count ?? 0 | ||
| const completionTokens = response.eval_count ?? 0 | ||
| return { | ||
| data: parsed, | ||
| rawText, | ||
| usage: { | ||
| promptTokens, | ||
| completionTokens, | ||
| totalTokens: promptTokens + completionTokens, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Preserve optional usage semantics instead of defaulting to zeros.
Line 204–Line 214 always returns usage, even when token counters are absent. That collapses “not reported” into 0 and can skew usage middleware behavior.
Proposed fix
- const promptTokens = response.prompt_eval_count ?? 0
- const completionTokens = response.eval_count ?? 0
+ const promptTokens = response.prompt_eval_count
+ const completionTokens = response.eval_count
return {
data: parsed,
rawText,
- usage: {
- promptTokens,
- completionTokens,
- totalTokens: promptTokens + completionTokens,
- },
+ ...(promptTokens !== undefined && completionTokens !== undefined
+ ? {
+ usage: {
+ promptTokens,
+ completionTokens,
+ totalTokens: promptTokens + completionTokens,
+ },
+ }
+ : {}),
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@packages/typescript/ai-ollama/src/adapters/text.ts` around lines 204 - 214,
The current return always sets usage with promptTokens and completionTokens
defaulted to 0, collapsing "not reported" into zero; change the construction in
the adapter (around where parsed/rawText are returned) to preserve optional
semantics by using the raw response fields (response.prompt_eval_count and
response.eval_count) without defaulting to 0 and only include the usage object
when at least one of those fields is defined—i.e., set promptTokens =
response.prompt_eval_count (no ?? 0), completionTokens = response.eval_count (no
?? 0) and omit or set usage to undefined when both are undefined so middleware
sees "not reported" instead of 0.
Summary
Follow-up to #609. Adapters without a native
structuredOutputStream(Anthropic, Gemini, Ollama, OpenRouter) run agentic structured output through the legacy finalization round-trip, but that extra call's token usage never reached the chat middlewareonUsagehook — any cost-tracking middleware silently under-counted by exactly one call per structured-output request.usagefromstructuredOutput(), sourced from the provider's response (Anthropicusage.input_tokens/output_tokens, GeminiusageMetadata, Ollamaprompt_eval_count/eval_count, OpenRouter normalizedusage).fallbackStructuredOutputStreamforwardsusageonto the synthesizedRUN_FINISHEDevent so the chat middleware accounts for the finalization call.StructuredOutputResultgains an optionalusage: { promptTokens, completionTokens, totalTokens }field. When the provider doesn't return tokens (or the call fails before usage is known) the field staysundefined, which the engine already treats as "no usage to report" — no consumer-visible change beyond accurateonUsagetotals.Test plan
pnpm --filter @tanstack/ai --filter @tanstack/ai-anthropic --filter @tanstack/ai-gemini --filter @tanstack/ai-ollama --filter @tanstack/ai-openrouter test:types(verified locally)pnpm test:libfor the affected packagesonUsagemiddleware and confirm the finalization call's tokens are included🤖 Generated with Claude Code
Summary by CodeRabbit
onUsagehooks now receive accurate token count totals, eliminating previous under-reporting in certain adapter paths.