Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .changeset/structured-output-finalization-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
'@tanstack/ai': patch
'@tanstack/ai-anthropic': patch
'@tanstack/ai-gemini': patch
'@tanstack/ai-ollama': patch
'@tanstack/ai-openrouter': patch
---

Adapters now report token `usage` from the non-streaming `structuredOutput()` call, and `fallbackStructuredOutputStream` forwards it onto the synthesized `RUN_FINISHED` event. Previously the legacy finalization round-trip was invisible to the chat middleware `onUsage` hook — any cost-tracking middleware silently under-counted by exactly one call whenever an adapter without a native `structuredOutputStream` (Anthropic, Gemini, Ollama, OpenRouter) ran agentic structured output through the legacy path.

`StructuredOutputResult` gains an optional `usage: { promptTokens, completionTokens, totalTokens }` field. Adapters without a token count on the wire (or that fail before usage is known) leave it `undefined`, which the engine treats as "no usage to report" — same as before. No consumer-visible behavior change beyond accurate `onUsage` totals.
7 changes: 7 additions & 0 deletions packages/typescript/ai-anthropic/src/adapters/text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -276,9 +276,16 @@ export class AnthropicTextAdapter<
}
}

const inputTokens = response.usage?.input_tokens ?? 0
const outputTokens = response.usage?.output_tokens ?? 0
return {
data: parsed,
rawText,
usage: {
promptTokens: inputTokens,
completionTokens: outputTokens,
totalTokens: inputTokens + outputTokens,
},
Comment on lines +279 to +288
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid forcing zero-usage when provider usage is missing.

Line 279–Line 288 always emits a usage object by defaulting to 0, which changes “unknown usage” into “known zero usage.” That can incorrectly fire onUsage with zero totals.

Proposed fix
-      const inputTokens = response.usage?.input_tokens ?? 0
-      const outputTokens = response.usage?.output_tokens ?? 0
+      const inputTokens = response.usage?.input_tokens
+      const outputTokens = response.usage?.output_tokens
       return {
         data: parsed,
         rawText,
-        usage: {
-          promptTokens: inputTokens,
-          completionTokens: outputTokens,
-          totalTokens: inputTokens + outputTokens,
-        },
+        ...(inputTokens !== undefined && outputTokens !== undefined
+          ? {
+              usage: {
+                promptTokens: inputTokens,
+                completionTokens: outputTokens,
+                totalTokens: inputTokens + outputTokens,
+              },
+            }
+          : {}),
       }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/typescript/ai-anthropic/src/adapters/text.ts` around lines 279 -
288, The code currently forces a usage object with zeros by defaulting
response.usage fields to 0; change it to only include the usage property when
response.usage is present so unknown usage remains unknown. In the function
returning { data: parsed, rawText, usage: ... }, check if response.usage != null
(or typeof response.usage !== 'undefined') and only then compute inputTokens =
response.usage.input_tokens, outputTokens = response.usage.output_tokens and
return usage:{ promptTokens: inputTokens, completionTokens: outputTokens,
totalTokens: inputTokens + outputTokens }; otherwise omit the usage field (or
set it to undefined).

}
} catch (error: unknown) {
const err = error as Error
Expand Down
8 changes: 8 additions & 0 deletions packages/typescript/ai-gemini/src/adapters/text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -196,9 +196,17 @@ export class GeminiTextAdapter<
)
}

const usageMetadata = result.usageMetadata
return {
data: parsed,
rawText,
...(usageMetadata && {
usage: {
promptTokens: usageMetadata.promptTokenCount ?? 0,
completionTokens: usageMetadata.candidatesTokenCount ?? 0,
totalTokens: usageMetadata.totalTokenCount ?? 0,
},
}),
}
} catch (error) {
logger.errors('gemini.structuredOutput fatal', {
Expand Down
7 changes: 7 additions & 0 deletions packages/typescript/ai-ollama/src/adapters/text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -201,9 +201,16 @@ export class OllamaTextAdapter<TModel extends string> extends BaseTextAdapter<
)
}

const promptTokens = response.prompt_eval_count ?? 0
const completionTokens = response.eval_count ?? 0
return {
data: parsed,
rawText,
usage: {
promptTokens,
completionTokens,
totalTokens: promptTokens + completionTokens,
},
}
Comment on lines +204 to 214
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve optional usage semantics instead of defaulting to zeros.

Line 204–Line 214 always returns usage, even when token counters are absent. That collapses “not reported” into 0 and can skew usage middleware behavior.

Proposed fix
-      const promptTokens = response.prompt_eval_count ?? 0
-      const completionTokens = response.eval_count ?? 0
+      const promptTokens = response.prompt_eval_count
+      const completionTokens = response.eval_count
       return {
         data: parsed,
         rawText,
-        usage: {
-          promptTokens,
-          completionTokens,
-          totalTokens: promptTokens + completionTokens,
-        },
+        ...(promptTokens !== undefined && completionTokens !== undefined
+          ? {
+              usage: {
+                promptTokens,
+                completionTokens,
+                totalTokens: promptTokens + completionTokens,
+              },
+            }
+          : {}),
       }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@packages/typescript/ai-ollama/src/adapters/text.ts` around lines 204 - 214,
The current return always sets usage with promptTokens and completionTokens
defaulted to 0, collapsing "not reported" into zero; change the construction in
the adapter (around where parsed/rawText are returned) to preserve optional
semantics by using the raw response fields (response.prompt_eval_count and
response.eval_count) without defaulting to 0 and only include the usage object
when at least one of those fields is defined—i.e., set promptTokens =
response.prompt_eval_count (no ?? 0), completionTokens = response.eval_count (no
?? 0) and omit or set usage to undefined when both are undefined so middleware
sees "not reported" instead of 0.

} catch (error: unknown) {
const err = error as Error
Expand Down
8 changes: 8 additions & 0 deletions packages/typescript/ai-openrouter/src/adapters/text.ts
Original file line number Diff line number Diff line change
Expand Up @@ -260,9 +260,17 @@ export class OpenRouterTextAdapter<
// this).
const transformed = this.transformStructuredOutput(parsed)

const usage = response.usage
return {
data: transformed,
rawText,
...(usage && {
usage: {
promptTokens: usage.promptTokens,
completionTokens: usage.completionTokens,
totalTokens: usage.totalTokens,
},
}),
}
} catch (error: unknown) {
// Narrow before logging: raw SDK errors can carry request metadata
Expand Down
11 changes: 11 additions & 0 deletions packages/typescript/ai/src/activities/chat/adapter.ts
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,17 @@ export interface StructuredOutputResult<T = unknown> {
data: T
/** The raw text response from the model before parsing */
rawText: string
/**
* Token usage reported by the provider for this call, when available.
* Forwarded by `fallbackStructuredOutputStream` onto the synthesized
* RUN_FINISHED so middleware `onUsage` hooks can account for the
* finalization round-trip.
*/
usage?: {
promptTokens: number
completionTokens: number
totalTokens: number
}
}

/**
Expand Down
11 changes: 10 additions & 1 deletion packages/typescript/ai/src/activities/chat/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -2630,7 +2630,15 @@ async function* fallbackStructuredOutputStream(
timestamp,
}

let result: { data: unknown; rawText: string }
let result: {
data: unknown
rawText: string
usage?: {
promptTokens: number
completionTokens: number
totalTokens: number
}
}
try {
result = await adapter.structuredOutput(options)
} catch (error) {
Expand Down Expand Up @@ -2686,6 +2694,7 @@ async function* fallbackStructuredOutputStream(
model,
timestamp,
finishReason: 'stop',
...(result.usage ? { usage: result.usage } : {}),
}
}

Expand Down
Loading