fix(openai): fix stop_reason null and zero usage fields in stream adapter#226
fix(openai): fix stop_reason null and zero usage fields in stream adapter#226guunergooner wants to merge 1 commit intoclaude-code-best:mainfrom
Conversation
📝 WalkthroughWalkthroughDeferred OpenAI finish handling and consolidated AssistantMessage assembly: stream adapter now emits finalized usage (including cache fields) at stream end, query assembly yields a single assistant message at stop or stream end (with max_tokens truncation error), plus expanded streaming tests. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant QueryService as Query (queryModelOpenAI)
participant Adapter as StreamAdapter (adaptOpenAIStreamToAnthropic)
participant OpenAI
participant Assembler as MessageAssembler
Client->>QueryService: start streaming request
QueryService->>OpenAI: open streaming connection
OpenAI-->>Adapter: raw streaming chunks
Adapter-->>QueryService: adapted events (message_start, content_block_*, message_delta, message_stop) [finish_reason deferred until post-loop]
QueryService->>Assembler: accumulate content blocks and usage from events
alt message_stop emitted
Assembler-->>Client: emit consolidated AssistantMessage (usage, stop_reason, stop_sequence=null)
alt stop_reason == max_tokens
Assembler-->>Client: emit AssistantMessage apiError: "max_output_tokens" (contains computed maxTokens)
end
else stream ends without message_stop
Assembler-->>Client: safety-assembled AssistantMessage (same finalization and possible apiError)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/api/openai/index.ts (1)
385-392:⚠️ Potential issue | 🟠 MajorDuplicate cost tracking — this block executes twice for
message_stop.The cost tracking at lines 385-392 duplicates the logic already executed inside the
message_stopcase handler at lines 364-367. Whenevent.type === 'message_stop',addToTotalSessionCostis called twice with the same values, inflating the reported session cost.🐛 Proposed fix: Remove the duplicate block
- // Track cost and token usage (matching the Anthropic path in claude.ts) - if ( - event.type === 'message_stop' && - usage.input_tokens + usage.output_tokens > 0 - ) { - const costUSD = calculateUSDCost(openaiModel, usage as any) - addToTotalSessionCost(costUSD, usage as any, options.model) - } - // Also yield as StreamEvent for real-time display (matching Anthropic path) yield { type: 'stream_event',🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/api/openai/index.ts` around lines 385 - 392, There is a duplicated cost-tracking block that runs when event.type === 'message_stop' causing addToTotalSessionCost to be called twice; remove the redundant block (the one that checks event.type === 'message_stop' and calls calculateUSDCost + addToTotalSessionCost) so that cost is only recorded by the existing logic inside the 'message_stop' case handler; ensure calculateUSDCost, addToTotalSessionCost, usage and options.model are still used exactly once for that event type.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/utils/formatBriefTimestamp.ts`:
- Around line 79-81: The day-boundary calculation in startOfDay() uses UTC
getters (getUTCFullYear/getUTCMonth/getUTCDate) while formatting uses local
timezone (toLocaleTimeString/toLocaleString), causing inconsistent day
classification vs display; update startOfDay() to use local date components
(getFullYear/getMonth/getDate) so boundaries match local formatting, or
alternatively make the display calls use UTC consistently (e.g., pass timeZone:
'UTC' to toLocaleString/toLocaleTimeString) if UTC-based boundaries are
intentional; if keeping UTC behavior, add a clear comment in startOfDay()
explaining the trade-off and test determinism.
---
Outside diff comments:
In `@src/services/api/openai/index.ts`:
- Around line 385-392: There is a duplicated cost-tracking block that runs when
event.type === 'message_stop' causing addToTotalSessionCost to be called twice;
remove the redundant block (the one that checks event.type === 'message_stop'
and calls calculateUSDCost + addToTotalSessionCost) so that cost is only
recorded by the existing logic inside the 'message_stop' case handler; ensure
calculateUSDCost, addToTotalSessionCost, usage and options.model are still used
exactly once for that event type.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8febc2f7-9823-4844-9baa-a8e3065b9ad5
📒 Files selected for processing (6)
src/services/api/openai/__tests__/queryModelOpenAI.test.tssrc/services/api/openai/__tests__/streamAdapter.test.tssrc/services/api/openai/index.tssrc/services/api/openai/streamAdapter.tssrc/utils/__tests__/formatBriefTimestamp.test.tssrc/utils/formatBriefTimestamp.ts
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/services/api/openai/__tests__/queryModelOpenAI.test.ts (1)
91-94: Use theeventsparameter directly to remove hidden global coupling
runQueryModel()acceptseventsbut currently ignores it and depends on_nextEvents. This makes the helper easier to misuse and harder to reason about.♻️ Suggested minimal fix
async function runQueryModel( events: BetaRawMessageStreamEvent[], envOverrides: Record<string, string | undefined> = {}, ) { + _nextEvents = events // Save + apply env overrides const saved: Record<string, string | undefined> = {}Also applies to: 157-157
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/api/openai/__tests__/queryModelOpenAI.test.ts` around lines 91 - 94, The helper runQueryModel currently ignores its events parameter and reads from the hidden global _nextEvents; change runQueryModel to use the passed-in events argument directly (replace any references to _nextEvents inside runQueryModel with the events parameter) and remove or stop relying on the global _nextEvents; update any test calls that expected the global behavior to pass the desired events into runQueryModel (also make the same change for the other occurrence noted around the second instance) so the helper is pure and no longer coupled to the global.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/services/api/openai/__tests__/queryModelOpenAI.test.ts`:
- Around line 91-94: The helper runQueryModel currently ignores its events
parameter and reads from the hidden global _nextEvents; change runQueryModel to
use the passed-in events argument directly (replace any references to
_nextEvents inside runQueryModel with the events parameter) and remove or stop
relying on the global _nextEvents; update any test calls that expected the
global behavior to pass the desired events into runQueryModel (also make the
same change for the other occurrence noted around the second instance) so the
helper is pure and no longer coupled to the global.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6f02b070-cf4d-477a-9920-f1ccf93a85d5
📒 Files selected for processing (4)
src/services/api/openai/__tests__/queryModelOpenAI.test.tssrc/services/api/openai/__tests__/streamAdapter.test.tssrc/services/api/openai/index.tssrc/utils/formatBriefTimestamp.ts
✅ Files skipped from review due to trivial changes (2)
- src/utils/formatBriefTimestamp.ts
- src/services/api/openai/tests/streamAdapter.test.ts
🚧 Files skipped from review as they are similar to previous changes (1)
- src/services/api/openai/index.ts
…pter Signed-off-by: guunergooner <tongchao0923@gmail.com>
e6865ab to
855ab08
Compare
|
@coderabbitai full review Addressed all review comments:
|
|
🧠 Learnings used✅ Actions performedFull review triggered. |
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/api/openai/index.ts (1)
216-252:⚠️ Potential issue | 🔴 CriticalForward
maxTokensinto the OpenAI request body.The computed
maxTokens(line 227) is never included in the request sent to OpenAI. ThebuildOpenAIRequestBody()function does not accept this parameter and does not serializemax_tokens, so OpenAI-compatible backends fall back to their server defaults. This makes the truncation-prevention fix non-functional, and the subsequent error messages (lines 375, 416) reference a limit that was never enforced.🔧 Proposed fix
export function buildOpenAIRequestBody(params: { model: string messages: any[] tools: any[] toolChoice: any enableThinking: boolean + maxTokens: number temperatureOverride?: number }): Record<string, any> { - const { model, messages, tools, toolChoice, enableThinking, temperatureOverride } = params + const { + model, + messages, + tools, + toolChoice, + enableThinking, + maxTokens, + temperatureOverride, + } = params return { model, messages, + max_tokens: maxTokens, ...(tools.length > 0 && { tools, ...(toolChoice && { tool_choice: toolChoice }), @@ const requestBody = buildOpenAIRequestBody({ model: openaiModel, messages: openaiMessages, tools: openaiTools, toolChoice: openaiToolChoice, enableThinking, + maxTokens, temperatureOverride: options.temperatureOverride, })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/api/openai/index.ts` around lines 216 - 252, The computed maxTokens from getModelMaxOutputTokens (and options.maxOutputTokensOverride) is never passed into the OpenAI request; update buildOpenAIRequestBody to accept a maxTokens (or max_output_tokens) parameter and serialize it as max_tokens in the returned request body, then pass maxTokens when calling buildOpenAIRequestBody before client.chat.completions.create; also update any related types/signatures (and tests) that reference buildOpenAIRequestBody to accept the new parameter so the request enforces the intended token limit.
🧹 Nitpick comments (1)
src/services/api/openai/index.ts (1)
328-420: Extract the final assistant emission into one helper.The
message_stopbranch and the post-loop fallback now duplicate the same message assembly andmax_output_tokenserror path. This is easy to drift the next time usage or stop-reason handling changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/api/openai/index.ts` around lines 328 - 420, Duplicate logic for assembling and emitting the final assistant message and the max_output_tokens error should be extracted into a single helper (e.g., emitFinalAssistantMessage or assembleFinalAssistantEmission) to avoid drift; implement a helper that accepts partialMessage, contentBlocks, tools, options.agentId, usage, stopReason, maxTokens, openaiModel and returns/yields the assembled AssistantMessage (using normalizeContentFromAPI, randomUUID(), new Date().toISOString()) and, when stopReason === 'max_tokens', the createAssistantAPIErrorMessage info. Replace the duplicated blocks in the 'message_stop' case and the post-loop safety fallback with calls to this helper, preserving the partialMessage = null reset and the cost tracking calls (calculateUSDCost and addToTotalSessionCost) in the same places or moved into the helper as appropriate. Ensure the helper maintains stop_sequence:null and includes usage in the emitted message so tokenCountWithEstimation continues to work.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/services/api/openai/index.ts`:
- Around line 216-252: The computed maxTokens from getModelMaxOutputTokens (and
options.maxOutputTokensOverride) is never passed into the OpenAI request; update
buildOpenAIRequestBody to accept a maxTokens (or max_output_tokens) parameter
and serialize it as max_tokens in the returned request body, then pass maxTokens
when calling buildOpenAIRequestBody before client.chat.completions.create; also
update any related types/signatures (and tests) that reference
buildOpenAIRequestBody to accept the new parameter so the request enforces the
intended token limit.
---
Nitpick comments:
In `@src/services/api/openai/index.ts`:
- Around line 328-420: Duplicate logic for assembling and emitting the final
assistant message and the max_output_tokens error should be extracted into a
single helper (e.g., emitFinalAssistantMessage or
assembleFinalAssistantEmission) to avoid drift; implement a helper that accepts
partialMessage, contentBlocks, tools, options.agentId, usage, stopReason,
maxTokens, openaiModel and returns/yields the assembled AssistantMessage (using
normalizeContentFromAPI, randomUUID(), new Date().toISOString()) and, when
stopReason === 'max_tokens', the createAssistantAPIErrorMessage info. Replace
the duplicated blocks in the 'message_stop' case and the post-loop safety
fallback with calls to this helper, preserving the partialMessage = null reset
and the cost tracking calls (calculateUSDCost and addToTotalSessionCost) in the
same places or moved into the helper as appropriate. Ensure the helper maintains
stop_sequence:null and includes usage in the emitted message so
tokenCountWithEstimation continues to work.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9a6a9b45-204c-425c-928c-bd37a898574e
📒 Files selected for processing (6)
src/services/api/openai/__tests__/queryModelOpenAI.test.tssrc/services/api/openai/__tests__/streamAdapter.test.tssrc/services/api/openai/index.tssrc/services/api/openai/streamAdapter.tssrc/utils/__tests__/formatBriefTimestamp.test.tssrc/utils/formatBriefTimestamp.ts
✅ Files skipped from review due to trivial changes (2)
- src/utils/formatBriefTimestamp.ts
- src/services/api/openai/tests/queryModelOpenAI.test.ts
🚧 Files skipped from review as they are similar to previous changes (2)
- src/utils/tests/formatBriefTimestamp.test.ts
- src/services/api/openai/streamAdapter.ts
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/services/api/openai/index.ts (1)
216-248:⚠️ Potential issue | 🟠 MajorThread
maxTokensinto the actual OpenAI request.Line 226 computes the cap, but Lines 241-248 never pass it into
buildOpenAIRequestBody(). The request still relies on the backend default, so compatible endpoints can keep truncating early while the latermax_output_tokensmessage reports a limit that was never sent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/api/openai/index.ts` around lines 216 - 248, The computed maxTokens (from getModelMaxOutputTokens/openaiModel and options.maxOutputTokensOverride) is not being passed into the OpenAI request builder; update the call to buildOpenAIRequestBody to include the resolved maxTokens so the actual request uses that cap (e.g., add a max_tokens or maxOutputTokens field to the buildOpenAIRequestBody invocation so requestBody contains the intended limit).
🧹 Nitpick comments (1)
src/services/api/openai/__tests__/queryModelOpenAI.test.ts (1)
159-167: Capturechat.completions.create()arguments in the mock.Right now this suite can't assert that the newly computed output-token cap is actually forwarded into the OpenAI request, so that omission still passes all tests. Recording the call payload here would give you a direct regression for that path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/services/api/openai/__tests__/queryModelOpenAI.test.ts` around lines 159 - 167, The mock for getOpenAIClient currently returns chat.completions.create as a no-op async iterator; update the mock in queryModelOpenAI.test.ts to capture the arguments passed into chat.completions.create (e.g., replace the anonymous function with a jest.fn or a closure that records its call payload to a scoped variable) and still return an async iterator-compatible object; ensure you reference getOpenAIClient and chat.completions.create so tests can assert the forwarded token cap and other request fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@src/services/api/openai/index.ts`:
- Around line 216-248: The computed maxTokens (from
getModelMaxOutputTokens/openaiModel and options.maxOutputTokensOverride) is not
being passed into the OpenAI request builder; update the call to
buildOpenAIRequestBody to include the resolved maxTokens so the actual request
uses that cap (e.g., add a max_tokens or maxOutputTokens field to the
buildOpenAIRequestBody invocation so requestBody contains the intended limit).
---
Nitpick comments:
In `@src/services/api/openai/__tests__/queryModelOpenAI.test.ts`:
- Around line 159-167: The mock for getOpenAIClient currently returns
chat.completions.create as a no-op async iterator; update the mock in
queryModelOpenAI.test.ts to capture the arguments passed into
chat.completions.create (e.g., replace the anonymous function with a jest.fn or
a closure that records its call payload to a scoped variable) and still return
an async iterator-compatible object; ensure you reference getOpenAIClient and
chat.completions.create so tests can assert the forwarded token cap and other
request fields.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 05632a66-a747-47e9-b20c-bea7ae3d3423
📒 Files selected for processing (6)
src/services/api/openai/__tests__/queryModelOpenAI.test.tssrc/services/api/openai/__tests__/streamAdapter.test.tssrc/services/api/openai/index.tssrc/services/api/openai/streamAdapter.tssrc/utils/__tests__/formatBriefTimestamp.test.tssrc/utils/formatBriefTimestamp.ts
Summary
stop_reasonalways being null in assembledAssistantMessage— the value captured frommessage_deltawas never applied to the partial messagepartialMessagenot being reset tonullaftermessage_stop, causing a safety fallback that would yield a second identicalAssistantMessage(doubled content in next API request)queryModelOpenAIandstreamAdapterTest plan
bun test src/services/api/openai/__tests__/to verify new tests passSummary by CodeRabbit
Bug Fixes
Tests