Skip to content

Conversation

@daniel-lxs
Copy link
Member

@daniel-lxs daniel-lxs commented Jan 16, 2026

Summary

Migrate the OpenRouter provider from the OpenAI SDK to the Vercel AI SDK for a more standardized and maintainable implementation.

Closes COM-502 (part 2/2)

⚠️ This PR depends on #11047 (AI SDK dependencies and utilities)

Problem

The current OpenRouter provider implementation uses the raw OpenAI SDK directly, which requires complex message transformation and reasoning token handling code that is difficult to maintain.

Solution

Refactor the OpenRouter provider to use the @openrouter/ai-sdk-provider package which integrates with Vercel's AI SDK. This provides:

  • Standardized streaming via streamText and generateText
  • Cleaner message conversion through the AI SDK's CoreMessage format
  • Simplified tool handling
  • Consistent provider options interface

Changes

  • Rewrote src/api/providers/openrouter.ts to use AI SDK patterns
  • Updated src/api/providers/__tests__/openrouter.spec.ts with new test patterns

Testing

  • tsc --noEmit passes in both packages/types and src
  • All 5254 tests pass (368 test files)

Related


Important

Refactor OpenRouter provider to use Vercel AI SDK, simplifying streaming, message conversion, and tool handling, with extensive testing and error handling improvements.

  • Behavior:
    • Refactor OpenRouterHandler in openrouter.ts to use Vercel AI SDK, replacing OpenAI SDK.
    • Implements standardized streaming with streamText and generateText.
    • Simplifies message conversion using CoreMessage format.
    • Handles tools consistently with AI SDK.
    • Updates openrouter.spec.ts tests to reflect new patterns.
  • Error Handling:
    • Captures and logs API errors using TelemetryService.
    • Handles stream errors and API errors gracefully.
  • Testing:
    • Extensive tests in openrouter.spec.ts for new streaming and message handling.
    • Tests for reasoning details, tool handling, and error scenarios.
  • Misc:
    • Removes old OpenAI-specific logic and error handling.
    • Adjusts model parameter handling in model-params.ts for reasoning and temperature settings.

This description was created by Ellipsis for 3f935af. You can customize this summary. It will automatically update as commits are pushed.

@daniel-lxs daniel-lxs requested review from cte, jr and mrubens as code owners January 16, 2026 16:32
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. Enhancement New feature or request labels Jan 16, 2026
@roomote
Copy link
Contributor

roomote bot commented Jan 16, 2026

Rooviewer Clock   See task

Review of commit 3f935af. No new issues found. 1 previously flagged issue remains unresolved.

  • Missing getReasoningDetails() method in src/api/providers/openrouter.ts - Task.ts relies on this method for preserving reasoning context across multi-turn conversations with models like Gemini 3
  • Telemetry Loss: Error handling no longer reports exceptions to TelemetryService.instance.captureException() for production monitoring
  • Usage Data Loss: Detailed usage information (totalCost, cacheReadTokens, reasoningTokens) is no longer yielded - only basic inputTokens and outputTokens are returned
  • Model-Specific Handling Removed: Special handling for DeepSeek R1 (message format), Gemini (sanitization, encrypted blocks), Anthropic (beta headers), and prompt caching has been removed
  • Missing topP Parameter: The topP value (0.95 for DeepSeek R1 models) is returned by getModel() but not passed to streamText() or generateText()
  • Error Handling Bypasses Retry Logic: createMessage catches errors and yields them as ApiStreamError chunks instead of throwing. Task.ts relies on thrown exceptions from the first chunk to trigger retry logic (exponential backoff, context-window-exceeded auto-truncation, user retry dialog). The migrated Gemini handler throws in the equivalent path, preserving this contract.
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

private client: OpenAI
protected models: ModelRecord = {}
protected endpoints: ModelRecord = {}
private readonly providerName = "OpenRouter"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old implementation had a getReasoningDetails() method and currentReasoningDetails accumulator that Task.ts relies on (see line 959 in Task.ts). This method accumulates reasoning_details from the stream (handling reasoning.text, reasoning.summary, and reasoning.encrypted types) and provides them for persistence in API conversation history. This is required for models like Gemini 3 via OpenRouter that use the reasoning.encrypted format - without it, reasoning context won't be preserved across multi-turn conversations, potentially breaking those model integrations.

Fix it with Roo Code or mention @roomote and request a fix.

@daniel-lxs daniel-lxs marked this pull request as draft January 16, 2026 16:40
@daniel-lxs daniel-lxs force-pushed the feat/openrouter-ai-sdk branch from 5260473 to d062770 Compare January 26, 2026 22:17
roomote[bot]
roomote bot previously requested changes Jan 26, 2026
Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional Issues Found in Re-review

The previously flagged issue with missing getReasoningDetails() method remains unresolved. Additionally, I've identified a few more regressions in this refactor:

1. Telemetry Loss

Error handling no longer reports exceptions to telemetry.

2. Usage Data Loss

Detailed usage information (cost, cache tokens, reasoning tokens) is no longer yielded.

3. Model-Specific Handling Removed

Special handling for DeepSeek R1, Gemini, and Anthropic models has been removed.

See inline comments for details.

tools,
toolChoice: metadata?.tool_choice as any,
providerOptions,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usage Data Regression: The previous implementation yielded detailed usage information including:

  • totalCost from cost_details.upstream_inference_cost + cost
  • cacheReadTokens from prompt_tokens_details.cached_tokens
  • reasoningTokens from completion_tokens_details.reasoning_tokens

The new implementation only yields basic inputTokens and outputTokens. This loses cost tracking and token detail information that may be important for usage analytics and cost monitoring.

The AI SDK's result.usage may have different properties - please verify what data is available and restore as much usage detail as possible.

@@ -212,323 +83,59 @@ export class OpenRouterHandler extends BaseProvider implements SingleCompletionH
metadata?: ApiHandlerCreateMessageMetadata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Model-Specific Handling Removed: The previous implementation had important model-specific handling that has been removed:

  1. DeepSeek R1: Converted messages using convertToR1Format() which uses user role instead of system role
  2. Gemini Models: Had sanitizeGeminiMessages() and fake encrypted block injection for preserving reasoning context when switching models
  3. Anthropic Models: Added x-anthropic-beta: fine-grained-tool-streaming-2025-05-14 header
  4. Prompt Caching: Called addGeminiCacheBreakpoints() or addAnthropicCacheBreakpoints() for supported models
  5. Reasoning Parameter: The reasoning parameter (including exclude logic for Gemini 2.5 Pro) is no longer passed to the API

These may cause regressions for specific model providers. Please verify if the AI SDK handles these cases automatically, or if they need to be re-implemented.

daniel-lxs added a commit that referenced this pull request Jan 26, 2026
- Add getReasoningDetails() method to preserve reasoning context for multi-turn conversations
- Restore telemetry reporting with TelemetryService.captureException() in error handlers
- Restore detailed usage metrics (totalCost, cacheReadTokens, reasoningTokens, cacheWriteTokens)

Addresses review comments on PR #10778
daniel-lxs added a commit that referenced this pull request Jan 28, 2026
- Add getReasoningDetails() method to preserve reasoning context for multi-turn conversations
- Restore telemetry reporting with TelemetryService.captureException() in error handlers
- Restore detailed usage metrics (totalCost, cacheReadTokens, reasoningTokens, cacheWriteTokens)

Addresses review comments on PR #10778
@daniel-lxs daniel-lxs force-pushed the feat/openrouter-ai-sdk branch from d8f0c64 to 11fd793 Compare January 28, 2026 16:47
@daniel-lxs daniel-lxs changed the base branch from main to feat/ai-sdk-utils January 28, 2026 16:48
@daniel-lxs daniel-lxs marked this pull request as ready for review January 28, 2026 16:48
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Jan 28, 2026
@roomote
Copy link
Contributor

roomote bot commented Jan 28, 2026

Rooviewer Clock   See task on Roo Cloud

Review of commit 11fd793. No new issues found. 2 of 5 original issues remain unresolved.

  • Missing getReasoningDetails() method in src/api/providers/openrouter.ts - Task.ts relies on this method for preserving reasoning context across multi-turn conversations with models like Gemini 3
  • Telemetry Loss: Error handling no longer reports exceptions to TelemetryService.instance.captureException() for production monitoring
  • Usage Data Loss: Detailed usage information (totalCost, cacheReadTokens, reasoningTokens) is no longer yielded - only basic inputTokens and outputTokens are returned
  • Model-Specific Handling Removed: Special handling for DeepSeek R1 (message format), Gemini (sanitization, encrypted blocks), Anthropic (beta headers), and prompt caching has been removed
  • Missing topP Parameter: The topP value (0.95 for DeepSeek R1 models) is returned by getModel() but not passed to streamText() or generateText()

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

daniel-lxs added a commit that referenced this pull request Jan 28, 2026
- Add getReasoningDetails() method to preserve reasoning context for multi-turn conversations
- Restore telemetry reporting with TelemetryService.captureException() in error handlers
- Restore detailed usage metrics (totalCost, cacheReadTokens, reasoningTokens, cacheWriteTokens)

Addresses review comments on PR #10778
@daniel-lxs daniel-lxs force-pushed the feat/openrouter-ai-sdk branch from 11fd793 to ae9636e Compare January 28, 2026 17:01
Base automatically changed from feat/ai-sdk-utils to main January 28, 2026 19:05
- Replace direct OpenAI SDK usage with @openrouter/ai-sdk-provider
- Add 'ai' package for streamText() and generateText() functions
- Extract reusable AI SDK conversion utilities to src/api/transform/ai-sdk.ts:
  - convertToAiSdkMessages(): Anthropic messages to CoreMessage format
  - convertToolsForAiSdk(): OpenAI tools to AI SDK tool format
  - processAiSdkStreamPart(): AI SDK stream events to ApiStreamChunk
- Update tests to mock AI SDK functions instead of OpenAI client
- Add comprehensive tests for the new ai-sdk.ts utility module

This follows Vercel's AI SDK provider pattern for standardized LLM integration
and enables easier migration of other providers in the future.
- Fixed tool result output format to use typed object { type: 'text', value: string }
  instead of plain string to satisfy AI SDK validation schema
- Added tool name resolution by building a map of tool call IDs to names
- Updated tests to reflect new output format
- Add getReasoningDetails() method to preserve reasoning context for multi-turn conversations
- Restore telemetry reporting with TelemetryService.captureException() in error handlers
- Restore detailed usage metrics (totalCost, cacheReadTokens, reasoningTokens, cacheWriteTokens)

Addresses review comments on PR #10778
- Add support for 'reasoning' and 'text' event types in AI SDK stream processing
- Pass reasoning parameters via createOpenRouter extraBody instead of providerOptions
- Support both effort-based (effort: 'high') and budget-based (max_tokens: N) reasoning
- Add comprehensive tests for reasoning parameter passing and stream event handling
- Fixes reasoning tokens not being displayed for models like DeepSeek R1 and Gemini Thinking

Changes:
- src/api/transform/ai-sdk.ts: Add 'text' and 'reasoning' event type handlers
- src/api/providers/openrouter.ts: Pass reasoning via extraBody in provider creation
- Add tests for new event types and reasoning parameter flow
- All 53 tests passing
- Re-add DeepSeek R1 format via convertToR1Format with extraBody override
- Re-add Gemini sanitization (sanitizeGeminiMessages) and encrypted block injection
- Re-add Gemini 2.5 Pro reasoning exclusion when not explicitly configured
- Re-add prompt caching (addAnthropicCacheBreakpoints, addGeminiCacheBreakpoints)
- Re-add Anthropic beta headers (x-anthropic-beta: fine-grained-tool-streaming)
- Re-add reasoning_details handling via consolidateReasoningDetails
- Fix topP parameter passthrough to streamText() and generateText()
- Fix duplicate toolCallIdToName in ai-sdk.ts from rebase
- Update tests for new model-specific behavior and add 9 new tests
Comment on lines +300 to 309
} catch (error: any) {
const errorMessage = error instanceof Error ? error.message : String(error)
const apiError = new ApiProviderError(errorMessage, this.providerName, modelId, "createMessage")
TelemetryService.instance.captureException(apiError)
yield {
type: "usage",
inputTokens: lastUsage.prompt_tokens || 0,
outputTokens: lastUsage.completion_tokens || 0,
cacheReadTokens: lastUsage.prompt_tokens_details?.cached_tokens,
reasoningTokens: lastUsage.completion_tokens_details?.reasoning_tokens,
totalCost: (lastUsage.cost_details?.upstream_inference_cost || 0) + (lastUsage.cost || 0),
type: "error",
error: "OpenRouterError",
message: `${this.providerName} API Error: ${errorMessage}`,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This catch yields an error chunk instead of throwing. Task.ts wraps the first chunk from createMessage in a try/catch (line ~4397) that drives retry logic: exponential backoff for auto-approval, context-window-exceeded detection with auto-truncation, and the user-facing retry dialog. By catching the error here and yielding it as a chunk, the first iterator.next() in Task.ts resolves successfully with an error-typed value instead of rejecting, so all of that retry machinery is bypassed. The migrated Gemini handler throws in the equivalent path (line ~203 of gemini.ts), which correctly triggers the retry flow. Consider re-throwing here (after telemetry capture) to preserve the existing error contract.

Fix it with Roo Code or mention @roomote and request a fix.

… provider

- Remove buildOpenAiMessages(), extraBody.messages hack, and dummy messages
- Wire reasoning_details through AI SDK natively via providerOptions.openrouter
- Filter schema-invalid reasoning_details entries (malformed encrypted blocks)
- Filter [REDACTED] from thinking UI stream (upstream provider behavior)
- Remove unused imports: convertToOpenAiMessages, sanitizeGeminiMessages,
  consolidateReasoningDetails, convertToR1Format, addAnthropicCacheBreakpoints,
  addGeminiCacheBreakpoints
- All models now use convertToAiSdkMessages() → streamText() natively
- Prompt caching deferred (needs providerOptions.openrouter.cacheControl impl)
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Feb 10, 2026
@hannesrudolph hannesrudolph merged commit 5773af8 into main Feb 10, 2026
10 checks passed
@hannesrudolph hannesrudolph deleted the feat/openrouter-ai-sdk branch February 10, 2026 03:50
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Feb 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Enhancement New feature or request lgtm This PR has been approved by a maintainer size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

No open projects
Status: PR [Needs Review]

Development

Successfully merging this pull request may close these issues.

3 participants