-
Notifications
You must be signed in to change notification settings - Fork 5
gemini client integrated #55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Greptile Summary
Important Files Changed
Confidence score: 4/5
Sequence DiagramsequenceDiagram
participant User
participant WebSocketServer
participant SessionManager
participant GeminiAgent
participant GeminiClient
participant MCPServer
participant ControllerBridge
User->>WebSocketServer: "Connect WebSocket"
WebSocketServer->>SessionManager: "createSession(agentConfig)"
SessionManager->>GeminiAgent: "create agent"
WebSocketServer-->>User: "connection event"
User->>WebSocketServer: "send message"
WebSocketServer->>SessionManager: "markProcessing()"
WebSocketServer->>GeminiAgent: "execute(message)"
GeminiAgent->>GeminiAgent: "init() - fetch config"
GeminiAgent->>GeminiClient: "initialize with MCP server"
GeminiAgent->>GeminiClient: "sendMessageStream()"
loop "Multi-turn conversation"
GeminiClient-->>GeminiAgent: "stream events"
GeminiAgent->>GeminiEventFormatter: "format(event)"
GeminiEventFormatter-->>WebSocketServer: "FormattedEvent"
WebSocketServer-->>User: "formatted event"
alt "Tool call required"
GeminiAgent->>MCPServer: "executeToolCall()"
MCPServer->>ControllerBridge: "browser automation"
ControllerBridge-->>MCPServer: "tool result"
MCPServer-->>GeminiAgent: "tool response"
GeminiAgent->>GeminiClient: "continue with tool results"
end
end
GeminiAgent-->>WebSocketServer: "completion event"
WebSocketServer->>SessionManager: "markIdle()"
WebSocketServer-->>User: "completion"
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
7 files reviewed, 2 comments
Edit Code Review Agent Settings | Greptile
React with 👍 or 👎 to share your feedback on this new summary format
- Add VercelAIContentGenerator with multi-provider support (OpenAI, Anthropic, Google) - Implement tool/message/response conversion strategies with full type safety - Fix tool parameter extraction (parametersJsonSchema support) - Fix OpenAI schema validation (recursive normalization for nested objects) - Fix text streaming (textDelta property name correction) - Fix usage metadata (promptTokens/completionTokens property names) - Add comprehensive test suite (streaming, non-streaming, error handling) - Complete E2E verification with browser tools (successful) - Enable strict mode for OpenAI token usage tracking
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
felarof99
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss in the review section itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This event formatter would only be required temporarily until we continue using our old UI, right?
For new UI, we don't need this event formatter, is that right? @shivammittal274
| this.useVercelAI = true; | ||
| this.vercelProvider = 'openai'; | ||
| this.config.apiKey = openaiKey; | ||
| this.config.modelName = 'gpt-4o'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's set the default model as gpt-4.1. @shivammittal274
| this.useVercelAI = true; | ||
| this.vercelProvider = 'google'; | ||
| this.config.apiKey = googleKey; | ||
| this.config.modelName = 'gemini-2.0-flash-exp'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bad default model. Let's actually use constants at the top of file or create environment variables for these defaults as well.
This by default should be gemini-2.5-flash
| excludeTools: ['run_shell_command', 'write_file', 'replace'], | ||
| compressionThreshold: 1000000, // Disable aggressive compression (1M tokens threshold) | ||
| mcpServers: { | ||
| 'browseros-mcp': new MCPServerConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivammittal274 Let's use named args here! If there is any upstream change, will be passing args in wrong order and would be a pain to debug.
| logger.debug('✅ VercelAIContentGenerator instance created'); | ||
|
|
||
| // Type assertion needed as contentGenerator is private in Config | ||
| (this.geminiConfig as any).contentGenerator = contentGenerator; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@greptileai why are we typecasting this.geminiConfig as any here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The type assertion (this.geminiConfig as any) is being used here because the contentGenerator property is private in the GeminiConfig class from the @google/gemini-cli-core package.
Looking at the code, this is happening in the Vercel AI integration path where you're trying to inject a custom VercelAIContentGenerator to override the default content generation behavior. Since the contentGenerator property isn't part of the public API of GeminiConfig, TypeScript won't allow you to access it directly.
This is a workaround to bypass TypeScript's type checking and directly set the private property. However, this approach has some risks:
- Brittle: If the internal structure of
GeminiConfigchanges, this could break - Type safety: You lose compile-time type checking for this operation
- Maintenance: It's not clear from the public API that this is supported
A few potential alternatives to consider:
- Check if there's a public setter method in
GeminiConfigfor the content generator - Use a factory pattern where you create the
GeminiConfigwith the custom generator from the start - Extend
GeminiConfigto create your own subclass that properly exposes this functionality - Submit a PR to
@google/gemini-cli-coreto add a public method for setting custom content generators
The type assertion works for now, but it would be better to find a more type-safe approach if possible.
| this.promptId = `prompt-${Date.now()}`; | ||
|
|
||
| logger.info('✅ GeminiClient initialized', { | ||
| mode: this.useVercelAI ? 'Vercel AI' : 'Native Gemini', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivammittal274 in a follow-up CL, let's add integration test with useVercelAI knob set to true and false, making sure both modes are working as of now. And in future if something breaks, we can then use that test to debug both.
|
|
||
| this.updateEventTime(); | ||
|
|
||
| const formatted = formatter.format(event); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivammittal274 I think we should delete GeminiAgentFormatter and delete all these pieces of code. We are not launching Gemini Agent with old UI, we will only launch it with the new UI. So, let's not have unnecessary code which will be hard to clean up later.
| request: GenerateContentParameters, | ||
| _userPromptId: string, | ||
| ): Promise<GenerateContentResponse> { | ||
| console.log('[VercelAI] generateContent called'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shivammittal274 let's not use any console.log. Let's consistently use logger we have, so that we can easily disable logs in production.
Logger we have -- /Users/felarof01/Workspaces/build/browseros-server/packages/common/src/logger.ts
| */ | ||
| geminiToVercel(contents: readonly Content[]): CoreMessage[] { | ||
| const messages: CoreMessage[] = []; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use logger
| } from '../types.js'; | ||
| import type { ToolConversionStrategy } from './tool.js'; | ||
|
|
||
| export class ResponseConversionStrategy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
**
@shivammittal274 would it be possible to add a unit tests verifiying vercelToGemini(geminiToVercel(x)) == x for each of the strategies? That'll be the fullproof way of testing we are translating correctly I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once you have such a test, you can even tell claude to keep fixing errors until the above test passes.
No description provided.