gemini client integrated #55

shivammittal274 · 2025-11-18T20:10:44Z

No description provided.

greptile-apps · 2025-11-18T20:12:11Z

Greptile Summary

Adds Google Gemini AI model integration to the BrowserOS agent system by implementing GeminiAgent class, formatter, and registering it in the factory
Updates SessionManager with new CANCELLING state and safety timeout mechanism to prevent race conditions during agent cancellation operations
Sets Gemini as the default agent type and configures file-based storage for macOS compatibility to avoid keychain prompts

Important Files Changed

Filename	Overview
`packages/agent/src/agent/GeminiAgent.ts`	New Gemini agent implementation with MCP server integration, dual config support, and streaming response handling
`packages/agent/src/session/SessionManager.ts`	Added CANCELLING state and safety timeout for robust cancellation; changed default agent to `gemini-sdk`
`packages/agent/src/agent/GeminiAgent.formatter.ts`	New formatter class that transforms Gemini events into standardized FormattedEvent format

Confidence score: 4/5

This PR is largely safe to merge with careful attention to the session state management changes
Score reflects solid architecture following established patterns but complexity in SessionManager cancellation logic requires review
Pay close attention to SessionManager.ts for potential race conditions in the new CANCELLING state implementation

Sequence Diagram

sequenceDiagram
    participant User
    participant WebSocketServer
    participant SessionManager
    participant GeminiAgent
    participant GeminiClient
    participant MCPServer
    participant ControllerBridge

    User->>WebSocketServer: "Connect WebSocket"
    WebSocketServer->>SessionManager: "createSession(agentConfig)"
    SessionManager->>GeminiAgent: "create agent"
    WebSocketServer-->>User: "connection event"

    User->>WebSocketServer: "send message"
    WebSocketServer->>SessionManager: "markProcessing()"
    WebSocketServer->>GeminiAgent: "execute(message)"
    
    GeminiAgent->>GeminiAgent: "init() - fetch config"
    GeminiAgent->>GeminiClient: "initialize with MCP server"
    GeminiAgent->>GeminiClient: "sendMessageStream()"
    
    loop "Multi-turn conversation"
        GeminiClient-->>GeminiAgent: "stream events"
        GeminiAgent->>GeminiEventFormatter: "format(event)"
        GeminiEventFormatter-->>WebSocketServer: "FormattedEvent"
        WebSocketServer-->>User: "formatted event"
        
        alt "Tool call required"
            GeminiAgent->>MCPServer: "executeToolCall()"
            MCPServer->>ControllerBridge: "browser automation"
            ControllerBridge-->>MCPServer: "tool result"
            MCPServer-->>GeminiAgent: "tool response"
            GeminiAgent->>GeminiClient: "continue with tool results"
        end
    end
    
    GeminiAgent-->>WebSocketServer: "completion event"
    WebSocketServer->>SessionManager: "markIdle()"
    WebSocketServer-->>User: "completion"

greptile-apps

_{7 files reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}
_{React with 👍 or 👎 to share your feedback on this new summary format}

packages/agent/src/agent/GeminiAgent.formatter.ts

packages/agent/src/agent/GeminiAgent.ts

- Add VercelAIContentGenerator with multi-provider support (OpenAI, Anthropic, Google) - Implement tool/message/response conversion strategies with full type safety - Fix tool parameter extraction (parametersJsonSchema support) - Fix OpenAI schema validation (recursive normalization for nested objects) - Fix text streaming (textDelta property name correction) - Fix usage metadata (promptTokens/completionTokens property names) - Add comprehensive test suite (streaming, non-streaming, error handling) - Complete E2E verification with browser tools (successful) - Enable strict mode for OpenAI token usage tracking

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

felarof99

Let's discuss in the review section itself.

felarof99 · 2025-11-25T01:33:44Z

packages/agent/src/agent/GeminiAgent.formatter.ts

This event formatter would only be required temporarily until we continue using our old UI, right?

For new UI, we don't need this event formatter, is that right? @shivammittal274

felarof99 · 2025-11-25T01:34:52Z

packages/agent/src/agent/GeminiAgent.ts

+      this.useVercelAI = true;
+      this.vercelProvider = 'openai';
+      this.config.apiKey = openaiKey;
+      this.config.modelName = 'gpt-4o';


Let's set the default model as gpt-4.1. @shivammittal274

felarof99 · 2025-11-25T01:35:37Z

packages/agent/src/agent/GeminiAgent.ts

+      this.useVercelAI = true;
+      this.vercelProvider = 'google';
+      this.config.apiKey = googleKey;
+      this.config.modelName = 'gemini-2.0-flash-exp';


This is a bad default model. Let's actually use constants at the top of file or create environment variables for these defaults as well.

This by default should be gemini-2.5-flash

felarof99 · 2025-11-25T01:39:41Z

packages/agent/src/agent/GeminiAgent.ts

+      excludeTools: ['run_shell_command', 'write_file', 'replace'],
+      compressionThreshold: 1000000, // Disable aggressive compression (1M tokens threshold)
+      mcpServers: {
+        'browseros-mcp': new MCPServerConfig(


@shivammittal274 Let's use named args here! If there is any upstream change, will be passing args in wrong order and would be a pain to debug.

felarof99 · 2025-11-25T01:42:26Z

packages/agent/src/agent/GeminiAgent.ts

+      logger.debug('✅ VercelAIContentGenerator instance created');
+
+      // Type assertion needed as contentGenerator is private in Config
+      (this.geminiConfig as any).contentGenerator = contentGenerator;


@greptileai why are we typecasting this.geminiConfig as any here?

The type assertion (this.geminiConfig as any) is being used here because the contentGenerator property is private in the GeminiConfig class from the @google/gemini-cli-core package.

Looking at the code, this is happening in the Vercel AI integration path where you're trying to inject a custom VercelAIContentGenerator to override the default content generation behavior. Since the contentGenerator property isn't part of the public API of GeminiConfig, TypeScript won't allow you to access it directly.

This is a workaround to bypass TypeScript's type checking and directly set the private property. However, this approach has some risks:

Brittle: If the internal structure of GeminiConfig changes, this could break

Type safety: You lose compile-time type checking for this operation

Maintenance: It's not clear from the public API that this is supported

A few potential alternatives to consider:

Check if there's a public setter method in GeminiConfig for the content generator

Use a factory pattern where you create the GeminiConfig with the custom generator from the start

Extend GeminiConfig to create your own subclass that properly exposes this functionality

Submit a PR to @google/gemini-cli-core to add a public method for setting custom content generators

The type assertion works for now, but it would be better to find a more type-safe approach if possible.

felarof99 · 2025-11-25T01:45:16Z

packages/agent/src/agent/GeminiAgent.ts

+    this.promptId = `prompt-${Date.now()}`;
+
+    logger.info('✅ GeminiClient initialized', {
+      mode: this.useVercelAI ? 'Vercel AI' : 'Native Gemini',


@shivammittal274 in a follow-up CL, let's add integration test with useVercelAI knob set to true and false, making sure both modes are working as of now. And in future if something breaks, we can then use that test to debug both.

felarof99 · 2025-11-25T01:48:10Z

packages/agent/src/agent/GeminiAgent.ts

+
+          this.updateEventTime();
+
+          const formatted = formatter.format(event);


@shivammittal274 I think we should delete GeminiAgentFormatter and delete all these pieces of code. We are not launching Gemini Agent with old UI, we will only launch it with the new UI. So, let's not have unnecessary code which will be hard to clean up later.

felarof99 · 2025-11-25T01:51:32Z

packages/agent/src/agent/adapters/vercel-ai/index.ts

+    request: GenerateContentParameters,
+    _userPromptId: string,
+  ): Promise<GenerateContentResponse> {
+    console.log('[VercelAI] generateContent called');


@shivammittal274 let's not use any console.log. Let's consistently use logger we have, so that we can easily disable logs in production.

Logger we have -- /Users/felarof01/Workspaces/build/browseros-server/packages/common/src/logger.ts

felarof99 · 2025-11-25T01:54:45Z

packages/agent/src/agent/adapters/vercel-ai/strategies/message.ts

+   */
+  geminiToVercel(contents: readonly Content[]): CoreMessage[] {
+    const messages: CoreMessage[] = [];
+


nit: use logger

felarof99 · 2025-11-25T02:03:55Z

packages/agent/src/agent/adapters/vercel-ai/strategies/response.ts

+} from '../types.js';
+import type { ToolConversionStrategy } from './tool.js';
+
+export class ResponseConversionStrategy {


**

@shivammittal274 would it be possible to add a unit tests verifiying vercelToGemini(geminiToVercel(x)) == x for each of the strategies? That'll be the fullproof way of testing we are translating correctly I guess.

Once you have such a test, you can even tell claude to keep fixing errors until the above test passes.

…rOS-server into gemini-agent

gemini client integrated

54681e7

greptile-apps bot reviewed Nov 18, 2025

View reviewed changes

packages/agent/src/agent/GeminiAgent.formatter.ts Outdated Show resolved Hide resolved

packages/agent/src/agent/GeminiAgent.ts Outdated Show resolved Hide resolved

shivammittal274 and others added 4 commits November 19, 2025 22:30

tools removed

e6a96a8

Update packages/agent/src/agent/GeminiAgent.ts

17fa365

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

Update packages/agent/src/agent/GeminiAgent.formatter.ts

9ef90ee

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>

felarof99 reviewed Nov 25, 2025

View reviewed changes

shivammittal274 added 2 commits November 25, 2025 23:06

added vercel ai sdk suppport

53e8243

Merge branch 'gemini-agent' of https://github.com/browseros-ai/Browse…

6fe1554

…rOS-server into gemini-agent


		this.updateEventTime();

		const formatted = formatter.format(event);

gemini client integrated #55

Are you sure you want to change the base?

gemini client integrated #55

Uh oh!

Conversation

shivammittal274 commented Nov 18, 2025

Uh oh!

greptile-apps bot commented Nov 18, 2025

Greptile Summary

Important Files Changed

Confidence score: 4/5

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

felarof99 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Nov 25, 2025

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants