Skip to content

fix(core): add CUA addContextNote parity across providers#2038

Open
BABTUNA wants to merge 2 commits intobrowserbase:mainfrom
BABTUNA:fix-cua-add-context-note-parity
Open

fix(core): add CUA addContextNote parity across providers#2038
BABTUNA wants to merge 2 commits intobrowserbase:mainfrom
BABTUNA:fix-cua-add-context-note-parity

Conversation

@BABTUNA
Copy link
Copy Markdown
Contributor

@BABTUNA BABTUNA commented Apr 24, 2026

why

V3CuaAgentHandler injects runtime captcha guidance via addContextNote(...), but only OpenAICUAClient consumed those notes.

That meant Google/Anthropic/Microsoft CUA clients silently dropped this guidance, causing behavior drift in captcha-related flows.

Closes #2037.

what changed

  • Added pending context-note queue support to:
    • GoogleCUAClient
    • AnthropicCUAClient
    • MicrosoftCUAClient
  • Implemented addContextNote(note) override in each client.
  • Added one-shot drain semantics (drainContextNotes) in each client.
  • Injected drained notes into the next turn only:
    • Google: append user text message to history
    • Anthropic: append user message(s) to nextInputItems
    • Microsoft: append user message(s) to conversationHistory

tests

Added packages/core/tests/unit/cua-context-note-parity.test.ts covering:

  • Anthropic: note appears in next turn input, not subsequent turn
  • Google: note appears once in history between turns
  • Microsoft: note appears once in conversation history between turns

Regression check run:

  • tests/unit/openai-cua-client.test.ts

validation run

  • npm.cmd exec prettier -- --write packages/core/lib/v3/agent/GoogleCUAClient.ts packages/core/lib/v3/agent/AnthropicCUAClient.ts packages/core/lib/v3/agent/MicrosoftCUAClient.ts packages/core/tests/unit/cua-context-note-parity.test.ts
  • node node_modules/vitest/vitest.mjs run --config .tmp-vitest-unit-config.mjs from packages/core (temporary local config targeting:
    • tests/unit/cua-context-note-parity.test.ts
    • tests/unit/openai-cua-client.test.ts)

Summary by cubic

Adds parity for addContextNote across CUA providers so runtime captcha guidance is applied consistently; notes are injected once on the next turn and drained on terminal steps to avoid leaks. Fixes #2037.

  • Bug Fixes
    • Implemented pending context-note queue and addContextNote(...) in GoogleCUAClient, AnthropicCUAClient, and MicrosoftCUAClient.
    • One-shot semantics: inject on next turn (Google → history, Anthropic → nextInputItems, Microsoft → conversationHistory) and drain on terminal steps to prevent carry-over.
    • Added unit tests for next-turn injection and no leakage across executions.

Written for commit 49e832d. Summary will update on new commits. Review in cubic

Implement pending context-note queue + next-turn injection for Google, Anthropic, and Microsoft CUA clients to match OpenAI behavior. Notes are drained after one turn to preserve one-shot semantics. Adds targeted unit coverage for all three clients. Refs browserbase#2037.
@changeset-bot
Copy link
Copy Markdown

changeset-bot Bot commented Apr 24, 2026

⚠️ No Changeset found

Latest commit: 49e832d

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions
Copy link
Copy Markdown
Contributor

This PR is from an external contributor and must be approved by a stagehand team member with write access before CI can run.
Approving the latest commit mirrors it into an internal PR owned by the approver.
If new commits are pushed later, the internal PR stays open but is marked stale until someone approves the latest external commit and refreshes it.

@github-actions github-actions Bot added external-contributor Tracks PRs mirrored from external contributor forks. external-contributor:awaiting-approval Waiting for a stagehand team member to approve the latest external commit. labels Apr 24, 2026
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Confidence score: 3/5

  • There is a concrete regression risk in context handling: both clients can retain terminal-step notes and apply them in later executions, which may cause one-turn delays and stale context behavior.
  • Given the medium severity (6/10) and high confidence (8-9/10), this is more than a cosmetic issue and can be user-facing in multi-step or repeated agent runs.
  • Pay close attention to packages/core/lib/v3/agent/AnthropicCUAClient.ts, packages/core/lib/v3/agent/GoogleCUAClient.ts - note-draining logic currently skips completed steps, allowing cross-execution note carryover.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/agent/AnthropicCUAClient.ts">

<violation number="1" location="packages/core/lib/v3/agent/AnthropicCUAClient.ts:183">
P2: Context notes are drained after model execution and only on non-completed steps, causing one-turn delay and possible stale-note carryover into later executions.</violation>
</file>

<file name="packages/core/lib/v3/agent/GoogleCUAClient.ts">

<violation number="1" location="packages/core/lib/v3/agent/GoogleCUAClient.ts:273">
P2: Context notes may leak across separate executions because queued notes are only drained when `!completed`, leaving terminal-step notes to persist into later runs.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

Comment thread packages/core/lib/v3/agent/AnthropicCUAClient.ts Outdated
Comment thread packages/core/lib/v3/agent/GoogleCUAClient.ts Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 4 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/tests/unit/cua-context-note-parity.test.ts">

<violation number="1" location="packages/core/tests/unit/cua-context-note-parity.test.ts:117">
P2: The new cross-run parity tests assert on the second step of the second execution instead of the first, so they can miss stale context-note leakage into the new run.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

@@ -0,0 +1,374 @@
import { describe, expect, it, vi } from "vitest";
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: The new cross-run parity tests assert on the second step of the second execution instead of the first, so they can miss stale context-note leakage into the new run.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/tests/unit/cua-context-note-parity.test.ts, line 117:

<comment>The new cross-run parity tests assert on the second step of the second execution instead of the first, so they can miss stale context-note leakage into the new run.</comment>

<file context>
@@ -69,6 +69,62 @@ describe("CUA context note parity", () => {
+      logger: noopLogger,
+    });
+
+    const secondRunStep2Input = executeStepSpy.mock.calls[2]?.[0] as Array<{
+      role?: string;
+      content?: string;
</file context>
Fix with Cubic

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmm. These clients inject notes after a step, so leak would appear on run-2 step-2, which is what the test checks

@BABTUNA
Copy link
Copy Markdown
Contributor Author

BABTUNA commented Apr 24, 2026

@cubic-dev-ai re-run review

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Apr 24, 2026

@cubic-dev-ai re-run review

@BABTUNA I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 4 files

Confidence score: 3/5

  • There is a concrete regression risk in packages/core/lib/v3/agent/AnthropicCUAClient.ts: pending context notes are not cleared on error paths, which can leak notes across executions and affect subsequent runs.
  • The new public addContextNote interface in packages/core/lib/v3/agent/AnthropicCUAClient.ts appears to be missing required flowLogger instrumentation, which can reduce traceability and make debugging/monitoring harder.
  • Given two medium-severity, high-confidence findings in core agent behavior, this carries some merge risk and is worth fixing before relying on it in production flows.
  • Pay close attention to packages/core/lib/v3/agent/AnthropicCUAClient.ts - error-path state cleanup and flowLogger instrumentation for the new public method.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/core/lib/v3/agent/AnthropicCUAClient.ts">

<violation number="1" location="packages/core/lib/v3/agent/AnthropicCUAClient.ts:117">
P2: Custom agent: **Ensure all public methods added to the stagehand class, agent, or understudy (page, locator, etc.) interfaces are properly instrumented with the flowLogger**

New public `addContextNote` interface mutates agent state and affects subsequent LLM input, but it is untracked by flowLogger.</violation>

<violation number="2" location="packages/core/lib/v3/agent/AnthropicCUAClient.ts:180">
P2: Context notes can leak across executions because pending notes are not cleared on error paths.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant H as V3CuaAgentHandler
    participant C as CUA Client (Anthropic/Google/MS)
    participant Q as pendingContextNotes (Queue)
    participant Hist as Provider History/Input
    participant API as Model API (Claude/Gemini/Fara)

    Note over H, API: Runtime Captcha/Context Injection Flow

    H->>C: NEW: addContextNote(note)
    C->>Q: Store note in queue

    H->>C: execute(task)
    
    loop Until Task Completed
        C->>API: getAction(currentInput)
        API-->>C: stepResult (completed: false)
        
        C->>C: internal: drainContextNotes()
        C->>Q: NEW: Get and clear all notes
        Q-->>C: notes[]
        
        alt NEW: notes exist AND not completed
            C->>Hist: CHANGED: Map notes to "user" messages
            Note right of Hist: Anthropic: nextInputItems<br/>Google: history<br/>Microsoft: conversationHistory
        end

        C->>C: Update currentInput for next turn
    end

    Note over C, Q: Terminal step or execution end
    C->>Q: NEW: drainContextNotes() (Prevent leaks to next run)
    C-->>H: finalResult
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

// Update completion status
completed = result.completed;

const contextNotes = this.drainContextNotes();
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Context notes can leak across executions because pending notes are not cleared on error paths.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/lib/v3/agent/AnthropicCUAClient.ts, line 180:

<comment>Context notes can leak across executions because pending notes are not cleared on error paths.</comment>

<file context>
@@ -172,9 +177,20 @@ export class AnthropicCUAClient extends AgentClient {
         // Update completion status
         completed = result.completed;
 
+        const contextNotes = this.drainContextNotes();
+
         // Update the input items for the next step if we're continuing
</file context>
Fix with Cubic

this.tools = tools;
}

addContextNote(note: string): void {
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Custom agent: Ensure all public methods added to the stagehand class, agent, or understudy (page, locator, etc.) interfaces are properly instrumented with the flowLogger

New public addContextNote interface mutates agent state and affects subsequent LLM input, but it is untracked by flowLogger.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/core/lib/v3/agent/AnthropicCUAClient.ts, line 117:

<comment>New public `addContextNote` interface mutates agent state and affects subsequent LLM input, but it is untracked by flowLogger.</comment>

<file context>
@@ -113,6 +114,10 @@ export class AnthropicCUAClient extends AgentClient {
     this.tools = tools;
   }
 
+  addContextNote(note: string): void {
+    this.pendingContextNotes.push(note);
+  }
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

external-contributor:awaiting-approval Waiting for a stagehand team member to approve the latest external commit. external-contributor Tracks PRs mirrored from external contributor forks.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

core(cua): implement addContextNote parity for Google/Anthropic/Microsoft clients

1 participant