Skip to content

Centralize custom code#332

Merged
pirate merged 4 commits intomainfrom
cleanup-custom-code
Apr 3, 2026
Merged

Centralize custom code#332
pirate merged 4 commits intomainfrom
cleanup-custom-code

Conversation

@pirate
Copy link
Copy Markdown
Member

@pirate pirate commented Apr 3, 2026

Summary by cubic

Centralizes all handwritten SDK logic under stagehand/_custom and patches the generated client at import time to simplify local server mode and bound session handling. Also ensures Playwright CDP sessions are detached after frame ID extraction to prevent leaks.

  • Refactors

    • Moved SEA binary/server and session patches to stagehand/_custom; _client now uses configure_client_base_url, prepare_*, close_*, and copy_local_mode_kwargs and lazily starts/stops the SEA server.
    • Runtime patches install on import: sessions.start() returns a bound Session/AsyncSession; Session.extract supports Pydantic schema validation; generated resources/sessions.start is type-hinted to return bound sessions.
    • Exported Session and AsyncSession from stagehand; _client.sessions now exposes the generated resources/sessions (helper wrappers removed).
    • Local mode behavior: when server="local" and no browser is passed, default to a local browser if Browserbase creds are set; if creds are missing, require an explicit local browser.
    • Consolidated scripts: scripts/download_binary.py (renamed) and scripts/test_local_mode.py (new); docs updated.
  • Bug Fixes

    • Always detach Playwright CDP sessions after reading Page.getFrameTree when injecting frame_id from a page param (sync/async) to avoid leaked CDP sessions and flaky tests.
  • Migration

    • Update imports to stagehand._custom (e.g., from stagehand._custom import sea_server, sea_binary).
    • Use scripts/download_binary.py and scripts/test_local_mode.py; type-checker excludes updated.
    • Code using sessions.start() can use the returned Session/AsyncSession directly; raw/streaming responses are unchanged.

Written for commit ee9b780. Summary will update on new commits. Review in cubic

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 17 files

Confidence score: 3/5

  • There is a concrete runtime risk in src/stagehand/_custom/session.py: new_cdp_session(page) is not detached, so repeated calls can leak CDP sessions and steadily consume browser resources.
  • The local_host fallback in src/stagehand/_custom/sea_server.py uses or instead of an explicit is not None check, which can override an intentionally passed empty string and cause subtle configuration surprises.
  • Given the medium-high severity and high confidence on the session leak, this sits in a moderate-risk zone until that lifecycle handling is fixed.
  • Pay close attention to src/stagehand/_custom/session.py and src/stagehand/_custom/sea_server.py - session cleanup and argument fallback behavior are the key merge risks.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="src/stagehand/_custom/sea_server.py">

<violation number="1" location="src/stagehand/_custom/sea_server.py:348">
P2: `local_host` uses `or` for fallback while all other parameters use `is not None`. An explicitly passed empty string would be silently dropped. Use `if local_host is not None` for consistency and correctness.</violation>
</file>

<file name="src/stagehand/_custom/session.py">

<violation number="1" location="src/stagehand/_custom/session.py:62">
P1: CDP session created by `new_cdp_session(page)` is never detached after use. Each invocation of a session method with a `page` argument leaks a CDP session, consuming browser resources. Wrap the CDP interaction in a `try/finally` that calls `cdp.detach()`.</violation>

<violation number="2" location="src/stagehand/_custom/session.py:521">
P2: `_camel_to_snake` mishandles acronym boundaries: `"HTMLParser"` → `"htmlparser"` instead of `"html_parser"`. Add a forward-lookahead check for the transition from an uppercase run to a lowercase character.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant User
    participant Client as Stagehand Client
    participant Custom as _custom (Helpers)
    participant SEA as SeaServerManager
    participant Binary as SEA Binary (Local Process)
    participant API as Remote Stagehand API

    Note over User,API: Initialization & Configuration
    User->>Client: Stagehand(server="local", ...)
    Client->>Custom: CHANGED: configure_client_base_url()
    Custom->>SEA: Initialize with SeaServerConfig
    SEA-->>Client: Store manager instance
    Client-->>User: Instance ready

    Note over User,API: Session Lifecycle (Local Mode)
    User->>Client: sessions.start(...)
    Client->>Client: _prepare_options()
    Client->>Custom: CHANGED: prepare_sync_client_base_url()
    
    opt SEA Binary not running
        Custom->>SEA: ensure_running_sync()
        SEA->>Custom: resolve_binary_path()
        SEA->>Binary: Spawn child process (NODE_ENV=production)
        loop Poll Readiness
            SEA->>Binary: GET /health
            Binary-->>SEA: 200 OK
        end
    end
    
    SEA-->>Client: Local Base URL (e.g. 127.0.0.1:port)
    Client->>Binary: POST /v1/sessions/start
    Binary-->>Client: session_id + metadata
    
    Client->>Custom: NEW: Wrap in Session/AsyncSession
    Custom-->>User: Session Object (First-class export)

    Note over User,API: Action Execution (e.g., Extract)
    User->>Custom: session.extract(instruction, page=pw_page)
    
    opt Playwright Page Provided
        Custom->>Custom: NEW: _extract_frame_id_from_playwright_page()
    end
    
    Custom->>Client: sessions.extract(id, frame_id, ...)
    alt server == "local"
        Client->>Binary: POST /v1/sessions/extract
        Binary-->>Client: Data
    else server == "remote"
        Client->>API: POST /v1/sessions/extract
        API-->>Client: Data
    end
    Client-->>User: SessionExtractResponse

    Note over User,API: Cleanup
    User->>Client: close()
    Client->>Custom: CHANGED: close_sync_client_sea_server()
    Custom->>SEA: close()
    SEA->>Binary: Terminate Process
    Binary-->>SEA: Shutdown
    Client-->>User: Closed
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.

@pirate pirate force-pushed the cleanup-custom-code branch from 0455301 to ebe153f Compare April 3, 2026 18:03
@pirate pirate force-pushed the cleanup-custom-code branch from ebe153f to ff29af4 Compare April 3, 2026 18:21
@pirate pirate merged commit 3cbbcfc into main Apr 3, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants