Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,8 @@ A lightweight Electron app that connects to a remote Chromium-based browser via
- **Local tabs**: Real local web pages rendered as in-DOM Electron `<webview>`s on a shared `persist:local` session (`src/components/local-webviews.tsx`) — full device access (OS notifications, speaker/mic, camera, screen-share) that CDP screencast tabs can't have. Because a `<webview>` is an in-page OOPIF, React overlays (dialogs, menus, tooltips, the settings sheet) stack **above the live page via CSS z-index** — no native z-order, no freeze. `activeKind: 'cdp' | 'local'` chooses the surface and routes the toolbar/nav hotkeys (`RemotePage` vs the active webview's methods). The renderer holds `LocalTab` metadata and maps webview DOM events to it; only the active webview is shown (others `display:none`, kept alive in the background). All open local tabs persist + restore on launch; pinned ones (a `pinned` flag, distinct from CDP PINNED pins) sort atop the LOCAL TABS section. Unpacked MV3 extensions load into the local session only (`localExtensionPaths`) and their content scripts inject into webview guests; the toolbar shows a Chrome-like action icon per extension (opens its popup in a popover), and popup/options also open as a local tab via the `chrome-extension://` URL. Permissions auto-granted behind the `autoGrantLocalMedia` setting (a `media` request triggers `askForMediaAccess`); packaging ships mic/cam/audio-capture Info.plist keys + entitlements (`build/entitlements.mac.plist`, hardened runtime). See `docs/adr/0005-local-tabs-base-window.md`.
- **Web build (no Electron)**: The same renderer runs as a plain web app via `web/server.mjs` — a Node HTTP proxy that serves the built `dist/` and exposes the whole `window.cdp` surface over **SSE** (`GET /api/events`, server→browser pushes incl. screencast frames) + **POST** (`/api/invoke`, `/api/send`, `/api/cdp-batch`, and REST for tabs/config/ui-state/pins/notifications). An optional **WebSocket** transport (`/api/ws`) supersedes SSE+POST when reachable — the user picks `Auto / Fastest (WS) / Streaming / Basic` in settings (2×2 toggle, web-only, `localStorage`). When WS is ready, frames + events + input all ride the one full-duplex socket. WS needs three lines in the nginx custom config (`proxy_http_version 1.1`, `proxy_set_header Upgrade $http_upgrade`, `proxy_set_header Connection $http_connection`); without them the client silently falls back to SSE+POST. See `docs/adr/0007-web-websocket-transport.md`. The proxy→CDP hop is still WS. The renderer installs a web `window.cdp` (`src/lib/cdp-web-transport.ts`, a thin assembler) when no preload exists, satisfying the same `CdpBridge` contract; the transport is split into named seams — a **Downlink** (`src/lib/downlink-dispatcher.ts`: one live WS-or-SSE source, decoder→filter→fan-out→toast-once dispatcher) and an **Uplink** (`src/lib/uplink-router.ts`: WS/stream/POST adapters + ready-transport router), with E2E sealed/opened once per direction through `src/lib/crypto-context.ts`. Input is coalesced via `src/lib/input-coalesce.ts`; the proxy acks frames itself, **except** for a WS client that announces ack-after-paint support (a plaintext `frame-ack-mode` control) — for that client the proxy **defers** its remote-ack and gates the next Screencast Frame on the client's post-paint `frame-ack`, so at most one frame is in flight on the link and a slow link can't accrue a stale-frame backlog (`core/frame-ack-gate.js`, the pure one-in-flight gate + a watchdog that frees the slot if a paint-ack never lands; the renderer fires the ack from `viewport.tsx` after it paints, via `window.cdp.ackPaintedFrame`; SSE/non-supporting clients keep the eager self-ack — see `docs/tasks/done/056-*`); theme follows `matchMedia`. **Always-on latency metrics** (`src/lib/latency-metrics.ts`, t057) ride the same seams: the WS uplink fires a plaintext `ping` (monotonic stamp) every 20s — a keepalive against proxy idle-reap plus an RTT/jitter EWMA probe — and the server echoes `{ t: "pong", seq, ts }` (RTT is measured only on the client clock); every Screencast Frame envelope carries a server `serverTs` so the client computes frame age (`now − serverTs + rtt/2`), recorded by the dispatcher before fan-out. Collection runs continuously (no `?perf=1`); the HUD is `src/components/latency-hud.tsx` (t059), always-on in the status bar. RTT/jitter report unavailable on the SSE+POST fallback. A `window.webCaps` flag (read through one accessor — `getCaps()` in `src/lib/caps.ts`, never inline) gates Electron-only surfaces. Local tabs are gated **structurally at the data source**: `useLocalTabs()` (`src/hooks/use-local-tabs.ts`) reads `caps.localTabs` once and returns an empty list + no-op handlers on web, so the renderer can't drive local-tab logic there (`LocalWebviews` never mounts, the new-tab kind toggle is hidden, Cmd+T/Cmd+Shift+T resolve to CDP only). Extensions are still gated at render only. `window.local` is a no-op stub (the safety net, not the mechanism). See `docs/conventions/feature-gates.md`. Pure shared logic lives in `core/` CJS modules — `cdp-endpoints.js` (`/json` URL builders), `settings-store.js` (settings/pins/ui-state), `notifications-sidechain.js` (Notification Side-Channel state machine + store, DI), `remote-page-connector.js` (Remote Page connect choreography, DI), `notifications.js` (dedup/cap/toast gating, Slack workspace key: `parseSlackContext`/`slackGroupKey`), `theme-emulation.js`, `crypto-envelope.js` (AES-256-GCM server side), `line-splitter.js` (NDJSON reassembly), `frame-throttle.js`, `frame-ack-gate.js`, and `quality-tier.js` — consumed by both `main.js` and `web/server.mjs`. Run `pnpm web`. See `docs/adr/0006-web-proxy-sse-transport.md`. The web build is an installable **PWA** (`public/manifest.webmanifest` with `APP_TITLE`-injected name + `public/sw.js`); the manifest is **iPad-targeted** (`"orientation": "landscape"`, `viewport-fit=cover`; `body` uses `100dvh` for full height including Safari URL-bar; safe-area insets are applied per-component — sidebar scroll content uses `pb-[max(0.5rem,env(safe-area-inset-bottom))]`, status bar uses `pb-[env(safe-area-inset-bottom)]`; sidebar defaults to 180px on viewports ≤1100px; an install nudge banner (`install-banner.tsx`) prompts Safari-tab visits to Add to Home Screen). Has a web-only **push-notification** toggle (`webPush` ui-state) that drives real **Web Push** on installed PWAs (iOS 16.4+) — VAPID-signed payloads from the server (`web-push` library) reach a service-worker `push` handler that fires `showNotification` even when the PWA is backgrounded or the screen is locked; clicks post-message back to the page and route through the same `notificationActivate` listeners as in-app clicks. Foreground tabs still get the in-page `Notification` API as before. Subscriptions persist in `web-push-subs.json` next to the settings file. The toggle is disabled in Safari-tab mode (Web Push needs standalone display), and lowers input latency with a **streaming input channel** — one long-lived `POST /api/input-stream` (fetch `ReadableStream` body over HTTP/2, NDJSON frames reassembled by `core/line-splitter.js`) that a probe/`stream-ack` confirms before use and that falls back to `/api/cdp-batch` if a proxy buffers it. Streaming needs `proxy_request_buffering off` upstream to activate; when it can't (the default behind nginx/Authentik), mouse input is **event-driven** so it doesn't flood the fallback: a **hover gate** (`createHoverGate`) holds buttons-up moves and emits one resting position only when the cursor stops (drag moves bypass it and track live; clicks carry their own coords), and the `/api/cdp-batch` fallback is **single-flight with move-collapsing** (`createSingleFlight` — one POST in flight, consecutive `mouseMoved` collapse to the latest) so the rate auto-adapts to link RTT instead of backing up fire-and-forget POSTs and starving clicks. See `docs/tasks/done/013-*`. An optional **E2E mode** (set `E2E_PASSPHRASE` on the server) seals every `/api` body + SSE frame in AES-256-GCM (`core/crypto-envelope.js` server / `src/lib/crypto-envelope.ts` browser; the single owner is `src/lib/crypto-context.ts` — the uplink seals once before leaving, the downlink opens once on arrival) so content stays opaque to a TLS-intercepting proxy (Zscaler); a verifier handshake rejects a wrong passphrase, and with E2E off everything is plaintext as before. It defeats network content inspection, not endpoint screen capture. See `docs/tasks/done/012-*`.
- **Clipboard paste (t065)**: Two gesture-driven one-way bridges — no ambient background sync (focus/permission wall + privacy). **Local→remote text**: ⌘/Ctrl+V reads the local clipboard (`window.cdp.readClipboard()` via Electron IPC / `navigator.clipboard` on web) and calls `RemotePage.paste(text)` → `Input.insertText` (plain) or pre-seed + forwarded ⌘V (rich). **Local→remote image**: `window.cdp.readClipboardImage()` (Electron IPC, reads `clipboard.readImage()`) or the native browser `paste` event (web — Safari/iPad blocks `navigator.clipboard.readText`/images; instead ⌘V is not `preventDefault`ed so the browser fires a `paste` ClipboardEvent on the document); either path calls `RemotePage.pasteImage(dataUrl)` → `Runtime.evaluate` synthesizes a paste `ClipboardEvent` with a `DataTransfer` carrying the image as a `File`. **Typing surface guard**: bare `?` (and other bare-char shortcuts) forward to the remote page when `activeKind` is `cdp` or `local` (`isTypingSurface` in `src/lib/typing-surface.ts`); the shortcut overlay opens via `⌘/` instead. `core/clipboard.js` owns the pure `Browser.grantPermissions` enum-fallback helpers and `selectPasteRoute`.
- **Notification tab keep-alive (t066)**: Chromium freezes idle background tabs (~5 min), pausing the page JS that the capture script hooks (`window.Notification`) — so background tabs silently stop delivering notifications and only the active tab notified. The side-channel now sends `Page.setWebLifecycleState({state:"active"})` on attach and re-applies it every `reconcile` (the browser can re-freeze). This un-freezes the tab **without** making it "visible" (verified against the CDP spec: `setWebLifecycleState` only takes `"frozen"|"active"` and governs freeze state, not `document.visibilityState`), so Slack still treats the tab as hidden and keeps firing desktop notifications for the side-channel to capture. The keep-alive lives in `core/notifications-sidechain.js` (`sideChannels` map value is now `{ ws, keepAlive }`), so both Electron and the headless web server benefit. **Out of scope (→ t067):** notifications raised from a service-worker `push` handler (`registration.showNotification`) run in a separate realm the page hook can't reach.
- **Notification favicon (t066, Electron)**: The OS notification banner and the macOS dock icon carry the source app's favicon so you can tell *which* app pinged you. `dockOverlayIcon(list)` (pure, `core/notifications.js`) picks the newest-unread entry's icon (null when all read → restore plain icon). main.js fetches the favicon bytes (no browser CORS wall), passes them as a data URL into the chrome renderer via `executeJavaScript` to composite base-icon + favicon-bottom-right (the renderer's `<img>` decodes `.ico`; data-URL inputs never taint the canvas), and turns the returned PNG data URLs into `nativeImage`s for `app.dock.setIcon` + the `Notification` `icon`. Synced on every new entry, mark-read/unread/all, clear, and launch.
- **Notifications side-channel**: A per-target read-only CDP socket (no screencast, no input) stays attached to background tabs that match a Notification Adapter (Teams, Outlook, Slack). Lifecycle and state machine live in `core/notifications-sidechain.js` (`createNotificationCenter`, DI) — consumed by both `main.js` and `web/server.mjs`; the server runs it headless. A capture script (per adapter, in `inject/`) is injected at document-start and ships toasts through a `__cdpNotify` binding. Pure dedup/cap/read-model helpers remain in `core/notifications.js`. Each adapter carries a `name`, hostname `match` regex, capture `script`, `iconUrl`, optional `activate` tagged union (`spa-link` | `thread`) for deep-opens, and an optional `groupKey(url)` hook (URL-derived per-workspace bucketing) — adding an adapter is one config entry in `ADAPTERS`. Capture style varies by site: Teams/Outlook use a `MutationObserver` on the site's own in-app toast DOM; **Slack has no in-app toast**, so its script (`inject/slack-notify.js`, t064) hijacks the Web Notifications API at document-start — it patches `window.Notification` to intercept every fired notification, and forces `Notification.permission` → `"granted"` so Slack actually fires (a remote browser's permission is often `"default"`, which would otherwise suppress all notifications; service-worker `push`-handler notifications are out of scope — a separate JS realm unreachable from the page script). Multiple Slack workspaces (one tab per workspace — switched-away workspaces in a single tab aren't running JS) share the `app.slack.com` origin, so per-origin grouping would merge them; the Slack adapter's `groupKey(url)` derives `slack:{teamId}` from the tab URL (`slackGroupKey`/`parseSlackContext` in `core/notifications.js`; `T…` standard or `E…` Enterprise Grid, legacy subdomain fallback) to keep per-workspace unread counts distinct. Clicking a notification activates the tab, then the renderer's activation registry (`src/lib/notification-activation.ts`) maps the `activate` intent to a Remote Page intention (`navigateSpa` for Outlook + Slack channel deep-links, `openTeamsThread` for Teams chats). Teams has no conversation URL (the URL stays bare `/v2/`), so thread-id clicks drive `openTeamsThread`; Slack reuses `spa-link` to `/client/{team}/{channel}` (best-effort — degrades to tab-only when the notification carries no channel id). See `docs/adr/0003-notifications-side-channel.md`.

## File Structure
Expand Down
29 changes: 24 additions & 5 deletions core/notifications-sidechain.js
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ function createNotificationCenter(deps) {
// back-compatible display.
activate: n.activate || null,
targetEntity: n.targetEntity || null,
icon: (adapter || {}).iconUrl || null,
icon: adapter?.iconUrl || null,
ts: n.ts || (deps.now ? deps.now() : Date.now()),
},
cap,
Expand All @@ -117,17 +117,32 @@ function createNotificationCenter(deps) {
const adapter = adapterFor(target.url)
if (!adapter || !target.webSocketDebuggerUrl) return
const ws = new WebSocketCtor(target.webSocketDebuggerUrl)
sideChannels.set(target.id, ws)
let cmdId = 1
let opened = false
const cdp = (method, params) =>
ws.send(JSON.stringify({ id: cmdId++, method, params: params || {} }))
// Keep the remote Tab's page alive so its capture script keeps firing even when the
// Tab is backgrounded on the remote browser. Chromium freezes idle background tabs
// (~5 min), which pauses the page JS that calls `new Notification()` — so background
// Tabs silently stop delivering toasts (the asymmetry where only the active Tab
// notified). Forcing the web lifecycle to "active" prevents the freeze WITHOUT making
// the page "visible" (visibility is orthogonal in the CDP spec — verified against
// Page.setWebLifecycleState, which only takes "frozen"|"active"), so Slack still treats
// the Tab as hidden and keeps firing desktop notifications for the side-channel to
// capture. Re-applied every reconcile because the browser can re-freeze. See t066.
const keepAlive = () => {
if (opened) cdp("Page.setWebLifecycleState", { state: "active" })
}
sideChannels.set(target.id, { ws, keepAlive })
ws.on("open", () => {
opened = true
cdp("Runtime.enable")
cdp("Page.enable")
cdp("Runtime.addBinding", { name: NOTIFY_BINDING })
// document-start for future loads + the already-loaded document.
cdp("Page.addScriptToEvaluateOnNewDocument", { source: sourceFor(adapter) })
cdp("Runtime.evaluate", { expression: sourceFor(adapter) })
keepAlive()
})
ws.on("message", (data) => {
try {
Expand All @@ -138,7 +153,8 @@ function createNotificationCenter(deps) {
} catch {}
})
const drop = () => {
if (sideChannels.get(target.id) === ws) sideChannels.delete(target.id)
const cur = sideChannels.get(target.id)
if (cur && cur.ws === ws) sideChannels.delete(target.id)
}
ws.on("close", drop)
ws.on("error", drop)
Expand All @@ -158,7 +174,7 @@ function createNotificationCenter(deps) {
if (!Array.isArray(list)) return
const matched = list.filter((t) => t.type === "page" && adapterFor(t.url))
const liveIds = new Set(matched.map((t) => t.id))
for (const [id, ws] of sideChannels) {
for (const [id, { ws }] of sideChannels) {
if (!liveIds.has(id)) {
try {
ws.close()
Expand All @@ -167,6 +183,9 @@ function createNotificationCenter(deps) {
}
}
for (const t of matched) if (!sideChannels.has(t.id)) attach(t)
// Re-apply keep-alive to every live side-channel each cycle — the browser may have
// re-frozen a backgrounded Tab since the last pass (t066).
for (const [, ch] of sideChannels) ch.keepAlive()
}

return {
Expand Down Expand Up @@ -195,7 +214,7 @@ function createNotificationCenter(deps) {
},
unreadCount: () => unreadCount(notifications),
close: () => {
for (const [, ws] of sideChannels) {
for (const [, { ws }] of sideChannels) {
try {
ws.close()
} catch {}
Expand Down
30 changes: 30 additions & 0 deletions core/notifications-sidechain.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,36 @@ describe("reconcile — idempotent / drop", () => {
})
})

describe("keep-alive (t066) — prevent background-tab freeze", () => {
it("forces the page web lifecycle to active on open so a backgrounded Tab keeps firing notifications", async () => {
const { center } = makeCenter()
await center.reconcile([teamsTarget()])
const ws = FakeWs.instances[0]
ws.open()
const keepAlive = ws.sent.filter((m) => m.method === "Page.setWebLifecycleState")
expect(keepAlive).toHaveLength(1)
expect(keepAlive[0].params.state).toBe("active")
})

it("does not send keep-alive before the socket opens", async () => {
const { center } = makeCenter()
await center.reconcile([teamsTarget()])
const ws = FakeWs.instances[0]
const count = ws.sent.filter((m) => m.method === "Page.setWebLifecycleState").length
expect(count).toBe(0)
})

it("re-applies keep-alive on each reconcile (browser may re-freeze the Tab)", async () => {
const { center } = makeCenter()
await center.reconcile([teamsTarget()])
const ws = FakeWs.instances[0]
ws.open()
await center.reconcile([teamsTarget()]) // same target, socket already open
const count = ws.sent.filter((m) => m.method === "Page.setWebLifecycleState").length
expect(count).toBe(2)
})
})

describe("ingest dedup", () => {
it("drops a duplicate toast within the dedup window — one stored entry, one onEntry", async () => {
const { center, onEntry } = makeCenter()
Expand Down
10 changes: 10 additions & 0 deletions core/notifications.js
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,15 @@ function unreadCount(list) {
return list.reduce((acc, n) => acc + (n.read ? 0 : 1), 0)
}

// The favicon to overlay on the app's dock icon: the icon of the most-recent UNREAD
// notification (the list is newest-first), or null when nothing is unread (clear the
// overlay, restore the plain app icon). Pure — main.js owns the image composite +
// app.dock.setIcon effect. See t066.
function dockOverlayIcon(list) {
const newestUnread = list.find((n) => !n.read)
return newestUnread?.icon || null
}

// { [targetId]: unreadCount } — only targets with at least one unread appear.
function unreadByTarget(list) {
const out = {}
Expand All @@ -115,6 +124,7 @@ module.exports = {
slackGroupKey,
ingest,
shouldNotifyOs,
dockOverlayIcon,
markRead,
markUnread,
markAllRead,
Expand Down
Loading