diff --git a/CLAUDE.md b/CLAUDE.md index b827eca..a7356c8 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -53,7 +53,10 @@ A lightweight Electron app that connects to a remote Chromium-based browser via - **Unread badges by group**: Sidebar unread counts are computed by `aggregateUnread` (`src/lib/unread-aggregator.ts`) and keyed by `groupKey` (from the notification entry) falling back to `groupKeyForUrl(url)` — Slack's per-workspace `slack:{teamId}`, else URL origin. Every tab/pin of the same app shares one count whether or not it captured the notification, and a dormant pin still badges by resolving its saved URL through the same key derivation. - **Local tabs**: Real local web pages rendered as in-DOM Electron ``s on a shared `persist:local` session (`src/components/local-webviews.tsx`) — full device access (OS notifications, speaker/mic, camera, screen-share) that CDP screencast tabs can't have. Because a `` is an in-page OOPIF, React overlays (dialogs, menus, tooltips, the settings sheet) stack **above the live page via CSS z-index** — no native z-order, no freeze. `activeKind: 'cdp' | 'local'` chooses the surface and routes the toolbar/nav hotkeys (`RemotePage` vs the active webview's methods). The renderer holds `LocalTab` metadata and maps webview DOM events to it; only the active webview is shown (others `display:none`, kept alive in the background). All open local tabs persist + restore on launch; pinned ones (a `pinned` flag, distinct from CDP PINNED pins) sort atop the LOCAL TABS section. Unpacked MV3 extensions load into the local session only (`localExtensionPaths`) and their content scripts inject into webview guests; the toolbar shows a Chrome-like action icon per extension (opens its popup in a popover), and popup/options also open as a local tab via the `chrome-extension://` URL. Permissions auto-granted behind the `autoGrantLocalMedia` setting (a `media` request triggers `askForMediaAccess`); packaging ships mic/cam/audio-capture Info.plist keys + entitlements (`build/entitlements.mac.plist`, hardened runtime). See `docs/adr/0005-local-tabs-base-window.md`. - **Web build (no Electron)**: The same renderer runs as a plain web app via `web/server.mjs` — a Node HTTP proxy that serves the built `dist/` and exposes the whole `window.cdp` surface over **SSE** (`GET /api/events`, server→browser pushes incl. screencast frames) + **POST** (`/api/invoke`, `/api/send`, `/api/cdp-batch`, and REST for tabs/config/ui-state/pins/notifications). An optional **WebSocket** transport (`/api/ws`) supersedes SSE+POST when reachable — the user picks `Auto / Fastest (WS) / Streaming / Basic` in settings (2×2 toggle, web-only, `localStorage`). When WS is ready, frames + events + input all ride the one full-duplex socket. WS needs three lines in the nginx custom config (`proxy_http_version 1.1`, `proxy_set_header Upgrade $http_upgrade`, `proxy_set_header Connection $http_connection`); without them the client silently falls back to SSE+POST. See `docs/adr/0007-web-websocket-transport.md`. The proxy→CDP hop is still WS. The renderer installs a web `window.cdp` (`src/lib/cdp-web-transport.ts`, a thin assembler) when no preload exists, satisfying the same `CdpBridge` contract; the transport is split into named seams — a **Downlink** (`src/lib/downlink-dispatcher.ts`: one live WS-or-SSE source, decoder→filter→fan-out→toast-once dispatcher) and an **Uplink** (`src/lib/uplink-router.ts`: WS/stream/POST adapters + ready-transport router), with E2E sealed/opened once per direction through `src/lib/crypto-context.ts`. Input is coalesced via `src/lib/input-coalesce.ts`; the proxy acks frames itself, **except** for a WS client that announces ack-after-paint support (a plaintext `frame-ack-mode` control) — for that client the proxy **defers** its remote-ack and gates the next Screencast Frame on the client's post-paint `frame-ack`, so at most one frame is in flight on the link and a slow link can't accrue a stale-frame backlog (`core/frame-ack-gate.js`, the pure one-in-flight gate + a watchdog that frees the slot if a paint-ack never lands; the renderer fires the ack from `viewport.tsx` after it paints, via `window.cdp.ackPaintedFrame`; SSE/non-supporting clients keep the eager self-ack — see `docs/tasks/done/056-*`); theme follows `matchMedia`. **Always-on latency metrics** (`src/lib/latency-metrics.ts`, t057) ride the same seams: the WS uplink fires a plaintext `ping` (monotonic stamp) every 20s — a keepalive against proxy idle-reap plus an RTT/jitter EWMA probe — and the server echoes `{ t: "pong", seq, ts }` (RTT is measured only on the client clock); every Screencast Frame envelope carries a server `serverTs` so the client computes frame age (`now − serverTs + rtt/2`), recorded by the dispatcher before fan-out. Collection runs continuously (no `?perf=1`); the HUD is `src/components/latency-hud.tsx` (t059), always-on in the status bar. RTT/jitter report unavailable on the SSE+POST fallback. A `window.webCaps` flag (read through one accessor — `getCaps()` in `src/lib/caps.ts`, never inline) gates Electron-only surfaces. Local tabs are gated **structurally at the data source**: `useLocalTabs()` (`src/hooks/use-local-tabs.ts`) reads `caps.localTabs` once and returns an empty list + no-op handlers on web, so the renderer can't drive local-tab logic there (`LocalWebviews` never mounts, the new-tab kind toggle is hidden, Cmd+T/Cmd+Shift+T resolve to CDP only). Extensions are still gated at render only. `window.local` is a no-op stub (the safety net, not the mechanism). See `docs/conventions/feature-gates.md`. Pure shared logic lives in `core/` CJS modules — `cdp-endpoints.js` (`/json` URL builders), `settings-store.js` (settings/pins/ui-state), `notifications-sidechain.js` (Notification Side-Channel state machine + store, DI), `remote-page-connector.js` (Remote Page connect choreography, DI), `notifications.js` (dedup/cap/toast gating, Slack workspace key: `parseSlackContext`/`slackGroupKey`), `theme-emulation.js`, `crypto-envelope.js` (AES-256-GCM server side), `line-splitter.js` (NDJSON reassembly), `frame-throttle.js`, `frame-ack-gate.js`, and `quality-tier.js` — consumed by both `main.js` and `web/server.mjs`. Run `pnpm web`. See `docs/adr/0006-web-proxy-sse-transport.md`. The web build is an installable **PWA** (`public/manifest.webmanifest` with `APP_TITLE`-injected name + `public/sw.js`); the manifest is **iPad-targeted** (`"orientation": "landscape"`, `viewport-fit=cover`; `body` uses `100dvh` for full height including Safari URL-bar; safe-area insets are applied per-component — sidebar scroll content uses `pb-[max(0.5rem,env(safe-area-inset-bottom))]`, status bar uses `pb-[env(safe-area-inset-bottom)]`; sidebar defaults to 180px on viewports ≤1100px; an install nudge banner (`install-banner.tsx`) prompts Safari-tab visits to Add to Home Screen). Has a web-only **push-notification** toggle (`webPush` ui-state) that drives real **Web Push** on installed PWAs (iOS 16.4+) — VAPID-signed payloads from the server (`web-push` library) reach a service-worker `push` handler that fires `showNotification` even when the PWA is backgrounded or the screen is locked; clicks post-message back to the page and route through the same `notificationActivate` listeners as in-app clicks. Foreground tabs still get the in-page `Notification` API as before. Subscriptions persist in `web-push-subs.json` next to the settings file. The toggle is disabled in Safari-tab mode (Web Push needs standalone display), and lowers input latency with a **streaming input channel** — one long-lived `POST /api/input-stream` (fetch `ReadableStream` body over HTTP/2, NDJSON frames reassembled by `core/line-splitter.js`) that a probe/`stream-ack` confirms before use and that falls back to `/api/cdp-batch` if a proxy buffers it. Streaming needs `proxy_request_buffering off` upstream to activate; when it can't (the default behind nginx/Authentik), mouse input is **event-driven** so it doesn't flood the fallback: a **hover gate** (`createHoverGate`) holds buttons-up moves and emits one resting position only when the cursor stops (drag moves bypass it and track live; clicks carry their own coords), and the `/api/cdp-batch` fallback is **single-flight with move-collapsing** (`createSingleFlight` — one POST in flight, consecutive `mouseMoved` collapse to the latest) so the rate auto-adapts to link RTT instead of backing up fire-and-forget POSTs and starving clicks. See `docs/tasks/done/013-*`. An optional **E2E mode** (set `E2E_PASSPHRASE` on the server) seals every `/api` body + SSE frame in AES-256-GCM (`core/crypto-envelope.js` server / `src/lib/crypto-envelope.ts` browser; the single owner is `src/lib/crypto-context.ts` — the uplink seals once before leaving, the downlink opens once on arrival) so content stays opaque to a TLS-intercepting proxy (Zscaler); a verifier handshake rejects a wrong passphrase, and with E2E off everything is plaintext as before. It defeats network content inspection, not endpoint screen capture. See `docs/tasks/done/012-*`. -- **Clipboard paste (t065)**: Two gesture-driven one-way bridges — no ambient background sync (focus/permission wall + privacy). **Local→remote text**: ⌘/Ctrl+V reads the local clipboard (`window.cdp.readClipboard()` via Electron IPC / `navigator.clipboard` on web) and calls `RemotePage.paste(text)` → `Input.insertText` (plain) or pre-seed + forwarded ⌘V (rich). **Local→remote image**: `window.cdp.readClipboardImage()` (Electron IPC, reads `clipboard.readImage()`) or the native browser `paste` event (web — Safari/iPad blocks `navigator.clipboard.readText`/images; instead ⌘V is not `preventDefault`ed so the browser fires a `paste` ClipboardEvent on the document); either path calls `RemotePage.pasteImage(dataUrl)` → `Runtime.evaluate` synthesizes a paste `ClipboardEvent` with a `DataTransfer` carrying the image as a `File`. **Typing surface guard**: bare `?` (and other bare-char shortcuts) forward to the remote page when `activeKind` is `cdp` or `local` (`isTypingSurface` in `src/lib/typing-surface.ts`); the shortcut overlay opens via `⌘/` instead. `core/clipboard.js` owns the pure `Browser.grantPermissions` enum-fallback helpers and `selectPasteRoute`. +- **Clipboard paste (t065)**: Two gesture-driven one-way bridges — no ambient background sync (focus/permission wall + privacy). **Local→remote text**: ⌘/Ctrl+V reads the local clipboard (`window.cdp.readClipboard()` via Electron IPC / `navigator.clipboard` on web) and calls `RemotePage.paste(text)` → `Input.insertText` (plain) or pre-seed + forwarded ⌘V (rich). **Local→remote image**: `window.cdp.readClipboardImage()` (Electron IPC, reads `clipboard.readImage()`) or the native browser `paste` event (web — Safari/iPad blocks `navigator.clipboard.readText`/images; instead ⌘V is not `preventDefault`ed so the browser fires a `paste` ClipboardEvent on the document); either path calls `RemotePage.pasteImage(dataUrl)` → `Runtime.evaluate` synthesizes a paste `ClipboardEvent` with a `DataTransfer` carrying the image as a `File`. **Local→remote file (video/doc, t068)**: a copied *file* (not raw image bits) is read as the actual file — `clipboard.readImage()` only yields the file's icon thumbnail, so Electron `window.cdp.readClipboardFiles()` reads the path from the clipboard's `public.file-url` and returns `{ name, type, dataUrl }` (mime from `core/clipboard.js` `mimeForName`; main has fs access, no CORS wall; 150 MB cap). The Cmd+V handler tries files → image bits → text in that order; web reads any file kind off the native `paste` event (not just `image/`). Both call `RemotePage.pasteFile(dataUrl, name, type)`, which synthesizes the same paste event but preserves the real name + MIME so upload targets accept a video. `pasteImage` is now a thin wrapper over `pasteFile`. **Drag-and-drop (t068)**: clipboard-as-file is flaky (a copied video may expose only an icon bitmap, not a `public.file-url`), so the robust path is dropping the file onto the screencast canvas — `viewport.tsx` reads the dropped `File`s via `FileReader` (identical on Electron + web, no clipboard ambiguity) and calls `RemotePage.dropFiles(specs, clientX, clientY)`, which maps the drop point through the Input-Forwarding coord resolver and synthesizes `dragenter`/`dragover`/`drop` DragEvents (carrying the files in a `DataTransfer`) on the remote element under the cursor. A window-level guard `preventDefault`s file drags outside an editable field / local webview so an errant drop can't navigate the Electron window to `file://`. **Chunked transfer (load-bearing):** both `pasteFile` and `dropFiles` are async and stream the file's base64 to the remote in ~2 MB chunks (`streamFiles` accumulates into `window.__cdpFiles[key]`; `assembleFilesExpr` joins + decodes into a real `File` then dispatches) — embedding a whole 62 MB video as one `Runtime.evaluate` source literal froze the renderer parsing it (and the screencast shares the CDP socket). Per-file size is capped at 100 MB on drop (toasted when skipped). Clipboard *paste* of a file stays OS-dependent (a copied video may expose only an icon bitmap, no `public.file-url`), so drag-drop is the reliable path. **Typing surface guard**: bare `?` (and other bare-char shortcuts) forward to the remote page when `activeKind` is `cdp` or `local` (`isTypingSurface` in `src/lib/typing-surface.ts`); the shortcut overlay opens via `⌘/` instead. `core/clipboard.js` owns the pure `Browser.grantPermissions` enum-fallback helpers and `selectPasteRoute`. +- **Notification tab keep-alive (t066)**: Chromium freezes idle background tabs (~5 min), pausing the page JS that the capture script hooks (`window.Notification`) — so background tabs silently stop delivering notifications and only the active tab notified. The side-channel now sends `Page.setWebLifecycleState({state:"active"})` on attach and re-applies it every `reconcile` (the browser can re-freeze). This un-freezes the tab **without** making it "visible" (verified against the CDP spec: `setWebLifecycleState` only takes `"frozen"|"active"` and governs freeze state, not `document.visibilityState`), so Slack still treats the tab as hidden and keeps firing desktop notifications for the side-channel to capture. The keep-alive lives in `core/notifications-sidechain.js` (`sideChannels` map value is now `{ ws, keepAlive }`), so both Electron and the headless web server benefit. +- **Service-worker push capture (t067, Slack)**: Slack delivers many notifications from its service worker's `push` handler via `registration.showNotification` — a separate realm the page hook can't reach. The side-channel now also attaches to the matching `service_worker` target (a Slack adapter `swScript`) and injects `inject/slack-sw-notify.js` via `Runtime.evaluate` (a worker has no `Page` domain, so no document-start hook + no keep-alive), patching `ServiceWorkerRegistration.prototype.showNotification` to ship the same `__cdpNotify` toasts. The single Slack SW serves every workspace (origin-level), so the SW URL has no team id — the script derives the per-workspace `groupKey` from the notification payload (defensive probe, logged once for HITL tightening). **Known gap:** a worker that spins up fresh on a push and fires before the next 5s reconcile attaches is missed (no SW-start barrier; a hardened version would use a browser-level `Target.setAutoAttach({waitForDebuggerOnStart:true})`). The t066 keep-alive keeps the registration warm enough to stay listed across reconciles in the common case. +- **Notification favicon (t066, Electron)**: The OS notification banner and the macOS dock icon carry the source app's favicon so you can tell *which* app pinged you. `dockOverlayIcon(list)` (pure, `core/notifications.js`) picks the newest-unread entry's icon (null when all read → restore plain icon). main.js fetches the favicon bytes (no browser CORS wall), passes them as a data URL into the chrome renderer via `executeJavaScript` to composite base-icon + favicon-bottom-right (the renderer's `` decodes `.ico`; data-URL inputs never taint the canvas), and turns the returned PNG data URLs into `nativeImage`s for `app.dock.setIcon` + the `Notification` `icon`. Synced on every new entry, mark-read/unread/all, clear, and launch. - **Notifications side-channel**: A per-target read-only CDP socket (no screencast, no input) stays attached to background tabs that match a Notification Adapter (Teams, Outlook, Slack). Lifecycle and state machine live in `core/notifications-sidechain.js` (`createNotificationCenter`, DI) — consumed by both `main.js` and `web/server.mjs`; the server runs it headless. A capture script (per adapter, in `inject/`) is injected at document-start and ships toasts through a `__cdpNotify` binding. Pure dedup/cap/read-model helpers remain in `core/notifications.js`. Each adapter carries a `name`, hostname `match` regex, capture `script`, `iconUrl`, optional `activate` tagged union (`spa-link` | `thread`) for deep-opens, and an optional `groupKey(url)` hook (URL-derived per-workspace bucketing) — adding an adapter is one config entry in `ADAPTERS`. Capture style varies by site: Teams/Outlook use a `MutationObserver` on the site's own in-app toast DOM; **Slack has no in-app toast**, so its script (`inject/slack-notify.js`, t064) hijacks the Web Notifications API at document-start — it patches `window.Notification` to intercept every fired notification, and forces `Notification.permission` → `"granted"` so Slack actually fires (a remote browser's permission is often `"default"`, which would otherwise suppress all notifications; service-worker `push`-handler notifications are out of scope — a separate JS realm unreachable from the page script). Multiple Slack workspaces (one tab per workspace — switched-away workspaces in a single tab aren't running JS) share the `app.slack.com` origin, so per-origin grouping would merge them; the Slack adapter's `groupKey(url)` derives `slack:{teamId}` from the tab URL (`slackGroupKey`/`parseSlackContext` in `core/notifications.js`; `T…` standard or `E…` Enterprise Grid, legacy subdomain fallback) to keep per-workspace unread counts distinct. Clicking a notification activates the tab, then the renderer's activation registry (`src/lib/notification-activation.ts`) maps the `activate` intent to a Remote Page intention (`navigateSpa` for Outlook + Slack channel deep-links, `openTeamsThread` for Teams chats). Teams has no conversation URL (the URL stays bare `/v2/`), so thread-id clicks drive `openTeamsThread`; Slack reuses `spa-link` to `/client/{team}/{channel}` (best-effort — degrades to tab-only when the notification carries no channel id). See `docs/adr/0003-notifications-side-channel.md`. ## File Structure @@ -73,7 +76,7 @@ cdp-browser/ │ ├── frame-throttle.js # Pure screencast rate throttle (createFrameThrottle, DI clock; fresh-frame-wins) │ ├── frame-ack-gate.js # Pure one-in-flight gate + watchdog for WS paint-ack backpressure (t056) │ ├── quality-tier.js # Pure Sharp/Balanced/Snappy screencast preset owner (tierToParams/DEFAULT_TIER/parseTier) -│ ├── clipboard.js # Pure clipboard helpers: grantPermissions enum-fallback builders + selectPasteRoute (t065) +│ ├── clipboard.js # Pure clipboard helpers: grantPermissions enum-fallback builders + selectPasteRoute (t065) + mimeForName (extension→MIME for file paste) │ └── crypto-envelope.js # Server-side AES-256-GCM seal/open (E2E mode); mirrors src/lib/crypto-envelope.ts ├── web/ │ └── server.mjs # Web build backend: serves dist/ + SSE/POST/WS proxy + streaming input (no Electron). See ADR-0006, ADR-0007 @@ -84,7 +87,8 @@ cdp-browser/ ├── inject/ │ ├── teams-notify.js # MutationObserver capture script injected into Teams pages │ ├── outlook-notify.js # MutationObserver capture for OWA NotificationPane; ships activate intent -│ └── slack-notify.js # Web Notifications API hijack (no in-app toast); forces permission granted, ships per-channel spa-link (t064) +│ ├── slack-notify.js # Web Notifications API hijack (no in-app toast); forces permission granted, ships per-channel spa-link (t064) +│ └── slack-sw-notify.js # Service-worker realm: patches ServiceWorkerRegistration.prototype.showNotification to capture SW push notifications (t067) ├── scripts/ │ └── install-local.sh # Build + install to /Applications (strips quarantine) ├── docs/ @@ -178,7 +182,7 @@ on pre-existing errors in untouched files). - Screencast frames are **CSS-resolution** (`Page.startScreencast` ignores `deviceScaleFactor`), so on a high-DPI display they're upscaled and look soft. Sharp device-resolution frames are only available via `Page.captureScreenshot`, which is too heavy to stream and color-shifts vs the screencast — see `docs/adr/0002-adaptive-viewport.md`. Not currently fixed. - Text input goes through `Input.dispatchKeyEvent`. macOS-reserved combos (Cmd+H hide, Cmd+M minimize, Cmd+Q quit, Ctrl+Cmd+F fullscreen, Cmd+` cycle windows) are detected by `isOsReservedKey` in `src/lib/key-routing.ts` and fall through to native macOS handlers rather than being forwarded. Common editing shortcuts (Cmd/Alt + arrows, line/word deletion) are translated to Blink editor commands, but full IME (CJK composition) is not supported. - **Finger touch is a lightweight mapping, not native touch.** A single finger on the screencast canvas drives the page through the existing mouse/wheel pipeline (ADR-0009): drag → `mouseWheel` scroll, tap → click, long-press → right-click. Gesture classification lives in the pure `src/lib/touch-gesture.ts`; `viewport.tsx` maps it onto `forwardInput` (so `toRemoteCoords` applies unchanged). Pinch-zoom, momentum/inertial scrolling, multi-touch, and full `Input.dispatchTouchEvent` are out (deferred to v0.2). Touch pointers are handled separately from the mouse path and `preventDefault`ed so iPad Safari's synthesized mouse events don't double-fire. -- No file download/upload support. +- No file download support, and no file-picker (``) upload. Files can be **dropped** onto the canvas or **pasted** (t068) — both inject the bytes as a base64 data URL over CDP, so large media (multi-hundred-MB video) is capped (100 MB on drop, toasted when skipped). A true file-picker upload would need `DOM.setFileInputFiles` plumbing. - Tab favicons may not load if the remote browser blocks cross-origin favicon requests. ## Troubleshooting diff --git a/build/notif-icons/outlook.png b/build/notif-icons/outlook.png new file mode 100644 index 0000000..53ebc5b Binary files /dev/null and b/build/notif-icons/outlook.png differ diff --git a/build/notif-icons/slack.png b/build/notif-icons/slack.png new file mode 100644 index 0000000..3f4292e Binary files /dev/null and b/build/notif-icons/slack.png differ diff --git a/build/notif-icons/teams.png b/build/notif-icons/teams.png new file mode 100644 index 0000000..de70187 Binary files /dev/null and b/build/notif-icons/teams.png differ diff --git a/core/clipboard.js b/core/clipboard.js index 5f82fe8..49e2ab4 100644 --- a/core/clipboard.js +++ b/core/clipboard.js @@ -15,7 +15,7 @@ * @param {string} [origin] - Optional origin to scope permissions. If omitted, applies to all. * @returns {{origin?: string, permissions: string[]}} Payload for Browser.grantPermissions. */ -export function buildClipboardPermissionsModern(origin) { +function buildClipboardPermissionsModern(origin) { const payload = { permissions: ["clipboardRead", "clipboardWrite"], } @@ -31,7 +31,7 @@ export function buildClipboardPermissionsModern(origin) { * @param {string} [origin] * @returns {{origin?: string, permissions: string[]}} */ -export function buildClipboardPermissionsLegacy(origin) { +function buildClipboardPermissionsLegacy(origin) { const payload = { permissions: ["clipboardReadWrite", "clipboardSanitizedWrite"], } @@ -51,7 +51,56 @@ export function buildClipboardPermissionsLegacy(origin) { * @returns {string} .route - Either 'insertText' (plain) or 'preseed' (rich). * @returns {string} .reason - Human-readable explanation. */ -export function selectPasteRoute(focusDescriptor) { +/** + * Minimal extension → MIME map for clipboard file paste. Covers the file kinds a + * user is likely to copy-paste into a remote page (images + video + a few docs); + * anything unknown falls back to application/octet-stream so the remote `File` + * still carries bytes + a name (the target site sniffs content / extension). + * + * The map is the source of truth for paste mime — `clipboard.readImage()` only + * yields a thumbnail icon for a copied *file*, so the file path is read directly + * and its type derived from the name here. + * + * @param {string} name - File name or path. + * @returns {string} MIME type. + */ +function mimeForName(name) { + const ext = String(name || "") + .toLowerCase() + .split(".") + .pop() + const map = { + // images + png: "image/png", + jpg: "image/jpeg", + jpeg: "image/jpeg", + gif: "image/gif", + webp: "image/webp", + bmp: "image/bmp", + svg: "image/svg+xml", + heic: "image/heic", + avif: "image/avif", + // video + mp4: "video/mp4", + m4v: "video/mp4", + mov: "video/quicktime", + webm: "video/webm", + mkv: "video/x-matroska", + avi: "video/x-msvideo", + // audio + mp3: "audio/mpeg", + wav: "audio/wav", + m4a: "audio/mp4", + ogg: "audio/ogg", + // docs + pdf: "application/pdf", + txt: "text/plain", + zip: "application/zip", + } + return (ext && map[ext]) || "application/octet-stream" +} + +function selectPasteRoute(focusDescriptor) { const { isContentEditable = false, isRichEditor = false } = focusDescriptor || {} if (isContentEditable || isRichEditor) { @@ -67,3 +116,10 @@ export function selectPasteRoute(focusDescriptor) { reason: "Plain input; use Input.insertText for direct text insertion", } } + +module.exports = { + buildClipboardPermissionsModern, + buildClipboardPermissionsLegacy, + mimeForName, + selectPasteRoute, +} diff --git a/core/clipboard.test.mjs b/core/clipboard.test.mjs index 401afbc..9904a54 100644 --- a/core/clipboard.test.mjs +++ b/core/clipboard.test.mjs @@ -2,9 +2,30 @@ import { describe, expect, it } from "vitest" import { buildClipboardPermissionsLegacy, buildClipboardPermissionsModern, + mimeForName, selectPasteRoute, } from "./clipboard.js" +describe("mimeForName", () => { + it("maps common video extensions", () => { + expect(mimeForName("clip.mp4")).toBe("video/mp4") + expect(mimeForName("movie.MOV")).toBe("video/quicktime") + expect(mimeForName("a.webm")).toBe("video/webm") + }) + + it("maps common image extensions", () => { + expect(mimeForName("pic.png")).toBe("image/png") + expect(mimeForName("photo.JPG")).toBe("image/jpeg") + expect(mimeForName("anim.gif")).toBe("image/gif") + }) + + it("falls back to octet-stream for unknown or missing extensions", () => { + expect(mimeForName("noext")).toBe("application/octet-stream") + expect(mimeForName("archive.zzz")).toBe("application/octet-stream") + expect(mimeForName("")).toBe("application/octet-stream") + }) +}) + describe("clipboard permissions", () => { describe("buildClipboardPermissionsModern", () => { it("returns modern permission names without origin", () => { diff --git a/core/notifications-sidechain.js b/core/notifications-sidechain.js index e6a3adc..6b67207 100644 --- a/core/notifications-sidechain.js +++ b/core/notifications-sidechain.js @@ -45,12 +45,20 @@ const ADAPTERS = [ name: "slack", script: "slack-notify.js", match: (h) => /(^|\.)slack\.com$/.test(h), - iconUrl: "https://a.slack-edge.com/80588/marketing/img/icons/favicon-32-electron.png", + // Renderer-bell icon only (the OS banner + dock use bundled build/notif-icons/slack.png). + // The old favicon-32-electron.png path 404/403'd; app-256.png is a stable Slack logo. + iconUrl: "https://a.slack-edge.com/80588/img/icons/app-256.png", // Slack runs every workspace under one origin (app.slack.com), so the default // per-origin grouping would merge all workspaces into one badge. Derive the group // key from the Tab's URL team id instead — one Tab per workspace, so the URL is the // authoritative workspace identity (more durable than the in-page capture script). groupKey: slackGroupKey, + // Slack delivers many notifications from its service worker's `push` handler + // (`registration.showNotification`), a realm the page hook can't reach. `swScript` + // is injected into the matching service_worker target to patch showNotification there + // and ship the same `__cdpNotify` toasts (t067). The SW URL carries no team id, so the + // script derives the per-workspace groupKey from the notification payload instead. + swScript: "slack-sw-notify.js", }, ] @@ -67,6 +75,15 @@ function createNotificationCenter(deps) { if (!sourceCache.has(adapter.name)) sourceCache.set(adapter.name, readInject(adapter.script)) return sourceCache.get(adapter.name) } + // Service-worker capture script, memoized per adapter (separate cache key so it never + // collides with the page `script`). Only adapters that declare `swScript` have one. + const swSourceCache = new Map() + const swSourceFor = (adapter) => { + if (!swSourceCache.has(adapter.name)) { + swSourceCache.set(adapter.name, readInject(adapter.swScript)) + } + return swSourceCache.get(adapter.name) + } const seeded = load() let notifications = Array.isArray(seeded) ? seeded : [] @@ -102,7 +119,7 @@ function createNotificationCenter(deps) { // back-compatible display. activate: n.activate || null, targetEntity: n.targetEntity || null, - icon: (adapter || {}).iconUrl || null, + icon: adapter?.iconUrl || null, ts: n.ts || (deps.now ? deps.now() : Date.now()), }, cap, @@ -117,18 +134,62 @@ function createNotificationCenter(deps) { const adapter = adapterFor(target.url) if (!adapter || !target.webSocketDebuggerUrl) return const ws = new WebSocketCtor(target.webSocketDebuggerUrl) - sideChannels.set(target.id, ws) let cmdId = 1 + let opened = false const cdp = (method, params) => ws.send(JSON.stringify({ id: cmdId++, method, params: params || {} })) + // Keep the remote Tab's page alive so its capture script keeps firing even when the + // Tab is backgrounded on the remote browser. Chromium freezes idle background tabs + // (~5 min), which pauses the page JS that calls `new Notification()` — so background + // Tabs silently stop delivering toasts (the asymmetry where only the active Tab + // notified). Forcing the web lifecycle to "active" prevents the freeze WITHOUT making + // the page "visible" (visibility is orthogonal in the CDP spec — verified against + // Page.setWebLifecycleState, which only takes "frozen"|"active"), so Slack still treats + // the Tab as hidden and keeps firing desktop notifications for the side-channel to + // capture. Re-applied every reconcile because the browser can re-freeze. See t066. + const keepAlive = () => { + if (opened) cdp("Page.setWebLifecycleState", { state: "active" }) + } + sideChannels.set(target.id, { ws, keepAlive }) ws.on("open", () => { + opened = true cdp("Runtime.enable") cdp("Page.enable") cdp("Runtime.addBinding", { name: NOTIFY_BINDING }) // document-start for future loads + the already-loaded document. cdp("Page.addScriptToEvaluateOnNewDocument", { source: sourceFor(adapter) }) cdp("Runtime.evaluate", { expression: sourceFor(adapter) }) + keepAlive() + }) + wireToastAndDrop(ws, target) + } + + // Service-worker side-channel (t067). Slack/Teams/Outlook deliver many notifications from + // their service worker's `push` handler via `registration.showNotification` — a realm the + // page hook (`window.Notification`) can't reach. A service_worker target supports Runtime + // (not Page), so we patch via a one-shot Runtime.evaluate into the running worker rather + // than Page.addScriptToEvaluateOnNewDocument. Best-effort: a worker that spins up fresh on + // a push and fires before the next 5s reconcile attaches is missed (no SW-start barrier + // here). The page keep-alive (t066) keeps the registration warm, which keeps the worker + // listed in /json across reconciles. No keep-alive on the worker itself (no web lifecycle). + function attachServiceWorker(target) { + const adapter = adapterFor(target.url) + if (!adapter?.swScript || !target.webSocketDebuggerUrl) return + const ws = new WebSocketCtor(target.webSocketDebuggerUrl) + let cmdId = 1 + const cdp = (method, params) => + ws.send(JSON.stringify({ id: cmdId++, method, params: params || {} })) + sideChannels.set(target.id, { ws, keepAlive: () => {} }) + ws.on("open", () => { + cdp("Runtime.enable") + cdp("Runtime.addBinding", { name: NOTIFY_BINDING }) + cdp("Runtime.evaluate", { expression: swSourceFor(adapter) }) }) + wireToastAndDrop(ws, target) + } + + // Shared toast ingest + self-removal wiring for both the page and service-worker channels. + function wireToastAndDrop(ws, target) { ws.on("message", (data) => { try { const msg = JSON.parse(data.toString()) @@ -138,7 +199,8 @@ function createNotificationCenter(deps) { } catch {} }) const drop = () => { - if (sideChannels.get(target.id) === ws) sideChannels.delete(target.id) + const cur = sideChannels.get(target.id) + if (cur && cur.ws === ws) sideChannels.delete(target.id) } ws.on("close", drop) ws.on("error", drop) @@ -156,9 +218,11 @@ function createNotificationCenter(deps) { } } if (!Array.isArray(list)) return - const matched = list.filter((t) => t.type === "page" && adapterFor(t.url)) - const liveIds = new Set(matched.map((t) => t.id)) - for (const [id, ws] of sideChannels) { + const pages = list.filter((t) => t.type === "page" && adapterFor(t.url)) + // Service-worker targets whose adapter declares a swScript (t067). + const workers = list.filter((t) => t.type === "service_worker" && adapterFor(t.url)?.swScript) + const liveIds = new Set([...pages, ...workers].map((t) => t.id)) + for (const [id, { ws }] of sideChannels) { if (!liveIds.has(id)) { try { ws.close() @@ -166,7 +230,11 @@ function createNotificationCenter(deps) { sideChannels.delete(id) } } - for (const t of matched) if (!sideChannels.has(t.id)) attach(t) + for (const t of pages) if (!sideChannels.has(t.id)) attach(t) + for (const t of workers) if (!sideChannels.has(t.id)) attachServiceWorker(t) + // Re-apply keep-alive to every live side-channel each cycle — the browser may have + // re-frozen a backgrounded Tab since the last pass (t066). SW channels no-op. + for (const [, ch] of sideChannels) ch.keepAlive() } return { @@ -195,7 +263,7 @@ function createNotificationCenter(deps) { }, unreadCount: () => unreadCount(notifications), close: () => { - for (const [, ws] of sideChannels) { + for (const [, { ws }] of sideChannels) { try { ws.close() } catch {} diff --git a/core/notifications-sidechain.test.ts b/core/notifications-sidechain.test.ts index a43dd0f..abc3bd0 100644 --- a/core/notifications-sidechain.test.ts +++ b/core/notifications-sidechain.test.ts @@ -172,6 +172,36 @@ describe("reconcile — idempotent / drop", () => { }) }) +describe("keep-alive (t066) — prevent background-tab freeze", () => { + it("forces the page web lifecycle to active on open so a backgrounded Tab keeps firing notifications", async () => { + const { center } = makeCenter() + await center.reconcile([teamsTarget()]) + const ws = FakeWs.instances[0] + ws.open() + const keepAlive = ws.sent.filter((m) => m.method === "Page.setWebLifecycleState") + expect(keepAlive).toHaveLength(1) + expect(keepAlive[0].params.state).toBe("active") + }) + + it("does not send keep-alive before the socket opens", async () => { + const { center } = makeCenter() + await center.reconcile([teamsTarget()]) + const ws = FakeWs.instances[0] + const count = ws.sent.filter((m) => m.method === "Page.setWebLifecycleState").length + expect(count).toBe(0) + }) + + it("re-applies keep-alive on each reconcile (browser may re-freeze the Tab)", async () => { + const { center } = makeCenter() + await center.reconcile([teamsTarget()]) + const ws = FakeWs.instances[0] + ws.open() + await center.reconcile([teamsTarget()]) // same target, socket already open + const count = ws.sent.filter((m) => m.method === "Page.setWebLifecycleState").length + expect(count).toBe(2) + }) +}) + describe("ingest dedup", () => { it("drops a duplicate toast within the dedup window — one stored entry, one onEntry", async () => { const { center, onEntry } = makeCenter() @@ -308,6 +338,85 @@ describe("slack adapter — per-workspace grouping (t064)", () => { }) }) +describe("service-worker capture (t067)", () => { + const slackSwTarget = (id = "sw1", over = {}) => ({ + id, + type: "service_worker" as const, + url: "https://app.slack.com/service-worker.js", + webSocketDebuggerUrl: `ws://host/devtools/worker/${id}`, + ...over, + }) + + it("attaches to a service_worker target whose adapter declares a swScript and injects it via Runtime.evaluate (no Page domain)", async () => { + const { center } = makeCenter() + await center.reconcile([slackSwTarget()]) + expect(FakeWs.instances).toHaveLength(1) + const ws = FakeWs.instances[0] + ws.open() + const methods = ws.sent.map((m) => m.method) + expect(methods).toContain("Runtime.enable") + expect(methods).toContain("Runtime.addBinding") + expect(methods).toContain("Runtime.evaluate") + expect(methods).not.toContain("Page.enable") + expect(methods).not.toContain("Page.addScriptToEvaluateOnNewDocument") + const evalCmd = ws.sent.find((m) => m.method === "Runtime.evaluate") + expect(evalCmd.params.expression).toContain("slack-sw-notify.js") + }) + + it("does not attach to a service_worker whose adapter has no swScript (Teams)", async () => { + const { center } = makeCenter() + await center.reconcile([ + { + id: "tsw", + type: "service_worker", + url: "https://teams.microsoft.com/sw.js", + webSocketDebuggerUrl: "ws://host/devtools/worker/tsw", + }, + ]) + expect(FakeWs.instances).toHaveLength(0) + }) + + it("ingests a toast from the SW channel, stamped with the slack adapter + payload groupKey", async () => { + const { center } = makeCenter() + await center.reconcile([slackSwTarget()]) + const ws = FakeWs.instances[0] + ws.open() + // SW URL has no team id, so the worker script supplies the per-workspace groupKey. + ws.notify({ id: "swn", title: "@bob: ping", groupKey: "slack:T999" }) + expect(center.list()[0].adapter).toBe("slack") + expect(center.list()[0].groupKey).toBe("slack:T999") + }) + + it("never sends keep-alive on a SW channel (no web lifecycle on a worker)", async () => { + const { center } = makeCenter() + await center.reconcile([slackSwTarget()]) + const ws = FakeWs.instances[0] + ws.open() + await center.reconcile([slackSwTarget()]) // triggers the keep-alive re-apply pass + expect(ws.sent.filter((m) => m.method === "Page.setWebLifecycleState")).toHaveLength(0) + }) + + it("drops the SW channel when the worker vanishes", async () => { + const { center } = makeCenter() + await center.reconcile([slackSwTarget()]) + const ws = FakeWs.instances[0] + await center.reconcile([]) + expect(ws.closed).toBe(true) + }) + + it("attaches both the page and the SW channel for the same workspace", async () => { + const { center } = makeCenter() + const slackPage = { + id: "p1", + type: "page" as const, + url: "https://app.slack.com/client/T1/C1", + webSocketDebuggerUrl: "ws://host/devtools/page/p1", + } + await center.reconcile([slackPage, slackSwTarget()]) + expect(FakeWs.instances).toHaveLength(2) + }) +}) + describe("store mutations + persistence", () => { async function seeded() { const ctx = makeCenter() diff --git a/docs/tasks/066-keep-notif-tabs-alive-dock-favicon.md b/docs/tasks/066-keep-notif-tabs-alive-dock-favicon.md new file mode 100644 index 0000000..1c9bda9 --- /dev/null +++ b/docs/tasks/066-keep-notif-tabs-alive-dock-favicon.md @@ -0,0 +1,57 @@ +# 066 — keep notification tabs alive + favicon on dock/banner + +- **Status:** in-progress +- **Mode:** HITL +- **Estimate:** 1d +- **Depends on:** none +- **Blocks:** 067 (service-worker push capture) + +## Goal + +Background Tabs on the remote browser silently stop delivering notifications: Chromium +freezes idle background tabs (~5 min), which pauses the page JS that the capture script +hooks (`window.Notification`), so only the *active* Tab keeps notifying. After this task, +every Notification-Adapter Tab (Teams / Outlook / Slack) is held in the "active" web +lifecycle state via the side-channel, so background Tabs keep firing notifications. And the +OS notification + the macOS dock icon now carry the source app's favicon, so you can tell +*which* app pinged you at a glance. + +## Why now + +This is the root cause of "only the focused tab notifies my real machine". It also unblocks +067 (service-worker push capture), which only matters once the page stays alive long enough +to be worth supplementing. + +## Acceptance criteria + +- [ ] Every adapter-matching Tab's side-channel sends `Page.setWebLifecycleState({state:"active"})` on open. +- [ ] Keep-alive is re-applied on every `reconcile` cycle (browser can re-freeze). +- [ ] Keep-alive does NOT make the page "visible" (Slack must keep firing desktop notifications). +- [ ] OS notification banner shows the source adapter's favicon. +- [ ] macOS dock icon shows the newest-unread app's favicon composited bottom-right; cleared when unread → 0. +- [ ] Dock overlay restores from persisted unread on launch and updates on mark-read/clear. + +## Test plan + +### Layer 1 — Pure logic (TDD) + +- [x] `core/notifications-sidechain.js` keep-alive — sends `setWebLifecycleState active` on open, re-applies per reconcile, not before open. +- [x] `core/notifications.js` `dockOverlayIcon(list)` — newest-unread icon, null when all read / empty / no icon. + +### Layer 2 — Manual smoke (CDP/IPC) + +- [ ] Open ≥2 Slack workspace Tabs; background one for >5 min; send it a message → OS notification fires on the Mac. +- [ ] Notification banner shows the Slack favicon. +- [ ] Dock icon shows the Slack favicon badge; mark-all-read → badge clears. +- [ ] Teams (`.ico` favicon) renders in both banner and dock (renderer decodes `.ico`). + +### Layer 3 — Visual review + +- [ ] Dock icon composite looks crisp at retina (white plate + favicon bottom-right). + +## Design notes + +- **Contracts changed:** `core/notifications-sidechain.js` `sideChannels` map value `ws → { ws, keepAlive }`; new pure `dockOverlayIcon(list)` in `core/notifications.js`. +- **CDP fact (verified vs protocol docs):** `Page.setWebLifecycleState` accepts only `"frozen"|"active"` and governs freeze state, not `document.visibilityState` — so "active" un-freezes without un-hiding. +- **Compositing:** done in the chrome renderer via `executeJavaScript` (its `` decodes `.ico`; favicon bytes are fetched in main and passed as a data URL, so the canvas is never cross-origin-tainted). main turns the returned PNG data URLs into `nativeImage`s for `app.dock.setIcon` + the notification `icon`. +- **New ADR needed?** no — tuning inside ADR-0003 (notifications side-channel). diff --git a/docs/tasks/067-service-worker-push-capture.md b/docs/tasks/067-service-worker-push-capture.md new file mode 100644 index 0000000..cd9a914 --- /dev/null +++ b/docs/tasks/067-service-worker-push-capture.md @@ -0,0 +1,49 @@ +# 067 — capture service-worker push notifications (Slack) + +- **Status:** in-progress +- **Mode:** HITL +- **Estimate:** 1d +- **Depends on:** 066 (keep-alive keeps the SW registration warm) +- **Blocks:** none + +## Goal + +The page hook (`window.Notification`, `slack-notify.js`) only sees notifications Slack fires +from the **page** realm. Slack also delivers notifications from its **service worker's** +`push` handler via `registration.showNotification(...)`, a separate realm the page script +can't reach — these were silently missed. After this task the side-channel also attaches to +the matching `service_worker` target and patches `showNotification` there, shipping the same +`__cdpNotify` toasts so SW-delivered notifications are captured too. + +## Why now + +Builds directly on t066: once background tabs stay alive, the remaining gap is the +SW-`push`-only deliveries. This closes the last "notification didn't show up" hole. + +## Acceptance criteria + +- [ ] `reconcile` attaches a side-channel to a `service_worker` target whose adapter declares `swScript`. +- [ ] The SW channel injects via `Runtime.evaluate` (no `Page` domain on a worker). +- [ ] SW channels never receive the t066 page keep-alive (`setWebLifecycleState`). +- [ ] A SW-`__cdpNotify` toast is ingested + grouped + fired through the same store path. +- [ ] Per-workspace `groupKey` comes from the payload (the SW URL has no team id). +- [ ] SW channel is dropped when the worker disappears from `/json`. + +## Test plan + +### Layer 1 — Pure logic (TDD) + +- [x] sidechain SW attach — attaches to a `service_worker` w/ swScript, evaluates the SW script, no Page domain, no keep-alive, drops on vanish, page+SW coexist. + +### Layer 2 — Manual smoke (CDP/IPC) — **REQUIRED, blind-spots below** + +- [ ] With Slack push enabled on the remote browser, trigger a SW push (close all Slack tabs' focus / use a push that routes through the SW) → notification is captured. +- [ ] Inspect the one-time `[cdp-sw-notify] sample options:` log in the worker console; **tighten `TEAM_RE`/`probe` keys to the real payload** if workspace grouping is wrong. +- [ ] Confirm per-workspace `groupKey` resolves (else all workspaces merge under the SW origin). + +## Design notes + +- **Contracts changed:** adapter gains optional `swScript`; `reconcile` matches `service_worker` targets; new `attachServiceWorker` path + `inject/slack-sw-notify.js`. +- **Known limitation (documented, not fixed here):** a worker that spins up fresh on a push and fires `showNotification` *before* the next 5s reconcile attaches is missed — there's no SW-start barrier. A hardened version would use a browser-level CDP session with `Target.setAutoAttach({ waitForDebuggerOnStart: true })` to attach + inject before the worker runs. Deferred; the t066 page keep-alive keeps the registration warm enough to be listed across reconciles in the common case. +- **Payload shape is a guess:** Slack's SW push `data` isn't publicly documented; `slack-sw-notify.js` probes defensively and logs a sample once for HITL tightening. +- **New ADR needed?** no — extends ADR-0003 (notifications side-channel). diff --git a/inject/slack-sw-notify.js b/inject/slack-sw-notify.js new file mode 100644 index 0000000..da763ff --- /dev/null +++ b/inject/slack-sw-notify.js @@ -0,0 +1,106 @@ +// Injected into Slack's SERVICE WORKER target (t067), not the page. Slack delivers many +// notifications from its service worker's `push` handler via +// `self.registration.showNotification(...)` — a realm the page hook (slack-notify.js's +// `window.Notification` patch) can't reach. This script patches +// `ServiceWorkerRegistration.prototype.showNotification` in the worker's global scope and +// ships the same `__cdpNotify` toast the side-channel ingests. +// +// EXPERIMENTAL / HITL: Slack's push payload shape is not publicly documented, so the +// team/channel extraction below probes the common carriers defensively and degrades to a +// tab-/origin-only toast when it can't find them. Verify against a live Slack SW push and +// tighten the probes (the captured `options.data` is logged to the worker console once). +// +// Realm notes: +// - No `window` / `document` here — only `self`, `self.registration`, the global +// `ServiceWorkerRegistration`, and the `__cdpNotify` binding the side-channel registers +// via Runtime.addBinding before this runs. +// - The single Slack SW serves EVERY workspace (app.slack.com origin-level), so the SW URL +// carries no team id. The per-workspace groupKey must come from the payload, not the URL. +// - We deliberately let the real showNotification still run so the user isn't left with a +// silently-swallowed push (the remote browser's own toast is harmless and offscreen); +// capture is purely additive. +;(() => { + if (self.__cdpSwNotifyArmed) return + self.__cdpSwNotifyArmed = true + + let seq = 0 + let loggedShape = false + + const CHANNEL_RE = /\b([CDG][A-Z0-9]{6,})\b/ + const TEAM_RE = /\b([TE][A-Z0-9]{6,})\b/ + + // Pull the first well-formed id matching `re` out of a grab-bag of probe strings drawn + // from the notification options (tag + data, object or scalar, plus a JSON dump). + const probe = (opts, re) => { + if (!opts || typeof opts !== "object") return null + const probes = [] + if (opts.tag != null) probes.push(String(opts.tag)) + const d = opts.data + if (d != null) { + if (typeof d === "object") { + for (const k of ["team", "teamId", "team_id", "channel", "channelId", "channel_id", "id"]) { + if (d[k] != null) probes.push(String(d[k])) + } + try { + probes.push(JSON.stringify(d)) + } catch { + /* circular — skip */ + } + } else { + probes.push(String(d)) + } + } + for (const p of probes) { + const m = p.match(re) + if (m) return m[1] + } + return null + } + + const capture = (title, opts) => { + if (!loggedShape) { + loggedShape = true + try { + // One-time aid for tightening the probes against the real payload (HITL). + console.log("[cdp-sw-notify] sample options:", JSON.stringify(opts)) + } catch {} + } + const team = probe(opts, TEAM_RE) + const channel = probe(opts, CHANNEL_RE) + const body = opts && typeof opts.body === "string" ? opts.body : "" + const payload = { + id: `slack-sw:${team || "?"}:${Date.now()}:${seq++}`, + source: "Slack", + title: title != null ? String(title) : "", + body, + // Per-workspace bucket from the payload (the SW URL has no team id). When absent the + // side-channel falls back to the SW origin — all workspaces merge, but still captured. + groupKey: team ? `slack:${team}` : undefined, + activate: channel && team ? { type: "spa-link", url: `/client/${team}/${channel}` } : null, + ts: Date.now(), + } + try { + self.__cdpNotify(JSON.stringify(payload)) + } catch { + /* binding not registered (shouldn't happen) */ + } + } + + const proto = + typeof ServiceWorkerRegistration !== "undefined" && ServiceWorkerRegistration.prototype + if (!proto || typeof proto.showNotification !== "function") return + const real = proto.showNotification + proto.showNotification = function patchedShowNotification(title, opts, ...rest) { + try { + capture(title, opts) + } catch { + /* never break the worker */ + } + // Still call through so the push handler's contract is honoured. + try { + return real.call(this, title, opts, ...rest) + } catch { + return Promise.resolve() + } + } +})() diff --git a/main.js b/main.js index 72ec844..a34f051 100644 --- a/main.js +++ b/main.js @@ -11,6 +11,7 @@ const { desktopCapturer, systemPreferences, dialog, + nativeImage, } = require("electron") const path = require("node:path") const fs = require("node:fs") @@ -18,6 +19,7 @@ const WebSocket = require("ws") const { emulatedMediaParams } = require("./core/theme-emulation") const { createSettingsStore } = require("./core/settings-store") const endpoints = require("./core/cdp-endpoints") +const { mimeForName: clipboardMime } = require("./core/clipboard") const { tierToParams, DEFAULT_TIER } = require("./core/quality-tier") // The window is a BaseWindow composed of a chrome view (the React UI, full @@ -206,6 +208,42 @@ ipcMain.handle("cdp:read-clipboard-image", () => { return img.isEmpty() ? null : img.toDataURL() }) +// Real files on the clipboard (e.g. a video copied in Finder) — read the actual bytes +// rather than clipboard.readImage(), which only yields the file's icon/thumbnail. Returns +// [{ name, type, dataUrl }]; empty when the clipboard holds no file reference. Reading +// happens in main (full fs access, no browser CORS/permission wall). +const MAX_CLIPBOARD_FILE_BYTES = 150 * 1024 * 1024 // 150 MB — guards against OOM on huge media +ipcMain.handle("cdp:read-clipboard-files", () => { + let paths = [] + try { + // macOS: a single copied file exposes public.file-url (file:///…); read it directly. + const fileUrl = clipboard.read("public.file-url") + if (fileUrl) { + const p = decodeURIComponent(fileUrl.replace(/^file:\/\//, "")) + if (p) paths = [p] + } + } catch { + // format absent on this platform / clipboard — fall through to empty + } + const out = [] + for (const p of paths) { + try { + const stat = fs.statSync(p) + if (!stat.isFile() || stat.size > MAX_CLIPBOARD_FILE_BYTES) continue + const buf = fs.readFileSync(p) + const type = clipboardMime(path.basename(p)) + out.push({ + name: path.basename(p), + type, + dataUrl: `data:${type};base64,${buf.toString("base64")}`, + }) + } catch { + // unreadable path — skip + } + } + return out +}) + ipcMain.handle("cdp:list-tabs", async () => { try { const { url, method } = endpoints.list(cdpHost, cdpPort) @@ -501,6 +539,131 @@ function updateBadge() { if (typeof app.setBadgeCount === "function") app.setBadgeCount(notificationCenter.unreadCount()) } +// --- Dock icon composite (t066): overlay the notifying app's favicon on the bottom-right +// of CDP Browser's dock icon, so the dock tells you WHICH app pinged you (Slack vs Teams), +// not just a number. The compositing runs in the chrome renderer (its decodes .ico +// + data-URL inputs don't taint the canvas), driven from main via executeJavaScript. +const APP_ICON_PATH = path.join(__dirname, "build", "icon.png") +let baseIconDataUrl = null +function baseIcon() { + if (baseIconDataUrl != null) return baseIconDataUrl + try { + baseIconDataUrl = `data:image/png;base64,${fs.readFileSync(APP_ICON_PATH).toString("base64")}` + } catch { + baseIconDataUrl = "" + } + return baseIconDataUrl +} + +// Bundled per-adapter app icons (Slack/Teams/Outlook), keyed by adapter name. Local files — +// NO network: remote favicon URLs were unreliable (Slack's returned HTTP 403, corporate +// proxies black-holed the fetch, and the first notification always missed the cache). Local +// PNGs render on the very first notification and can't hang. See build/notif-icons/. +const NOTIF_ICONS_DIR = path.join(__dirname, "build", "notif-icons") +// Human label per adapter for the macOS notification subtitle (text fallback when the OS +// banner ignores the custom icon). +const NOTIF_APP_LABELS = { slack: "Slack", teams: "Microsoft Teams", outlook: "Outlook" } +const localIconImageCache = new Map() // adapter -> NativeImage | null +function localIconImage(adapter) { + if (!adapter) return null + if (localIconImageCache.has(adapter)) return localIconImageCache.get(adapter) + let img = null + try { + const p = path.join(NOTIF_ICONS_DIR, `${adapter}.png`) + if (fs.existsSync(p)) { + const candidate = nativeImage.createFromPath(p) + if (!candidate.isEmpty()) img = candidate + } + } catch {} + localIconImageCache.set(adapter, img) + return img +} +const localIconDataUrlCache = new Map() // adapter -> dataURL | "" +function localIconDataUrl(adapter) { + if (!adapter) return "" + if (localIconDataUrlCache.has(adapter)) return localIconDataUrlCache.get(adapter) + let out = "" + try { + const p = path.join(NOTIF_ICONS_DIR, `${adapter}.png`) + if (fs.existsSync(p)) out = `data:image/png;base64,${fs.readFileSync(p).toString("base64")}` + } catch {} + localIconDataUrlCache.set(adapter, out) + return out +} + +// In the renderer: draw base app icon + the adapter icon in the bottom-right corner. +// Returns the composited dock PNG data URL, or null on failure. Inputs are local data URLs, +// so the canvas is never cross-origin-tainted; a per-image timeout prevents any hang. +async function composeDockIcon(iconDataUrl) { + const wc = chromeWc() + const base = baseIcon() + if (!wc || !base || !iconDataUrl) return null + const expr = `(async () => { + const load = (src) => new Promise((res) => { + const img = new Image() + const done = (v) => res(v) + img.onload = () => done(img); img.onerror = () => done(null) + setTimeout(() => done(null), 2500) + img.src = src + }) + try { + const [base, fav] = await Promise.all([load(${JSON.stringify(base)}), load(${JSON.stringify(iconDataUrl)})]) + if (!base || !fav) return null + const S = base.naturalWidth || 1024 + const c = document.createElement("canvas"); c.width = S; c.height = S + const x = c.getContext("2d") + x.drawImage(base, 0, 0, S, S) + const bs = Math.round(S * 0.42), pad = Math.round(S * 0.04) + const bx = S - bs - pad, by = S - bs - pad, r = Math.round(bs * 0.22) + x.save() + x.beginPath() + x.moveTo(bx + r, by) + x.arcTo(bx + bs, by, bx + bs, by + bs, r) + x.arcTo(bx + bs, by + bs, bx, by + bs, r) + x.arcTo(bx, by + bs, bx, by, r) + x.arcTo(bx, by, bx + bs, by, r) + x.closePath() + x.shadowColor = "rgba(0,0,0,0.35)"; x.shadowBlur = Math.round(S * 0.02) + x.fillStyle = "#fff"; x.fill() + x.restore() + const inset = Math.round(bs * 0.12) + x.drawImage(fav, bx + inset, by + inset, bs - 2 * inset, bs - 2 * inset) + return c.toDataURL("image/png") + } catch { return null } + })()` + try { + return await wc.executeJavaScript(expr) + } catch { + return null + } +} + +function setDockIcon(dataUrl) { + if (!app.dock || !dataUrl) return + try { + app.dock.setIcon(nativeImage.createFromDataURL(dataUrl)) + } catch {} +} +function clearDockIcon() { + if (!app.dock) return + try { + app.dock.setIcon(nativeImage.createFromPath(APP_ICON_PATH)) + } catch {} +} + +// Reconcile the dock icon with the store: overlay the newest-unread app's icon, or restore +// the plain icon when nothing is unread. Fire-and-forget — never awaited on a path that +// gates a notification. Uses bundled local icons, so it can't hang on the network. +async function syncDockIcon() { + try { + const newest = notificationCenter.list().find((n) => !n.read) + const iconDataUrl = newest ? localIconDataUrl(newest.adapter) : "" + const dock = iconDataUrl ? await composeDockIcon(iconDataUrl) : null + if (dock) setDockIcon(dock) + else clearDockIcon() + } catch {} +} + // Retain shown Notification objects: Electron/V8 garbage-collects a Notification with no // live reference, and the collected object never delivers its `click` event — the banner // shows but clicking it does nothing. Held until the user clicks or it closes. @@ -521,6 +684,10 @@ const notificationCenter = createNotificationCenter({ updateBadge() chromeSend("cdp:notification", entry) + // Fire the OS notification FIRST and synchronously — it must NEVER be gated by favicon + // or network work (a hung favicon fetch previously swallowed every toast). The banner + // icon is best-effort: use the cached normalized favicon if we already have it, else + // fire without one (the app icon shows regardless). The dock sync below warms the cache. const windowFocused = !!(mainWindow && !mainWindow.isDestroyed() && mainWindow.isFocused()) if ( shouldNotifyOs(entry, { @@ -530,7 +697,17 @@ const notificationCenter = createNotificationCenter({ }) && Notification.isSupported() ) { - const osN = new Notification({ title: entry.title || entry.source, body: entry.body }) + const opts = { title: entry.title || entry.source, body: entry.body } + // Tell the user WHICH app pinged them, two ways for robustness: + // - icon: bundled local adapter logo (Slack/Teams/Outlook). Works on Windows/Linux and + // in macOS Notification Center; macOS banners may fall back to the app icon. + // - subtitle (macOS): the app label in TEXT — always visible even if the icon is + // suppressed, so "Slack" vs "Teams" is never ambiguous. + const img = localIconImage(entry.adapter) + if (img) opts.icon = img + const label = NOTIF_APP_LABELS[entry.adapter] + if (label) opts.subtitle = label + const osN = new Notification(opts) liveNotifications.add(osN) const cleanupN = () => liveNotifications.delete(osN) osN.on("click", () => { @@ -544,34 +721,45 @@ const notificationCenter = createNotificationCenter({ osN.on("close", cleanupN) osN.show() } + + // Update the dock favicon overlay + warm the banner-icon cache — fire-and-forget so a + // slow favicon fetch can never delay or swallow the notification above. + void syncDockIcon() }, }) setInterval(() => notificationCenter.reconcile(), 5000) app.whenReady().then(() => { updateBadge() // restore dock badge from persisted unread - setTimeout(() => notificationCenter.reconcile(), 1000) + setTimeout(() => { + notificationCenter.reconcile() + syncDockIcon() // restore dock favicon overlay once the chrome renderer can composite + }, 1000) }) ipcMain.handle("cdp:get-notifications", () => notificationCenter.list()) ipcMain.handle("cdp:mark-notification-read", (_, id) => { const list = notificationCenter.markRead(id) updateBadge() + syncDockIcon() return list }) ipcMain.handle("cdp:mark-notification-unread", (_, id) => { const list = notificationCenter.markUnread(id) updateBadge() + syncDockIcon() return list }) ipcMain.handle("cdp:mark-notifications-read", () => { const list = notificationCenter.markAllRead() updateBadge() + syncDockIcon() return list }) ipcMain.handle("cdp:clear-notifications", () => { const list = notificationCenter.clear() updateBadge() + syncDockIcon() return list }) diff --git a/package.json b/package.json index d2a9ac6..c38b6db 100644 --- a/package.json +++ b/package.json @@ -58,6 +58,8 @@ "!core/**/*.test.js", "inject/**/*", "dist/**/*", + "build/icon.png", + "build/notif-icons/**/*", "!node_modules/**/*", "node_modules/ws/**/*" ], diff --git a/preload.js b/preload.js index e0bd0e0..0208ce2 100644 --- a/preload.js +++ b/preload.js @@ -23,6 +23,7 @@ contextBridge.exposeInMainWorld("cdp", { copyToClipboard: (text) => ipcRenderer.invoke("cdp:copy-to-clipboard", text), readClipboard: () => ipcRenderer.invoke("cdp:read-clipboard"), readClipboardImage: () => ipcRenderer.invoke("cdp:read-clipboard-image"), + readClipboardFiles: () => ipcRenderer.invoke("cdp:read-clipboard-files"), onSwipe: (cb) => ipcRenderer.on("cdp:swipe", (_, direction) => cb(direction)), // Pins getPins: () => ipcRenderer.invoke("cdp:get-pins"), diff --git a/src/app.tsx b/src/app.tsx index a0f27be..a4233a3 100644 --- a/src/app.tsx +++ b/src/app.tsx @@ -1179,16 +1179,25 @@ export default function App() { if (caps.web) break e.preventDefault() e.stopPropagation() - // Electron: read the local clipboard in main (image first, then text) and inject. + // Electron: read the local clipboard in main and inject. Order matters: a copied + // *file* (e.g. a video from Finder) must be read as a real file — readClipboardImage + // would otherwise return the file's icon thumbnail and paste that instead. So: + // files → image bits → text. window.cdp - .readClipboardImage() - .then((dataUrl) => { - if (dataUrl) { - page.pasteImage(dataUrl) + .readClipboardFiles() + .then((files) => { + if (files?.length) { + for (const f of files) page.pasteFile(f.dataUrl, f.name, f.type) return } - return window.cdp.readClipboard().then((text) => { - if (text) page.paste(text, { rich: false }) + return window.cdp.readClipboardImage().then((dataUrl) => { + if (dataUrl) { + page.pasteImage(dataUrl) + return + } + return window.cdp.readClipboard().then((text) => { + if (text) page.paste(text, { rich: false }) + }) }) }) .catch(() => { @@ -1238,16 +1247,17 @@ export default function App() { return const dt = e.clipboardData if (!dt) return - const imageItem = Array.from(dt.items).find( - (it) => it.kind === "file" && it.type.startsWith("image/"), - ) - if (imageItem) { - const file = imageItem.getAsFile() + // Any file (image, video, doc) — not just images. Preserve name + type so the + // remote upload target accepts a video instead of receiving a bare image. + const fileItem = Array.from(dt.items).find((it) => it.kind === "file") + if (fileItem) { + const file = fileItem.getAsFile() if (file) { e.preventDefault() const reader = new FileReader() reader.onload = () => { - if (typeof reader.result === "string") page.pasteImage(reader.result) + if (typeof reader.result === "string") + page.pasteFile(reader.result, file.name, file.type) } reader.readAsDataURL(file) return diff --git a/src/components/viewport.tsx b/src/components/viewport.tsx index 82c3dbf..14c0335 100644 --- a/src/components/viewport.tsx +++ b/src/components/viewport.tsx @@ -1,4 +1,5 @@ import { useCallback, useEffect, useRef, useState } from "react" +import { toast } from "sonner" import type { SwitchEffect } from "@/components/settings-dialog" import { useAnyPointerFine } from "@/hooks/use-pointer-coarse" import { type Event as AdaptiveEvent, type Bounds, initial, reduce } from "@/lib/adaptive-viewport" @@ -13,7 +14,7 @@ import { } from "@/lib/echo-cursor" import { isOsReservedKey } from "@/lib/key-routing" import { perfFrame, perfMark } from "@/lib/perf-mark" -import type { RemotePage } from "@/lib/remote-page" +import type { DropFileSpec, RemotePage } from "@/lib/remote-page" import { createTouchGesture, type GestureEvent, LONGPRESS_MS } from "@/lib/touch-gesture" import { drawFrame, type Size, toRemoteCoords } from "@/lib/viewport-transform" import { @@ -28,6 +29,34 @@ import { * the reflow as finished and reveal the tab. Adapts the freeze to connection speed and * page complexity (a heavy page like Outlook keeps emitting frames until it's done). */ const FRAMES_QUIET_MS = 200 + +/** Files dropped onto the canvas cross the CDP wire as a base64 data URL inside a + * Runtime.evaluate expression — fine for screenshots/clips, ruinous for a multi-GB movie. + * Cap each file and toast the ones skipped so a huge drop fails loudly, not silently. */ +const MAX_DROP_FILE_BYTES = 100 * 1024 * 1024 // 100 MB + +/** Reads a dropped FileList into data-URL specs, skipping files over the size cap. + * Returns the readable specs plus the names of any skipped (too-large) files. */ +async function readDroppedFiles( + files: File[], +): Promise<{ specs: DropFileSpec[]; skipped: string[] }> { + const specs: DropFileSpec[] = [] + const skipped: string[] = [] + for (const file of files) { + if (file.size > MAX_DROP_FILE_BYTES) { + skipped.push(file.name) + continue + } + const dataUrl = await new Promise((resolve) => { + const reader = new FileReader() + reader.onload = () => resolve(typeof reader.result === "string" ? reader.result : null) + reader.onerror = () => resolve(null) + reader.readAsDataURL(file) + }) + if (dataUrl) specs.push({ dataUrl, name: file.name, type: file.type }) + } + return { specs, skipped } +} /** Safety cap on the tab-switch freeze, in case frames never go quiet (animated page). */ const SETTLE_CAP_MS = 1500 /** Blur applied to the frozen frame during a tab-switch settle; eased back to 0 on @@ -94,6 +123,11 @@ export function Viewport({ }: ViewportProps) { const canvasRef = useRef(null) const containerRef = useRef(null) + // True while an OS file drag hovers the canvas — drives the drop-target highlight. + const [dragOver, setDragOver] = useState(false) + // Nested dragenter/dragleave fire per child; a depth counter keeps the highlight stable + // until the drag truly leaves the container (leave at depth 0). + const dragDepthRef = useRef(0) const imgRef = useRef(new Image()) // The single frame-view snapshot, captured the moment a frame is painted: the painted // frame's image px, plus its remote DIP geometry (device size + vertical offset) when @@ -653,8 +687,77 @@ export function Viewport({ } }, [page, maybeRearm]) + // An OS file dropped anywhere on an Electron window otherwise navigates it to file://. + // Swallow the default for file drags outside an editable field / local webview (those + // own their drops); the canvas's own onDrop still runs and forwards the files. + useEffect(() => { + const hasFiles = (e: DragEvent) => Array.from(e.dataTransfer?.types ?? []).includes("Files") + const isOwnTarget = (t: EventTarget | null) => { + const el = t as HTMLElement | null + if (!el?.tagName) return false + return ( + el.tagName === "INPUT" || + el.tagName === "TEXTAREA" || + el.tagName === "WEBVIEW" || + el.isContentEditable + ) + } + const guard = (e: DragEvent) => { + if (hasFiles(e) && !isOwnTarget(e.target)) e.preventDefault() + } + window.addEventListener("dragover", guard) + window.addEventListener("drop", guard) + return () => { + window.removeEventListener("dragover", guard) + window.removeEventListener("drop", guard) + } + }, []) + + const hasDragFiles = (e: React.DragEvent) => + Array.from(e.dataTransfer?.types ?? []).includes("Files") + return ( -
+
{ + if (!hasDragFiles(e)) return + e.preventDefault() + dragDepthRef.current += 1 + setDragOver(true) + }} + onDragLeave={(e) => { + if (!hasDragFiles(e)) return + dragDepthRef.current = Math.max(0, dragDepthRef.current - 1) + if (dragDepthRef.current === 0) setDragOver(false) + }} + onDragOver={(e) => { + if (!hasDragFiles(e)) return + e.preventDefault() + e.dataTransfer.dropEffect = "copy" + }} + onDrop={(e) => { + if (!e.dataTransfer?.files?.length) return + e.preventDefault() + dragDepthRef.current = 0 + setDragOver(false) + const { clientX, clientY } = e + const files = Array.from(e.dataTransfer.files) + void readDroppedFiles(files).then(({ specs, skipped }) => { + if (specs.length) { + const names = specs.map((s) => s.name).join(", ") + // The bytes stream to the remote in chunks (seconds for a big video); keep the + // toast up so the wait reads as progress, not a hang. + const tid = toast.loading(`Sending ${names} to the page…`) + page + .dropFiles(specs, clientX, clientY) + .then(() => toast.success(`Sent ${names} to the page`, { id: tid })) + .catch(() => toast.error(`Couldn't send ${names}`, { id: tid })) + } + if (skipped.length) toast(`Too large to drop (max 100 MB): ${skipped.join(", ")}`) + }) + }} + ref={containerRef} + > { @@ -747,6 +850,13 @@ export function Viewport({ ref={canvasRef} /> {showVirtualPointer && } + {dragOver && ( +
+
+ Drop files to send to the page +
+
+ )}
) } diff --git a/src/lib/CLAUDE.md b/src/lib/CLAUDE.md index 61ae936..2fbd645 100644 --- a/src/lib/CLAUDE.md +++ b/src/lib/CLAUDE.md @@ -4,7 +4,7 @@ Domain modules that form the renderer's logic layer, plus a React hook that wire ## Modules -**`remote-page.ts`** — the Remote Page. `createRemotePage(transport)` wraps the CDP Transport seam into named intentions (`navigate`, `navigateSpa`, `openTeamsThread`, `back`, `forward`, `reload`, `selectAll`, `copySelection`, `getNavState`, `isLoading`, `paste`, `pasteImage`, `find`/`findStep`/`clearFind`) — `navigateSpa` drives client-side SPA routing (`pushState`+`popstate`, full-navigation fallback) for deep-opening, e.g. an Outlook message from a notification; `openTeamsThread` deep-opens a Teams conversation by clicking the chat row carrying the thread id (Teams has no URL route — see ADR-0003); `paste(text, {rich?})` inserts the local clipboard text into the remote focused element: `rich:false` (default) uses `Input.insertText` (plain, fires `input` events on React-controlled inputs); `rich:true` pre-seeds the remote clipboard via `Runtime.evaluate` and forwards Cmd+V so the page's `onpaste` handler runs; `pasteImage(dataUrl)` synthesizes a `paste` ClipboardEvent on the remote's focused element carrying the image as a `File` in a `DataTransfer` (rich editors — Slack, Gmail, Docs — read `clipboardData.files`); `find`/`findStep`/`clearFind` are the in-page find seam (t001) — they inject a per-document `window.__cdpFind` helper via `Runtime.evaluate` (`window.find` reports only a boolean, so the helper owns counting, stepping-with-wrap, scroll-into-view, and clearing) and report `{ total }` / `{ index }` via `returnByValue`; and the two subscription surfaces (`on` for typed events, `onFrame` for Screencast Frames). One registration on the raw transport; subscribers come and go — no re-registration, no leaks. Auto-acks every Screencast Frame before passing it to `onFrame` listeners. `forwardInput(InputIntent)` is the single Input Forwarding extension seam: new input kinds (IME, paste, drag) become new variants on `InputIntent` plus one `case` in `forwardInput`; no other interface changes. +**`remote-page.ts`** — the Remote Page. `createRemotePage(transport)` wraps the CDP Transport seam into named intentions (`navigate`, `navigateSpa`, `openTeamsThread`, `back`, `forward`, `reload`, `selectAll`, `copySelection`, `getNavState`, `isLoading`, `paste`, `pasteImage`, `pasteFile`, `dropFiles`, `find`/`findStep`/`clearFind`) — `navigateSpa` drives client-side SPA routing (`pushState`+`popstate`, full-navigation fallback) for deep-opening, e.g. an Outlook message from a notification; `openTeamsThread` deep-opens a Teams conversation by clicking the chat row carrying the thread id (Teams has no URL route — see ADR-0003); `paste(text, {rich?})` inserts the local clipboard text into the remote focused element: `rich:false` (default) uses `Input.insertText` (plain, fires `input` events on React-controlled inputs); `rich:true` pre-seeds the remote clipboard via `Runtime.evaluate` and forwards Cmd+V so the page's `onpaste` handler runs; `pasteFile(dataUrl, name, type)` synthesizes a `paste` ClipboardEvent on the remote's focused element carrying the file as a `File` in a `DataTransfer` (rich editors / upload surfaces — Slack, Gmail, Drive — read `clipboardData.files`), preserving the real name + MIME so a video/doc is accepted (not just an image); `pasteImage(dataUrl)` is a thin wrapper over it for raw image bits; `dropFiles(specs, clientX, clientY)` injects OS-dropped files by mapping the drop point through the coord resolver and synthesizing `dragenter`/`dragover`/`drop` DragEvents carrying the files in a `DataTransfer` on the remote element under the cursor (no-op when empty) — `viewport.tsx` reads the dropped `File`s via `FileReader`. Both `pasteFile` and `dropFiles` are **async** and **stream the payload in ~2 MB base64 chunks** (`streamFiles` → `window.__cdpFiles[key]`, then `assembleFilesExpr` joins + decodes into a real `File`); a whole 62 MB video as one `Runtime.evaluate` source literal froze the renderer parsing it (and the screencast shares the CDP socket), so chunking is load-bearing, not an optimization; `find`/`findStep`/`clearFind` are the in-page find seam (t001) — they inject a per-document `window.__cdpFind` helper via `Runtime.evaluate` (`window.find` reports only a boolean, so the helper owns counting, stepping-with-wrap, scroll-into-view, and clearing) and report `{ total }` / `{ index }` via `returnByValue`; and the two subscription surfaces (`on` for typed events, `onFrame` for Screencast Frames). One registration on the raw transport; subscribers come and go — no re-registration, no leaks. Auto-acks every Screencast Frame before passing it to `onFrame` listeners. `forwardInput(InputIntent)` is the single Input Forwarding extension seam: new input kinds (IME, paste, drag) become new variants on `InputIntent` plus one `case` in `forwardInput`; no other interface changes. **`tabs.ts`** — Tab ordering and lifecycle. `reconcile(order, remoteTabs)` merges the Remote Browser's tab list against the locally-owned order: existing tabs keep position, gone tabs drop out, new tabs append. `nextTab`/`prevTab` wrap around. `stripTitleBadge(title)` strips a leading `(N)` unread count that some apps (e.g. Teams) prepend to the document title — the app surfaces unread counts via its own tab badge, so the title shouldn't duplicate it. diff --git a/src/lib/cdp-web-transport.ts b/src/lib/cdp-web-transport.ts index 0ad62e7..d23b849 100644 --- a/src/lib/cdp-web-transport.ts +++ b/src/lib/cdp-web-transport.ts @@ -1205,6 +1205,7 @@ export function createWebCdp(deps: WebTransportDeps = resolveDeps()): CdpBridge // Web reads the clipboard from the native `paste` event (app.tsx), not here — the // async Clipboard API can't reliably read images on Safari/iPad. Stub returns null. readClipboardImage: async () => null, + readClipboardFiles: async () => [], onSwipe: () => {}, // no trackpad swipe over the web getPins: () => rest.getJson("/api/pins"), addPin: (pin) => rest.postJson("/api/pins/add", pin), diff --git a/src/lib/remote-page.test.ts b/src/lib/remote-page.test.ts index 6a79e46..814ebb7 100644 --- a/src/lib/remote-page.test.ts +++ b/src/lib/remote-page.test.ts @@ -354,19 +354,63 @@ describe("RemotePage clipboard paste", () => { expect(t.sends[0].params).toMatchObject({ type: "keyDown", key: "v", commandKey: true }) }) - it("pasteImage evaluates a synthetic paste event injecting the data URL", () => { + it("pasteImage streams the payload in chunks then dispatches a paste event", async () => { const t = fakeTransport() const page = createRemotePage(t.transport) - page.pasteImage("data:image/png;base64,ABC") + await page.pasteImage("data:image/png;base64,ABC") - expect(t.invoke).toHaveBeenCalledWith( - "Runtime.evaluate", - expect.objectContaining({ awaitPromise: true }), + const exprs = t.invoke.mock.calls.map((c) => c[1].expression as string) + const dispatch = exprs[exprs.length - 1] + expect(dispatch).toContain('ClipboardEvent("paste"') + expect(dispatch).toContain("pasted-image.png") + // The base64 payload rides the chunk pushes, never the final dispatch (no giant literal). + expect(exprs.some((e) => e.includes("ABC"))).toBe(true) + expect(dispatch).not.toContain("ABC") + }) + + it("pasteFile carries the file name and mime type into the synthesized File", async () => { + const t = fakeTransport() + const page = createRemotePage(t.transport) + + await page.pasteFile("data:video/mp4;base64,XYZ", "clip.mp4", "video/mp4") + + const exprs = t.invoke.mock.calls.map((c) => c[1].expression as string) + const dispatch = exprs[exprs.length - 1] + expect(dispatch).toContain('ClipboardEvent("paste"') + expect(dispatch).toContain("clip.mp4") + expect(dispatch).toContain("video/mp4") + expect(exprs.some((e) => e.includes("XYZ"))).toBe(true) + expect(dispatch).not.toContain("XYZ") + }) + + it("dropFiles streams chunks then dispatches a drop at the resolved remote coords", async () => { + const t = fakeTransport() + const page = createRemotePage(t.transport) + page.setCoordResolver(() => ({ x: 100, y: 200 })) + + await page.dropFiles( + [{ dataUrl: "data:video/mp4;base64,XYZ", name: "clip.mp4", type: "video/mp4" }], + 10, + 20, ) - const expr = t.invoke.mock.calls[0][1].expression as string - expect(expr).toContain('ClipboardEvent("paste"') - expect(expr).toContain("data:image/png;base64,ABC") + + const exprs = t.invoke.mock.calls.map((c) => c[1].expression as string) + const dispatch = exprs[exprs.length - 1] + expect(dispatch).toContain('DragEvent("drop"') + expect(dispatch).toContain("elementFromPoint(100, 200)") + expect(dispatch).toContain("clip.mp4") + expect(exprs.some((e) => e.includes("XYZ"))).toBe(true) + expect(dispatch).not.toContain("XYZ") + }) + + it("dropFiles is a no-op when no files are given", async () => { + const t = fakeTransport() + const page = createRemotePage(t.transport) + + await page.dropFiles([], 10, 20) + + expect(t.invoke).not.toHaveBeenCalled() }) }) diff --git a/src/lib/remote-page.ts b/src/lib/remote-page.ts index 2ef449d..3c905fe 100644 --- a/src/lib/remote-page.ts +++ b/src/lib/remote-page.ts @@ -94,6 +94,13 @@ export type InputIntent = } | { kind: "wheel"; event: WheelEventLike } +/** A file to inject into the remote page (paste or drop), as a base64 data URL. */ +export interface DropFileSpec { + dataUrl: string + name: string + type: string +} + export interface RemotePageOptions { /** Maps a client point to Remote Page pixels (the injected Viewport Transform). */ resolveCoords?: (clientX: number, clientY: number) => { x: number; y: number } @@ -242,7 +249,25 @@ export interface RemotePage { * ClipboardEvent carrying the image as a File (rich editors read clipboardData.files). * `dataUrl` is a `data:image/...;base64,…` string. */ - pasteImage(dataUrl: string): void + pasteImage(dataUrl: string): Promise + /** + * Pastes an arbitrary file (video, audio, doc, image) into the remote page's + * focused element by synthesizing a `paste` ClipboardEvent carrying the file as a + * `File` in a `DataTransfer`. Unlike `pasteImage` this preserves the original file + * name + MIME type so upload targets that sniff extension/type (Slack, Drive) accept + * it. `dataUrl` is a `data:;base64,…` string. The payload is streamed to the + * remote in chunks (see `streamFiles`) so a large file can't freeze the CDP link. + */ + pasteFile(dataUrl: string, name: string, type: string): Promise + /** + * Drops one or more files onto the remote page at the given client coordinates by + * synthesizing dragenter/dragover/drop DragEvents on the element under the cursor, + * each carrying the files in a `DataTransfer` (upload dropzones — Slack, Gmail, Drive — + * read `dataTransfer.files`). `clientX`/`clientY` are canvas-relative client px, mapped + * to Remote Page DIP via the injected coord resolver (the same path as Input Forwarding). + * Payload is streamed in chunks so a large video can't freeze the link. No-op when empty. + */ + dropFiles(files: DropFileSpec[], clientX: number, clientY: number): Promise /** * In-page find (t001). The remote-side search is an injected per-document routine * (`window.find` reports only a boolean — it can't count or step deterministically), @@ -265,6 +290,62 @@ function normalizeUrl(url: string): string { return /^https?:\/\//.test(url) ? url : `https://${url}` } +/** + * ~2 MB of base64 chars per Runtime.evaluate. Small enough that the remote parses each + * chunk's source literal in a single fast tick (a whole-file literal — tens of MB — would + * block the renderer parsing it, freezing the screencast that shares the CDP socket), big + * enough to keep the round-trip count low. See `streamFiles`. + */ +const FILE_CHUNK_CHARS = 2_000_000 + +/** + * Streams each file's base64 payload into the remote page in bounded chunks, accumulating + * them under `window.__cdpFiles[key]`. Returns the per-file keys (in input order) so the + * caller's assemble step can join + decode them. Sequential by design — order matters for + * the join, and interleaving small evaluates with screencast frames is what keeps the link + * from stalling. The bytes never travel inside an evaluate's source as one giant literal. + */ +async function streamFiles( + transport: Transport, + files: DropFileSpec[], + seq: number, +): Promise { + const keys: string[] = [] + for (let fi = 0; fi < files.length; fi++) { + const key = `s${seq}_${fi}` + keys.push(key) + const b64 = files[fi].dataUrl.slice(files[fi].dataUrl.indexOf(",") + 1) + await transport.invoke("Runtime.evaluate", { + expression: `((window.__cdpFiles||(window.__cdpFiles={}))[${JSON.stringify(key)}]=[])`, + }) + for (let i = 0; i < b64.length; i += FILE_CHUNK_CHARS) { + await transport.invoke("Runtime.evaluate", { + expression: `window.__cdpFiles[${JSON.stringify(key)}].push(${JSON.stringify( + b64.slice(i, i + FILE_CHUNK_CHARS), + )})`, + }) + } + } + return keys +} + +/** + * In-page JS that reconstructs a `DataTransfer` (`dt`) from the streamed chunks: joins each + * file's base64, decodes to bytes, and builds a real `File` (name + MIME preserved) the + * remote page reads from `clipboardData.files` / `dataTransfer.files`. Frees `__cdpFiles` + * after. Returns a statement block defining `dt`; pairs with `streamFiles`. + */ +function assembleFilesExpr(keys: string[], files: DropFileSpec[]): string { + const metas = keys.map((key, i) => ({ key, name: files[i].name, type: files[i].type })) + return `const dt = new DataTransfer(); + for (const m of ${JSON.stringify(metas)}) { + const b64 = ((window.__cdpFiles||{})[m.key]||[]).join(""); + const bytes = Uint8Array.from(atob(b64), (c) => c.charCodeAt(0)); + dt.items.add(new File([bytes], m.name, { type: m.type || "application/octet-stream" })); + try { delete window.__cdpFiles[m.key]; } catch (e) {} + }` +} + export function createRemotePage( transport: Transport, options: RemotePageOptions = {}, @@ -283,6 +364,8 @@ export function createRemotePage( // parentId); seeded from the first loading event when still unknown, and reset on // disconnect so each tab tracks its own frame. let mainFrameId: string | undefined + // Monotonic id so concurrent file injections never clobber each other's `__cdpFiles` keys. + let fileSeq = 0 // One registration on the raw transport, demuxed to typed subscribers. Subscribers // come and go via `on`'s unsubscribe — the transport listener is registered once. @@ -481,20 +564,42 @@ export function createRemotePage( } }, pasteImage(dataUrl) { - // Input.insertText can't carry images, so synthesize a paste event on the remote's - // focused element with a DataTransfer holding the image File — rich editors (Slack, - // Gmail, Docs) that listen for `paste` read it from clipboardData.files. - transport.invoke("Runtime.evaluate", { - expression: `(async () => { - const res = await fetch(${JSON.stringify(dataUrl)}); - const blob = await res.blob(); - const file = new File([blob], "pasted-image.png", { type: blob.type || "image/png" }); - const dt = new DataTransfer(); - dt.items.add(file); + return this.pasteFile(dataUrl, "pasted-image.png", "image/png") + }, + async pasteFile(dataUrl, name, type) { + // Input.insertText can't carry binary, so synthesize a paste event on the remote's + // focused element with a DataTransfer holding the File — rich editors / upload + // surfaces (Slack, Gmail, Drive) that listen for `paste` read it from + // clipboardData.files. Name + type are preserved so the target accepts the file + // (a video needs its real extension/MIME, not a generic image). Streamed in chunks + // so a large file never lands as one giant Runtime.evaluate literal (that freezes CDP). + const files = [{ dataUrl, name, type }] + const keys = await streamFiles(transport, files, fileSeq++) + await transport.invoke("Runtime.evaluate", { + expression: `(() => { + ${assembleFilesExpr(keys, files)} const el = document.activeElement || document.body; el.dispatchEvent(new ClipboardEvent("paste", { clipboardData: dt, bubbles: true, cancelable: true })); })()`, - awaitPromise: true, + }) + }, + async dropFiles(files, clientX, clientY) { + if (!files.length) return + // Map the drop point to Remote Page DIP (same resolver as Input Forwarding), then + // synthesize dragenter/dragover/drop on the element under the cursor. Many upload + // dropzones only listen for `drop` + `dataTransfer.files`; the lead-in drag events + // satisfy the ones that gate on a prior dragover. Streamed in chunks (see streamFiles). + const { x, y } = resolveCoords(clientX, clientY) + const keys = await streamFiles(transport, files, fileSeq++) + await transport.invoke("Runtime.evaluate", { + expression: `(() => { + ${assembleFilesExpr(keys, files)} + const el = document.elementFromPoint(${x}, ${y}) || document.body; + const base = { bubbles: true, cancelable: true, composed: true, clientX: ${x}, clientY: ${y}, dataTransfer: dt }; + el.dispatchEvent(new DragEvent("dragenter", base)); + el.dispatchEvent(new DragEvent("dragover", base)); + el.dispatchEvent(new DragEvent("drop", base)); + })()`, }) }, async find(query) { diff --git a/src/vite-env.d.ts b/src/vite-env.d.ts index 57337d1..191cc2e 100644 --- a/src/vite-env.d.ts +++ b/src/vite-env.d.ts @@ -99,6 +99,11 @@ interface CdpBridge { readClipboard: () => Promise /** Electron-only: the local clipboard's image as a data URL, or null if none. */ readClipboardImage: () => Promise + /** + * Electron-only: real files referenced on the local clipboard (e.g. a video copied in + * Finder), read as `{ name, type, dataUrl }`. Empty when no file reference is present. + */ + readClipboardFiles: () => Promise> onSwipe: (cb: (direction: string) => void) => void // Pins getPins: () => Promise