From 714c0edd7dce2c486014b463bd5a96d8b2994c88 Mon Sep 17 00:00:00 2001 From: Arul Sharma <31745423+arul28@users.noreply.github.com> Date: Mon, 11 May 2026 16:30:56 -0400 Subject: [PATCH 01/10] Add autoresearch perf loop foundation + skills MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Instruments ADE for an iterative perf-optimization loop driven by an agentskills.io SKILL. Adds main-process metrics sampler (app metrics + memory), renderer-side perfMark/perfMeasure + PerformanceObserver web vitals, a tap into the existing IPC trace wrapper, and a JSONL + summary aggregator that produces a single fitness score per scenario run. Includes per-tab scenario harness (lanes + boot), launch shortcuts (perf-launch.mjs / run-perf-scenario.mjs with --no-project and --tab), and the ade-autoresearch / ade-perf-lanes / ade-perf-boot skills under .agents/skills/. Smoke-verified end-to-end on the lanes tab — summary.json populated with real signal across all six fitness components. Co-Authored-By: Claude Opus 4.7 (1M context) --- .agents/skills/ade-autoresearch/SKILL.md | 192 +++++++++ .agents/skills/ade-perf-boot/SKILL.md | 64 +++ .agents/skills/ade-perf-lanes/SKILL.md | 47 +++ apps/desktop/src/main/main.ts | 14 + .../src/main/services/ipc/registerIpc.ts | 11 + .../src/main/services/perf/aggregator.ts | 378 ++++++++++++++++++ .../src/main/services/perf/metricsSampler.ts | 43 ++ .../desktop/src/main/services/perf/perfIpc.ts | 64 +++ .../desktop/src/main/services/perf/perfLog.ts | 95 +++++ apps/desktop/src/preload/global.d.ts | 21 + apps/desktop/src/preload/preload.ts | 11 + apps/desktop/src/renderer/main.tsx | 3 + .../src/renderer/perf/harness/index.ts | 44 ++ apps/desktop/src/renderer/perf/markers.ts | 81 ++++ .../src/renderer/perf/scenarios/boot.ts | 101 +++++ .../src/renderer/perf/scenarios/index.ts | 164 ++++++++ .../src/renderer/perf/scenarios/lanes.ts | 94 +++++ apps/desktop/src/renderer/perf/webVitals.ts | 81 ++++ apps/desktop/src/shared/ipc.ts | 4 + scripts/perf-launch.mjs | 96 +++++ scripts/reset-perf-pass.sh | 32 ++ scripts/run-perf-scenario.mjs | 108 +++++ 22 files changed, 1748 insertions(+) create mode 100644 .agents/skills/ade-autoresearch/SKILL.md create mode 100644 .agents/skills/ade-perf-boot/SKILL.md create mode 100644 .agents/skills/ade-perf-lanes/SKILL.md create mode 100644 apps/desktop/src/main/services/perf/aggregator.ts create mode 100644 apps/desktop/src/main/services/perf/metricsSampler.ts create mode 100644 apps/desktop/src/main/services/perf/perfIpc.ts create mode 100644 apps/desktop/src/main/services/perf/perfLog.ts create mode 100644 apps/desktop/src/renderer/perf/harness/index.ts create mode 100644 apps/desktop/src/renderer/perf/markers.ts create mode 100644 apps/desktop/src/renderer/perf/scenarios/boot.ts create mode 100644 apps/desktop/src/renderer/perf/scenarios/index.ts create mode 100644 apps/desktop/src/renderer/perf/scenarios/lanes.ts create mode 100644 apps/desktop/src/renderer/perf/webVitals.ts create mode 100755 scripts/perf-launch.mjs create mode 100755 scripts/reset-perf-pass.sh create mode 100755 scripts/run-perf-scenario.mjs diff --git a/.agents/skills/ade-autoresearch/SKILL.md b/.agents/skills/ade-autoresearch/SKILL.md new file mode 100644 index 000000000..36c9978bf --- /dev/null +++ b/.agents/skills/ade-autoresearch/SKILL.md @@ -0,0 +1,192 @@ +--- +name: ade-autoresearch +description: Iteratively optimize an ADE tab's CPU/memory/IPC/render performance. + Runs predefined scenarios, identifies bottlenecks from JSONL metrics, makes ONE + targeted code change per iteration, gates on tests + smoke, keeps wins on a + branch, and distills patterns into per-tab perf skills. Invoke when the user + says "optimize ", "autoresearch ", or "perf pass on ". Drives + ADE pointed at the perf-pass throwaway repo; full liberty inside that repo. + Uses Codex/GPT models for any in-ADE AI activity unless a scenario opts into + Claude. +metadata: + author: ADE + version: 0.1.0 +--- + +# ade-autoresearch + +A Karpathy-style autoresearch loop for ADE perf. You (the agent) ARE the loop runner — there is no hidden script. Follow this algorithm exactly. + +## Inputs + +- ``: the tab to optimize. Must be one of: `boot`, `lanes`, `missions`, `prs`, `work`, `files`, `run`, `graph`, `review`, `history`, `automations`, `cto`, `settings`. (`boot` = cold launch + welcome + project open + remote runtime + iOS pairing — the "main ADE screen" surface above any specific tab.) +- ``: throwaway git repo path. Defaults to `/Users/admin/Projects/perf pass` (note the space — quote it). Must exist, must be a git repo, must have a `perf-pass-seed` tag (or you create one on first run). Override via `ADE_PERF_PASS_DIR` env var. + +## Shortcuts — prefer IPC and direct launch over computer use + +Computer use (screenshots + clicks) is slow and noisy. Always prefer these in this order: + +1. **Direct IPC calls** in scenarios. Most ADE actions are exposed on `window.ade.*` — call them directly instead of clicking buttons. Examples: + ```ts + await window.ade.project.openRepo({ rootPath: "/some/repo" }); + await window.ade.lanes.create({ name: "x", ... }); + await window.ade.project.listRecent(); + await window.ade.remoteRuntime.listTargets(); + ``` + Read `apps/desktop/src/preload/global.d.ts` for the full IPC surface. + +2. **Warm launch on a specific tab** without running any scenario: + ```bash + node scripts/perf-launch.mjs --tab lanes + node scripts/perf-launch.mjs --tab boot --no-project + node scripts/perf-launch.mjs --route /settings/integrations + ``` + This boots ADE with perf instrumentation, navigates to the target route, and stays open. Ideal for hands-on inspection of metrics as they stream into `~/.ade/perf-runs//events.jsonl`. + +3. **Single-scenario run** for a measured cycle: + ```bash + node scripts/run-perf-scenario.mjs lanes.cold-list run-id + node scripts/run-perf-scenario.mjs boot.idle-welcome run-id --no-project + ``` + +4. **Computer use is the last resort.** Use only when no IPC exists and a click is unavoidable. Even then, prefer adding a `data-testid` and a `clickTestId` step in the scenario instead of pixel coordinates. + +## Setup (do once at start of run) + +1. **Read prior wins** at `.agents/skills/ade-perf-/SKILL.md` if it exists. These are patterns past runs discovered — do not redo them, and prefer NOT to undo them. If conflict, prefer the prior win unless your new change strictly improves on it. +2. **Read scenario definitions** at `apps/desktop/src/renderer/perf/scenarios/.ts`. These are the *contract*. Do NOT edit them. +3. **Verify perf-pass repo** is clean and on its seed tag: + ```bash + scripts/reset-perf-pass.sh + ``` + Refuse to start if perf-pass doesn't exist — instruct the user to create it. +4. **Create a working branch** off main: + ```bash + git checkout -b autoresearch/-$(date +%Y%m%d-%H%M) + ``` +5. **Set the model override** for all in-ADE AI activity: export `ADE_MODEL_OVERRIDE=gpt-5-codex` (or another GPT/Codex model id available in ADE). Don't touch this during the run. + +## Baseline (iteration 0) + +Run all scenarios for the tab. For lanes that's: + +```bash +node scripts/run-perf-scenario.mjs lanes.cold-list baseline-cold +node scripts/run-perf-scenario.mjs lanes.switch-rapid baseline-switch +node scripts/run-perf-scenario.mjs lanes.idle-at-rest baseline-idle +node scripts/run-perf-scenario.mjs lanes.scroll-list baseline-scroll +node scripts/run-perf-scenario.mjs lanes.stress-poll baseline-stress +``` + +For boot (which has a mix of project-loaded and no-project scenarios): + +```bash +node scripts/run-perf-scenario.mjs boot.cold-paint baseline-paint --no-project +node scripts/run-perf-scenario.mjs boot.recent-projects baseline-recent --no-project +node scripts/run-perf-scenario.mjs boot.open-project baseline-open --no-project +node scripts/run-perf-scenario.mjs boot.remote-runtime baseline-remote +node scripts/run-perf-scenario.mjs boot.idle-welcome baseline-idle --no-project +node scripts/run-perf-scenario.mjs boot.stress-launch baseline-stress --no-project +``` + +Each writes `~/.ade/perf-runs//summary.json`. Read all summaries. Compute the **per-tab fitness** as the sum of all scenario fitness scores. Record this as `baseline_fitness`. Also record per-component breakdown so you can target the worst component. + +Tag the baseline commit: +```bash +git tag perf-baseline--$(date +%Y%m%d) +``` + +## Iteration loop + +Stop conditions: **no fitness improvement for 10 consecutive iterations** OR user kills the run OR 50 iterations OR 4 hours wall-clock. + +For each iteration: + +### 1. Analyze +- Read the latest set of `summary.json` files for this tab. +- Pick the **#1 bottleneck**: the component contributing most to fitness. Tie-break by reproducibility (the bottleneck that appears across multiple scenarios > single-scenario). +- Common bottleneck categories: + - **Slow IPC channel**: a channel in `summary.ipc.slowChannels` with p95 ≥ 120ms + - **Long task spam**: `webVitals.longTaskCount` > 5 per minute + - **Memory growth**: `process.rendererHeapGrowthMB` > 10 over a scenario + - **Render-on-scroll cost**: `marks.scroll.*` p95 high + - **Route transition cost**: `marks.nav.*` or `marks.switch.*` p95 high + - **Main CPU**: `process.mainCpuPercentP95` > 30 during idle scenarios → background pollers +- Read the code that owns the bottleneck. Form a hypothesis. + +### 2. Propose ONE change + +Legal moves (examples — not exhaustive): +- Memoize a hot selector with `useMemo` / `useCallback` +- Batch IPC calls (collapse N independent invokes into one) +- Debounce / throttle a poller +- Virtualize a long list (`@tanstack/react-virtual` or similar) +- Lazy-load a heavy component (`React.lazy`) +- Replace `O(n²)` work with a Map lookup +- Hoist a stable callback out of render +- Skip re-renders with `React.memo` + stable props +- Move work off the render thread (`requestIdleCallback`, microtask deferral) +- Replace a polling interval with an event-driven subscription +- Cache an expensive derive (only invalidate on deps change) + +**Forbidden moves:** +- Editing anything under `apps/desktop/src/main/services/perf/**` +- Editing anything under `apps/desktop/src/renderer/perf/**` +- Editing `scripts/run-perf-scenario.mjs` or `scripts/reset-perf-pass.sh` +- Editing test files to make them pass +- Disabling polling/sync features outright (only debounce/throttle) +- Removing UI features or hiding elements to bypass scenarios +- Changing fitness weights or scenario definitions + +### 3. Apply the change +One commit, focused. Conventional message: `perf(): `. + +### 4. Test gate +Run **only the affected test files**. Never the full suite. Use the per-tab Vitest projects. +```bash +npm --prefix apps/desktop run typecheck +npm --prefix apps/desktop run test -- --run path/to/affected.test.ts +``` +If tests fail: **revert** the commit (`git reset --hard HEAD~1`), do NOT count toward plateau, try a different change targeting the same or next bottleneck. + +### 5. Measure +Re-run all scenarios for the tab. Compute new per-tab fitness. + +### 6. Smoke gate +For each scenario's summary, check `summary.scenarios..ok === true` and `smokeFailures.length === 0`. If any scenario failed smoke: **revert**, increment plateau counter. + +### 7. Decide +- Improvement threshold: `new_fitness < best_fitness * 0.98` (≥2% better) +- If improvement: **keep**. Update best. Reset plateau to 0. Amend the commit message with `fitness `. +- Else: **revert** (`git reset --hard HEAD~1`). Plateau += 1. + +### 8. Soft iteration cap +If this iteration has been running >15 minutes wall clock (build loops, scenario flakes, etc.), abort it: revert any in-progress change, mark as a missed iteration (don't count toward plateau), move on. + +## Termination + +When stop condition hits: +1. Print run summary: starting fitness, final fitness, %-improvement, list of kept commits (sha + message + fitness delta). +2. Suggest the user merge the working branch into main via PR. +3. Proceed to codification (next section). + +## Codify (after the run ends) + +Read all kept commits (`git log --oneline perf-baseline--... HEAD`). For each, extract the **pattern** (the technique used, not the literal change). Update `.agents/skills/ade-perf-/SKILL.md`: + +- One entry per pattern. If a similar pattern already exists, append a refinement instead of duplicating. +- Each entry: + - **Pattern**: one-line name (e.g. "Debounce git-status pollers behind window visibility"). + - **Why it helped**: which bottleneck it addressed, with the metric delta from the summary. + - **How to recognize when to apply**: signs in future code that the same pattern is needed. + - **Anti-pattern to avoid**: what NOT to do. + - **Verification**: which scenario + metric this affected. +- Append-only. Do not delete prior entries. + +## Notes on agent behavior + +- **Stay focused.** One bottleneck at a time. Resist the urge to "while I'm here also fix..." — that breaks attribution. +- **Trust the metric.** If fitness went up but you "feel" the code is better, revert anyway. The metric is the contract. +- **The perf-pass repo is your sandbox.** Inside it, you may create lanes, open chats, run automations, anything that exercises ADE. The scenarios already drive this; you may extend them ONLY by adding new scenarios in `apps/desktop/src/renderer/perf/scenarios/.ts` — never by editing existing ones. +- **Codex model only.** If a scenario invokes an in-ADE chat, that chat uses the `ADE_MODEL_OVERRIDE` model (gpt-5-codex by default). Scenarios opting into Claude must declare `requiresClaude: true` and you must set `ADE_PERF_ALLOW_CLAUDE=1` for them. +- **Concurrency**: only one perf run on the machine at a time. If `~/.ade/perf-runs/` contains a `/lock` file with a live pid, refuse to start. diff --git a/.agents/skills/ade-perf-boot/SKILL.md b/.agents/skills/ade-perf-boot/SKILL.md new file mode 100644 index 000000000..1f69521e4 --- /dev/null +++ b/.agents/skills/ade-perf-boot/SKILL.md @@ -0,0 +1,64 @@ +--- +name: ade-perf-boot +description: Performance patterns discovered for ADE's cold launch and "main + screen" surfaces — welcome / project picker, recent projects list, project + open flow, remote runtime connect, iOS pairing. Read before editing files in + apps/desktop/src/main/main.ts, the App shell (apps/desktop/src/renderer/components/app/**), + project bootstrap services (apps/desktop/src/main/services/projects/**, lanes/**), + remote runtime services, or anything in the app's pre-tab boot path. + Append-only knowledge base populated by ade-autoresearch runs. +metadata: + author: ade-autoresearch + version: 0.1.0 + status: seed +--- + +# ade-perf-boot + +Patterns discovered for ADE's cold launch and main-screen surfaces. Each entry has run-traced provenance — do not delete entries without explicit user approval. + +## Scope + +This tab covers the code paths that run before any per-tab UI is mounted, plus the project-level chrome that appears regardless of which tab is open: + +- **Cold launch** — process spawn → main process bootstrap → renderer first paint → first interactive. +- **Welcome / project picker** — when no project is loaded. +- **Recent projects** — listing, rendering, icon resolution. +- **Project open flow** — `project.openRepo` IPC, the load + bind path. +- **Remote runtime** — `remoteRuntime.listTargets`, snapshot, connection lifecycle. +- **iOS / phone pairing** — pairing status, auth, etc. +- **App shell chrome** — sidebar, header, route transitions (the AppShell wrapping all tabs). + +Use `ade-autoresearch boot` to run an optimization cycle against this surface. + +## Scenarios this tab is benchmarked against + +Defined in `apps/desktop/src/renderer/perf/scenarios/boot.ts`: + +- `boot.cold-paint` — measures FCP / LCP / INP during cold launch (no driving). +- `boot.recent-projects` — `project.listRecent` IPC + welcome render. +- `boot.open-project` — `project.openRepo` round-trip for perf-pass. +- `boot.remote-runtime` — `remoteRuntime.listTargets` cost. +- `boot.idle-welcome` — 20s idle on welcome (no project) — catches background pollers. +- `boot.stress-launch` — 2min idle from cold — catches startup leaks. + +The "no project" scenarios should be run with `--no-project` so the welcome screen renders: + +```bash +node scripts/run-perf-scenario.mjs boot.idle-welcome run-id --no-project +``` + +## Patterns + +_No patterns recorded yet — populated by the first `ade-autoresearch boot` run._ + + diff --git a/.agents/skills/ade-perf-lanes/SKILL.md b/.agents/skills/ade-perf-lanes/SKILL.md new file mode 100644 index 000000000..df30ee355 --- /dev/null +++ b/.agents/skills/ade-perf-lanes/SKILL.md @@ -0,0 +1,47 @@ +--- +name: ade-perf-lanes +description: Performance patterns discovered for ADE's Lanes tab. Read before + editing files under apps/desktop/src/renderer/components/lanes/** or + apps/desktop/src/main/services/lanes/**. Append-only knowledge base populated + by ade-autoresearch runs. Skip patterns that contradict the current scenario + contract. +metadata: + author: ade-autoresearch + version: 0.1.0 + status: seed +--- + +# ade-perf-lanes + +Patterns discovered for the Lanes tab. Each entry has run-traced provenance — do not delete entries without explicit user approval. + +## How to use this file + +- Read all entries before making any change in lanes code. +- If a proposed change conflicts with an entry: prefer the entry. If you believe you can do better, run `ade-autoresearch lanes` and prove it with metrics. +- New entries are appended by `ade-autoresearch` at the end of each run. + +## Scenarios this tab is benchmarked against + +Defined in `apps/desktop/src/renderer/perf/scenarios/lanes.ts`: + +- `lanes.cold-list` — cold open of /lanes route. +- `lanes.switch-rapid` — fast route switching to/from /lanes. +- `lanes.idle-at-rest` — 30s on /lanes, measures background polling cost. +- `lanes.stress-poll` — 2min on /lanes, catches leaks. +- `lanes.scroll-list` — scroll the lanes list repeatedly. + +## Patterns + +_No patterns recorded yet — populated by the first `ade-autoresearch lanes` run._ + + diff --git a/apps/desktop/src/main/main.ts b/apps/desktop/src/main/main.ts index 2a69c7240..e7b2eaa73 100644 --- a/apps/desktop/src/main/main.ts +++ b/apps/desktop/src/main/main.ts @@ -8,6 +8,9 @@ type NodePtyType = typeof NodePty; import { isAdeMcpNamedPipePath } from "../shared/adeMcpIpc"; import { registerIpc } from "./services/ipc/registerIpc"; import { createFileLogger } from "./services/logging/logger"; +import { initPerfRunFromEnv } from "./services/perf/perfLog"; +import { startMetricsSampler } from "./services/perf/metricsSampler"; +import { registerPerfIpcHandlers } from "./services/perf/perfIpc"; import { openKvDb } from "./services/state/kvDb"; import { ensureAdeDirs } from "./services/state/projectState"; import { @@ -816,6 +819,12 @@ app.on("open-file", (event, filePath) => { }); app.whenReady().then(async () => { + // Perf run init — must come first so subsequent IPC + sampler hooks can see the active run. + const perfRun = initPerfRunFromEnv(); + if (perfRun) { + startMetricsSampler(); + } + /** Canonical artifacts dir for the active project; ade-artifact:// only serves under this path. */ let adeArtifactAllowedDir: string | null = null; @@ -5569,6 +5578,11 @@ app.whenReady().then(async () => { // Explicit project launches still bind a project before the renderer boots; // normal launches stay on the welcome/recent-project surface. + + registerPerfIpcHandlers(); + + // Restore the startup project before the renderer boots so packaged launches + // do not flash into the welcome state and lose the previous project context. if (shouldOpenStartupProject && startupProject.rootPath) { try { await switchProjectFromDialog(startupProject.rootPath); diff --git a/apps/desktop/src/main/services/ipc/registerIpc.ts b/apps/desktop/src/main/services/ipc/registerIpc.ts index 23497d91b..bbed2fdc8 100644 --- a/apps/desktop/src/main/services/ipc/registerIpc.ts +++ b/apps/desktop/src/main/services/ipc/registerIpc.ts @@ -9,6 +9,7 @@ import path from "node:path"; import { fileURLToPath } from "node:url"; import { IPC } from "../../../shared/ipc"; import { getModelById } from "../../../shared/modelRegistry"; +import { appendEvent as perfAppend, isRunActive as isPerfRunActive } from "../perf/perfLog"; import { buildPrAiResolutionContextKey } from "../../../shared/types"; import { launchPrIssueResolutionChat, previewPrIssueResolutionPrompt } from "../prs/prIssueResolver"; import { launchRebaseResolutionChat } from "../prs/prRebaseResolver"; @@ -2072,6 +2073,16 @@ export function registerIpc({ durationMs: number; failed: boolean; }) => { + if (isPerfRunActive()) { + perfAppend({ + ts: Date.now(), + kind: "ipcInvoke", + channel: input.channel, + winId: input.winId, + durationMs: input.durationMs, + failed: input.failed, + }); + } const key = `${input.winId ?? "none"}:${input.channel}`; const existing = ipcInvokeAggregates.get(key) ?? { channel: input.channel, diff --git a/apps/desktop/src/main/services/perf/aggregator.ts b/apps/desktop/src/main/services/perf/aggregator.ts new file mode 100644 index 000000000..fd3f1dea1 --- /dev/null +++ b/apps/desktop/src/main/services/perf/aggregator.ts @@ -0,0 +1,378 @@ +import { readFileSync, writeFileSync, existsSync } from "node:fs"; +import { homedir } from "node:os"; +import { join } from "node:path"; + +type Event = { ts: number; kind: string; [key: string]: unknown }; + +type ProcessMetricSample = { + ts: number; + processes: Array<{ + pid: number; + type: string; + cpuPercent: number; + workingSetSizeKb: number; + }>; + mainRss: number; + mainHeapUsed: number; +}; + +type RendererMemSample = { ts: number; usedMB: number }; + +type WebVitalSample = { + ts: number; + metric: "INP" | "FCP" | "CLS" | "LCP"; + value: number; + scenario: string | null; +}; + +type IpcSample = { + ts: number; + channel: string; + durationMs: number; + failed: boolean; + scenario: string | null; +}; + +type MeasureSample = { + ts: number; + name: string; + durationMs: number; + scenario: string | null; +}; + +type LongTaskSample = { ts: number; durationMs: number; scenario: string | null }; + +type Summary = { + runId: string; + startedAt: number; + endedAt: number; + durationMs: number; + scenarios: Record< + string, + { + startedAt: number; + endedAt: number; + durationMs: number; + ok: boolean; + smokeFailures: string[]; + } + >; + fitness: { + score: number; + components: { + interactionLatencyP95: number; + inpP95: number; + rendererHeapGrowthMB: number; + mainCpuSeconds: number; + ipcP95TopChannels: number; + longTaskCount: number; + }; + }; + ipc: { + perChannel: Array<{ + channel: string; + count: number; + p50: number; + p95: number; + max: number; + failedCount: number; + }>; + slowChannels: Array<{ channel: string; p95: number; count: number }>; + }; + marks: Array<{ name: string; count: number; p50: number; p95: number; max: number }>; + webVitals: { + inpP95: number; + inpSamples: number; + fcp: number | null; + cls: number; + longTaskCount: number; + longTaskTotalMs: number; + }; + process: { + mainCpuPercentP95: number; + mainCpuSecondsApprox: number; + rendererPeakRssKb: number; + gpuPeakRssKb: number; + rendererHeapGrowthMB: number; + rendererHeapPeakMB: number; + }; +}; + +function percentile(sorted: number[], p: number): number { + if (sorted.length === 0) return 0; + const idx = Math.min(sorted.length - 1, Math.max(0, Math.floor((sorted.length - 1) * p))); + return sorted[idx]!; +} + +function readEvents(path: string): Event[] { + if (!existsSync(path)) return []; + const text = readFileSync(path, "utf8"); + const out: Event[] = []; + for (const line of text.split("\n")) { + if (!line.trim()) continue; + try { + out.push(JSON.parse(line) as Event); + } catch { + // skip malformed line + } + } + return out; +} + +export function aggregate(runId: string): Summary { + const dir = join(homedir(), ".ade", "perf-runs", runId); + const eventsPath = join(dir, "events.jsonl"); + const events = readEvents(eventsPath); + + if (events.length === 0) { + throw new Error(`No events found at ${eventsPath}`); + } + + const startedAt = events[0]!.ts; + const endedAt = events[events.length - 1]!.ts; + + // Group scenarios. + type ScenarioState = { + startedAt: number; + endedAt: number; + ok: boolean; + smokeFailures: string[]; + }; + const scenarios: Record = {}; + let currentScenario: string | null = null; + + const processSamples: ProcessMetricSample[] = []; + const rendererMem: RendererMemSample[] = []; + const webVitals: WebVitalSample[] = []; + const longTasks: LongTaskSample[] = []; + const ipcCalls: IpcSample[] = []; + const measures: MeasureSample[] = []; + + for (const ev of events) { + switch (ev.kind) { + case "scenarioStart": { + const name = String(ev.scenario ?? "unknown"); + scenarios[name] = { + startedAt: ev.ts, + endedAt: ev.ts, + ok: false, + smokeFailures: [], + }; + currentScenario = name; + break; + } + case "scenarioEnd": { + const name = String(ev.scenario ?? currentScenario ?? "unknown"); + const state = scenarios[name]; + if (state) { + state.endedAt = ev.ts; + state.ok = ev.ok === true; + if (Array.isArray(ev.smokeFailures)) { + state.smokeFailures = ev.smokeFailures.map(String); + } + } + currentScenario = null; + break; + } + case "processMetrics": { + processSamples.push(ev as unknown as ProcessMetricSample); + break; + } + case "rendererMemory": { + rendererMem.push({ ts: ev.ts, usedMB: Number(ev.usedMB ?? 0) }); + break; + } + case "webVital": { + webVitals.push({ + ts: ev.ts, + metric: ev.metric as WebVitalSample["metric"], + value: Number(ev.value ?? 0), + scenario: currentScenario, + }); + break; + } + case "longTask": { + longTasks.push({ + ts: ev.ts, + durationMs: Number(ev.durationMs ?? 0), + scenario: currentScenario, + }); + break; + } + case "ipcInvoke": { + ipcCalls.push({ + ts: ev.ts, + channel: String(ev.channel ?? "unknown"), + durationMs: Number(ev.durationMs ?? 0), + failed: ev.failed === true, + scenario: currentScenario, + }); + break; + } + case "measure": { + measures.push({ + ts: ev.ts, + name: String(ev.name ?? "unknown"), + durationMs: Number(ev.durationMs ?? 0), + scenario: currentScenario, + }); + break; + } + default: + break; + } + } + + // IPC per-channel stats. + const ipcByChannel = new Map(); + const ipcFailed = new Map(); + for (const c of ipcCalls) { + const arr = ipcByChannel.get(c.channel) ?? []; + arr.push(c.durationMs); + ipcByChannel.set(c.channel, arr); + if (c.failed) ipcFailed.set(c.channel, (ipcFailed.get(c.channel) ?? 0) + 1); + } + const ipcPerChannel = [...ipcByChannel.entries()].map(([channel, durations]) => { + const sorted = [...durations].sort((a, b) => a - b); + return { + channel, + count: sorted.length, + p50: Math.round(percentile(sorted, 0.5)), + p95: Math.round(percentile(sorted, 0.95)), + max: Math.round(Math.max(...sorted)), + failedCount: ipcFailed.get(channel) ?? 0, + }; + }); + ipcPerChannel.sort((a, b) => b.p95 - a.p95); + const slowChannels = ipcPerChannel.filter((c) => c.p95 >= 120).map((c) => ({ + channel: c.channel, + p95: c.p95, + count: c.count, + })); + const ipcP95TopChannels = ipcPerChannel + .slice(0, 5) + .reduce((sum, c) => sum + c.p95, 0); + + // Marks per-name stats. + const measureByName = new Map(); + for (const m of measures) { + const arr = measureByName.get(m.name) ?? []; + arr.push(m.durationMs); + measureByName.set(m.name, arr); + } + const marks = [...measureByName.entries()].map(([name, durations]) => { + const sorted = [...durations].sort((a, b) => a - b); + return { + name, + count: sorted.length, + p50: Math.round(percentile(sorted, 0.5)), + p95: Math.round(percentile(sorted, 0.95)), + max: Math.round(Math.max(...sorted)), + }; + }); + marks.sort((a, b) => b.p95 - a.p95); + const interactionLatencyP95 = marks.slice(0, 3).reduce((sum, m) => sum + m.p95, 0); + + // Web Vitals stats. + const inpValues = webVitals.filter((w) => w.metric === "INP").map((w) => w.value); + const inpSorted = [...inpValues].sort((a, b) => a - b); + const inpP95 = Math.round(percentile(inpSorted, 0.95)); + const fcpVals = webVitals.filter((w) => w.metric === "FCP").map((w) => w.value); + const fcp = fcpVals.length > 0 ? Math.round(fcpVals[0]!) : null; + const cls = webVitals + .filter((w) => w.metric === "CLS") + .reduce((max, w) => Math.max(max, w.value), 0); + const longTaskCount = longTasks.length; + const longTaskTotalMs = Math.round(longTasks.reduce((sum, t) => sum + t.durationMs, 0)); + + // Process stats. + const mainCpu = processSamples.map( + (s) => s.processes.find((p) => p.type === "Browser")?.cpuPercent ?? 0 + ); + const mainCpuSorted = [...mainCpu].sort((a, b) => a - b); + const mainCpuPercentP95 = percentile(mainCpuSorted, 0.95); + // Approximate CPU-seconds: avg percent * total run seconds * 0.01. + const runSeconds = (endedAt - startedAt) / 1000; + const mainCpuAvg = + mainCpu.length > 0 ? mainCpu.reduce((a, b) => a + b, 0) / mainCpu.length : 0; + const mainCpuSecondsApprox = Math.round(mainCpuAvg * runSeconds * 0.01 * 100) / 100; + + const rendererPeakRssKb = processSamples.reduce((peak, s) => { + const r = s.processes + .filter((p) => p.type !== "Browser" && p.type !== "GPU" && p.type !== "Utility") + .reduce((m, p) => Math.max(m, p.workingSetSizeKb), 0); + return Math.max(peak, r); + }, 0); + const gpuPeakRssKb = processSamples.reduce((peak, s) => { + const g = s.processes + .filter((p) => p.type === "GPU") + .reduce((m, p) => Math.max(m, p.workingSetSizeKb), 0); + return Math.max(peak, g); + }, 0); + + const rendererHeapPeakMB = rendererMem.reduce((m, s) => Math.max(m, s.usedMB), 0); + const rendererHeapStartMB = rendererMem.length > 0 ? rendererMem[0]!.usedMB : 0; + const rendererHeapEndMB = + rendererMem.length > 0 ? rendererMem[rendererMem.length - 1]!.usedMB : 0; + const rendererHeapGrowthMB = Math.max(0, rendererHeapEndMB - rendererHeapStartMB); + + // Fitness (lower = better). + const fitnessScore = + 1.0 * interactionLatencyP95 + + 0.8 * inpP95 + + 0.5 * rendererHeapGrowthMB + + 0.3 * mainCpuSecondsApprox + + 0.2 * ipcP95TopChannels + + 0.2 * longTaskCount; + + const summary: Summary = { + runId, + startedAt, + endedAt, + durationMs: endedAt - startedAt, + scenarios: Object.fromEntries( + Object.entries(scenarios).map(([name, s]) => [ + name, + { + startedAt: s.startedAt, + endedAt: s.endedAt, + durationMs: s.endedAt - s.startedAt, + ok: s.ok, + smokeFailures: s.smokeFailures, + }, + ]) + ), + fitness: { + score: Math.round(fitnessScore * 100) / 100, + components: { + interactionLatencyP95, + inpP95, + rendererHeapGrowthMB, + mainCpuSeconds: mainCpuSecondsApprox, + ipcP95TopChannels, + longTaskCount, + }, + }, + ipc: { perChannel: ipcPerChannel, slowChannels }, + marks, + webVitals: { + inpP95, + inpSamples: inpValues.length, + fcp, + cls: Math.round(cls * 1000) / 1000, + longTaskCount, + longTaskTotalMs, + }, + process: { + mainCpuPercentP95: Math.round(mainCpuPercentP95 * 100) / 100, + mainCpuSecondsApprox, + rendererPeakRssKb, + gpuPeakRssKb, + rendererHeapGrowthMB, + rendererHeapPeakMB, + }, + }; + + writeFileSync(join(dir, "summary.json"), JSON.stringify(summary, null, 2)); + return summary; +} diff --git a/apps/desktop/src/main/services/perf/metricsSampler.ts b/apps/desktop/src/main/services/perf/metricsSampler.ts new file mode 100644 index 000000000..f7a3e3383 --- /dev/null +++ b/apps/desktop/src/main/services/perf/metricsSampler.ts @@ -0,0 +1,43 @@ +import { app } from "electron"; +import { appendEvent, isRunActive } from "./perfLog"; + +const SAMPLE_INTERVAL_MS = 1000; + +let timer: NodeJS.Timeout | null = null; + +export function startMetricsSampler(): void { + if (!isRunActive() || timer) return; + timer = setInterval(() => { + const ts = Date.now(); + try { + const metrics = app.getAppMetrics().map((m) => ({ + pid: m.pid, + type: m.type, + cpuPercent: m.cpu?.percentCPUUsage ?? 0, + cpuIdleWakeups: m.cpu?.idleWakeupsPerSecond ?? 0, + workingSetSizeKb: m.memory?.workingSetSize ?? 0, + peakWorkingSetSizeKb: m.memory?.peakWorkingSetSize ?? 0, + })); + const mem = process.memoryUsage(); + appendEvent({ + ts, + kind: "processMetrics", + processes: metrics, + mainRss: mem.rss, + mainHeapUsed: mem.heapUsed, + mainHeapTotal: mem.heapTotal, + mainExternal: mem.external, + }); + } catch { + // Ignore — metrics are best effort. + } + }, SAMPLE_INTERVAL_MS); + timer.unref?.(); +} + +export function stopMetricsSampler(): void { + if (timer) { + clearInterval(timer); + timer = null; + } +} diff --git a/apps/desktop/src/main/services/perf/perfIpc.ts b/apps/desktop/src/main/services/perf/perfIpc.ts new file mode 100644 index 000000000..d2d62ab0f --- /dev/null +++ b/apps/desktop/src/main/services/perf/perfIpc.ts @@ -0,0 +1,64 @@ +import { ipcMain } from "electron"; +import { IPC } from "../../../shared/ipc"; +import { aggregate } from "./aggregator"; +import { appendEvent, getActiveRun, isRunActive } from "./perfLog"; + +export type PerfRunConfigForRenderer = { + active: boolean; + runId: string | null; + scenario: string | null; + initialRoute: string | null; + allowClaude: boolean; + modelOverride: string | null; +}; + +export type PerfRecordEventArgs = { + ts?: number; + kind: string; + [key: string]: unknown; +}; + +export function registerPerfIpcHandlers(): void { + ipcMain.handle(IPC.perfGetConfig, (): PerfRunConfigForRenderer => { + const run = getActiveRun(); + return { + active: run !== null, + runId: run?.runId ?? null, + scenario: run?.scenario ?? null, + initialRoute: run?.initialRoute ?? null, + allowClaude: run?.allowClaude ?? false, + modelOverride: run?.modelOverride ?? null, + }; + }); + + ipcMain.handle(IPC.perfRecordEvent, (_event, args: PerfRecordEventArgs) => { + if (!isRunActive()) return { ok: false, reason: "no-active-run" }; + const ts = typeof args.ts === "number" ? args.ts : Date.now(); + const { kind, ts: _ignored, ...rest } = args; + appendEvent({ ts, kind: kind as never, ...rest }); + return { ok: true }; + }); + + ipcMain.handle(IPC.perfScenarioComplete, (_event, args: { + scenario: string; + ok: boolean; + smokeFailures?: string[]; + }) => { + if (!isRunActive()) return { ok: false, reason: "no-active-run" }; + appendEvent({ + ts: Date.now(), + kind: "scenarioEnd", + scenario: args.scenario, + ok: args.ok, + smokeFailures: args.smokeFailures ?? [], + }); + return { ok: true }; + }); + + ipcMain.handle(IPC.perfFinalize, () => { + const run = getActiveRun(); + if (!run) return { ok: false, reason: "no-active-run" }; + const summary = aggregate(run.runId); + return { ok: true, summary }; + }); +} diff --git a/apps/desktop/src/main/services/perf/perfLog.ts b/apps/desktop/src/main/services/perf/perfLog.ts new file mode 100644 index 000000000..2bd57d07f --- /dev/null +++ b/apps/desktop/src/main/services/perf/perfLog.ts @@ -0,0 +1,95 @@ +import { appendFileSync, mkdirSync, existsSync } from "node:fs"; +import { homedir } from "node:os"; +import { join } from "node:path"; + +export type PerfEventKind = + | "scenarioStart" + | "scenarioEnd" + | "mark" + | "measure" + | "webVital" + | "longTask" + | "ipcInvoke" + | "processMetrics" + | "rendererMemory" + | "note"; + +export type PerfEvent = { + ts: number; + kind: PerfEventKind; +} & Record; + +type PerfRunConfig = { + runId: string; + scenario: string | null; + initialRoute: string | null; + allowClaude: boolean; + modelOverride: string | null; + startedAt: number; + dir: string; + eventsPath: string; + summaryPath: string; +}; + +let active: PerfRunConfig | null = null; + +function readEnvRunId(): string | null { + const raw = process.env.ADE_PERF_RUN_ID; + return raw && raw.length > 0 ? raw : null; +} + +function readEnvScenario(): string | null { + const raw = process.env.ADE_PERF_SCENARIO; + return raw && raw.length > 0 ? raw : null; +} + +function ensureDir(path: string): void { + if (!existsSync(path)) mkdirSync(path, { recursive: true }); +} + +export function initPerfRunFromEnv(): PerfRunConfig | null { + if (active) return active; + const runId = readEnvRunId(); + if (!runId) return null; + const dir = join(homedir(), ".ade", "perf-runs", runId); + ensureDir(dir); + active = { + runId, + scenario: readEnvScenario(), + initialRoute: process.env.ADE_PERF_INITIAL_ROUTE ?? null, + allowClaude: process.env.ADE_PERF_ALLOW_CLAUDE === "1", + modelOverride: process.env.ADE_MODEL_OVERRIDE ?? null, + startedAt: Date.now(), + dir, + eventsPath: join(dir, "events.jsonl"), + summaryPath: join(dir, "summary.json"), + }; + appendEvent({ + kind: "note", + ts: active.startedAt, + note: "perfRunStarted", + runId: active.runId, + scenario: active.scenario, + initialRoute: active.initialRoute, + allowClaude: active.allowClaude, + modelOverride: active.modelOverride, + }); + return active; +} + +export function getActiveRun(): PerfRunConfig | null { + return active; +} + +export function isRunActive(): boolean { + return active !== null; +} + +export function appendEvent(event: PerfEvent): void { + if (!active) return; + try { + appendFileSync(active.eventsPath, JSON.stringify(event) + "\n"); + } catch { + // Swallow — perf logging never breaks the app. + } +} diff --git a/apps/desktop/src/preload/global.d.ts b/apps/desktop/src/preload/global.d.ts index e4e59fbb9..4993c9c09 100644 --- a/apps/desktop/src/preload/global.d.ts +++ b/apps/desktop/src/preload/global.d.ts @@ -2395,6 +2395,27 @@ declare global { updateQuitAndInstall: () => Promise; updateDismissInstalledNotice: () => Promise; onUpdateEvent: (cb: (snapshot: AutoUpdateSnapshot) => void) => () => void; + perf: { + getConfig: () => Promise<{ + active: boolean; + runId: string | null; + scenario: string | null; + initialRoute: string | null; + allowClaude: boolean; + modelOverride: string | null; + }>; + recordEvent: (event: { + kind: string; + ts?: number; + [key: string]: unknown; + }) => Promise<{ ok: boolean; reason?: string }>; + scenarioComplete: (args: { + scenario: string; + ok: boolean; + smokeFailures?: string[]; + }) => Promise<{ ok: boolean; reason?: string }>; + finalize: () => Promise<{ ok: boolean; reason?: string; summary?: unknown }>; + }; }; } } diff --git a/apps/desktop/src/preload/preload.ts b/apps/desktop/src/preload/preload.ts index 1b5c87ef7..075eaebe0 100644 --- a/apps/desktop/src/preload/preload.ts +++ b/apps/desktop/src/preload/preload.ts @@ -8398,4 +8398,15 @@ contextBridge.exposeInMainWorld("ade", { ipcRenderer.on(IPC.updateEvent, listener); return () => ipcRenderer.removeListener(IPC.updateEvent, listener); }, + perf: { + getConfig: () => ipcRenderer.invoke(IPC.perfGetConfig), + recordEvent: (event: { kind: string; ts?: number; [k: string]: unknown }) => + ipcRenderer.invoke(IPC.perfRecordEvent, event), + scenarioComplete: (args: { + scenario: string; + ok: boolean; + smokeFailures?: string[]; + }) => ipcRenderer.invoke(IPC.perfScenarioComplete, args), + finalize: () => ipcRenderer.invoke(IPC.perfFinalize), + }, }); diff --git a/apps/desktop/src/renderer/main.tsx b/apps/desktop/src/renderer/main.tsx index ce1040f7f..9cf5cb28e 100644 --- a/apps/desktop/src/renderer/main.tsx +++ b/apps/desktop/src/renderer/main.tsx @@ -8,6 +8,7 @@ import geistMonoVariableUrl from "../../node_modules/geist/dist/fonts/geist-mono import { App } from "./components/app/App"; import { RendererErrorBoundary } from "./components/app/RendererErrorBoundary"; import { logRendererDebugEvent } from "./lib/debugLog"; +import { initPerfRuntime } from "./perf/harness"; (function injectFontFaces() { const style = document.createElement("style"); @@ -161,3 +162,5 @@ createRoot(document.getElementById("root")!).render( ); + +void initPerfRuntime(); diff --git a/apps/desktop/src/renderer/perf/harness/index.ts b/apps/desktop/src/renderer/perf/harness/index.ts new file mode 100644 index 000000000..72af81573 --- /dev/null +++ b/apps/desktop/src/renderer/perf/harness/index.ts @@ -0,0 +1,44 @@ +import { setPerfActive, startRendererMemorySampler } from "../markers"; +import { installWebVitalsObservers } from "../webVitals"; +import { runScenario } from "../scenarios"; + +export async function initPerfRuntime(): Promise { + const cfg = await window.ade?.perf?.getConfig?.(); + if (!cfg?.active) return; + + setPerfActive(true); + installWebVitalsObservers(); + startRendererMemorySampler(); + + window.ade?.perf?.recordEvent({ + kind: "note", + ts: Date.now(), + note: "rendererPerfReady", + initialRoute: cfg.initialRoute ?? null, + scenario: cfg.scenario ?? null, + }); + + if (cfg.initialRoute) { + // Navigate immediately so the warm-mode shortcut lands on the requested tab. + const target = cfg.initialRoute.startsWith("/") + ? `#${cfg.initialRoute}` + : cfg.initialRoute.startsWith("#") + ? cfg.initialRoute + : `#/${cfg.initialRoute}`; + if (window.location.hash !== target) { + window.location.hash = target; + } + } + + if (!cfg.scenario) return; + + // Defer scenario kick-off until after first paint so we measure realistic cold flows. + const start = () => { + setTimeout(() => runScenario(cfg.scenario as string), 1500); + }; + if (document.readyState === "complete") { + start(); + } else { + window.addEventListener("load", start, { once: true }); + } +} diff --git a/apps/desktop/src/renderer/perf/markers.ts b/apps/desktop/src/renderer/perf/markers.ts new file mode 100644 index 000000000..6d9022980 --- /dev/null +++ b/apps/desktop/src/renderer/perf/markers.ts @@ -0,0 +1,81 @@ +let perfActive = false; + +export function setPerfActive(active: boolean): void { + perfActive = active; +} + +export function isPerfActive(): boolean { + return perfActive; +} + +export function perfMark(name: string, extra?: Record): void { + if (!perfActive) return; + try { + performance.mark(name); + } catch { + // best effort + } + window.ade?.perf?.recordEvent({ + kind: "mark", + ts: Date.now(), + name, + extra: extra ?? null, + }); +} + +export function perfMeasure(name: string, startMark: string, endMark?: string): number | null { + if (!perfActive) return null; + try { + const entry = endMark + ? performance.measure(name, startMark, endMark) + : performance.measure(name, startMark); + const durationMs = Math.round(entry.duration * 100) / 100; + window.ade?.perf?.recordEvent({ + kind: "measure", + ts: Date.now(), + name, + durationMs, + startMark, + endMark: endMark ?? null, + }); + return durationMs; + } catch { + return null; + } +} + +const memoryReader = () => { + const perf = performance as Performance & { + memory?: { usedJSHeapSize?: number; totalJSHeapSize?: number }; + }; + if (!perf.memory) return null; + return { + usedMB: Math.round((perf.memory.usedJSHeapSize ?? 0) / 1024 / 1024), + totalMB: Math.round((perf.memory.totalJSHeapSize ?? 0) / 1024 / 1024), + }; +}; + +let memorySamplerId: number | null = null; + +export function startRendererMemorySampler(): void { + if (memorySamplerId !== null || !perfActive) return; + const sample = () => { + const mem = memoryReader(); + if (!mem) return; + window.ade?.perf?.recordEvent({ + kind: "rendererMemory", + ts: Date.now(), + usedMB: mem.usedMB, + totalMB: mem.totalMB, + }); + }; + sample(); + memorySamplerId = window.setInterval(sample, 1000); +} + +export function stopRendererMemorySampler(): void { + if (memorySamplerId !== null) { + window.clearInterval(memorySamplerId); + memorySamplerId = null; + } +} diff --git a/apps/desktop/src/renderer/perf/scenarios/boot.ts b/apps/desktop/src/renderer/perf/scenarios/boot.ts new file mode 100644 index 000000000..b34d2f20d --- /dev/null +++ b/apps/desktop/src/renderer/perf/scenarios/boot.ts @@ -0,0 +1,101 @@ +import { registerScenario } from "./index"; + +/** + * Boot-tab scenarios. These exercise the "main ADE screen" — cold launch, + * project picker, opening projects, remote runtime status, iOS pairing. + * Most of these prefer direct IPC calls over DOM clicks so they are fast + * and reproducible. Scenarios that need a no-project state should be launched + * with `--no-project` on the runner so the welcome screen renders. + */ + +registerScenario({ + id: "boot.cold-paint", + description: "Cold launch — measures time-to-first-paint and to-interactive without driving any UI.", + run: async (ctx) => { + ctx.mark("boot.cold-paint.tick"); + await ctx.idle(8_000); + // FCP/LCP/INP samples accumulate in webVitals during this idle window. + ctx.assert(document.body.children.length > 0, "no body content after cold launch"); + }, +}); + +registerScenario({ + id: "boot.recent-projects", + description: "Measures recent-projects IPC + welcome list render.", + run: async (ctx) => { + ctx.mark("recent.fetch.start"); + const recent = await window.ade?.project?.listRecent(); + ctx.mark("recent.fetch.done"); + ctx.measure("recent.fetch", "recent.fetch.start", "recent.fetch.done"); + ctx.assert(Array.isArray(recent), "project.listRecent did not return an array"); + await ctx.idle(1_500); + }, +}); + +registerScenario({ + id: "boot.open-project", + description: "Times opening the perf-pass project via IPC openRepo.", + run: async (ctx) => { + const perfPass = "/Users/admin/Projects/perf pass"; + ctx.mark("openRepo.start"); + try { + // openRepo opens a directory chooser in some flows; we call with a path arg if supported, + // otherwise the IPC errors and we record that. + // The renderer-side preload signature varies; record what we got. + const result = await (window.ade?.project as unknown as { + openRepo?: (args?: { rootPath?: string }) => Promise; + })?.openRepo?.({ rootPath: perfPass }); + ctx.mark("openRepo.done"); + ctx.measure("openRepo", "openRepo.start", "openRepo.done"); + ctx.assert(result !== undefined, "openRepo returned undefined"); + } catch (err) { + ctx.mark("openRepo.done"); + ctx.measure("openRepo", "openRepo.start", "openRepo.done"); + ctx.assert(false, `openRepo threw: ${err instanceof Error ? err.message : String(err)}`); + } + await ctx.idle(3_000); + }, +}); + +registerScenario({ + id: "boot.remote-runtime", + description: "Times the remote runtime list-targets + snapshot path.", + run: async (ctx) => { + ctx.mark("remote.list.start"); + try { + const targets = await (window.ade as unknown as { + remoteRuntime?: { listTargets?: () => Promise }; + }).remoteRuntime?.listTargets?.(); + ctx.mark("remote.list.done"); + ctx.measure("remote.list", "remote.list.start", "remote.list.done"); + ctx.assert(targets === undefined || Array.isArray(targets), "remote.listTargets returned a non-array"); + } catch (err) { + ctx.mark("remote.list.done"); + ctx.measure("remote.list", "remote.list.start", "remote.list.done"); + ctx.assert(false, `remote.listTargets threw: ${err instanceof Error ? err.message : String(err)}`); + } + await ctx.idle(1_500); + }, +}); + +registerScenario({ + id: "boot.idle-welcome", + description: "Sit on the welcome screen for 20s — measures at-rest CPU and background pollers when no project is loaded.", + run: async (ctx) => { + // The harness skips ADE_PERF_INITIAL_ROUTE for this scenario by not setting it. + await ctx.idle(20_000); + ctx.assert(document.body.children.length > 0, "body emptied during idle"); + }, +}); + +registerScenario({ + id: "boot.stress-launch", + description: "Long-tail at boot — 2 minutes of cold-launched idle to catch leaks in startup pollers.", + run: async (ctx) => { + for (let i = 0; i < 6; i++) { + await ctx.idle(20_000); + ctx.mark(`boot.stress.tick.${i}`); + } + ctx.assert(document.body.children.length > 0, "body emptied during boot stress"); + }, +}); diff --git a/apps/desktop/src/renderer/perf/scenarios/index.ts b/apps/desktop/src/renderer/perf/scenarios/index.ts new file mode 100644 index 000000000..8dc21ee1a --- /dev/null +++ b/apps/desktop/src/renderer/perf/scenarios/index.ts @@ -0,0 +1,164 @@ +import { perfMark, perfMeasure } from "../markers"; + +export type ScenarioContext = { + mark: (name: string) => void; + measure: (name: string, startMark: string, endMark?: string) => number | null; + /** Wait until the test condition returns truthy, or timeoutMs elapses. Returns true if found. */ + waitFor: (testFn: () => boolean, timeoutMs?: number) => Promise; + /** Wait until a DOM element with [data-testid=id] appears. */ + waitForTestId: (id: string, timeoutMs?: number) => Promise; + /** Click an element by testid. Throws if not found within timeout. */ + clickTestId: (id: string, timeoutMs?: number) => Promise; + /** Navigate (hash-router style). */ + navigate: (route: string) => Promise; + /** Idle for ms, lets the event loop breathe and metrics samplers tick. */ + idle: (ms: number) => Promise; + /** Assert an invariant; pushes failure to smokeFailures if false. */ + assert: (condition: boolean, message: string) => void; +}; + +export type Scenario = { + id: string; + description: string; + requiresClaude?: boolean; + run: (ctx: ScenarioContext) => Promise; +}; + +const registry = new Map(); + +export function registerScenario(scenario: Scenario): void { + if (registry.has(scenario.id)) { + throw new Error(`Duplicate perf scenario id: ${scenario.id}`); + } + registry.set(scenario.id, scenario); +} + +export function getScenario(id: string): Scenario | null { + return registry.get(id) ?? null; +} + +export function listScenarios(): string[] { + return [...registry.keys()]; +} + +function delay(ms: number): Promise { + return new Promise((resolve) => setTimeout(resolve, ms)); +} + +async function waitFor(testFn: () => boolean, timeoutMs = 10_000): Promise { + const deadline = performance.now() + timeoutMs; + while (performance.now() < deadline) { + if (testFn()) return true; + await delay(50); + } + return false; +} + +async function waitForTestId(id: string, timeoutMs = 10_000): Promise { + const deadline = performance.now() + timeoutMs; + while (performance.now() < deadline) { + const el = document.querySelector(`[data-testid="${id}"]`); + if (el && el.offsetParent !== null) return el; + await delay(50); + } + return null; +} + +function makeContext(smokeFailures: string[]): ScenarioContext { + return { + mark: (name) => perfMark(name), + measure: (name, startMark, endMark) => perfMeasure(name, startMark, endMark), + waitFor, + waitForTestId, + clickTestId: async (id, timeoutMs = 5_000) => { + const el = await waitForTestId(id, timeoutMs); + if (!el) { + throw new Error(`clickTestId timeout: ${id}`); + } + el.click(); + }, + navigate: async (route) => { + window.location.hash = route.startsWith("#") ? route : `#${route}`; + await delay(50); + }, + idle: (ms) => delay(ms), + assert: (condition, message) => { + if (!condition) smokeFailures.push(message); + }, + }; +} + +async function loadScenarioModule(scenarioId: string): Promise { + const prefix = scenarioId.split(".", 1)[0] ?? scenarioId; + switch (prefix) { + case "lanes": + await import("./lanes"); + return; + case "boot": + await import("./boot"); + return; + default: + // Best-effort: try direct import. If it fails, the scenario lookup will report not-found. + try { + await import(/* @vite-ignore */ `./${prefix}`); + } catch { + // ignore — handled by caller + } + } +} + +export async function runScenario(scenarioId: string): Promise { + await loadScenarioModule(scenarioId); + + const scenario = getScenario(scenarioId); + if (!scenario) { + window.ade?.perf?.recordEvent({ + kind: "note", + ts: Date.now(), + note: "scenarioNotFound", + scenario: scenarioId, + }); + return; + } + + const cfg = await window.ade?.perf?.getConfig?.(); + if (scenario.requiresClaude && !cfg?.allowClaude) { + await window.ade?.perf?.scenarioComplete({ + scenario: scenarioId, + ok: false, + smokeFailures: ["requiresClaude but ADE_PERF_ALLOW_CLAUDE not set"], + }); + return; + } + + const smokeFailures: string[] = []; + const ctx = makeContext(smokeFailures); + + window.ade?.perf?.recordEvent({ + kind: "scenarioStart", + ts: Date.now(), + scenario: scenarioId, + description: scenario.description, + }); + + let ok = true; + try { + await scenario.run(ctx); + } catch (err) { + ok = false; + smokeFailures.push(err instanceof Error ? err.message : String(err)); + } + + await window.ade?.perf?.scenarioComplete({ + scenario: scenarioId, + ok: ok && smokeFailures.length === 0, + smokeFailures, + }); + + // Auto-finalize so the harness script can detect summary.json appearing. + try { + await window.ade?.perf?.finalize(); + } catch { + // ignore + } +} diff --git a/apps/desktop/src/renderer/perf/scenarios/lanes.ts b/apps/desktop/src/renderer/perf/scenarios/lanes.ts new file mode 100644 index 000000000..87f4c2df0 --- /dev/null +++ b/apps/desktop/src/renderer/perf/scenarios/lanes.ts @@ -0,0 +1,94 @@ +import { registerScenario, type ScenarioContext } from "./index"; + +async function navigateToLanes(ctx: ScenarioContext): Promise { + ctx.mark("nav.lanes.start"); + await ctx.navigate("/lanes"); + const present = await ctx.waitFor( + () => document.querySelector("[data-route='lanes'], main") !== null, + 8_000 + ); + ctx.mark("nav.lanes.done"); + ctx.measure("nav.lanes", "nav.lanes.start", "nav.lanes.done"); + ctx.assert(present, "lanes route did not render main content"); +} + +registerScenario({ + id: "lanes.cold-list", + description: "Cold open of /lanes — measures route mount, first render, settle.", + run: async (ctx) => { + await navigateToLanes(ctx); + await ctx.idle(5_000); + // Smoke: at least one heading-like element exists. + const hasHeading = !!document.querySelector("h1, h2, [role='heading']"); + ctx.assert(hasHeading, "no heading visible after lanes mount"); + }, +}); + +registerScenario({ + id: "lanes.switch-rapid", + description: "Rapid route switching to/from lanes — measures route transition cost.", + run: async (ctx) => { + const routes = ["/lanes", "/work", "/lanes", "/files", "/lanes", "/prs", "/lanes"]; + for (let i = 0; i < routes.length; i++) { + const route = routes[i]!; + const tag = route.replace(/\W+/g, "_") + "_" + i; + ctx.mark(`switch.${tag}.start`); + await ctx.navigate(route); + await ctx.waitFor(() => document.querySelector("main") !== null, 5_000); + ctx.mark(`switch.${tag}.done`); + ctx.measure(`switch.${tag}`, `switch.${tag}.start`, `switch.${tag}.done`); + await ctx.idle(300); + } + ctx.assert( + window.location.hash.includes("lanes") || window.location.pathname.includes("lanes"), + "final route is not /lanes" + ); + }, +}); + +registerScenario({ + id: "lanes.idle-at-rest", + description: "Sit on /lanes for 30s — measures at-rest CPU/memory and background polling.", + run: async (ctx) => { + await navigateToLanes(ctx); + await ctx.idle(30_000); + ctx.assert(document.querySelector("main") !== null, "main vanished during idle"); + }, +}); + +registerScenario({ + id: "lanes.stress-poll", + description: "Sit on /lanes for 2 minutes — catches memory leaks and polling regressions.", + run: async (ctx) => { + await navigateToLanes(ctx); + // Force window focus loops to exercise visibility-driven pollers. + for (let i = 0; i < 6; i++) { + await ctx.idle(20_000); + ctx.mark(`stress.tick.${i}`); + } + ctx.assert(document.querySelector("main") !== null, "main vanished during stress"); + }, +}); + +registerScenario({ + id: "lanes.scroll-list", + description: "Scroll the lanes list rapidly — measures render-on-scroll cost.", + run: async (ctx) => { + await navigateToLanes(ctx); + await ctx.idle(2_000); + const scrollable = + document.querySelector("[data-scrollable], main [class*='overflow']") || + document.querySelector("main"); + if (!scrollable) { + ctx.assert(false, "no scrollable container found in lanes"); + return; + } + ctx.mark("scroll.start"); + for (let i = 0; i < 20; i++) { + scrollable.scrollTop = (i % 2 === 0 ? scrollable.scrollHeight : 0); + await ctx.idle(150); + } + ctx.mark("scroll.done"); + ctx.measure("scroll.20-cycles", "scroll.start", "scroll.done"); + }, +}); diff --git a/apps/desktop/src/renderer/perf/webVitals.ts b/apps/desktop/src/renderer/perf/webVitals.ts new file mode 100644 index 000000000..355db72c0 --- /dev/null +++ b/apps/desktop/src/renderer/perf/webVitals.ts @@ -0,0 +1,81 @@ +import { isPerfActive } from "./markers"; + +const LONG_TASK_THRESHOLD_MS = 50; + +let installed = false; + +function send(kind: string, payload: Record): void { + if (!isPerfActive()) return; + window.ade?.perf?.recordEvent({ + kind, + ts: Date.now(), + ...payload, + }); +} + +function observe(type: string, callback: (entries: PerformanceEntryList) => void): void { + try { + const po = new PerformanceObserver((list) => callback(list.getEntries())); + po.observe({ type, buffered: true } as PerformanceObserverInit); + } catch { + // observer type not supported in this runtime — ignore. + } +} + +export function installWebVitalsObservers(): void { + if (installed) return; + installed = true; + + observe("longtask", (entries) => { + for (const entry of entries) { + if (entry.duration < LONG_TASK_THRESHOLD_MS) continue; + send("longTask", { + durationMs: Math.round(entry.duration), + startTime: Math.round(entry.startTime), + }); + } + }); + + observe("paint", (entries) => { + for (const entry of entries) { + if (entry.name === "first-contentful-paint") { + send("webVital", { + metric: "FCP", + value: Math.round(entry.startTime), + }); + } + } + }); + + observe("largest-contentful-paint", (entries) => { + for (const entry of entries) { + send("webVital", { + metric: "LCP", + value: Math.round(entry.startTime), + }); + } + }); + + observe("layout-shift", (entries) => { + let total = 0; + for (const entry of entries as unknown as Array) { + if (entry.hadRecentInput) continue; + total += entry.value; + } + if (total > 0) { + send("webVital", { metric: "CLS", value: total }); + } + }); + + observe("event", (entries) => { + for (const entry of entries as unknown as Array) { + if (!entry.interactionId) continue; + if (entry.duration < 16) continue; + send("webVital", { + metric: "INP", + value: Math.round(entry.duration), + eventName: entry.name, + }); + } + }); +} diff --git a/apps/desktop/src/shared/ipc.ts b/apps/desktop/src/shared/ipc.ts index ed8845357..b54bbfe5e 100644 --- a/apps/desktop/src/shared/ipc.ts +++ b/apps/desktop/src/shared/ipc.ts @@ -773,6 +773,10 @@ export const IPC = { notificationsApnsUploadKey: "ade.notifications.apns.uploadKey", notificationsApnsClearKey: "ade.notifications.apns.clearKey", notificationsApnsSendTestPush: "ade.notifications.apns.sendTestPush", + perfGetConfig: "ade.perf.getConfig", + perfRecordEvent: "ade.perf.recordEvent", + perfFinalize: "ade.perf.finalize", + perfScenarioComplete: "ade.perf.scenarioComplete", } as const; export type IpcChannel = (typeof IPC)[keyof typeof IPC]; diff --git a/scripts/perf-launch.mjs b/scripts/perf-launch.mjs new file mode 100755 index 000000000..8a1deb2cb --- /dev/null +++ b/scripts/perf-launch.mjs @@ -0,0 +1,96 @@ +#!/usr/bin/env node +/** + * Warm-launch ADE in perf mode pointing at perf-pass on a specific tab. + * Perf instrumentation is ON but no scenario auto-runs — for Codex to inspect + * and drive interactively. Quits when you SIGINT/kill it. + * + * Usage: + * scripts/perf-launch.mjs --tab lanes + * scripts/perf-launch.mjs --tab missions --run-id manual + * scripts/perf-launch.mjs --tab boot --no-project + * scripts/perf-launch.mjs --route /settings --project "/path/to/repo" + */ +import { spawn } from "node:child_process"; +import { mkdirSync, mkdtempSync } from "node:fs"; +import { homedir, tmpdir } from "node:os"; +import { join } from "node:path"; + +const argv = process.argv.slice(2); +const args = { + tab: null, + route: null, + runId: null, + projectRoot: null, + noProject: false, +}; +for (let i = 0; i < argv.length; i++) { + const a = argv[i]; + if (a === "--tab") args.tab = argv[++i] ?? null; + else if (a === "--route") args.route = argv[++i] ?? null; + else if (a === "--run-id") args.runId = argv[++i] ?? null; + else if (a === "--project") args.projectRoot = argv[++i] ?? null; + else if (a === "--no-project") args.noProject = true; + else if (a === "--help" || a === "-h") { + console.log(`Usage: perf-launch.mjs --tab [--run-id ] [--project ] [--no-project] + perf-launch.mjs --route / [--run-id ] [--project ] [--no-project]`); + process.exit(0); + } +} + +const route = args.route ?? (args.tab ? `/${args.tab}` : null); +if (!route) { + console.error("perf-launch: must pass --tab or --route /"); + process.exit(2); +} + +const runId = args.runId ?? `warm-${Date.now()}`; +mkdirSync(join(homedir(), ".ade", "perf-runs", runId), { recursive: true }); + +const defaultPerfPass = "/Users/admin/Projects/perf pass"; +const projectRoot = args.noProject + ? mkdtempSync(join(tmpdir(), "ade-perf-no-project-")) + : (args.projectRoot ?? process.env.ADE_PERF_PASS_DIR ?? defaultPerfPass); + +console.log(`[perf-launch] tab=${args.tab ?? "(route)"} route=${route} runId=${runId}`); +console.log(`[perf-launch] project=${projectRoot}${args.noProject ? " (no-project mode)" : ""}`); +console.log(`[perf-launch] events → ${join(homedir(), ".ade", "perf-runs", runId, "events.jsonl")}`); +console.log(`[perf-launch] press Ctrl-C to quit`); + +const env = { + ...process.env, + ADE_PERF_RUN_ID: runId, + ADE_PERF_INITIAL_ROUTE: route, + ADE_PERF_PASS_DIR: projectRoot, + ADE_TRACE_IPC: "verbose", + ADE_MODEL_OVERRIDE: process.env.ADE_MODEL_OVERRIDE ?? "gpt-5-codex", + ELECTRON_ENABLE_LOGGING: "1", +}; +delete env.ADE_PERF_SCENARIO; + +const child = spawn( + "npm", + ["run", "dev:desktop", "--", "--project-root", projectRoot], + { + stdio: ["ignore", "inherit", "inherit"], + env, + detached: false, + } +); + +const stop = (code) => { + try { + if (!child.killed) { + child.kill("SIGTERM"); + setTimeout(() => { + if (!child.killed) child.kill("SIGKILL"); + }, 5_000).unref(); + } + } catch { + // ignore + } + process.exit(code); +}; + +process.on("SIGINT", () => stop(130)); +process.on("SIGTERM", () => stop(143)); +child.on("exit", (code) => process.exit(code ?? 0)); diff --git a/scripts/reset-perf-pass.sh b/scripts/reset-perf-pass.sh new file mode 100755 index 000000000..cc292a7bd --- /dev/null +++ b/scripts/reset-perf-pass.sh @@ -0,0 +1,32 @@ +#!/usr/bin/env bash +# Reset the perf-pass throwaway repo to a known seed state. +# Usage: scripts/reset-perf-pass.sh [path-to-perf-pass-repo] +set -euo pipefail + +PERF_PASS="${1:-${ADE_PERF_PASS_DIR:-/Users/admin/Projects/perf pass}}" +SEED_TAG="${ADE_PERF_PASS_SEED_TAG:-perf-pass-seed}" + +if [[ ! -d "$PERF_PASS/.git" ]]; then + echo "[perf-pass] $PERF_PASS is not a git repo" >&2 + exit 2 +fi + +cd "$PERF_PASS" + +if ! git rev-parse --verify "$SEED_TAG" >/dev/null 2>&1; then + echo "[perf-pass] tagging current HEAD as $SEED_TAG" + git tag "$SEED_TAG" +fi + +echo "[perf-pass] resetting to $SEED_TAG" +git reset --hard "$SEED_TAG" + +echo "[perf-pass] clearing untracked" +git clean -fdx + +if [[ -d "$PERF_PASS/.ade" ]]; then + echo "[perf-pass] clearing .ade/" + rm -rf "$PERF_PASS/.ade" +fi + +echo "[perf-pass] ready at $PERF_PASS ($(git rev-parse --short HEAD))" diff --git a/scripts/run-perf-scenario.mjs b/scripts/run-perf-scenario.mjs new file mode 100755 index 000000000..1beb95c29 --- /dev/null +++ b/scripts/run-perf-scenario.mjs @@ -0,0 +1,108 @@ +#!/usr/bin/env node +/** + * Run a single perf scenario end-to-end: + * 1. Set ADE_PERF_RUN_ID + ADE_PERF_SCENARIO + * 2. Launch `npm run dev:desktop` as a child process + * 3. Poll for ~/.ade/perf-runs//summary.json to appear + * 4. Print summary + kill the dev process + * + * Usage: + * scripts/run-perf-scenario.mjs [runId] + * scripts/run-perf-scenario.mjs lanes.cold-list + * scripts/run-perf-scenario.mjs lanes.stress-poll my-baseline + */ +import { spawn } from "node:child_process"; +import { existsSync, readFileSync, rmSync, mkdirSync, mkdtempSync } from "node:fs"; +import { homedir, tmpdir } from "node:os"; +import { join } from "node:path"; + +const positional = []; +let noProject = false; +let initialRoute = null; +for (const arg of process.argv.slice(2)) { + if (arg === "--no-project") noProject = true; + else if (arg.startsWith("--route=")) initialRoute = arg.slice("--route=".length); + else positional.push(arg); +} +const scenarioId = positional[0]; +if (!scenarioId) { + console.error("Usage: run-perf-scenario.mjs [runId] [--no-project] [--route=/path]"); + process.exit(2); +} +const runId = positional[1] ?? `run-${Date.now()}`; +const dir = join(homedir(), ".ade", "perf-runs", runId); +const summaryPath = join(dir, "summary.json"); +const timeoutMs = Number(process.env.ADE_PERF_TIMEOUT_MS ?? 300_000); + +mkdirSync(dir, { recursive: true }); +if (existsSync(summaryPath)) rmSync(summaryPath); + +console.log(`[perf] scenario=${scenarioId} runId=${runId}`); +console.log(`[perf] events → ${join(dir, "events.jsonl")}`); +console.log(`[perf] summary → ${summaryPath}`); + +const defaultPerfPass = "/Users/admin/Projects/perf pass"; +const perfPassDir = noProject + ? mkdtempSync(join(tmpdir(), "ade-perf-no-project-")) + : (process.env.ADE_PERF_PASS_DIR ?? defaultPerfPass); + +const env = { + ...process.env, + ADE_PERF_RUN_ID: runId, + ADE_PERF_SCENARIO: scenarioId, + ADE_PERF_PASS_DIR: perfPassDir, + ADE_TRACE_IPC: "verbose", + ADE_MODEL_OVERRIDE: process.env.ADE_MODEL_OVERRIDE ?? "gpt-5-codex", + ELECTRON_ENABLE_LOGGING: "1", +}; +if (initialRoute) env.ADE_PERF_INITIAL_ROUTE = initialRoute; + +const child = spawn( + "npm", + ["run", "dev:desktop", "--", "--project-root", perfPassDir], + { + stdio: ["ignore", "inherit", "inherit"], + env, + detached: false, + } +); + +const deadline = Date.now() + timeoutMs; +const cleanup = (code) => { + try { + if (!child.killed) { + child.kill("SIGTERM"); + setTimeout(() => { + if (!child.killed) child.kill("SIGKILL"); + }, 5_000).unref(); + } + } catch { + // ignore + } + process.exit(code); +}; + +process.on("SIGINT", () => cleanup(130)); +process.on("SIGTERM", () => cleanup(143)); + +(async function pollForSummary() { + while (Date.now() < deadline) { + if (existsSync(summaryPath)) { + try { + const text = readFileSync(summaryPath, "utf8"); + const summary = JSON.parse(text); + console.log("\n[perf] summary:"); + console.log(JSON.stringify(summary, null, 2)); + cleanup(0); + return; + } catch (err) { + console.error(`[perf] failed to parse summary: ${err.message}`); + cleanup(1); + return; + } + } + await new Promise((r) => setTimeout(r, 1_000)); + } + console.error(`[perf] timed out waiting for ${summaryPath}`); + cleanup(1); +})(); From f99fe39d09db44c909e853527a4b75f530b19b0d Mon Sep 17 00:00:00 2001 From: Arul Sharma <31745423+arul28@users.noreply.github.com> Date: Mon, 11 May 2026 18:51:54 -0400 Subject: [PATCH 02/10] docs(skills): codify lanes autoresearch guidance --- .agents/skills/ade-autoresearch/SKILL.md | 65 +++++++------- .agents/skills/ade-perf-lanes/SKILL.md | 104 ++++++++++++++++------- 2 files changed, 109 insertions(+), 60 deletions(-) diff --git a/.agents/skills/ade-autoresearch/SKILL.md b/.agents/skills/ade-autoresearch/SKILL.md index 36c9978bf..0a074c396 100644 --- a/.agents/skills/ade-autoresearch/SKILL.md +++ b/.agents/skills/ade-autoresearch/SKILL.md @@ -10,7 +10,7 @@ description: Iteratively optimize an ADE tab's CPU/memory/IPC/render performance Claude. metadata: author: ADE - version: 0.1.0 + version: 0.2.0 --- # ade-autoresearch @@ -22,44 +22,44 @@ A Karpathy-style autoresearch loop for ADE perf. You (the agent) ARE the loop ru - ``: the tab to optimize. Must be one of: `boot`, `lanes`, `missions`, `prs`, `work`, `files`, `run`, `graph`, `review`, `history`, `automations`, `cto`, `settings`. (`boot` = cold launch + welcome + project open + remote runtime + iOS pairing — the "main ADE screen" surface above any specific tab.) - ``: throwaway git repo path. Defaults to `/Users/admin/Projects/perf pass` (note the space — quote it). Must exist, must be a git repo, must have a `perf-pass-seed` tag (or you create one on first run). Override via `ADE_PERF_PASS_DIR` env var. -## Shortcuts — prefer IPC and direct launch over computer use +## Real UI audit is the primary loop -Computer use (screenshots + clicks) is slow and noisy. Always prefer these in this order: +The job is to find what a person actually feels in the tab. Deterministic scenarios are guardrails and regression checks, not a substitute for driving the product. -1. **Direct IPC calls** in scenarios. Most ADE actions are exposed on `window.ade.*` — call them directly instead of clicking buttons. Examples: - ```ts - await window.ade.project.openRepo({ rootPath: "/some/repo" }); - await window.ade.lanes.create({ name: "x", ... }); - await window.ade.project.listRecent(); - await window.ade.remoteRuntime.listTargets(); - ``` - Read `apps/desktop/src/preload/global.d.ts` for the full IPC surface. +Use this order: -2. **Warm launch on a specific tab** without running any scenario: +1. **Warm launch the real Electron UI on the target tab** and keep it open while auditing: ```bash - node scripts/perf-launch.mjs --tab lanes - node scripts/perf-launch.mjs --tab boot --no-project - node scripts/perf-launch.mjs --route /settings/integrations + NO_DEVTOOLS=1 ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1 ADE_LOCAL_RUNTIME_FALLBACK=1 ADE_MODEL_OVERRIDE=gpt-5-codex \ + node scripts/perf-launch.mjs --tab --run-id -ui-audit-$(date +%Y%m%d-%H%M) ``` - This boots ADE with perf instrumentation, navigates to the target route, and stays open. Ideal for hands-on inspection of metrics as they stream into `~/.ade/perf-runs//events.jsonl`. + Confirm the Electron surface is on the requested tab. The visible active tab must match ``; do not audit a related embedded surface from another tab. -3. **Single-scenario run** for a measured cycle: - ```bash - node scripts/run-perf-scenario.mjs lanes.cold-list run-id - node scripts/run-perf-scenario.mjs boot.idle-welcome run-id --no-project +2. **Build an action inventory from the visible UI and source.** Start with the tab's actual first screen, then cover every safe user action, subpane, menu, picker, dialog, mode switch, list interaction, empty state, error/preflight state, expand/minimize/fullscreen state, keyboard/search/filter path, and tab-specific destructive/external preflight. For destructive or externally visible actions, open and measure the prompt/preflight unless the user has explicitly allowed final execution. + + The inventory must be tab-derived. For example, a Work pass should cover Work sidebar/session list, chat/CLI/shell start surfaces, session tabs/grid/layout controls, running/ended session actions, model/attachment/command/parallel pickers, terminal/chat panes, context menus, filters/search, and ADE tools drawers because those are Work-tab surfaces. A Lanes pass should cover lane list, stack graph, lane dialogs, Git Actions, and lane Work panes because those are Lanes-tab surfaces. + +3. **Mark each UI segment in the perf log** before and after exercising it: + ```ts + window.ade.perf.recordEvent({ kind: "manualStep", ts: Date.now(), name: "git-actions-stage", phase: "start" }); + // drive the visible UI + window.ade.perf.recordEvent({ kind: "manualStep", ts: Date.now(), name: "git-actions-stage", phase: "end" }); ``` + Segment names should describe the workflow, not the implementation detail. + +4. **Use direct IPC only for setup, cleanup, and analysis.** It is fine to create fixture data, reset a throwaway repo, query status, or extract metrics through IPC/shell. Do not replace a UI audit action with `window.ade.*` unless the UI is genuinely impossible to drive; if you must, say so in the run notes. -4. **Computer use is the last resort.** Use only when no IPC exists and a click is unavoidable. Even then, prefer adding a `data-testid` and a `clickTestId` step in the scenario instead of pixel coordinates. +5. **Run deterministic scenarios after UI findings.** Scenarios catch regressions and quantify broad fitness. They do not prove the tab is clean unless the UI action inventory was also covered. ## Setup (do once at start of run) -1. **Read prior wins** at `.agents/skills/ade-perf-/SKILL.md` if it exists. These are patterns past runs discovered — do not redo them, and prefer NOT to undo them. If conflict, prefer the prior win unless your new change strictly improves on it. +1. **Read prior wins** at `.agents/skills/ade-perf-/SKILL.md` if it exists. These are optional best-practice notes from earlier audits, not prerequisites. If no per-tab skill exists, derive the checklist from the tab UI and source and create the per-tab skill only during codification after you have measured real behavior. 2. **Read scenario definitions** at `apps/desktop/src/renderer/perf/scenarios/.ts`. These are the *contract*. Do NOT edit them. -3. **Verify perf-pass repo** is clean and on its seed tag: +3. **Verify perf-pass repo** exists, has a seed tag, and can exercise real GitHub paths when needed: ```bash scripts/reset-perf-pass.sh ``` - Refuse to start if perf-pass doesn't exist — instruct the user to create it. + Refuse to start if perf-pass doesn't exist. If the tab uses GitHub behavior, publish the repo as a private `perf-pass` remote before measuring push/pull/fetch UI. 4. **Create a working branch** off main: ```bash git checkout -b autoresearch/-$(date +%Y%m%d-%H%M) @@ -68,6 +68,8 @@ Computer use (screenshots + clicks) is slow and noisy. Always prefer these in th ## Baseline (iteration 0) +Start with one deterministic scenario sweep so you know the existing guardrail fitness, then do the real UI inventory. The baseline is not complete until both exist. + Run all scenarios for the tab. For lanes that's: ```bash @@ -91,6 +93,8 @@ node scripts/run-perf-scenario.mjs boot.stress-launch baseline-stress --no-pro Each writes `~/.ade/perf-runs//summary.json`. Read all summaries. Compute the **per-tab fitness** as the sum of all scenario fitness scores. Record this as `baseline_fitness`. Also record per-component breakdown so you can target the worst component. +Then launch the real tab with `perf-launch`, drive the action inventory, and analyze `~/.ade/perf-runs//events.jsonl` by manualStep segments. Record the worst UI segment, the slow IPC channels inside it, and whether the cost is expected work (for example network push/fetch) or avoidable tab work. + Tag the baseline commit: ```bash git tag perf-baseline--$(date +%Y%m%d) @@ -103,8 +107,8 @@ Stop conditions: **no fitness improvement for 10 consecutive iterations** OR use For each iteration: ### 1. Analyze -- Read the latest set of `summary.json` files for this tab. -- Pick the **#1 bottleneck**: the component contributing most to fitness. Tie-break by reproducibility (the bottleneck that appears across multiple scenarios > single-scenario). +- Read the latest scenario summaries and the latest real-UI `events.jsonl`. +- Pick the **#1 bottleneck**: the avoidable cost that appears in real UI segments or scenario summaries. Tie-break by user-visible workflow first, then reproducibility across scenarios. - Common bottleneck categories: - **Slow IPC channel**: a channel in `summary.ipc.slowChannels` with p95 ≥ 120ms - **Long task spam**: `webVitals.longTaskCount` > 5 per minute @@ -112,6 +116,7 @@ For each iteration: - **Render-on-scroll cost**: `marks.scroll.*` p95 high - **Route transition cost**: `marks.nav.*` or `marks.switch.*` p95 high - **Main CPU**: `process.mainCpuPercentP95` > 30 during idle scenarios → background pollers +- UI segment waste: heavy refreshes, duplicate mounted panes, hidden pollers, repeated global status checks, or expensive dialog prefetches that are not needed for the action the user took - Read the code that owns the bottleneck. Form a hypothesis. ### 2. Propose ONE change @@ -150,7 +155,7 @@ npm --prefix apps/desktop run test -- --run path/to/affected.test.ts If tests fail: **revert** the commit (`git reset --hard HEAD~1`), do NOT count toward plateau, try a different change targeting the same or next bottleneck. ### 5. Measure -Re-run all scenarios for the tab. Compute new per-tab fitness. +First re-drive the same UI segment with the same markers and compare the IPC/render/memory delta. Then re-run the smallest scenario subset that covers the changed surface. Re-run all scenarios before declaring the run done. ### 6. Smoke gate For each scenario's summary, check `summary.scenarios..ok === true` and `smokeFailures.length === 0`. If any scenario failed smoke: **revert**, increment plateau counter. @@ -174,19 +179,19 @@ When stop condition hits: Read all kept commits (`git log --oneline perf-baseline--... HEAD`). For each, extract the **pattern** (the technique used, not the literal change). Update `.agents/skills/ade-perf-/SKILL.md`: -- One entry per pattern. If a similar pattern already exists, append a refinement instead of duplicating. +- Write this as future engineering guidance for agents editing that tab, not as an audit transcript. One entry per pattern. If a similar pattern already exists, append a refinement instead of duplicating. - Each entry: - **Pattern**: one-line name (e.g. "Debounce git-status pollers behind window visibility"). - **Why it helped**: which bottleneck it addressed, with the metric delta from the summary. - **How to recognize when to apply**: signs in future code that the same pattern is needed. - **Anti-pattern to avoid**: what NOT to do. - **Verification**: which scenario + metric this affected. -- Append-only. Do not delete prior entries. +- Preserve proven history, but keep the top of the file readable as best practices for future code changes. ## Notes on agent behavior - **Stay focused.** One bottleneck at a time. Resist the urge to "while I'm here also fix..." — that breaks attribution. - **Trust the metric.** If fitness went up but you "feel" the code is better, revert anyway. The metric is the contract. -- **The perf-pass repo is your sandbox.** Inside it, you may create lanes, open chats, run automations, anything that exercises ADE. The scenarios already drive this; you may extend them ONLY by adding new scenarios in `apps/desktop/src/renderer/perf/scenarios/.ts` — never by editing existing ones. +- **The perf-pass repo is your sandbox.** Inside it, you may create lanes, open chats, push/pull throwaway branches, run automations, stash changes, and delete fixtures when needed to exercise ADE. Scenarios are guardrails; real UI audit coverage is required before you call the tab optimized. You may extend scenarios ONLY by adding new scenarios in `apps/desktop/src/renderer/perf/scenarios/.ts` — never by editing existing ones. - **Codex model only.** If a scenario invokes an in-ADE chat, that chat uses the `ADE_MODEL_OVERRIDE` model (gpt-5-codex by default). Scenarios opting into Claude must declare `requiresClaude: true` and you must set `ADE_PERF_ALLOW_CLAUDE=1` for them. - **Concurrency**: only one perf run on the machine at a time. If `~/.ade/perf-runs/` contains a `/lock` file with a live pid, refuse to start. diff --git a/.agents/skills/ade-perf-lanes/SKILL.md b/.agents/skills/ade-perf-lanes/SKILL.md index df30ee355..a681ace02 100644 --- a/.agents/skills/ade-perf-lanes/SKILL.md +++ b/.agents/skills/ade-perf-lanes/SKILL.md @@ -1,47 +1,91 @@ --- name: ade-perf-lanes -description: Performance patterns discovered for ADE's Lanes tab. Read before - editing files under apps/desktop/src/renderer/components/lanes/** or - apps/desktop/src/main/services/lanes/**. Append-only knowledge base populated - by ade-autoresearch runs. Skip patterns that contradict the current scenario - contract. +description: Performance practices for ADE's Lanes tab. Read before editing + files under apps/desktop/src/renderer/components/lanes/**, + apps/desktop/src/renderer/state/appStore.ts, or + apps/desktop/src/main/services/lanes/**. Preserve these patterns unless a new + measured UI audit proves a better one. metadata: author: ade-autoresearch - version: 0.1.0 - status: seed + version: 0.2.0 + status: active --- # ade-perf-lanes -Patterns discovered for the Lanes tab. Each entry has run-traced provenance — do not delete entries without explicit user approval. +Use this as engineering guidance for keeping the Lanes tab fast while adding features. The Lanes tab is a dense workspace: lane list, branch selector, stack graph, Work pane, Git Actions, dialogs, history, diff viewer, and runtime/session state all coexist. Small refresh choices can easily multiply into visible UI noise. -## How to use this file +## Testing posture -- Read all entries before making any change in lanes code. -- If a proposed change conflicts with an entry: prefer the entry. If you believe you can do better, run `ade-autoresearch lanes` and prove it with metrics. -- New entries are appended by `ade-autoresearch` at the end of each run. +- Test the actual `/lanes` route in the Electron dev app. Do not treat the Work tab with a lane selector as Lanes parity. +- Drive visible UI actions and mark each segment with `window.ade.perf.recordEvent({ kind: "manualStep", ... })`. Deterministic scenarios are regression guards, not a substitute for clicking through the tab. +- Keep a private `perf-pass` GitHub repo available for real fetch/push/pull behavior. It is safe to create throwaway lanes, commits, stashes, and branches there. +- When an action is destructive or externally visible, exercise the prompt/preflight by default. Execute the final action only when the user has allowed it or the target is clearly disposable. -## Scenarios this tab is benchmarked against +## Refresh rules -Defined in `apps/desktop/src/renderer/perf/scenarios/lanes.ts`: +- Use full decorated snapshots only when runtime decorations, conflict status, rebase suggestions, or auto-rebase state are truly needed. +- For Git Actions local operations such as stage, commit, fetch, push, pull, and history refresh, prefer `refreshLanes({ includeStatus: true, includeSnapshots: false })`. This updates lane Git status without rebuilding runtime/rebase/conflict snapshot decorations. +- For runtime-only updates from Work pane sessions, use `refreshLanes({ includeStatus: false, includeSnapshots: true, includeConflictStatus: false, includeRebaseSuggestions: false, includeAutoRebaseStatus: false })`. Preserve prior lane Git status while refreshing runtime buckets. +- For metadata-only updates such as lane color/appearance, use `refreshLanes({ includeStatus: false })` and preserve prior `status` / `parentStatus` in the store. A color change must not recompute Git status. +- Avoid calling bare `refreshLanes()` from new Lanes UI handlers. Treat it as the expensive path and document why a full refresh is required. -- `lanes.cold-list` — cold open of /lanes route. -- `lanes.switch-rapid` — fast route switching to/from /lanes. -- `lanes.idle-at-rest` — 30s on /lanes, measures background polling cost. -- `lanes.stress-poll` — 2min on /lanes, catches leaks. -- `lanes.scroll-list` — scroll the lanes list repeatedly. +## Pane and poller rules -## Patterns +- Expanded/fullscreen panes must unmount the corresponding inline pane body when the duplicate would keep effects alive. CSS hiding is not enough. +- Git Actions should have at most one active polling/effect owner per visible lane. Timers must clean up on lane switch, pane minimize, and fullscreen transitions. +- Poll only visible or active surfaces. Hidden lanes, hidden panes, minimized panes, and closed dialogs should not keep expensive Git, PR, Linear, AI, or runtime status requests alive. +- Background sync/local-runtime failures should be fast-pathed in disabled perf/dev modes at the IPC boundary. Do not make every renderer caller catch slow "service unavailable" failures. +- Presence updates should be idempotent and de-duped by lane/signature so filter changes, layout switches, and tab clicks do not spam sync IPC. -_No patterns recorded yet — populated by the first `ade-autoresearch lanes` run._ +## Dialog and menu rules - +## Git Actions rules + +- Keep local change operations scoped. Stage, unstage, commit, stash, and discard should refresh the active lane's change model and lane Git status, not all snapshot decorations. +- History and diff controls should fetch only the selected commit/file data. Split/unified, wrap, line-number, and copy-path controls should be renderer-local after the file/patch is loaded. +- Network actions are allowed to cost real time. `fetch`, `push`, and child-lane creation can dominate a trace; do not optimize them by hiding progress or skipping correctness checks. +- Save Changes currently stashes tracked changes and can leave untracked files visible. If changing that behavior, treat it as functionality work and add tests before using it as a perf cleanup. + +## Proven patterns + +### Skip disabled local runtime bridge calls +- **Why it helped**: When `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, preload still attempted local-runtime action/sync/event IPC before falling back to desktop IPC. Real `/lanes` runs showed slow `ade.localRuntime.*` spans and `lanes.idle-at-rest` hit V8 OOM before summary. +- **Apply when**: Perf/dev launches disable the daemon but traces show slow `ade.localRuntime.callAction`, `ade.localRuntime.callSync`, or `ade.localRuntime.streamEvents`. +- **Avoid**: Renderer-only caches that hide the symptom while the unavailable transport still burns time. +- **Verification**: Baseline `lanes-20260511-1721-real-baseline-*` had local-runtime slow channels and idle OOM. Post-change `lanes-20260511-1725-real-optimized-{cold,switch,idle,scroll,stress}` passed with total fitness `7028.82` and no `ade.localRuntime.*` channels. + +### Suppress hidden duplicate fullscreen pane bodies +- **Why it helped**: Expanding Git Actions mounted the fullscreen pane while leaving the inline `LaneGitActionsPane` body alive, producing duplicate toolbars and duplicated effects. +- **Apply when**: A Lanes expanded/fullscreen overlay reuses pane configs and a DOM snapshot shows duplicate pane bodies or repeated test regions while only one is visible. +- **Avoid**: CSS-only hiding for duplicate pane bodies. +- **Verification**: Git Actions expand went from 2 toolbars to 1. IPC dropped from 30 calls / 223 ms in `lanes-expand-prefix-20260511` to 29 calls / 192 ms in `lanes-expand-postfix-20260511`. + +### Fast-path disabled sync status and presence +- **Why it helped**: Perf-mode Lanes traces hit `ade.sync.getStatus` and `ade.sync.setActiveLanePresence` even though the local runtime daemon and in-process sync service were unavailable. Failed calls cost about 250-370 ms each. +- **Apply when**: `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1` and traces show failed `ade.sync.*` calls with "Sync service is not available." +- **Avoid**: Removing Lanes presence calls globally or catching every failure in the renderer. +- **Verification**: `lanes-expand-postfix-20260511` had 4 failed sync IPC calls totaling 1145 ms. `lanes-sync-postfix-20260511` had 3 successful sync IPC calls totaling 2 ms. + +### Scope Git Actions refreshes to lane status +- **Why it helped**: Stage/commit/fetch used to call full `listSnapshots`, rebuilding runtime and decoration state for local Git actions. The scoped path keeps status fresh and skips snapshot decorations. +- **Apply when**: A Git Actions handler finishes local Git work and calls bare `refreshLanes()`. +- **Avoid**: Recomputing runtime/rebase/conflict decorations after every stage, commit, stash, fetch, or history refresh. +- **Verification**: Before the change, `lanes-full-ui-audit-20260511-01` stage cycle spent 551 ms in `ade.lanes.listSnapshots`; commit spent 278 ms in `listSnapshots`. After the change, `lanes-refresh-light-20260511` stage used `ade.lanes.list` at 40 ms with no `listSnapshots`, and typed commit used `ade.lanes.list` at 213 ms with no `listSnapshots`. + +### Split runtime refresh from Git status refresh +- **Why it helped**: Work pane pty/chat updates need runtime buckets, not fresh Git status for every lane. Runtime-only snapshot refresh avoids Git-status recompute while preserving prior status in store. +- **Apply when**: A session/runtime event updates running/awaiting/ended counts. +- **Avoid**: Calling full snapshots with `includeStatus:true` for runtime-only changes. +- **Verification**: `lanes-refresh-light-20260511` runtime snapshot refresh used `ade.lanes.listSnapshots` with `includeStatus:false`; the only runtime snapshot call was 76 ms while prior Git status stayed intact. + +### Keep appearance refresh metadata-only +- **Why it helped**: Lane color changes in manage/context flows previously refreshed full decorated snapshots. Appearance is metadata and should use the statusless list path while preserving previous Git status in the store. +- **Apply when**: New lane metadata or appearance handlers update color/name/description without changing branch state. +- **Avoid**: Bare `refreshLanes()` after appearance-only updates. +- **Verification**: Real UI manage-dialog trace `lanes-refresh-light-20260511` showed `ade.lanes.updateAppearance` at 1 ms followed by an unnecessary `ade.lanes.listSnapshots` at 308 ms. The lightweight path uses `refreshLanes({ includeStatus: false })` instead. From 8819e9f123447873afd3b575f5a297268d335e6c Mon Sep 17 00:00:00 2001 From: Arul Sharma <31745423+arul28@users.noreply.github.com> Date: Mon, 11 May 2026 20:09:03 -0400 Subject: [PATCH 03/10] ade/optimizing lanes tab ed588b8d (#282) * fix(app): honor hash routes in browser router * perf(lanes): skip disabled local runtime bridge real lanes baseline failed at lanes.idle-at-rest with V8 OOM; optimized scenarios all passed, fitness 7028.82 * perf(lanes): suppress duplicate git actions fullscreen pane UI audit: Git Actions expand went from 2 toolbars to 1. Post-marker IPC changed from 30 calls / 223ms to 29 calls / 192ms, removing one duplicate ade.git.getSyncStatus. * perf(lanes): fast-path disabled sync runtime UI audit: sync IPC in local-runtime-disabled perf mode went from 4 failed calls / 1145ms to 3 successful calls / 2ms. * perf(lanes): scope lane refreshes during UI actions * ship: prepare lane for review * ship: iteration 1 - address snapshot refresh review * ship: iteration 2 - address refresh review * ship: iteration 3 - refresh git action overlays --- .../src/main/services/ipc/registerIpc.ts | 101 +++++++++++++-- .../main/services/ipc/runtimeBridge.test.ts | 121 +++++++++++++++++- apps/desktop/src/preload/preload.test.ts | 46 +++++++ apps/desktop/src/preload/preload.ts | 18 ++- .../src/renderer/components/app/App.tsx | 38 ++++-- .../components/app/App.workKeepAlive.test.tsx | 18 +++ .../lanes/LaneGitActionsPane.test.tsx | 6 + .../components/lanes/LaneGitActionsPane.tsx | 14 +- .../components/lanes/LanesPage.test.ts | 29 +++++ .../renderer/components/lanes/LanesPage.tsx | 36 ++++-- .../src/renderer/state/appStore.test.ts | 95 +++++++++++++- apps/desktop/src/renderer/state/appStore.ts | 50 ++++++-- docs/ARCHITECTURE.md | 9 +- docs/features/lanes/README.md | 29 +++-- docs/features/remote-runtime/README.md | 8 +- .../remote-runtime/internal-architecture.md | 8 +- 16 files changed, 553 insertions(+), 73 deletions(-) diff --git a/apps/desktop/src/main/services/ipc/registerIpc.ts b/apps/desktop/src/main/services/ipc/registerIpc.ts index bbed2fdc8..028e12577 100644 --- a/apps/desktop/src/main/services/ipc/registerIpc.ts +++ b/apps/desktop/src/main/services/ipc/registerIpc.ts @@ -1796,14 +1796,86 @@ export function registerIpc({ if (getSyncService) return getSyncService() ?? null; return getCtx().syncService ?? null; }; + const resolveOptionalSyncService = async (): Promise | null> => + resolveSyncService + ? (await resolveSyncService()) ?? null + : getOptionalSyncService(); + const localRuntimeDaemonDisabled = process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON === "1"; const allowLocalRuntimeFallback = - process.env.ADE_LOCAL_RUNTIME_FALLBACK === "1" || - process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON === "1"; + process.env.ADE_LOCAL_RUNTIME_FALLBACK === "1"; + + const unavailableSyncSnapshotCreatedAt = new Date().toISOString(); + const unavailableSyncPlatform = + process.platform === "darwin" + ? "macOS" + : process.platform === "win32" + ? "windows" + : process.platform === "linux" + ? "linux" + : "unknown"; + const unavailableSyncDevice: SyncDeviceRecord = { + deviceId: "local-runtime-disabled", + siteId: "local-runtime-disabled", + name: "Local desktop", + platform: unavailableSyncPlatform, + deviceType: "desktop", + createdAt: unavailableSyncSnapshotCreatedAt, + updatedAt: unavailableSyncSnapshotCreatedAt, + lastSeenAt: unavailableSyncSnapshotCreatedAt, + lastHost: null, + lastPort: null, + tailscaleIp: null, + ipAddresses: [], + metadata: { unavailableReason: "local_runtime_daemon_disabled" }, + }; + const unavailableSyncSnapshot: SyncRoleSnapshot = { + mode: "standalone", + role: "brain", + localDevice: unavailableSyncDevice, + currentBrain: unavailableSyncDevice, + clusterState: null, + bootstrapToken: null, + pairingPin: null, + pairingPinConfigured: false, + pairingConnectInfo: null, + connectedPeers: [], + tailnetDiscovery: { + state: "disabled", + serviceName: "ade-sync", + servicePort: 0, + target: null, + updatedAt: null, + error: null, + stderr: null, + }, + client: { + state: "disconnected", + host: null, + port: null, + connectedAt: null, + lastSeenAt: null, + latencyMs: null, + syncLag: null, + lastRemoteDbVersion: 0, + brainDeviceId: unavailableSyncDevice.deviceId, + hostName: unavailableSyncDevice.name, + error: null, + message: "Sync service unavailable in local runtime disabled mode.", + savedDraft: null, + }, + transferReadiness: { + ready: true, + blockers: [], + survivableState: [], + }, + survivableStateText: "Sync service unavailable in local runtime disabled mode.", + blockingStateText: "", + }; + + const buildUnavailableSyncSnapshot = (): SyncRoleSnapshot => unavailableSyncSnapshot; const requireSyncService = async (): Promise> => { - const service = resolveSyncService - ? await resolveSyncService() - : getOptionalSyncService(); + const service = await resolveOptionalSyncService(); if (!service) { throw new Error("Sync service is not available."); } @@ -1823,6 +1895,7 @@ export function registerIpc({ event: { sender: Electron.WebContents }, action: (pool: LocalRuntimeConnectionPool, rootPath: string) => Promise, ): Promise => { + if (localRuntimeDaemonDisabled) return null; if (!localRuntimeConnectionPool) return null; const rootPath = getLocalRuntimeRootForEvent(event); if (!rootPath) return null; @@ -4107,7 +4180,12 @@ export function registerIpc({ pool.syncStatusForRoot(rootPath, arg ?? {}) ); if (runtimeStatus) return runtimeStatus; - return await (await requireSyncService()).getStatus({ + const service = await resolveOptionalSyncService(); + if (!service) { + if (localRuntimeDaemonDisabled) return buildUnavailableSyncSnapshot(); + throw new Error("Sync service is not available."); + } + return await service.getStatus({ includeTransferReadiness: arg?.includeTransferReadiness, forceTransferReadiness: arg?.forceTransferReadiness, }); @@ -4235,7 +4313,7 @@ export function registerIpc({ async (event, arg: { laneIds?: string[] | null }): Promise => { const laneIds = Array.isArray(arg?.laneIds) ? arg.laneIds : []; const rootPath = getLocalRuntimeRootForEvent(event); - if (localRuntimeConnectionPool && rootPath) { + if (!localRuntimeDaemonDisabled && localRuntimeConnectionPool && rootPath) { try { await localRuntimeConnectionPool.callSyncForRoot(rootPath, "sync.setActiveLanePresence", { laneIds }); return; @@ -4245,9 +4323,12 @@ export function registerIpc({ } } } - await (await requireSyncService()).setActiveLanePresence( - laneIds, - ); + const service = await resolveOptionalSyncService(); + if (!service) { + if (localRuntimeDaemonDisabled) return; + throw new Error("Sync service is not available."); + } + await service.setActiveLanePresence(laneIds); }, ); diff --git a/apps/desktop/src/main/services/ipc/runtimeBridge.test.ts b/apps/desktop/src/main/services/ipc/runtimeBridge.test.ts index 881115a02..7dbf8b67d 100644 --- a/apps/desktop/src/main/services/ipc/runtimeBridge.test.ts +++ b/apps/desktop/src/main/services/ipc/runtimeBridge.test.ts @@ -1,4 +1,4 @@ -import { beforeEach, describe, expect, it, vi } from "vitest"; +import { afterEach, beforeEach, describe, expect, it, vi } from "vitest"; import { IPC } from "../../../shared/ipc"; import type { OpenProjectBinding, @@ -22,14 +22,39 @@ const remoteCallMachineForTargetMock = vi.hoisted(() => vi.fn()); const remoteDisconnectMock = vi.hoisted(() => vi.fn()); vi.mock("electron", () => ({ + app: { + getPath: vi.fn(() => "/tmp"), + getVersion: vi.fn(() => "1.0.0"), + isPackaged: false, + }, BrowserWindow: { fromWebContents: browserWindowFromWebContents, getAllWindows: browserWindowGetAllWindows, }, + clipboard: { + readImage: vi.fn(() => ({ isEmpty: () => true })), + readText: vi.fn(() => ""), + writeText: vi.fn(), + }, + desktopCapturer: { + getSources: vi.fn(async () => []), + }, + dialog: { + showOpenDialog: vi.fn(), + }, ipcMain: { handle: vi.fn((channel: string, handler: (...args: any[]) => unknown) => { ipcHandlers.set(channel, handler); }), + on: vi.fn(), + }, + nativeImage: { + createFromPath: vi.fn(() => ({ isEmpty: () => true })), + }, + shell: { + openExternal: vi.fn(), + openPath: vi.fn(), + showItemInFolder: vi.fn(), }, })); @@ -62,6 +87,7 @@ vi.mock("../git/git", () => ({ })); import { registerRuntimeBridge } from "./runtimeBridge"; +import { registerIpc } from "./registerIpc"; const target: RemoteRuntimeTarget = { id: "target-1", @@ -99,6 +125,7 @@ function localBinding(rootPath = "/repo"): OpenProjectBinding { describe("registerRuntimeBridge", () => { beforeEach(() => { + delete process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON; ipcHandlers.clear(); browserWindowFromWebContents.mockReset(); browserWindowGetAllWindows.mockReset().mockReturnValue([]); @@ -373,3 +400,95 @@ describe("registerRuntimeBridge", () => { ); }); }); + +describe("registerIpc sync bridge", () => { + beforeEach(() => { + delete process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON; + ipcHandlers.clear(); + browserWindowFromWebContents.mockReset().mockReturnValue({ id: 7 }); + }); + + afterEach(() => { + vi.useRealTimers(); + }); + + it("returns an unavailable sync snapshot without probing local runtime when the daemon is disabled", async () => { + process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON = "1"; + vi.useFakeTimers(); + vi.setSystemTime(new Date("2026-05-11T12:00:00.000Z")); + const localRuntimeConnectionPool = { + syncStatusForRoot: vi.fn(), + callSyncForRoot: vi.fn(), + }; + registerIpc({ + getCtx: () => ({ + syncService: null, + }) as any, + getWindowSession: () => ({ + windowId: 7, + project: { rootPath: "/repo", displayName: "Repo" } as any, + binding: localBinding("/repo"), + }), + localRuntimeConnectionPool: localRuntimeConnectionPool as any, + switchProjectFromDialog: vi.fn(), + closeCurrentProject: vi.fn(), + closeProjectByPath: vi.fn(), + globalStatePath: "/tmp/ade-state.json", + }); + + const snapshot = await ipcHandlers.get(IPC.syncGetStatus)?.( + eventForSender(), + { includeTransferReadiness: true }, + ) as any; + vi.setSystemTime(new Date("2026-05-11T12:00:05.000Z")); + const secondSnapshot = await ipcHandlers.get(IPC.syncGetStatus)?.( + eventForSender(), + { includeTransferReadiness: true }, + ) as any; + + expect(localRuntimeConnectionPool.syncStatusForRoot).not.toHaveBeenCalled(); + expect(secondSnapshot).toBe(snapshot); + expect(snapshot.mode).toBe("standalone"); + expect(snapshot.localDevice.createdAt).toBe("2026-05-11T12:00:00.000Z"); + expect(secondSnapshot.localDevice.updatedAt).toBe(snapshot.localDevice.updatedAt); + expect(secondSnapshot.localDevice.lastSeenAt).toBe(snapshot.localDevice.lastSeenAt); + expect(snapshot.localDevice.metadata).toEqual({ + unavailableReason: "local_runtime_daemon_disabled", + }); + expect(snapshot.client.message).toBe("Sync service unavailable in local runtime disabled mode."); + }); + + it("drops active lane presence updates instead of probing unavailable sync services when the daemon is disabled", async () => { + process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON = "1"; + const localRuntimeConnectionPool = { + callSyncForRoot: vi.fn(), + }; + const resolveSyncService = vi.fn(async () => null); + registerIpc({ + getCtx: () => ({ + syncService: null, + }) as any, + resolveSyncService, + getWindowSession: () => ({ + windowId: 7, + project: { rootPath: "/repo", displayName: "Repo" } as any, + binding: localBinding("/repo"), + }), + localRuntimeConnectionPool: localRuntimeConnectionPool as any, + switchProjectFromDialog: vi.fn(), + closeCurrentProject: vi.fn(), + closeProjectByPath: vi.fn(), + globalStatePath: "/tmp/ade-state.json", + }); + + await expect( + ipcHandlers.get(IPC.syncSetActiveLanePresence)?.( + eventForSender(), + { laneIds: ["lane-1"] }, + ), + ).resolves.toBeUndefined(); + + expect(localRuntimeConnectionPool.callSyncForRoot).not.toHaveBeenCalled(); + expect(resolveSyncService).toHaveBeenCalledTimes(1); + }); +}); diff --git a/apps/desktop/src/preload/preload.test.ts b/apps/desktop/src/preload/preload.test.ts index 4f24c0a80..2ae50a087 100644 --- a/apps/desktop/src/preload/preload.test.ts +++ b/apps/desktop/src/preload/preload.test.ts @@ -10,6 +10,7 @@ describe("preload OAuth bridge", () => { afterEach(() => { vi.resetModules(); vi.doUnmock("electron"); + delete process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON; delete (globalThis as any).__adeBridge; }); @@ -325,6 +326,51 @@ describe("preload OAuth bridge", () => { expect(invoke).toHaveBeenCalledWith(IPC.lanesOpenFolder, { laneId: "lane-1" }); }); + it("skips local runtime IPC when the local runtime daemon is disabled", async () => { + process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON = "1"; + const binding = { + kind: "local", + key: "local:/repo", + rootPath: "/repo", + displayName: "Project", + }; + const invoke = vi.fn(async (channel: string) => { + if (channel === IPC.appGetWindowSession) { + return { windowId: 1, project: { rootPath: "/repo", displayName: "Project" }, binding }; + } + if (channel === IPC.lanesList) return []; + throw new Error(`unexpected IPC: ${channel}`); + }); + const on = vi.fn(); + const removeListener = vi.fn(); + const exposeInMainWorld = vi.fn((name: string, value: unknown) => { + (globalThis as any).__bridgeName = name; + (globalThis as any).__adeBridge = value; + }); + + vi.doMock("electron", () => ({ + contextBridge: { exposeInMainWorld }, + ipcRenderer: { invoke, on, removeListener }, + webFrame: { + getZoomLevel: vi.fn(() => 0), + setZoomLevel: vi.fn(), + getZoomFactor: vi.fn(() => 1), + }, + })); + + await import("./preload"); + + const bridge = (globalThis as any).__adeBridge; + await expect(bridge.lanes.list()).resolves.toEqual([]); + + expect(invoke).toHaveBeenCalledWith(IPC.appGetWindowSession); + expect(invoke).toHaveBeenCalledWith(IPC.lanesList, {}); + expect(invoke).not.toHaveBeenCalledWith( + IPC.localRuntimeCallAction, + expect.anything(), + ); + }); + it("routes project local-data cleanup through a remote project runtime when bound", async () => { const binding = { kind: "remote", diff --git a/apps/desktop/src/preload/preload.ts b/apps/desktop/src/preload/preload.ts index 075eaebe0..3669626d2 100644 --- a/apps/desktop/src/preload/preload.ts +++ b/apps/desktop/src/preload/preload.ts @@ -1049,11 +1049,14 @@ const gitBranchesCache = createKeyedShortIpcCache( 2_000, ); +const localRuntimeDaemonDisabled = + process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON === "1"; + const allowLocalRuntimeFallback = process.env.ADE_LOCAL_RUNTIME_FALLBACK !== "0" && ( process.env.ADE_LOCAL_RUNTIME_FALLBACK === "1" || - process.env.ADE_DISABLE_LOCAL_RUNTIME_DAEMON === "1" || + localRuntimeDaemonDisabled || process.env.ADE_PACKAGE_CHANNEL === "alpha" ); @@ -1082,7 +1085,10 @@ function rememberProjectBinding(binding: OpenProjectBinding | null): void { projectBindingGeneration += 1; resetRemoteRuntimeEventDedup(nextKey); } - if (binding?.kind === "remote" || binding?.kind === "local") { + if ( + binding?.kind === "remote" || + (binding?.kind === "local" && !localRuntimeDaemonDisabled) + ) { ensureRemoteRuntimeEventPump(); } } @@ -1148,6 +1154,7 @@ async function callLocalProjectActionIfBound( action: string, request: Omit = {}, ): Promise<{ handled: true; result: T } | { handled: false }> { + if (localRuntimeDaemonDisabled) return { handled: false }; const binding = await getLocalProjectBinding(); if (!binding) return { handled: false }; try { @@ -1214,6 +1221,7 @@ async function callLocalProjectSyncIfBound( method: string, params: Record = {}, ): Promise<{ handled: true; result: T } | { handled: false }> { + if (localRuntimeDaemonDisabled) return { handled: false }; const binding = await getLocalProjectBinding(); if (!binding) return { handled: false }; try { @@ -1396,7 +1404,11 @@ async function pollRemoteRuntimeEvents(): Promise { let nextDelayMs: number | null = null; try { const binding = await getProjectRuntimeBinding(); - if (!binding || (binding.kind !== "remote" && binding.kind !== "local")) { + if ( + !binding || + (binding.kind !== "remote" && binding.kind !== "local") || + (binding.kind === "local" && localRuntimeDaemonDisabled) + ) { remoteRuntimeEventCursor = 0; remoteRuntimeEventBindingKey = null; remoteRuntimeEventGeneration = projectBindingGeneration; diff --git a/apps/desktop/src/renderer/components/app/App.tsx b/apps/desktop/src/renderer/components/app/App.tsx index 005bc28aa..f821c820a 100644 --- a/apps/desktop/src/renderer/components/app/App.tsx +++ b/apps/desktop/src/renderer/components/app/App.tsx @@ -10,16 +10,6 @@ import { useNavigate } from "react-router-dom"; -// Use path-based routes on http(s) (Vite in Chrome, Cursor Simple Browser, etc.). -// Use hash routes for non-http(s) surfaces (e.g. packaged Electron `file://`) where -// the history API is not tied to a normal origin. -// Relying only on `__adeBrowserMock` breaks when the flag is not set at module-eval -// time, which can strand Cursor's embedded browser on a single path. -const Router = - typeof window !== "undefined" && - (window.location.protocol === "http:" || window.location.protocol === "https:") - ? BrowserRouter - : HashRouter; import { AppShell } from "./AppShell"; import { RunPage } from "../run/RunPage"; import { ProjectSetupPage } from "../onboarding/ProjectSetupPage"; @@ -70,6 +60,16 @@ import { getAiStatusCached } from "../../lib/aiDiscoveryCache"; import { dispatchWorkSurfaceRevealed } from "../terminals/workSurfaceVisibility"; import type { AppNavigationRequest } from "../../../shared/types"; +// Use path-based routes on http(s) (Vite in Chrome, Cursor Simple Browser, etc.). +// Use hash routes for non-http(s) surfaces (e.g. packaged Electron `file://`) where +// the history API is not tied to a normal origin. +// Relying only on `__adeBrowserMock` breaks when the flag is not set at module-eval +// time, which can strand Cursor's embedded browser on a single path. +const usesBrowserRouter = + typeof window !== "undefined" && + (window.location.protocol === "http:" || window.location.protocol === "https:"); +const Router = usesBrowserRouter ? BrowserRouter : HashRouter; + const StartupSplashScreen = (
{/* Background glow */} @@ -328,6 +328,23 @@ function AppNavigationBridge() { return null; } +function BrowserHashRouteBridge() { + const navigate = useNavigate(); + + React.useEffect(() => { + const syncHashRoute = () => { + const hash = window.location.hash; + if (!hash.startsWith("#/")) return; + navigate(hash.slice(1), { replace: true }); + }; + syncHashRoute(); + window.addEventListener("hashchange", syncHashRoute); + return () => window.removeEventListener("hashchange", syncHashRoute); + }, [navigate]); + + return null; +} + export function App() { const theme = useAppStore((s) => s.theme); const projectRoot = useAppStore((s) => s.project?.rootPath ?? null); @@ -356,6 +373,7 @@ export function App() {
+ {usesBrowserRouter ? : null} } /> }> diff --git a/apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx b/apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx index e7db1d3ad..6cb110485 100644 --- a/apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx +++ b/apps/desktop/src/renderer/components/app/App.workKeepAlive.test.tsx @@ -94,6 +94,10 @@ vi.mock("../files/FilesPage", async () => { }; }); +vi.mock("../lanes/LanesPage", () => ({ + LanesPage: () =>
, +})); + describe("App Work route keep-alive", () => { beforeEach(() => { vi.clearAllMocks(); @@ -187,4 +191,18 @@ describe("App Work route keep-alive", () => { expect(screen.queryByTestId("work-page")).toBeNull(); expect(workLifecycle.mounts).toBe(0); }); + + it("converts legacy hash app routes into BrowserRouter paths", async () => { + window.history.replaceState({}, "", "/work#/lanes"); + const { App } = await import("./App"); + + render(); + + await screen.findByTestId("lanes-page"); + await waitFor(() => { + expect(window.location.pathname).toBe("/lanes"); + expect(window.location.hash).toBe(""); + }); + expect(screen.getByTestId("work-page").getAttribute("data-active")).toBe("false"); + }); }); diff --git a/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.test.tsx b/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.test.tsx index f4dd90dab..6c48f8c3e 100644 --- a/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.test.tsx +++ b/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.test.tsx @@ -321,6 +321,12 @@ describe("LaneGitActionsPane rescue action", () => { path: ".claude/worktrees/fix-session-auto-naming", }); }); + await waitFor(() => { + expect(mockStoreState.refreshLanes).toHaveBeenCalledWith({ + includeStatus: true, + includeSnapshots: true, + }); + }); await waitFor(() => { expect(screen.queryByText(".claude/worktrees/fix-session-auto-naming")).toBeNull(); }); diff --git a/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.tsx b/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.tsx index 487bd9e26..3acf3a868 100644 --- a/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.tsx +++ b/apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.tsx @@ -717,6 +717,14 @@ export function LaneGitActionsPane({ } }; + const refreshLaneGitState = useCallback(async (targetLaneId: string | null) => { + await Promise.all([ + refreshChanges(targetLaneId), + refreshLanes({ includeStatus: true, includeSnapshots: true }), + refreshGitMeta(targetLaneId), + ]); + }, [refreshChanges, refreshGitMeta, refreshLanes]); + const refreshAll = async (options?: { fetchRemote?: boolean }, targetLaneId: string | null = laneId) => { if (targetLaneId && options?.fetchRemote) { try { @@ -725,7 +733,7 @@ export function LaneGitActionsPane({ // best effort } } - await Promise.all([refreshChanges(targetLaneId), refreshLanes(), refreshGitMeta(targetLaneId)]); + await refreshLaneGitState(targetLaneId); if (isViewingLane(targetLaneId)) { setCommitTimelineKey((prev) => prev + 1); } @@ -843,13 +851,13 @@ export function LaneGitActionsPane({ }; const completeCommitRefresh = useCallback(async (targetLaneId: string) => { - await Promise.all([refreshChanges(targetLaneId), refreshLanes(), refreshGitMeta(targetLaneId)]); + await refreshLaneGitState(targetLaneId); if (isViewingLane(targetLaneId)) { setCommitTimelineKey((prev) => prev + 1); setCommitMessage(""); setAmendCommit(false); } - }, [isViewingLane, refreshChanges, refreshGitMeta, refreshLanes]); + }, [isViewingLane, refreshLaneGitState]); const submitCommit = useCallback(async () => { if (!laneId || (!hasStaged && !amendCommit) || busyAction != null) return; diff --git a/apps/desktop/src/renderer/components/lanes/LanesPage.test.ts b/apps/desktop/src/renderer/components/lanes/LanesPage.test.ts index e4b36c523..6010fb22b 100644 --- a/apps/desktop/src/renderer/components/lanes/LanesPage.test.ts +++ b/apps/desktop/src/renderer/components/lanes/LanesPage.test.ts @@ -10,6 +10,7 @@ import { selectLanePrTag, sortLaneListRows, } from "./lanePageModel"; +import { shouldMountGitActionsPane } from "./LanesPage"; import type { GitHubPrListItem, LaneSummary, PrSummary } from "../../../shared/types"; type LanePrTarget = Pick; @@ -385,3 +386,31 @@ describe("sortLaneListRows", () => { expect(result.map((lane) => lane.id)).toEqual(["running-pinned", "running-a"]); }); }); + +describe("shouldMountGitActionsPane", () => { + it("mounts one Git Actions pane owner when a lane is expanded", () => { + expect(shouldMountGitActionsPane({ + laneId: "lane-1", + expandedGitActionsLaneId: "lane-1", + surface: "inline", + })).toBe(false); + + expect(shouldMountGitActionsPane({ + laneId: "lane-1", + expandedGitActionsLaneId: "lane-1", + surface: "git-actions-fullscreen", + })).toBe(true); + + expect(shouldMountGitActionsPane({ + laneId: "lane-2", + expandedGitActionsLaneId: "lane-1", + surface: "inline", + })).toBe(true); + + expect(shouldMountGitActionsPane({ + laneId: "lane-1", + expandedGitActionsLaneId: null, + surface: "inline", + })).toBe(true); + }); +}); diff --git a/apps/desktop/src/renderer/components/lanes/LanesPage.tsx b/apps/desktop/src/renderer/components/lanes/LanesPage.tsx index 65c62ed81..0fb70196b 100644 --- a/apps/desktop/src/renderer/components/lanes/LanesPage.tsx +++ b/apps/desktop/src/renderer/components/lanes/LanesPage.tsx @@ -89,6 +89,20 @@ type RebaseScopePromptState = { resolve: (scope: RebaseScope | null) => void; }; +type LanePaneSurface = "inline" | "git-actions-fullscreen" | "lane-fullscreen"; + +export function shouldMountGitActionsPane({ + laneId, + expandedGitActionsLaneId, + surface, +}: { + laneId: string | null; + expandedGitActionsLaneId: string | null; + surface: LanePaneSurface; +}): boolean { + return surface !== "inline" || !laneId || expandedGitActionsLaneId !== laneId; +} + type RebasePushReviewState = { runId: string; lanes: Array<{ laneId: string; laneName: string; selected: boolean }>; @@ -872,7 +886,8 @@ export function LanesPage() { let timer: ReturnType | null = null; const refreshRuntimeOnly = () => refreshLanes({ - includeStatus: true, + includeStatus: false, + includeSnapshots: true, includeConflictStatus: false, includeRebaseSuggestions: false, includeAutoRebaseStatus: false, @@ -2108,13 +2123,18 @@ export function LanesPage() { /* ---- Pane configs ---- */ - const getPaneConfigs = useCallback((laneId: string | null) => { + const getPaneConfigs = useCallback((laneId: string | null, surface: LanePaneSurface = "inline") => { const laneDetail = laneId ? lanePaneDetails[laneId] ?? EMPTY_LANE_PANE_DETAIL : EMPTY_LANE_PANE_DETAIL; const laneSnapshot = laneId ? laneSnapshotByLaneId.get(laneId) ?? null : null; const pendingLinearIssueContext = laneId && linearIssueChatContextRequest?.laneId === laneId ? linearIssueChatContextRequest : null; + const mountGitActionsPane = shouldMountGitActionsPane({ + laneId, + expandedGitActionsLaneId, + surface, + }); return { "git-actions": { title: "Git Actions", @@ -2143,7 +2163,7 @@ export function LanesPage() { ), bodyClassName: "overflow-hidden", - children: ( + children: mountGitActionsPane ? ( handleClearLanePaneDetailSelection(laneId) : undefined} /> - ) + ) : null }, "work": { title: "Work", @@ -3272,7 +3292,7 @@ export function LanesPage() {
@@ -3294,7 +3314,7 @@ export function LanesPage() {
@@ -3328,7 +3348,7 @@ export function LanesPage() { setActiveLaneIds(allIds); }} onBatchManage={openBatchManage} - onAppearanceChanged={() => refreshLanes().catch(() => {})} + onAppearanceChanged={() => refreshLanes({ includeStatus: false }).catch(() => {})} /> ) : null} @@ -3359,7 +3379,7 @@ export function LanesPage() { }} onArchive={() => { archiveManagedLanes().catch(() => {}); }} onDelete={() => { deleteManagedLanes().catch(() => {}); }} - onAppearanceChanged={() => refreshLanes().catch(() => {})} + onAppearanceChanged={() => refreshLanes({ includeStatus: false }).catch(() => {})} /> diff --git a/apps/desktop/src/renderer/state/appStore.test.ts b/apps/desktop/src/renderer/state/appStore.test.ts index 6b82cb4d6..1d62d1ff2 100644 --- a/apps/desktop/src/renderer/state/appStore.test.ts +++ b/apps/desktop/src/renderer/state/appStore.test.ts @@ -286,9 +286,11 @@ describe("appStore", () => { expect(useAppStore.getState().lanes).toEqual([snapshots[0].lane]); }); - it("refreshLanes can request the cheaper snapshot bootstrap path", async () => { - const lanes = [{ id: "lane-lite", name: "Lane lite" }] as any[]; - (window.ade.lanes.list as any).mockResolvedValueOnce(lanes); + it("refreshLanes can request the cheaper lane-list path while preserving prior git status", async () => { + useAppStore.setState({ + lanes: [{ id: "lane-lite", name: "Lane lite", status: { dirty: true }, parentStatus: { ahead: 1 } }] as any[], + }); + (window.ade.lanes.list as any).mockResolvedValueOnce([{ id: "lane-lite", name: "Lane lite", color: "#7dd3fc" }] as any[]); await useAppStore.getState().refreshLanes({ includeStatus: false }); @@ -297,9 +299,73 @@ describe("appStore", () => { includeStatus: false, }); expect(window.ade.lanes.listSnapshots).not.toHaveBeenCalled(); + expect(useAppStore.getState().lanes[0]).toEqual( + expect.objectContaining({ + id: "lane-lite", + color: "#7dd3fc", + status: { dirty: true }, + parentStatus: { ahead: 1 }, + }), + ); + }); + + it("refreshLanes can update lane git status without snapshot decorations", async () => { + const lanes = [{ id: "lane-status", name: "Lane status", status: { dirty: true } }] as any[]; + (window.ade.lanes.list as any).mockResolvedValueOnce(lanes); + + await useAppStore.getState().refreshLanes({ includeStatus: true, includeSnapshots: false }); + + expect(window.ade.lanes.list).toHaveBeenCalledWith({ + includeArchived: false, + includeStatus: true, + }); + expect(window.ade.lanes.listSnapshots).not.toHaveBeenCalled(); expect(useAppStore.getState().lanes).toEqual(lanes); }); + it("refreshLanes can update runtime snapshots without recomputing lane git status", async () => { + useAppStore.setState({ + lanes: [{ id: "lane-1", name: "Lane 1", status: { dirty: true }, parentStatus: { dirty: false } }] as any[], + }); + const snapshots = [ + { + lane: { id: "lane-1", name: "Lane 1", status: { dirty: false }, parentStatus: null }, + runtime: { + bucket: "running", + runningCount: 1, + awaitingInputCount: 0, + endedCount: 0, + sessionCount: 1, + }, + rebaseSuggestion: null, + autoRebaseStatus: null, + conflictStatus: null, + stateSnapshot: null, + adoptableAttached: false, + }, + ] as any[]; + (window.ade.lanes.listSnapshots as any).mockResolvedValueOnce(snapshots); + + await useAppStore.getState().refreshLanes({ + includeStatus: false, + includeSnapshots: true, + includeConflictStatus: false, + includeRebaseSuggestions: false, + includeAutoRebaseStatus: false, + }); + + expect(window.ade.lanes.listSnapshots).toHaveBeenCalledWith({ + includeArchived: false, + includeStatus: false, + includeConflictStatus: false, + includeRebaseSuggestions: false, + includeAutoRebaseStatus: false, + }); + expect(useAppStore.getState().laneSnapshots[0].runtime.bucket).toBe("running"); + expect(useAppStore.getState().lanes[0].status).toEqual({ dirty: true }); + expect(useAppStore.getState().laneSnapshots[0].lane).toBe(useAppStore.getState().lanes[0]); + }); + it("refreshLanes can skip conflict status for cheaper warmup snapshots", async () => { (window.ade.lanes.listSnapshots as any).mockResolvedValueOnce([]); @@ -342,11 +408,17 @@ describe("appStore", () => { }); }); - it("refreshLanes preserves compatible lane snapshots during lightweight refresh", async () => { + it("refreshLanes syncs retained snapshots to statusless lane metadata during lightweight refresh", async () => { useAppStore.setState({ laneSnapshots: [ { - lane: { id: "lane-1", name: "Lane 1" }, + lane: { + id: "lane-1", + name: "Lane 1", + color: "#0f172a", + status: { dirty: true }, + parentStatus: { ahead: 1 }, + }, runtime: { bucket: "running", runningCount: 1, @@ -377,13 +449,22 @@ describe("appStore", () => { }, ] as any[], }); - (window.ade.lanes.list as any).mockResolvedValueOnce([{ id: "lane-1", name: "Lane 1" }] as any[]); + (window.ade.lanes.list as any).mockResolvedValueOnce([ + { id: "lane-1", name: "Lane 1", color: "#7dd3fc" }, + ] as any[]); await useAppStore.getState().refreshLanes({ includeStatus: false }); + expect(window.ade.lanes.listSnapshots).not.toHaveBeenCalled(); expect(useAppStore.getState().laneSnapshots).toEqual([ expect.objectContaining({ - lane: expect.objectContaining({ id: "lane-1" }), + lane: expect.objectContaining({ + id: "lane-1", + color: "#7dd3fc", + status: { dirty: true }, + parentStatus: { ahead: 1 }, + }), + runtime: expect.objectContaining({ bucket: "running" }), }), ]); }); diff --git a/apps/desktop/src/renderer/state/appStore.ts b/apps/desktop/src/renderer/state/appStore.ts index 0e7e0f29e..365aa530f 100644 --- a/apps/desktop/src/renderer/state/appStore.ts +++ b/apps/desktop/src/renderer/state/appStore.ts @@ -618,6 +618,7 @@ type AppState = { refreshProject: () => Promise; refreshLanes: (options?: { includeStatus?: boolean; + includeSnapshots?: boolean; includeConflictStatus?: boolean; includeRebaseSuggestions?: boolean; includeAutoRebaseStatus?: boolean; @@ -632,6 +633,7 @@ export type LaneInspectorTab = "terminals" | "context" | "stack" | "merge"; type LaneRefreshRequest = { includeStatus: boolean; + includeSnapshots: boolean; includeConflictStatus: boolean; includeRebaseSuggestions: boolean; includeAutoRebaseStatus: boolean; @@ -647,28 +649,43 @@ let pendingLaneRefreshRequest: LaneRefreshRequest | null = null; function normalizeLaneRefreshRequest(options?: { includeStatus?: boolean; + includeSnapshots?: boolean; includeConflictStatus?: boolean; includeRebaseSuggestions?: boolean; includeAutoRebaseStatus?: boolean; }): LaneRefreshRequest { const includeStatus = options?.includeStatus ?? true; + const includeSnapshots = options?.includeSnapshots ?? includeStatus; return { includeStatus, - includeConflictStatus: includeStatus && (options?.includeConflictStatus ?? true), - includeRebaseSuggestions: includeStatus && (options?.includeRebaseSuggestions ?? true), - includeAutoRebaseStatus: includeStatus && (options?.includeAutoRebaseStatus ?? true), + includeSnapshots, + includeConflictStatus: includeSnapshots && (options?.includeConflictStatus ?? true), + includeRebaseSuggestions: includeSnapshots && (options?.includeRebaseSuggestions ?? true), + includeAutoRebaseStatus: includeSnapshots && (options?.includeAutoRebaseStatus ?? true), }; } function mergeLaneRefreshRequests(current: LaneRefreshRequest, next: LaneRefreshRequest): LaneRefreshRequest { return { includeStatus: current.includeStatus || next.includeStatus, + includeSnapshots: current.includeSnapshots || next.includeSnapshots, includeConflictStatus: current.includeConflictStatus || next.includeConflictStatus, includeRebaseSuggestions: current.includeRebaseSuggestions || next.includeRebaseSuggestions, includeAutoRebaseStatus: current.includeAutoRebaseStatus || next.includeAutoRebaseStatus, }; } +function withPreservedLaneStatus( + lane: LaneSummary, + previousLanesById: Map, + previousSnapshotsById: Map, +): LaneSummary { + const previousLane = previousLanesById.get(lane.id) ?? previousSnapshotsById.get(lane.id)?.lane; + return previousLane + ? { ...lane, status: previousLane.status, parentStatus: previousLane.parentStatus } + : lane; +} + function scheduleProjectHydration(get: () => AppState) { if (warmupTimer != null) { window.clearTimeout(warmupTimer); @@ -951,21 +968,31 @@ export const useAppStore = create((set, get) => ({ const runRefresh = async (currentRequest: LaneRefreshRequest) => { const requestedProjectKey = normalizeProjectKey(get().project?.rootPath); const token = ++laneRefreshVersion; - const laneSnapshots = currentRequest.includeStatus + const previousLanesById = new Map(get().lanes.map((lane) => [lane.id, lane] as const)); + const previousSnapshotsById = new Map(get().laneSnapshots.map((snapshot) => [snapshot.lane.id, snapshot] as const)); + const rawLaneSnapshots = currentRequest.includeSnapshots ? await window.ade.lanes.listSnapshots({ includeArchived: false, - includeStatus: true, + includeStatus: currentRequest.includeStatus, includeConflictStatus: currentRequest.includeConflictStatus, includeRebaseSuggestions: currentRequest.includeRebaseSuggestions, includeAutoRebaseStatus: currentRequest.includeAutoRebaseStatus, }) : null; - const lanes = laneSnapshots != null + const laneSnapshots = rawLaneSnapshots?.map((snapshot) => { + if (currentRequest.includeStatus) return snapshot; + const lane = withPreservedLaneStatus(snapshot.lane, previousLanesById, previousSnapshotsById); + return lane === snapshot.lane ? snapshot : { ...snapshot, lane }; + }) ?? null; + const rawLanes = laneSnapshots != null ? laneSnapshots.map((snapshot) => snapshot.lane) : await window.ade.lanes.list({ includeArchived: false, - includeStatus: false, + includeStatus: currentRequest.includeStatus, }); + const lanes = laneSnapshots != null || currentRequest.includeStatus + ? rawLanes + : rawLanes.map((lane) => withPreservedLaneStatus(lane, previousLanesById, previousSnapshotsById)); // Discard stale response: a newer refresh was issued while this one was in-flight if (token !== laneRefreshVersion) { return; @@ -993,9 +1020,15 @@ export const useAppStore = create((set, get) => ({ nextLaneWorkViews[scopeKey] = viewState; } } + const lanesById = new Map(lanes.map((lane) => [lane.id, lane] as const)); const nextSnapshots: LaneListSnapshot[] = laneSnapshots ?? - prev.laneSnapshots.filter((snapshot) => allowed.has(snapshot.lane.id)); + prev.laneSnapshots + .filter((snapshot) => allowed.has(snapshot.lane.id)) + .map((snapshot) => { + const nextLane = lanesById.get(snapshot.lane.id); + return nextLane ? { ...snapshot, lane: nextLane } : snapshot; + }); persistWorkViewState({ workViewByProject: prev.workViewByProject, laneWorkViewByScope: nextLaneWorkViews, @@ -1015,6 +1048,7 @@ export const useAppStore = create((set, get) => ({ const activeSatisfies = activeRequest != null && (activeRequest.includeStatus || !request.includeStatus) + && (activeRequest.includeSnapshots || !request.includeSnapshots) && (activeRequest.includeConflictStatus || !request.includeConflictStatus) && (activeRequest.includeRebaseSuggestions || !request.includeRebaseSuggestions) && (activeRequest.includeAutoRebaseStatus || !request.includeAutoRebaseStatus); diff --git a/docs/ARCHITECTURE.md b/docs/ARCHITECTURE.md index 310052c14..8e180ee4a 100644 --- a/docs/ARCHITECTURE.md +++ b/docs/ARCHITECTURE.md @@ -119,7 +119,7 @@ The desktop app is a **client of the runtime**. It owns a trusted main process, | Directory | Role | |-----------|------| | `apps/desktop/src/main/` | Node process with full OS access. Hosts windows, registers IPC handlers, routes runtime-backed APIs through local/remote runtime pools, spawns the local runtime daemon when needed, and runs the legacy in-process services that have not yet been migrated to the runtime. Entry: `main.ts`. | -| `apps/desktop/src/preload/` | Typed bridge. Entry: `preload.ts`. Uses `contextBridge.exposeInMainWorld("ade", { ... })`. Runtime-backed APIs route through `LocalRuntimeConnectionPool` (local) or `RemoteConnectionPool` (SSH-bound window). | +| `apps/desktop/src/preload/` | Typed bridge. Entry: `preload.ts`. Uses `contextBridge.exposeInMainWorld("ade", { ... })`. Runtime-backed APIs route through `LocalRuntimeConnectionPool` (local) or `RemoteConnectionPool` (SSH-bound window); when `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, local-bound windows skip the daemon/event pump and use guarded in-process IPC fallbacks. | | `apps/desktop/src/renderer/` | React 18 SPA. No Node access, no filesystem access, no direct process/network. Everything goes through `window.ade`. Entry: `main.tsx`. | | `apps/desktop/src/shared/` | Types, IPC channel constants (`ipc.ts`), model registry (`modelRegistry.ts`), keybindings, and other DTOs. Imported by both desktop and `apps/ade-cli`. New runtime-facing types live in `shared/types/remoteRuntime.ts` and `shared/types/core.ts`. | | `apps/desktop/src/generated/` | Build-time generated code (e.g., bootstrap SQL snapshots). | @@ -130,7 +130,7 @@ The desktop app is a **client of the runtime**. It owns a trusted main process, **Runtime binding pools.** -- `apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.ts` — desktop-side client for the local `ade serve` daemon. Spawns or attaches to the machine socket, registers local projects with `projects.add`, dispatches local runtime actions, and best-effort installs the background service in packaged builds. +- `apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.ts` — desktop-side client for the local `ade serve` daemon. Spawns or attaches to the machine socket, registers local projects with `projects.add`, dispatches local runtime actions, and best-effort installs the background service in packaged builds. `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1` is a development/diagnostic escape hatch: preload does not pump local runtime events or issue local runtime actions, and main-process sync IPC returns a standalone unavailable snapshot or no-ops lane-presence updates instead of spawning the daemon. - `apps/desktop/src/main/services/remoteRuntime/` — SSH-bound runtime pool. `remoteTargetRegistry.ts` stores saved machines under `~/.ade/secrets/remote-machines.json`; `sshTransport.ts` handles ssh-agent / key based transport; `remoteBootstrap.ts` does first-connect runtime upload + version negotiation against the bundled `ade-` binary; `remoteConnectionPool.ts` keeps the per-window remote runtime binding alive with reconnect / eviction; `runtimeRpcClient.ts` is the JSON-RPC client; `runtimeDiscovery.ts` discovers reachable runtimes on the network. Build outputs (configured in `apps/desktop/tsup.config.ts`): @@ -521,13 +521,14 @@ On startup the main process also invokes `recoverManagedOpenCodeOrphans({ force: | Pane layouts | `react-resizable-panels`, in-house `PaneTilingLayout` | | Virtualization | `@tanstack/react-virtual` | -Electron renderer runtime does **not** wrap the app in `React.StrictMode`. Browser-mock development (outside Electron) still uses Strict Mode. +Electron renderer runtime does **not** wrap the app in `React.StrictMode`. Browser-mock development (outside Electron) still uses Strict Mode. The app uses `BrowserRouter` on normal `http(s)` origins and `HashRouter` inside Electron/file-like contexts; `App.tsx` also bridges legacy `#/route` fragments into BrowserRouter paths so old ADE deep links keep working in the browser-hosted dev shell. ### 7.2 Global store -`apps/desktop/src/renderer/state/appStore.ts` (~868 lines) — Zustand store holding project, lanes, selected lane, theme, provider mode, keybindings, per-project work-view state. Patterns: +`apps/desktop/src/renderer/state/appStore.ts` (~1,325 lines) — Zustand store holding project, lanes, selected lane, theme, provider mode, keybindings, per-project work-view state. Patterns: - Narrow selectors on components to minimize re-renders. +- `refreshLanes` accepts independent lane-status and lane-snapshot flags. Callers can refresh cheap runtime snapshot decorations without recomputing git status, or update git status without rebuilding conflict/rebase/auto-rebase overlays; statusless refreshes preserve the previous `LaneStatus`/`parentStatus` in store so the UI does not flicker to unknown git state. - Per-project work-view state keyed by project root (`WorkProjectViewState`). Includes the right-edge Work sidebar fields `workSidebarOpen`, `workSidebarTab` (`"git" | "files" | "ios" | "app-control" | "browser"`), and `workSidebarWidthPct` (clamped 26–55) — persisted alongside the rest of the work-view state under `ade.workViewState.v1`. The sidebar consolidates lane-scoped tools that were previously split across separate floating panes; per-chat iOS / App Control drawers still exist on `AgentChatPane` but are suppressed when the chat is mounted as a Work tile so the sidebar owns those surfaces at lane scope. The `browser` tab is the only sidebar tab that is not lane-scoped — the built-in browser is one shared instance per app. - Store-owned event subscriptions for high-frequency streams (e.g., missions). - `projectRevision` is a monotonically incrementing counter bumped inside `setProject` whenever the active project root actually changes. Long-lived renderer-side caches (most notably the module-level xterm runtime cache in `TerminalView.tsx`) subscribe to it and tear down any entries whose `projectRoot`/`projectRevision` no longer match, so PTYs never bleed between projects. All project-transition paths (`refreshProject`, `openRepo`, `switchProjectToPath`, `closeProject`) go through `setProject` to keep the counter honest. diff --git a/docs/features/lanes/README.md b/docs/features/lanes/README.md index de91b030c..5291a292e 100644 --- a/docs/features/lanes/README.md +++ b/docs/features/lanes/README.md @@ -24,15 +24,17 @@ remote-bound windows. The legacy in-process `laneService.ts` still exists on the desktop main process as a fallback target so older callers and tests keep working — preload calls the runtime first via `callProjectRuntimeActionOr("lane", …)` and only invokes the local IPC -handler if no runtime is bound. For remote-bound windows the worktree is -created on the remote machine; the desktop renders the same UX but the -git operations, file watchers, PTYs, and processes execute on the remote -host. The desktop main process keeps a thin `laneListSnapshotService.ts` -helper for assembling per-window lane snapshots that overlay sync -presence on top of runtime-supplied lane summaries. Multi-window: each -desktop window has its own project binding, so a lane-creation request -in window A targets window A's runtime (local or remote) regardless of -what window B is bound to. +handler if no runtime is bound. When `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1` +is set for local development/diagnostics, preload skips the local daemon +route entirely and goes straight to those in-process IPC fallbacks. For +remote-bound windows the worktree is created on the remote machine; the +desktop renders the same UX but the git operations, file watchers, PTYs, +and processes execute on the remote host. The desktop main process keeps +a thin `laneListSnapshotService.ts` helper for assembling per-window lane +snapshots that overlay sync presence on top of runtime-supplied lane +summaries. Multi-window: each desktop window has its own project binding, +so a lane-creation request in window A targets window A's runtime (local +or remote) regardless of what window B is bound to. ## Source file map @@ -67,7 +69,7 @@ Renderer components: | File | Responsibility | |------|---------------| -| `renderer/components/lanes/LanesPage.tsx` | 3-pane cockpit, tab management, dialog coordination. Each lane row in the lane list optionally renders a state-aware PR tag (`PR #N` / `DRAFT #N` / `MERGED #N` / `CLOSED #N`) when the lane's current branch matches an existing PR. The pure selectors in `lanePageModel.ts` prefer ADE-linked PR rows, then fall back to `prs.getGitHubSnapshot().repoPullRequests` so merged or externally created PRs stay visible by branch match; linked PRs route to the PR workspace, while unlinked GitHub-only matches open externally. Lane delete kicks off optimistically: the page subscribes to `lanes.delete.event`, tracks per-lane `LaneDeleteProgress` in `deleteProgressByLaneId`, immediately closes the manage dialog, and excludes deleting lanes from the selectable lane id sets used by keyboard navigation (`selectableFilteredLaneIds`, `sortedSelectableLaneIds`). Lane tabs for deleting lanes render a non-interactive overlay with a spinning `CircleNotch` and a `Deleting` / `Deleted` label; selection / pinning / context menu / split / git-actions surfaces are all suppressed for those rows. `resolveLaneDeleteStartSelection` (also used by tests) computes a fallback selection so the user is moved to the next available lane the moment delete starts, and a top-bar "Lane action failed" chip surfaces any failure or cancellation through `laneActionError`. | +| `renderer/components/lanes/LanesPage.tsx` | 3-pane cockpit, tab management, dialog coordination. Each lane row in the lane list optionally renders a state-aware PR tag (`PR #N` / `DRAFT #N` / `MERGED #N` / `CLOSED #N`) when the lane's current branch matches an existing PR. The pure selectors in `lanePageModel.ts` prefer ADE-linked PR rows, then fall back to `prs.getGitHubSnapshot().repoPullRequests` so merged or externally created PRs stay visible by branch match; linked PRs route to the PR workspace, while unlinked GitHub-only matches open externally. Runtime activity refreshes use `refreshLanes({ includeStatus: false, includeSnapshots: true, ... })` so PTY/chat/process buckets update without recomputing git status. Expanding Git Actions suppresses the hidden inline duplicate pane via `shouldMountGitActionsPane` while keeping the fullscreen pane mounted. Lane delete kicks off optimistically: the page subscribes to `lanes.delete.event`, tracks per-lane `LaneDeleteProgress` in `deleteProgressByLaneId`, immediately closes the manage dialog, and excludes deleting lanes from the selectable lane id sets used by keyboard navigation (`selectableFilteredLaneIds`, `sortedSelectableLaneIds`). Lane tabs for deleting lanes render a non-interactive overlay with a spinning `CircleNotch` and a `Deleting` / `Deleted` label; selection / pinning / context menu / split / git-actions surfaces are all suppressed for those rows. `resolveLaneDeleteStartSelection` (also used by tests) computes a fallback selection so the user is moved to the next available lane the moment delete starts, and a top-bar "Lane action failed" chip surfaces any failure or cancellation through `laneActionError`. | | `renderer/components/lanes/lanePageModel.ts` | Pure lane-page selectors and URL/deletion helpers used by `LanesPage` and unit tests. Owns lane branch/PR matching, ADE-vs-GitHub PR tag precedence, deep-link lane selection, create-lane request normalization, and delete-start selection fallback. | | `renderer/components/lanes/laneUtils.ts` | Pure lane list/filter helpers plus default pane trees, including the work-focused tiling tree used by parallel chat launch deep links. | | `renderer/components/lanes/laneColorPalette.ts` | Curated 12-swatch lane color palette (`LANE_COLOR_PALETTE`) plus helpers (`getLaneAccent`, `colorsInUse`, `nextAvailableColor`, `laneColorName`). The first 8 hexes form `LANE_FALLBACK_COLORS`, the legacy index-based fallback used for lanes that don't have an explicit color assigned. | @@ -76,7 +78,7 @@ Renderer components: | `renderer/components/lanes/LaneContextMenu.tsx` | Right-click menu on the lane list. Hosts the inline color swatch row that calls `lanes.updateAppearance` directly, "Reveal/Copy path", manage/adopt/open-in-Run actions, split-tab actions, and batch manage. | | `renderer/components/lanes/LaneStackPane.tsx` | Stack graph sidebar, integration source chips, canvas jump | | `renderer/components/lanes/LaneDiffPane.tsx` | Lane diff list + per-file stage/unstage/discard; file content uses shared `AdeDiffViewer` (commit comparisons read-only; working-tree file can be editable when unstaged) | -| `renderer/components/lanes/LaneGitActionsPane.tsx` | Commit, stash, fetch, sync, push, recent commits. Seeds its `autoRebaseStatus` from the `autoRebaseStatusSnapshot` prop that `LanesPage` passes from the lane list (`laneSnapshot.autoRebaseStatus`), so opening a lane does not trigger a per-lane probe. A fallback `refreshAutoRebaseStatus` runs only when the snapshot is `undefined`, after a 3.5 s delay, and only while the document is visible. | +| `renderer/components/lanes/LaneGitActionsPane.tsx` | Commit, stash, fetch, sync, push, recent commits. After commit/stash operations it refreshes changes, lane git status, and git metadata while skipping snapshot decorations (`refreshLanes({ includeStatus: true, includeSnapshots: false })`). Seeds its `autoRebaseStatus` from the `autoRebaseStatusSnapshot` prop that `LanesPage` passes from the lane list (`laneSnapshot.autoRebaseStatus`), so opening a lane does not trigger a per-lane probe. A fallback `refreshAutoRebaseStatus` runs only when the snapshot is `undefined`, after a 3.5 s delay, and only while the document is visible. | | `renderer/components/lanes/LaneWorkPane.tsx` | Terminal/chat toggle work surface | | `renderer/components/lanes/LaneRebaseBanner.tsx` | Inline banner driven by `rebaseSuggestionService` | | `renderer/components/lanes/LaneEnvInitProgress.tsx` | Env init step progress inside create dialog | @@ -414,7 +416,10 @@ open lanes; primary lanes render with a home icon. (`LaneTerminalsPanel`) and an agent chat view (`AgentChatPane`). Chat sessions inherit `cwd = lane.worktreePath`. - The Lanes page reads pane overlay data from `appStore` (`lanes`, - `refreshLanes`) and from the per-lane `useLaneWorkSessions` hook. + `laneSnapshots`, `refreshLanes`) and from the per-lane + `useLaneWorkSessions` hook. `refreshLanes` can refresh lane rows, + git status, and snapshot overlays independently; statusless refreshes + preserve the previous git status in store. - `LaneRuntimeBar` (Run page) renders lane runtime state: health dot, proxy/preview status, OAuth callback URL, active processes. It parallelizes six IPC calls and debounces via an in-flight sequence diff --git a/docs/features/remote-runtime/README.md b/docs/features/remote-runtime/README.md index ca41ee3c6..90805cf4c 100644 --- a/docs/features/remote-runtime/README.md +++ b/docs/features/remote-runtime/README.md @@ -18,7 +18,9 @@ The wire transport is the same JSON-RPC the local daemon answers. The remote-run confirmation dialog before opening a remote project, surfaces local matches with uncommitted changes. - `apps/desktop/src/preload/preload.ts` — routes runtime-backed renderer APIs to - local or remote JSON-RPC actions based on the active project binding. + local or remote JSON-RPC actions based on the active project binding. When + `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, local-bound windows skip local runtime + actions and event polling and use guarded Electron IPC fallbacks. - `apps/ade-cli/src/multiProjectRpcServer.ts` — runtime-level project catalog and sync methods plus project-scoped action dispatch. - `apps/ade-cli/src/services/projects/` — machine project registry and @@ -91,7 +93,7 @@ After install, the headless machine can already serve clients. Desktop ADE on a Remote project bindings route lanes, agent chat, PTYs, terminal IO, file operations, file-watch notifications, git actions, PR actions, PR queue automation, PR AI conflict-resolution sessions, PR issue-resolution launch flows, Path to Merge orchestration, AI PR summaries, issue inventory, and event streaming through the remote runtime. Agent CLI failures (Claude / Codex / Cursor / Droid not installed or not authenticated) surface as inline `AgentCliAuthCard` cards in chat; the install / login buttons open a tracked terminal in the active runtime, so a remote project runs the install or login command on the remote machine. -Local project bindings prefer the local `ade serve` daemon for the same surfaces — agent chat, session history, PTYs, terminal reads/writes, file operations and watchers, diffs, lanes, PRs, PR queues, PR issue-resolution launch flows, Path to Merge, PR AI conflict-resolution sessions, issue inventory, tests, processes, project config, and most git operations. The legacy in-process Electron services remain only as a guarded fallback while the last IPC surfaces are migrated. +Local project bindings prefer the local `ade serve` daemon for the same surfaces — agent chat, session history, PTYs, terminal reads/writes, file operations and watchers, diffs, lanes, PRs, PR queues, PR issue-resolution launch flows, Path to Merge, PR AI conflict-resolution sessions, issue inventory, tests, processes, project config, and most git operations. The legacy in-process Electron services remain only as a guarded fallback while the last IPC surfaces are migrated. Setting `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1` disables that local daemon path for development/diagnostics: preload avoids local runtime action calls and the event pump, and desktop sync IPC reports a standalone unavailable snapshot instead of starting the daemon. Memory and embedding features are disabled for remote runtimes in v1. The static remote runtime does not bundle `onnxruntime-node`. @@ -99,7 +101,7 @@ Memory and embedding features are disabled for remote runtimes in v1. The static iOS does not SSH into a machine. The phone connects to the runtime daemon's sync WebSocket advertised on the LAN or over a Tailscale tailnet. Install Tailscale on the phone and the ADE machine when they are not on the same local network. -On desktop, phone pairing and sync status are managed by the local `ade serve` daemon. The legacy in-process desktop sync host is disabled by default and can be re-enabled only for diagnostics with `ADE_ENABLE_DESKTOP_SYNC_HOST=1`. +On desktop, phone pairing and sync status are managed by the local `ade serve` daemon. If the local daemon is disabled with `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, sync status remains visible as a standalone unavailable snapshot and lane-presence updates no-op. The legacy in-process desktop sync host is disabled by default and can be re-enabled only for diagnostics with `ADE_ENABLE_DESKTOP_SYNC_HOST=1`. ## Troubleshooting diff --git a/docs/features/remote-runtime/internal-architecture.md b/docs/features/remote-runtime/internal-architecture.md index 3edff1d12..0c6e292b2 100644 --- a/docs/features/remote-runtime/internal-architecture.md +++ b/docs/features/remote-runtime/internal-architecture.md @@ -6,7 +6,7 @@ Remote runtime support is built on the same JSON-RPC runtime the local `ade serv `OpenProjectBinding` records the active runtime for a window: -- `kind: "local"` — actions go through `LocalRuntimeConnectionPool`, which connects to the machine socket (`~/.ade/sock/ade.sock`) and spawns `ade serve` if it is not running. +- `kind: "local"` — actions normally go through `LocalRuntimeConnectionPool`, which connects to the machine socket (`~/.ade/sock/ade.sock`) and spawns `ade serve` if it is not running. With `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, preload treats the local runtime route as unavailable and falls back to Electron IPC without spawning or polling the daemon. - `kind: "remote"` — actions go through `RemoteConnectionPool` keyed by `{ targetId, projectId }`. The binding is established when a project is opened. Local bindings are created from the current desktop project (the desktop calls `LocalRuntimeConnectionPool.ensureProject(rootPath)` to register the project with the daemon and capture its `projectId`). Remote bindings are created by `remoteRuntimeOpenProject` after the selected target is connected and the remote project record is confirmed. @@ -33,7 +33,7 @@ Project-scoped operations are routed through `ade/actions/call` and carry `param `ade/initialize` advertises `runtimeInfo.multiProject: true` and `capabilities.projects: true`. Clients use that to decide whether to send `projectId` per request (multi-project runtime) or treat the runtime as already bound to one project (embedded `ade code --embedded`). `validateRemoteRuntimeInitializeResult` enforces both flags on the remote side and rejects mismatched runtime versions. -Runtime event streaming uses `ade/actions/call` with `name: "stream_events"` for one-shot pulls, and `runtimeEvents.subscribe` (with `runtime/event` notifications) for live streaming. For remote bindings the desktop reconnects the SSH transport before re-subscribing, matching normal remote action behavior after disconnects. For local bindings, preload polls the local daemon through `localRuntimeStreamEvents` so daemon-owned chat, terminal, pty, lane, file-watch, process, and test events are delivered through the same renderer fanout used by remote projects. +Runtime event streaming uses `ade/actions/call` with `name: "stream_events"` for one-shot pulls, and `runtimeEvents.subscribe` (with `runtime/event` notifications) for live streaming. For remote bindings the desktop reconnects the SSH transport before re-subscribing, matching normal remote action behavior after disconnects. For local bindings, preload polls the local daemon through `localRuntimeStreamEvents` so daemon-owned chat, terminal, pty, lane, file-watch, process, and test events are delivered through the same renderer fanout used by remote projects. The local event pump is not started when `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`. ## SSH transport @@ -72,13 +72,13 @@ Before opening a remote project, `remoteRuntimeCheckLocalWork` compares the remo The sync WebSocket host is owned by the `ade serve` daemon in normal desktop operation. `ProjectScopeRegistry.ensureSyncHost` elects the most-recently-opened registered project as the active sync host and re-elects when projects are added or removed. -Desktop sync Settings IPC first talks to the local runtime daemon for status, discovery, device registry, and PIN operations, then falls back to the legacy in-process sync service only when the daemon route is unavailable. The old desktop-host path is guarded by `ADE_ENABLE_DESKTOP_SYNC_HOST=1` for diagnostics and migration debugging. +Desktop sync Settings IPC first talks to the local runtime daemon for status, discovery, device registry, and PIN operations, then falls back to the legacy in-process sync service only when the daemon route is unavailable. When `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, IPC skips the daemon route; if no in-process sync service is available, status returns a standalone unavailable snapshot and `sync.setActiveLanePresence` no-ops. The old desktop-host path is guarded by `ADE_ENABLE_DESKTOP_SYNC_HOST=1` for diagnostics and migration debugging. The sync command registry labels descriptors as `runtime` or `project` scope. Project-bound hosts reject project-scoped commands that arrive without a matching `projectId`, while runtime-scoped commands operate on the daemon as a whole. This keeps mobile/controller commands explicit in the multi-project runtime. ## Local daemon routing -Local desktop windows go through the runtime binding before falling back to legacy Electron-hosted handlers. `callProjectRuntimeActionOr` and `callProjectRuntimeSyncOr` in `apps/desktop/src/preload/preload.ts` try the runtime path first and fall back to the in-process IPC only on a safe local-runtime fallback error. +Local desktop windows go through the runtime binding before falling back to legacy Electron-hosted handlers. `callProjectRuntimeActionOr` and `callProjectRuntimeSyncOr` in `apps/desktop/src/preload/preload.ts` try the runtime path first and fall back to the in-process IPC only on a safe local-runtime fallback error. The exception is `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`: preload returns "not handled" for local runtime calls immediately, so local windows use the fallback handlers directly. The runtime path covers: From 3541071ef0f4da57deff3ef72b0ef90d016d9094 Mon Sep 17 00:00:00 2001 From: Arul Sharma <31745423+arul28@users.noreply.github.com> Date: Tue, 12 May 2026 03:52:26 -0400 Subject: [PATCH 04/10] Optimize Work tab runtime and tools pane Squash-merge Work tab optimization lane into ade/autoresearch-d23e8745 after automate/finalize and review fixes. --- .agents/skills/ade-autoresearch/SKILL.md | 187 +- .agents/skills/ade-perf-work/SKILL.md | 458 ++++ .../ade-cli/src/multiProjectRpcServer.test.ts | 7 +- apps/ade-cli/src/stdioRpcDaemon.test.ts | 7 +- .../services/git/gitOperationsService.test.ts | 62 +- .../main/services/git/gitOperationsService.ts | 30 +- .../services/github/githubService.test.ts | 25 +- .../src/main/services/github/githubService.ts | 72 +- .../src/main/services/ipc/registerIpc.ts | 12 +- .../usage/usageTrackingService.test.ts | 13 + .../services/usage/usageTrackingService.ts | 36 +- apps/desktop/src/preload/global.d.ts | 4 + apps/desktop/src/preload/preload.ts | 32 +- .../src/renderer/components/app/AppShell.tsx | 36 +- .../components/app/CommandPalette.test.tsx | 21 +- .../renderer/components/app/TopBar.test.tsx | 15 + .../chat/AgentChatComposer.test.tsx | 81 + .../chat/AgentChatMessageList.test.tsx | 190 ++ .../components/chat/AgentChatMessageList.tsx | 12 +- .../AgentChatPane.companionDrawers.test.tsx | 427 ++++ .../chat/ChatAppControlPanel.test.tsx | 296 +++ .../chat/ChatBuiltInBrowserPanel.test.tsx | 57 +- .../components/chat/ChatGitToolbar.test.tsx | 126 + .../chat/ChatIosSimulatorPanel.test.tsx | 399 +++- .../components/chat/ChatIosSimulatorPanel.tsx | 6 + .../chat/ChatSubagentsPanel.test.tsx | 109 + .../chat/ChatTerminalDrawer.test.tsx | 77 +- .../components/chat/ChatWorkLogBlock.tsx | 21 +- .../chat/CursorCloudInlineLaunch.test.tsx | 66 + .../components/files/FilesPage.test.tsx | 58 + .../renderer/components/files/FilesPage.tsx | 52 +- .../components/lanes/CommitTimeline.test.tsx | 38 + .../components/lanes/CommitTimeline.tsx | 12 +- .../components/lanes/LaneDiffPane.test.tsx | 142 ++ .../lanes/LaneGitActionsPane.test.tsx | 84 + .../components/lanes/LaneGitActionsPane.tsx | 10 +- .../components/terminals/LaneCombobox.tsx | 4 + .../terminals/MacosVmPanel.test.tsx | 186 ++ .../terminals/SessionContextMenu.tsx | 39 +- .../terminals/SessionListPane.test.tsx | 44 + .../components/terminals/SessionListPane.tsx | 25 +- .../components/terminals/WorkSidebar.tsx | 27 +- .../terminals/WorkStartSurface.test.tsx | 38 + .../terminals/WorkViewArea.test.tsx | 121 +- .../renderer/lib/useGithubProjectRemote.ts | 2 +- apps/desktop/src/shared/ipc.ts | 1 + docs/ARCHITECTURE.md | 4 +- .../files-and-editor/editor-surfaces.md | 7 + docs/features/lanes/README.md | 2 +- .../features/terminals-and-sessions/README.md | 11 +- docs/perf/work-tab-action-inventory.md | 2128 +++++++++++++++++ 51 files changed, 5762 insertions(+), 157 deletions(-) create mode 100644 .agents/skills/ade-perf-work/SKILL.md create mode 100644 apps/desktop/src/renderer/components/chat/AgentChatPane.companionDrawers.test.tsx create mode 100644 apps/desktop/src/renderer/components/chat/ChatAppControlPanel.test.tsx create mode 100644 apps/desktop/src/renderer/components/chat/ChatGitToolbar.test.tsx create mode 100644 apps/desktop/src/renderer/components/chat/ChatSubagentsPanel.test.tsx create mode 100644 apps/desktop/src/renderer/components/chat/CursorCloudInlineLaunch.test.tsx create mode 100644 apps/desktop/src/renderer/components/lanes/CommitTimeline.test.tsx create mode 100644 apps/desktop/src/renderer/components/lanes/LaneDiffPane.test.tsx create mode 100644 apps/desktop/src/renderer/components/terminals/MacosVmPanel.test.tsx create mode 100644 apps/desktop/src/renderer/components/terminals/WorkStartSurface.test.tsx create mode 100644 docs/perf/work-tab-action-inventory.md diff --git a/.agents/skills/ade-autoresearch/SKILL.md b/.agents/skills/ade-autoresearch/SKILL.md index 0a074c396..ccae8ba12 100644 --- a/.agents/skills/ade-autoresearch/SKILL.md +++ b/.agents/skills/ade-autoresearch/SKILL.md @@ -1,16 +1,17 @@ --- name: ade-autoresearch description: Iteratively optimize an ADE tab's CPU/memory/IPC/render performance. - Runs predefined scenarios, identifies bottlenecks from JSONL metrics, makes ONE - targeted code change per iteration, gates on tests + smoke, keeps wins on a - branch, and distills patterns into per-tab perf skills. Invoke when the user - says "optimize ", "autoresearch ", or "perf pass on ". Drives - ADE pointed at the perf-pass throwaway repo; full liberty inside that repo. - Uses Codex/GPT models for any in-ADE AI activity unless a scenario opts into - Claude. + Drives the real UI, builds tab-specific probes from the visible product and + perf-pass repo, identifies bottlenecks from JSONL metrics, makes ONE targeted + code change per iteration, gates on tests + smoke, keeps wins on a branch, and + distills patterns into per-tab perf skills. Invoke when the user says + "optimize ", "autoresearch ", or "perf pass on ". Drives ADE + pointed at the perf-pass throwaway repo; full liberty inside that repo. Uses + Codex/GPT models for in-ADE AI activity unless the run explicitly opts into + another configured provider for comparison. metadata: author: ADE - version: 0.2.0 + version: 0.3.0 --- # ade-autoresearch @@ -24,7 +25,12 @@ A Karpathy-style autoresearch loop for ADE perf. You (the agent) ARE the loop ru ## Real UI audit is the primary loop -The job is to find what a person actually feels in the tab. Deterministic scenarios are guardrails and regression checks, not a substitute for driving the product. +The job is to find what a person actually feels in the tab. Do not predefine a +fixed deterministic scenario suite and mistake that for the audit. Build the +measurement plan from the live UI, source, and perf-pass repo; then create +whatever repeatable probes, scripts, seeded repo states, or tests are needed to +show load-time, CPU, heap, IPC, render, and interaction deltas for the surfaces +the user actually exercises. Use this order: @@ -39,6 +45,19 @@ Use this order: The inventory must be tab-derived. For example, a Work pass should cover Work sidebar/session list, chat/CLI/shell start surfaces, session tabs/grid/layout controls, running/ended session actions, model/attachment/command/parallel pickers, terminal/chat panes, context menus, filters/search, and ADE tools drawers because those are Work-tab surfaces. A Lanes pass should cover lane list, stack graph, lane dialogs, Git Actions, and lane Work panes because those are Lanes-tab surfaces. + Do not claim complete coverage until the inventory itself says every row is + measured, prompt-only, external-skip, or explicitly deferred with a reason. + A handful of representative clicks is a partial smoke pass, not an audit. If + an action matrix already exists for the tab, update that matrix as evidence + arrives instead of replacing it with narrative notes. + + Treat the matrix as the work queue. Pick the next unresolved action/state, + drive that exact UI path, record evidence, then either promote the row or + mark why it needs a fixture, sandbox, prompt, or external dependency. When a + row exposes slowness, churn, overflow, broken behavior, or missing + accessibility, make one targeted product change, re-drive that same row, and + only then move to the next row. + 3. **Mark each UI segment in the perf log** before and after exercising it: ```ts window.ade.perf.recordEvent({ kind: "manualStep", ts: Date.now(), name: "git-actions-stage", phase: "start" }); @@ -49,12 +68,20 @@ Use this order: 4. **Use direct IPC only for setup, cleanup, and analysis.** It is fine to create fixture data, reset a throwaway repo, query status, or extract metrics through IPC/shell. Do not replace a UI audit action with `window.ade.*` unless the UI is genuinely impossible to drive; if you must, say so in the run notes. -5. **Run deterministic scenarios after UI findings.** Scenarios catch regressions and quantify broad fitness. They do not prove the tab is clean unless the UI action inventory was also covered. +5. **Create UI-derived probes after findings.** If existing scenarios cover a + surface, you may run them. If they do not, write a tab-specific probe, + scenario, fixture, or test that reproduces the measured workflow against the + perf-pass repo. The probe is evidence, not a product requirement: it exists to + quantify a real UI bottleneck and compare before/after behavior. ## Setup (do once at start of run) 1. **Read prior wins** at `.agents/skills/ade-perf-/SKILL.md` if it exists. These are optional best-practice notes from earlier audits, not prerequisites. If no per-tab skill exists, derive the checklist from the tab UI and source and create the per-tab skill only during codification after you have measured real behavior. -2. **Read scenario definitions** at `apps/desktop/src/renderer/perf/scenarios/.ts`. These are the *contract*. Do NOT edit them. +2. **Inspect existing perf probes** under `apps/desktop/src/renderer/perf/scenarios/`, + scripts, and tab tests. Reuse what matches the tab, but do not treat missing or + incomplete scenarios as a blocker and do not let scenario availability define + the work. It is acceptable to add new tab-specific scenarios/probes when they + help quantify a real UI workflow. 3. **Verify perf-pass repo** exists, has a seed tag, and can exercise real GitHub paths when needed: ```bash scripts/reset-perf-pass.sh @@ -68,32 +95,29 @@ Use this order: ## Baseline (iteration 0) -Start with one deterministic scenario sweep so you know the existing guardrail fitness, then do the real UI inventory. The baseline is not complete until both exist. +Start with the real UI inventory. The baseline is complete when every safe tab +surface has been exercised or explicitly marked unsafe/external/destructive, and +each measured segment has corresponding perf evidence. -Run all scenarios for the tab. For lanes that's: +For each important workflow, capture at least one of: -```bash -node scripts/run-perf-scenario.mjs lanes.cold-list baseline-cold -node scripts/run-perf-scenario.mjs lanes.switch-rapid baseline-switch -node scripts/run-perf-scenario.mjs lanes.idle-at-rest baseline-idle -node scripts/run-perf-scenario.mjs lanes.scroll-list baseline-scroll -node scripts/run-perf-scenario.mjs lanes.stress-poll baseline-stress -``` +- A real UI perf-launch run with `manualStep` markers in + `~/.ade/perf-runs//events.jsonl` +- A purpose-built probe or scenario that drives the workflow against the + perf-pass repo and writes `summary.json` / `events.jsonl` +- A focused unit/component/integration test that reproduces the expensive derive, + mount, or IPC behavior +- A shell/IPC setup script only for fixture creation and cleanup -For boot (which has a mix of project-loaded and no-project scenarios): - -```bash -node scripts/run-perf-scenario.mjs boot.cold-paint baseline-paint --no-project -node scripts/run-perf-scenario.mjs boot.recent-projects baseline-recent --no-project -node scripts/run-perf-scenario.mjs boot.open-project baseline-open --no-project -node scripts/run-perf-scenario.mjs boot.remote-runtime baseline-remote -node scripts/run-perf-scenario.mjs boot.idle-welcome baseline-idle --no-project -node scripts/run-perf-scenario.mjs boot.stress-launch baseline-stress --no-project -``` +Record `baseline_metrics` as a small table, not a single mandatory fitness +score. Include the metrics that matter to the surface: route/load time, segment +duration, main CPU p95, renderer CPU or long tasks, heap growth, IPC count/p95, +render-on-scroll time, or panel mount cost. If an existing scenario reports a +fitness score, keep it as one data point; do not let it override real UI evidence. -Each writes `~/.ade/perf-runs//summary.json`. Read all summaries. Compute the **per-tab fitness** as the sum of all scenario fitness scores. Record this as `baseline_fitness`. Also record per-component breakdown so you can target the worst component. - -Then launch the real tab with `perf-launch`, drive the action inventory, and analyze `~/.ade/perf-runs//events.jsonl` by manualStep segments. Record the worst UI segment, the slow IPC channels inside it, and whether the cost is expected work (for example network push/fetch) or avoidable tab work. +Then analyze `events.jsonl` by manualStep segment. Record the worst UI segment, +the slow IPC channels inside it, and whether the cost is expected work (for +example network push/fetch) or avoidable tab work. Tag the baseline commit: ```bash @@ -102,26 +126,31 @@ git tag perf-baseline--$(date +%Y%m%d) ## Iteration loop -Stop conditions: **no fitness improvement for 10 consecutive iterations** OR user kills the run OR 50 iterations OR 4 hours wall-clock. +Stop conditions: **no measurable improvement on the current bottleneck for 10 +consecutive attempts** OR user kills the run OR 50 iterations OR 4 hours +wall-clock. For each iteration: ### 1. Analyze -- Read the latest scenario summaries and the latest real-UI `events.jsonl`. -- Pick the **#1 bottleneck**: the avoidable cost that appears in real UI segments or scenario summaries. Tie-break by user-visible workflow first, then reproducibility across scenarios. +- Read the latest real-UI `events.jsonl`, probe outputs, scenario summaries, and + focused test results. +- Pick the **#1 bottleneck**: the avoidable cost that appears in real UI segments + or repeatable probes. Tie-break by user-visible workflow first, then + reproducibility and metric severity. - Common bottleneck categories: - **Slow IPC channel**: a channel in `summary.ipc.slowChannels` with p95 ≥ 120ms - **Long task spam**: `webVitals.longTaskCount` > 5 per minute - - **Memory growth**: `process.rendererHeapGrowthMB` > 10 over a scenario + - **Memory growth**: `process.rendererHeapGrowthMB` > 10 over a measured workflow - **Render-on-scroll cost**: `marks.scroll.*` p95 high - **Route transition cost**: `marks.nav.*` or `marks.switch.*` p95 high - - **Main CPU**: `process.mainCpuPercentP95` > 30 during idle scenarios → background pollers + - **Main CPU**: `process.mainCpuPercentP95` > 30 during idle or panel-open probes → background pollers - UI segment waste: heavy refreshes, duplicate mounted panes, hidden pollers, repeated global status checks, or expensive dialog prefetches that are not needed for the action the user took - Read the code that owns the bottleneck. Form a hypothesis. ### 2. Propose ONE change -Legal moves (examples — not exhaustive): +Legal moves (examples, not a complete list): - Memoize a hot selector with `useMemo` / `useCallback` - Batch IPC calls (collapse N independent invokes into one) - Debounce / throttle a poller @@ -136,12 +165,23 @@ Legal moves (examples — not exhaustive): **Forbidden moves:** - Editing anything under `apps/desktop/src/main/services/perf/**` -- Editing anything under `apps/desktop/src/renderer/perf/**` -- Editing `scripts/run-perf-scenario.mjs` or `scripts/reset-perf-pass.sh` +- Editing metrics plumbing under `apps/desktop/src/renderer/perf/harness/**`, + `apps/desktop/src/renderer/perf/markers.ts`, or + `apps/desktop/src/renderer/perf/webVitals.ts` to make results look better +- Editing `scripts/run-perf-scenario.mjs` or `scripts/reset-perf-pass.sh` to + weaken measurement or setup - Editing test files to make them pass - Disabling polling/sync features outright (only debounce/throttle) -- Removing UI features or hiding elements to bypass scenarios -- Changing fitness weights or scenario definitions +- Removing UI features or hiding elements to bypass measured workflows +- Changing metric weights, summaries, or existing probes to mask a regression + +Allowed measurement moves: +- Add new tab-specific scenarios/probes under `apps/desktop/src/renderer/perf/scenarios/` + when they drive a real UI-derived workflow. +- Add scripts under `scripts/` or tests under the touched feature area to seed the + perf-pass repo or reproduce an expensive UI path. +- Expand a probe to cover a newly discovered tab surface, provided it remains + honest about what it measures. ### 3. Apply the change One commit, focused. Conventional message: `perf(): `. @@ -155,14 +195,27 @@ npm --prefix apps/desktop run test -- --run path/to/affected.test.ts If tests fail: **revert** the commit (`git reset --hard HEAD~1`), do NOT count toward plateau, try a different change targeting the same or next bottleneck. ### 5. Measure -First re-drive the same UI segment with the same markers and compare the IPC/render/memory delta. Then re-run the smallest scenario subset that covers the changed surface. Re-run all scenarios before declaring the run done. +First re-drive the same UI segment with the same markers and compare the +IPC/render/memory/load/CPU delta. Then re-run the smallest probe, scenario, or +test that covers the changed surface. Before declaring the run done, re-run the +final measured sweep that covers the audited surfaces; this can be a mix of real +UI markers, custom probes, and existing scenarios. ### 6. Smoke gate -For each scenario's summary, check `summary.scenarios..ok === true` and `smokeFailures.length === 0`. If any scenario failed smoke: **revert**, increment plateau counter. +For each probe or scenario that writes a summary, check +`summary.scenarios..ok === true` when present and `smokeFailures.length === 0`. +For tests, require the targeted tests to pass. If the workflow breaks or smoke +fails because of the code change: **revert**, increment the missed-attempt +counter. ### 7. Decide -- Improvement threshold: `new_fitness < best_fitness * 0.98` (≥2% better) -- If improvement: **keep**. Update best. Reset plateau to 0. Amend the commit message with `fitness `. +- Improvement threshold: at least one primary metric for the bottleneck improves + by ≥2% without regressing the surrounding smoke metrics. Use the most relevant + metric for the workflow (duration, CPU p95, heap, long tasks, IPC count/p95, + render cost), not a mandatory global fitness score. +- If improvement: **keep**. Update best. Reset plateau to 0. Amend the commit + message with the metric delta, e.g. `work open 1840ms → 1210ms` or + `ipc p95 160ms → 70ms`. - Else: **revert** (`git reset --hard HEAD~1`). Plateau += 1. ### 8. Soft iteration cap @@ -171,10 +224,31 @@ If this iteration has been running >15 minutes wall clock (build loops, scenario ## Termination When stop condition hits: -1. Print run summary: starting fitness, final fitness, %-improvement, list of kept commits (sha + message + fitness delta). +1. Print run summary: baseline metrics, final metrics, %-improvement for each + kept bottleneck, and list of kept commits (sha + message + metric delta). 2. Suggest the user merge the working branch into main via PR. 3. Proceed to codification (next section). +## Completion and handoff discipline + +Do not describe the run as "done", "complete", or "covered" while the tab +inventory still has unresolved rows (`source`, `fixture-needed`, `sandbox-only`, +or unvisited `prompt-only` / `external-skip`) unless the user explicitly narrowed +the objective. Open rows mean the run is still in progress. + +If there is a feasible next measured iteration and the user has not asked you to +stop, continue the loop instead of ending with a future-work summary. If you must +pause because the user asked for a handoff, the environment needs cleanup, or a +blocking decision is required, do all of the following before the final response: + +- Stop any perf/dev/Electron processes you started and record the latest run id. +- Update the tab audit matrix with what is measured, invalid, skipped, and next. +- Update the per-tab perf skill with any measured win that future agents must + preserve. +- State clearly that the audit is incomplete and name the next concrete loop. +- Include a ready-to-run follow-up prompt that points to the matrix, run ids, + current bottleneck, validation commands, and "do not claim full coverage" rule. + ## Codify (after the run ends) Read all kept commits (`git log --oneline perf-baseline--... HEAD`). For each, extract the **pattern** (the technique used, not the literal change). Update `.agents/skills/ade-perf-/SKILL.md`: @@ -185,13 +259,22 @@ Read all kept commits (`git log --oneline perf-baseline--... HEAD`). For ea - **Why it helped**: which bottleneck it addressed, with the metric delta from the summary. - **How to recognize when to apply**: signs in future code that the same pattern is needed. - **Anti-pattern to avoid**: what NOT to do. - - **Verification**: which scenario + metric this affected. + - **Verification**: which UI segment, probe, scenario, or test metric this affected. - Preserve proven history, but keep the top of the file readable as best practices for future code changes. ## Notes on agent behavior - **Stay focused.** One bottleneck at a time. Resist the urge to "while I'm here also fix..." — that breaks attribution. -- **Trust the metric.** If fitness went up but you "feel" the code is better, revert anyway. The metric is the contract. -- **The perf-pass repo is your sandbox.** Inside it, you may create lanes, open chats, push/pull throwaway branches, run automations, stash changes, and delete fixtures when needed to exercise ADE. Scenarios are guardrails; real UI audit coverage is required before you call the tab optimized. You may extend scenarios ONLY by adding new scenarios in `apps/desktop/src/renderer/perf/scenarios/.ts` — never by editing existing ones. -- **Codex model only.** If a scenario invokes an in-ADE chat, that chat uses the `ADE_MODEL_OVERRIDE` model (gpt-5-codex by default). Scenarios opting into Claude must declare `requiresClaude: true` and you must set `ADE_PERF_ALLOW_CLAUDE=1` for them. +- **Trust the metric.** If the relevant measured workflow does not improve, revert + even when the code feels cleaner. The metric-backed user workflow is the + contract. +- **The perf-pass repo is your sandbox.** Inside it, you may create lanes, open + chats, push/pull throwaway branches, run automations, stash changes, and delete + fixtures when needed to exercise ADE. Purpose-built probes are encouraged when + fixed scenarios do not cover the tab. Real UI audit coverage is required before + you call the tab optimized. +- **Codex model preference.** If a probe or in-ADE action invokes chat/agent work, + use the `ADE_MODEL_OVERRIDE` model (gpt-5-codex by default) for the majority of + chat work and for deep performance-fix work. Other configured providers may be + sampled for comparison when the user asks for broad coverage. - **Concurrency**: only one perf run on the machine at a time. If `~/.ade/perf-runs/` contains a `/lock` file with a live pid, refuse to start. diff --git a/.agents/skills/ade-perf-work/SKILL.md b/.agents/skills/ade-perf-work/SKILL.md new file mode 100644 index 000000000..8b46b955e --- /dev/null +++ b/.agents/skills/ade-perf-work/SKILL.md @@ -0,0 +1,458 @@ +--- +name: ade-perf-work +description: Performance and UX patterns discovered for ADE's Work tab, including chat/CLI/shell launch surfaces, the Work tools pane, Git/Files/iOS/App Control/Browser/Mac VM panels, and local-runtime-disabled perf runs. Read before editing Work tab code. +metadata: + author: ADE + version: 0.1.0 +--- + +# ade-perf-work + +Read this before editing Work tab surfaces: + +- `apps/desktop/src/renderer/components/terminals/**` +- `apps/desktop/src/renderer/components/chat/**` when mounted from Work +- `apps/desktop/src/renderer/components/lanes/LaneGitActionsPane.tsx` +- `apps/desktop/src/renderer/components/lanes/CommitTimeline.tsx` +- `apps/desktop/src/renderer/components/files/FilesPage.tsx` when embedded +- `apps/desktop/src/preload/preload.ts` +- `apps/desktop/src/main/services/ipc/registerIpc.ts` +- Work-facing tool services for iOS Simulator, App Control, built-in browser, and macOS VM + +## Measurement pattern + +Use the real Work tab first. For local perf runs, reset and open the perf-pass repo: + +```bash +scripts/reset-perf-pass.sh +NO_DEVTOOLS=1 ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1 ADE_LOCAL_RUNTIME_FALLBACK=1 ADE_MODEL_OVERRIDE=gpt-5-codex \ + node scripts/perf-launch.mjs --tab work --run-id work-ui-audit-- +``` + +Keep the Work audit matrix at `docs/perf/work-tab-action-inventory.md` current. +Rows start as source inventory. Promote them only when a real UI run, UI-derived +probe, or focused fixture test covers that exact control/state. Do not describe a +run as complete while rows are still `source`, `fixture-needed`, +`sandbox-only`, `prompt-only`, or `external-skip` without evidence or an explicit +reason. + +Use the matrix as the Work pass queue. Enumerate the visible Work actions and +states, then handle rows one by one: drive the real control, record markers, +promote or classify the row, fix at most one discovered bottleneck/bug, re-drive +the same row to prove the change, and only then advance. Backend/source review is +supporting evidence; it does not replace trying the UI action. + +Drive actual Work UI actions and record `work.audit.*` markers for: + +- session search/filter, tab/grid, Chat/CLI/Shell mode switches +- model picker, attachment picker, slash command picker, parallel model configuration +- Work tools pane open/close +- Git: status, More menu, history refresh, diff selection +- Files: mount and path filtering +- iOS Sim, App Control, Browser, Mac VM panel mounts + +Do not start from a fixed deterministic scenario list. Scenario files are only +optional evidence after the Work UI inventory exposes a real workflow. The +important proof is a UI-derived run over +`~/.ade/perf-runs//events.jsonl` plus focused tests for any fixed +behavior. + +## Current known wins + +### Skip classic GitHub repo probes during shell status + +`githubService.getStatus()` must still validate the token and keep the +fine-grained-token repo access probe. For classic tokens, required scopes are +inspectable from `/user`, so do not also probe the active repo on the Work +startup status path. + +Measured Work runs: + +- Before: `ade.github.getStatus` `644ms` during Work startup. +- After: post-fix samples were `473ms` and `192ms` in the hot-reload run, and + `524ms` in a clean Work launch. + +Keep `repoAccessOk` / `repoAccessError` as `null` for classic-token statuses. +Do not remove the fine-grained-token repo probe; it prevents a false connected +state when a fine-grained token cannot read the active repo. + +### Keep GitHub auth status out of the first Work startup window + +The top-bar Publish pill only needs local origin state. It must call the +lightweight `github.getRemoteStatus` path, not full `github.getStatus`, so Work +startup does not validate GitHub auth just to decide whether to show Publish. + +The AppShell GitHub banner/avatar refresh may still call full `github.getStatus`, +but keep it post-startup. Settings save/clear broadcasts still update the shell +immediately through `onStatusChanged`. + +Measured Work runs: + +- Before shell auth delay: first `10s` IPC total `901ms`, with + `ade.github.getStatus` at `293ms` in the startup window. +- After: first `10s` IPC total `633ms`; `ade.github.getStatus` was absent from + the first startup summary. The auth check still ran later at `301ms`. +- `ade.github.getRemoteStatus` stayed `0ms` in the startup path. + +### Bound local usage cost-log scans + +The usage tracker starts in Work perf launches after the startup delay and scans +local Claude/Codex JSONL logs for cost estimates. Do not read every recent log +file into memory. Scan only bounded recent files, skip oversized files, stream +line-by-line, and cap retained token entries. + +Measured Work runs: + +- Before: clean Work launch crashed with V8 heap OOM; main-process heap reached + `1,867.7MB` around the `usage.start` window. +- After: clean Work launch stayed alive past the same window with main-process + heap max `130.9MB` and CPU p95 `0.18%`. + +If changing usage/cost telemetry, test with a large local `~/.codex` / +`~/.claude` history and re-run a Work perf launch for at least 45 seconds. + +### Skip the local runtime bridge when the local daemon is disabled + +In Work perf runs with `ADE_DISABLE_LOCAL_RUNTIME_DAEMON=1`, the preload bridge must not try `ade.localRuntime.callAction`, `ade.localRuntime.callSync`, or `ade.localRuntime.streamEvents` for local project bindings. + +Measured Work cold run: + +- Before: `67` failed `ade.localRuntime.*` IPC calls, `19,427ms` aggregate failed IPC time. +- After: `0` failed `ade.localRuntime.*` IPC calls. + +Keep the preload guard in `apps/desktop/src/preload/preload.ts` intact. If changing project binding or remote runtime event pump logic, re-run a local-runtime-disabled Work audit and confirm local runtime IPC remains zero. + +### Fast-path sync status when the local runtime daemon is disabled + +`ade.sync.getStatus` is called during Work startup and periodically from shell chrome. In local-runtime-disabled runs it must not spawn the disabled runtime or fail before falling back. + +Measured Work run: + +- Before sync fix: `ade.sync.getStatus` failed and consumed `606ms` across two calls in the startup window. +- After: the same status path returned successfully in `0-1ms`. + +Keep the unavailable sync snapshot in `apps/desktop/src/main/services/ipc/registerIpc.ts`. It is a perf-mode/status fallback, not a replacement for real sync service behavior. + +### Work tools pane must remain operable when narrow + +The Work tools pane can be narrow after the session list, chat surface, and tools pane are all visible. Do not assume all tab labels fit. The tab strip should collapse to icon buttons under narrow widths while preserving `aria-label`, tooltips, and stable hit targets. + +Measured UI pass after compact tabs: + +- Git, Files, iOS Sim, App Control, Browser, and Mac VM tabs were all visible and clickable in the narrow tools pane. +- No tools tab had a bounding rect beyond the renderer viewport. + +If changing `WorkSidebar`, verify with a small Work pane and a larger audit window. The target is no clipped or unreachable tool tabs, not merely no TypeScript errors. + +### Session list filters must wrap in narrow panes + +The sessions pane can be squeezed when Work, the sessions list, and the tools pane are all visible. Status/group filter pills must wrap inside the filter panel instead of assuming a single row, and embedded lane selectors must be able to fill a narrow parent without overflowing it. + +Measured Work run `work-inventory-shell-session-20260512-01`: + +- Before: a `120.2px` filter panel had status/group controls extending `122.8px` past the panel edge. +- After: the rightmost control ended `8.0px` inside the same narrow panel. + +When editing `SessionListPane` filter controls or `LaneCombobox` trigger sizing, verify a small Work viewport with the filter panel open and record the panel/control bounds in the action inventory. + +### Context menus must clamp to the renderer viewport + +Session and Work-tab context menus are fixed-position menus created from the event `clientX/clientY`. Do not place them directly at the click point without measuring the rendered menu size and clamping to the viewport. + +Measured Work run `work-context-menu-edge-20260512-01`: + +- Before: a Work-tab context menu opened at `left=534.0px` in a `582px` viewport and overflowed right by `132.0px`. +- After: the same probe placed the `180px` menu at `left=394.0px`, ending `8.0px` inside the viewport. + +When changing `SessionContextMenu`, verify with a small Work viewport and a context menu opened near the right edge. Keep the inset stable so users can still reach every menu item. + +Files context menus have the same requirement. The Files tree context menu is +rendered by `FilesPage.tsx`; measure its actual rendered width/height and clamp +both axes to the viewport. + +Measured Work run `work-chat-other-controls-20260512-01`: + +- Before: right-clicking the `README.md` row near the viewport edge opened the + menu at `x=1163.0px` with `width=200px` in a `1164px` viewport, overflowing + right by `199px` and bottom by `159.1px`. +- After: the same probe placed it at `x=956.0px`, `right=1156.0px`, + `bottom=737.0px` in the `1164x745` viewport. + +When changing Files context-menu contents or row context-menu handling, verify +with a right-click near the renderer's right/bottom edges. + +Embedded Files has a narrower parent than the full Files route. Keep the Work +tools pane layout responsive: the explorer/editor split should stack when +`FilesPage` is embedded rather than preserving a fixed `320px` explorer column +that pushes editor controls offscreen. + +Measured Work run `work-chat-other-controls-20260512-01`: + +- Before: the embedded explorer ended at `right=1170.8px` in a `1164px` + viewport, the editor collapsed to `1.8px`, and the `CODE` button overflowed + right by `92.2px`. +- After: explorer and editor both fit inside the tools pane at `right=1151.6px`; + the `CODE` button ended at `right=953.8px` with no overflow. + +### Chat controls depend on provider/model state + +The Work Chat composer changes its visible controls after provider/model +selection. Do not mark a control missing until the matching provider state is +set through the real model picker. + +Measured Work run `work-chat-controls-20260512-02`: + +- `Fast mode` appeared only after selecting a Codex model with a fast service + tier (`GPT-5.4` in the measured run). The valid marker toggled + `aria-pressed` from `false` to `true`. +- The Codex approval preset menu exposed `Default permissions`, `Plan mode`, + `Full access`, and `Custom (config.toml)` after a Codex model was selected. +- Parallel setup starts with two visible model slots; `Add model` should + increase the visible slot count, `Configure` should become `Editing`, and the + uppercase `FOCUSED` / `PARALLEL` execution controls should be checked with a + viewport-aware probe before promotion. +- In a fresh empty perf-pass chat, the slash command menu may expose only + `/clear Clear chat history`. Do not promote `work.chat.command.select` from + that state; use a fixture or real session state with a non-clear command. +- Handoff controls are not an empty-draft surface. They appear only for standard + locked Work chats. Use focused `AgentChatPane.submit.test.tsx -t "handoff"` + evidence for open/permission/launch logic unless you intentionally create a + real throwaway chat in perf-pass. +- The proof/artifacts drawer is also not exposed on the empty draft surface in + Work. Use a selected standard chat/session or a focused companion-toolbar + fixture before measuring proof drawer open/close or right-pane resize. +- In the Work tab embedding, `AgentChatPane` is rendered with + `hideLaneToolDrawers` from `WorkViewArea` / `WorkStartSurface`, so the + chat-toolbar iOS and App Control drawer buttons are intentionally absent. + Verify Work coverage through the WorkSidebar tabs and panels; keep exact + drawer-button rows as fixture-needed/non-Work `AgentChatPane` fixture + evidence if they remain in the inventory. +- Cursor permission/mode controls and Cursor Cloud actions only appear after a + Cursor-backed model is selected through the real model picker. In the measured + run, selecting a Cursor SDK model exposed a native mode select with `Agent`, + `Ask`, `Plan`, and `Full auto`; the `Cursor Cloud actions` menu exposed + `Send to Cursor Cloud` and `Open existing cloud chat` without launching a + cloud session. `CursorCloudInlineLaunch.test.tsx` covers canceling an inline + Cursor Cloud send without creating a run. +- `work.chat.composer.clear` is not an empty-draft affordance in the current + Work composer. The visible `Clear` button is gated by `turnActive` alongside + steer/stop controls, so measure it with an active-turn fixture or sandbox chat + session rather than by typing into the fresh draft surface. +- The Browser tools panel can be measured from the empty Work surface for tab + creation, tab switching/closing, URL typing, and inspect toggling. Browser + screenshot crop is different: in the empty tools panel it is disabled with + `Chat context is unavailable here`, so measure screenshot start/cancel from a + selected chat/context-capable surface. +- Chat attachment search uses the composer lane, not the Files tools workspace. + In perf-pass, switch the composer lane to `Primary` before searching for + `README.md`; a missing lane worktree legitimately returns no file results. + For React-controlled picker inputs, prefer focused CDP text input over simply + assigning `input.value`, because DOM-only value changes can leave the picker + state at `Type to search files...`. Text-file chips cover select/remove only; + `open-preview` and `copy-image` need an image attachment fixture. +- `ChatAttachmentTray.test.tsx` is valid focused fixture evidence for image + attachment preview/copy behavior: it mocks `getImageDataUrl`, verifies the + image lightbox opens, and verifies `writeClipboardImage` is called from the + copy control. +- `AgentChatMessageList.test.tsx` has valid focused fixture evidence for + transcript message copy, cloud PR browser navigation, file-link routing, + transcript code-block copy, tool-call disclosure expansion, full-prompt + toggles, memory/thought disclosure toggles, manual transcript scroll, + jump-to-latest, user-message minimap jump, and inline question tab/prev-next + controls. Keep assistant markdown code blocks routed through + `HighlightedCode`; that component owns the `Copy code` control and copy-button + placement preference. Match the inventory row to the exact clicked control; do + not use the localhost URL test as PR-browser evidence. Grouped tool results + render through `ChatWorkLogBlock`, not the old standalone `ToolResultCard`; + keep long-result `show all` / `collapse` behavior on that reachable + `ChatWorkLogBlock` row path. +- `AgentChatPane.submit.test.tsx` has valid focused fixture evidence for + selecting chat tabs in the Work chat pane. Its CLI-created terminal tests only + auto-reveal terminal tabs; use `ChatTerminalDrawer.test.tsx` for terminal + drawer toggle, resize, and manual tab-switch evidence. +- `AgentChatPane.companionDrawers.test.tsx` covers chat companion iOS, App + Control, proof drawer open/close, right-pane split resize, archived-chat + restore, and persistent identity `Clear view` through the real + `AgentChatPane` chrome. This is non-Work fixture evidence for the lane tool + drawers: normal Work embedding passes `hideLaneToolDrawers`, so iOS/App + Control drawer buttons are intentionally absent there. +- `ChatSubagentsPanel.test.tsx` can cover the subagents drawer toggle, detail + view, Back navigation, hidden timeline expansion, and copy-id behavior without + spawning real subagents. +- `AgentChatComposer.test.tsx` can cover the active-turn `Clear` composer + control, queued steer edit/remove callbacks, prompt-suggestion Tab + acceptance, rich visual-context chip select/remove, slash-command selection, + and dismissing an attachment error. `FilesPage.test.tsx` can cover the + non-embedded Files editor theme toggle, primary-workspace `TRUST & EDIT` + toggle, and Files tree context-menu `COPY PATH`. Embedded Work Files does not + render those non-embedded chrome controls, so keep live embedded fail markers + separate from focused Files chrome coverage. +- `SessionListPane.test.tsx` can cover the stale running-session warning, child + shell section collapse/expand, and bulk restore footer wiring. + `PackedSessionGrid.test.tsx` covers persisted tile resize, while + `WorkViewArea.test.tsx` covers selecting a tiled grid session by clicking its + body, closing an ended tab from the tab strip, and embedded floating-pane + minimize/expand through the actual `FloatingPane` chrome context. +- `WorkStartSurface.test.tsx` can cover the `lanes=[]` no-lanes empty state + without mutating the perf-pass lane database. +- Browser back/forward/reload can use a localhost two-page fixture. For + `Stop loading`, do not count a URL-open click while the toolbar is still + `busy`; that leaves the real Stop button disabled. Use direct browser API + navigation only as setup for a slow localhost page, wait for the visible Stop + button to become enabled, then click the real button and verify loading + returns to false. If `captureScreenshot` times out before crop mode appears, + record it as invalid screenshot evidence and keep screenshot start/cancel + fixture-needed. +- Browser selected-context rows can use direct `setBounds`/`selectPoint` only to + seed a localhost selected element; measure the real UI controls afterward. + The composer may be a `textarea[aria-label="Type to vibecode"]` rather than a + rich contenteditable after cleanup, so verify `Insert draft` against the + actual textarea value. Auto-attached browser context chips must be removed + before handing off. +- `ChatBuiltInBrowserPanel.test.tsx` can cover starting screenshot crop mode, + canceling crop mode, and clicking the visible Attach control for an already + selected browser element. It does not cover dragging a crop region; keep crop + drag sandbox-only or measure it in a browser-capable throwaway run. +- After forcing BrowserView bounds, switch the Work tools pane away from + Browser and verify `builtInBrowser.getStatus().visible === false` before + measuring unrelated chat-toolbar controls. The chat Git commit-message input + can then be measured with CDP mouse input without submitting; clear the input + value afterward, and treat close/submit behavior as separate evidence. Broad + Work UI coordinate probes did not open the Quick Run Radix menu, but + `ChatGitToolbar.test.tsx` covers the current-lane navigation button and the + embedded `QuickRunMenu` trigger with pointerdown/up; use that as fixture + evidence unless a future real-UI probe opens the menu reliably. +- The detached App Control tools pane can safely measure local text entry in + `App Control launch command` and `CDP port` if the fields are restored and + Run/Connect are not clicked. `Help wire CDP` is not rendered there unless + `canAttachToChat` is true, and Control/Inspect mode plus focused-element + typing need an active app session. +- `ChatAppControlPanel.test.tsx` can cover safe App Control state controls + without launching or driving an app: selecting a configured Run-tab command, + inserting the Help wire CDP draft, showing the launch terminal, refreshing a + snapshot, dismissing the message, toggling Control/Inspect, typing in the + local focused-element text field, switching controlled windows, and re-scanning + controlled windows. It can also cover hovering an inspected screenshot + element, selecting a source-context point, and re-attaching that selected + point. Do not use it as evidence for screenshot control clicks, + element-crop attachment, Type/send, Stop, Run, or Connect. +- Mac VM provisioning fields are safe to measure as state-only controls when + restored afterward: CPU, memory, disk, display, source mode, source image, and + local viewer checkbox. Do not click Start, Provision, Stop, Focus, + Screenshot, Delete, or Setup docs during a normal inventory pass unless the + row state explicitly allows that sandbox/external/prompt path. +- `MacosVmPanel.test.tsx` can cover selecting a mocked VM screenshot point and + editing the local VM text input without clicking the externally visible + Click/Type actions. Keep actual VM click/type/send rows sandbox-only unless a + throwaway VM run is explicitly allowed. +- For iOS Simulator tools, switch the Work lane to a real existing worktree + before measuring status, launch-target, or Preview Lab refresh. Missing lane + worktrees produce useful fixture evidence but should not be the only proof for + `refresh-state` or `preview-refresh`. Surface switches, Preview Control / + Capture mode, and the preview agent-action selector are safe state-only + controls; Ask-agent/#Preview prompt buttons need a chat-capable fixture or + explicit clipboard/draft handling. +- `ChatIosSimulatorPanel.test.tsx` can cover iOS stream retry/recovery by + emitting a `stream-error` event and verifying the panel restarts the stream + for the selected device. It can also cover state-only launch-target select, + Preview-target select, setup install-command copy, Preview `Ask agent`, and + no-target `Ask agent to add a #Preview` without launching Simulator or Xcode. + The same fixture can cover live simulator Control/Inspect mode switches, + active-app text-field entry before Send, inspector refresh, and starting + screenshot capture mode. It can also cover selecting a mocked ADE-inspector + element and opening Preview Lab scoped to that element's source file; keep + Preview rendering as separate sandbox-only evidence. It also covers clearing + stale launch-target/project-root errors after the project root changes and a + later `listLaunchTargets` call succeeds; preserve that behavior when editing + iOS refresh or lane/project-root handling. It does not cover sending text or + dragging a screenshot crop. +- The iOS Simulator device selector can be measured without booting a simulator: + switch to the iOS Sim tools tab, change the populated device `