perf(startup): continue reducing desktop startup and interactive shell readiness time

﻿## Global Optimization Scoreboard

Last updated: 2026-05-30

Performance PR acceptance rule:

- Every performance PR must include repeated A/B timing data for the target business scenario and any materially affected user-facing scenario.
- If work is deferred from startup to a later interaction, that later interaction must be measured. Bundle size alone is supporting evidence, not an accepted user-perceived performance result.
- If one scenario improves while another regresses, the regression must be listed with severity. Material UX regressions require product confirmation before the PR is considered an effective optimization.
- PR descriptions must list measured wins, measured regressions, test method, risk/functional impact, and unresolved data gaps.

| PR | Status | Target scenario | Key result data | Known regression / risk | Decision state |
| --- | --- | --- | ---: | --- | --- |
| #934 | Merged | Release-fast cold startup / first visible shell | `main_window_shown` p50 1504.7 ms -> 1101.8 ms, -402.9 ms (-26.8%); `start_application_end` p50 1455.7 ms -> 1035.6 ms, -420.1 ms (-28.9%) | `interactive_shell_ready` p50 1538.9 ms -> 1628.4 ms, +89.4 ms (+5.8%) | Merged, but interactive readiness remains unresolved |
| #958 | Merged | First open of larger historical sessions | Backend first restore avg 69.8 ms -> 5.1 ms, about -92.7% before first render | Deferred full hydrate still costs 35.1-99.5 ms after latest-content rendering | Merged, latest-content-first tradeoff accepted |
| #971 | Open draft / mitigation measured | Startup by deferring editor and root AI startup paths; first editor-heavy interactions protected by post-splash idle warmup | Current head p50: `first_script_eval` -577.9 ms (-49.0%), `main_window_shown` -598.2 ms (-45.7%), `interactive_shell_ready` -660.6 ms (-38.4%); first code editor -176.7 ms (-46.0%), first Git diff -127.4 ms (-32.0%) in explicit-warmup path | Latest-main strict A/B not rerun after rebase; editor warmup consumes about 230 ms idle time; users clicking before warmup finishes can still pay partial cold cost | Draft; candidate positive, but keep open for latest-main A/B decision and continued `interactive_shell_ready` work |

## Follow-Up Metrics From PR #971
Related PR: #971

Benchmark method: Windows desktop `release-fast`. Current PR head is `20d6a235`, rebased on `gcwing/main@18aa2df4`. The comparison baseline remains #949's continuity baseline `gcwing/main@978e079f`; latest-main strict A/B was not rerun after this rebase. Startup and long-session data uses `tests/e2e/specs/performance/startup-session-perf.spec.ts`; editor first-open data uses `tests/e2e/specs/performance/editor-first-open-perf.spec.ts`. Editor timing measures app event trigger to Monaco DOM ready to isolate deferred editor/rendering cost.

### PR #971 Business Scenario Coverage

| Business scenario | Baseline p50 | PR #971 p50 | Delta | Decision impact |
| --- | ---: | ---: | ---: | --- |
| Cold startup `first_script_eval` | 1179.0 ms | 601.1 ms | -577.9 ms (-49.0%) | Positive startup result. |
| Cold startup `main_window_shown` | 1309.5 ms | 711.3 ms | -598.2 ms (-45.7%) | Positive first-shell result. |
| Cold startup `interactive_shell_ready` | 1720.4 ms | 1059.8 ms | -660.6 ms (-38.4%) | Positive readiness result in current head data; issue remains open for deeper `interactive_shell_ready` optimization. |
| Non-critical initialization completion | 1667.2 ms | 1315.9 ms | -351.3 ms (-21.1%) | Positive in this run despite deferred scheduling. |
| First open of generated long session, backend restore | 45.6 ms | 41.5 ms | -4.1 ms (-9.0%) | Neutral to positive. |
| First open of generated long session, latest frame | 45.7 ms | 46.5 ms | +0.8 ms (+1.8%) | Small regression, below material UX threshold in this run. |
| First open of generated long session, initial hydrate | 105.9 ms | 102.9 ms | -3.0 ms (-2.8%) | Neutral to positive. |
| First open of generated long session, full hydrate | 51.5 ms | 52.7 ms | +1.2 ms (+2.3%) | Small regression, track only. |
| First code editor open, no explicit warmup wait | 384.1 ms | 207.4 ms | -176.7 ms (-46.0%) | Previous PR regression mitigated. |
| First code editor open, explicit warmup wait | 384.1 ms | 203.4 ms | -180.7 ms (-47.0%) | Positive first-use result after idle warmup. |
| First Git diff editor open, no explicit warmup wait | 398.6 ms | 283.7 ms | -114.9 ms (-28.8%) | Previous PR regression mitigated. |
| First Git diff editor open, explicit warmup wait | 398.6 ms | 271.2 ms | -127.4 ms (-32.0%) | Positive first-use result after idle warmup. |

### PR #971 Notes

- The earlier first-use editor regression is now mitigated in current release-fast measurements: code-editor p50 improved from the previous PR measurement of 503.7 ms to 207.4 ms without explicit wait, and Git-diff p50 improved from 526.9 ms to 283.7 ms without explicit wait.
- Background editor warmup duration p50 is about 226.7-237.0 ms across the measured editor scenarios. It is scheduled only after splash exit, uses low-priority idle scheduling, and yields between editor-surface preload stages, Monaco init, and theme sync.
- The current no-wait editor measurements all had `editor_startup_warmup_end` before the trigger. That means the measured common path is good, but a user who clicks an editor immediately after splash exit can still see partial cold cost.
- Startup splash behavior changed from immediate loading text to logo-only first. `Loading workspace...` appears below the logo only after 1800 ms if workspace loading is still active, so normal startup does not show a short text flash.
- Current state: keep PR #971 draft until the team decides whether the existing continuity baseline is enough, or whether a fresh latest-main strict A/B run is required before marking ready.
## Required Metrics For Next Startup Stage

| Scenario | Required comparison | Acceptance note |
| --- | --- | --- |
| Cold startup to first visible shell | Latest `main` vs PR, repeated Windows `release-fast` cold launches | Must improve or be neutral with data. |
| Cold startup to `interactive_shell_ready` | Latest `main` vs PR, repeated Windows `release-fast` cold launches | Regressions need explicit confirmation. |
| First open of long historical session | Latest `main` vs PR, long-session fixture and real local sample if available | Latest visible turns must render first; total hydrate cost must be reported. |
| First code editor open | Latest `main` vs PR, first-use timing after fresh startup | Required for any PR that defers Monaco/editor work. |
| First Git diff editor open | Latest `main` vs PR, first-use timing after fresh startup | Required for any PR that lazy-loads diff editor paths. |

## Feature Goal
Reduce overall desktop cold-start time and first-screen wait time without hiding existing navigation content or sacrificing feature behavior. Performance work should improve the time to a usable shell, not only move work out of one metric.

## Original Problem
Cold startup currently pays for too much work before or around the first visible shell:

- Startup eagerly loads systems that are not always needed before first interaction, including IDE / MCP / ACP startup probes and MiniApp catalog refresh.
- Session navigation previously loaded far more session metadata than was immediately visible or clickable.
- The startup path showed a black-background logo and then a white-background logo before the main UI, causing a visible flash.
- Earlier measurements showed that improving isolated marks is not enough: `interactive_shell_ready` still needs direct follow-up work.

## Related Stage 1 PR
Related PR: #934

This issue should remain open after #934 is merged. PR #934 is the stage-1 first-screen optimization and does not complete the broader startup performance goal.

## Stage 1 Performance Metrics From PR #934
Benchmark method: Windows desktop `release-fast` cold launches, 10 runs per build, p50 comparison using the same packaged WebView path.

| P50 metric | Baseline | Stage 1 PR | Delta |
| --- | ---: | ---: | ---: |
| `main_window_shown` | 1504.7 ms | 1101.8 ms | -402.9 ms (-26.8%) |
| `start_application_end` | 1455.7 ms | 1035.6 ms | -420.1 ms (-28.9%) |
| `startApplicationDurationMs` | 614.7 ms | 27.7 ms | -586.9 ms (-95.5%) |
| `afterRenderDurationMs` | 584.1 ms | 461.4 ms | -122.7 ms (-21.0%) |
| Primary startup API count | 15.5 | 7.0 | -8.5 (-54.8%) |
| Primary startup response bytes | 12,891 | 7,699 | -5,192 (-40.3%) |
| Session metadata loaded | 187 | 27 | -160 (-85.6%) |
| Max session metadata duration | 175.1 ms | 53.3 ms | -121.8 ms (-69.6%) |
| `interactive_shell_ready` | 1538.9 ms | 1628.4 ms | +89.4 ms (+5.8%) |

Additional PR #934 backend microbenchmark: paged session metadata for 1000 sessions reports about `page5_avg_ms=11.550` versus `full_list_avg_ms=88.142`, about 7.6x faster for the paged backend path.

## Stage 1 Scope Already Covered By PR #934
- Show the native desktop main window earlier while keeping the frontend show path as a fallback.
- Defer non-critical IDE / MCP / ACP / MiniApp startup work behind first visible shell or idle boundaries.
- Page session metadata so startup fetches only initially visible rows, while explicit show-more loads the next page with loading feedback.
- Keep workspace and session sections visually expanded by default instead of hiding existing navigation content to obtain the win.
- Remove the extra pre-React static logo layer to avoid the double-logo flash.
- Keep editor and LSP startup lazy, with a first-open loading prompt for the editor initialization path.

## Remaining Follow-Up Work
- Directly optimize `interactive_shell_ready`; this is the main remaining blocker because PR #934 improves first-visible metrics but regresses this readiness mark by about 89 ms p50.
- Profile React mount and first-shell effects to identify long tasks, synchronous store hydration, expensive render paths, and readiness gating that still run before the shell is actually usable.
- Revisit startup marks so first-visible, first-interactive, and deferred-work completion are clearly separated and comparable across strict A/B runs.
- Add or refine measurements for the first on-demand cost of deferred ACP / IDE / MCP / LSP work, especially editor first-open and agent flows that require language-service indexing.
- Validate future stages with strict cold-start A/B data on the latest main before merging.

## Constraints
- Do not trade away visible functionality or default navigation behavior for startup metrics without explicit product confirmation.
- Lazy loading is acceptable only when the user gets clear loading feedback for actions that can exceed about 1 second.
- LSP base configuration can be prepared cheaply, but language servers and indexing should start only when the editor or an agent flow actually requires language services.
- Future PRs should update this issue with the same metric table style so the trend is visible over time.

## Follow-Up Metrics From PR #958
Related PR: #958

Benchmark method: Windows desktop debug preview for startup/session-list observations, plus real persistence restore-path measurements on the 5 largest local sessions by turn count. The restore benchmark compares the pre-PR full restore behavior with PR #958's `tailTurnCount=3` first restore, then explicitly records that full history is still hydrated later in the background.

### PR #958 Business Scenario Coverage

| Business scenario | Baseline / reference behavior | PR #958 result | Product impact |
| --- | ---: | ---: | --- |
| Debug startup to visible native window | Existing native show path | Main window `show` at 944 ms after main-window create start | No latency win claimed; unsafe delayed-window strategy was removed. |
| Debug startup to frontend script | Existing Vite/WebView cold module path | `first_script_eval` at 9799.4 ms | Still slow; static preload only reduces blank-screen perception. |
| Debug startup to shell readiness | Existing readiness path | `interactive_shell_ready` at 10557 ms | Still open follow-up work for this issue. |
| Startup workspace/session navigation APIs | Existing paged metadata path | `get_opened_workspaces` 15.1 ms, `get_recent_workspaces` 13.9 ms, 7 `list_persisted_sessions_page` calls max 14.6 ms | Confirms workspace/history list remains visible and responsive. |
| First open of small/current session | Full restore is effectively unavoidable | 1/1 turn restored in 133.4 ms, convert 1.0 ms, state commit 0.5 ms | Neutral; no tail-load benefit for short sessions. |
| First open of larger historical sessions | Full history read before first render | Tail-3 first restore reads only latest turns, then full history hydrates after 150 ms if state is unchanged | Improves time to latest visible conversation, while preserving eventual full history. |
| Completed large model round rendering | Previously rendered from oldest visible groups first | Completed rounds render newest visible groups first; streaming rounds remain head-anchored | Improves perceived latest-content priority; unit-test covered, no separate browser timing claimed. |

### PR #958 Historical Session Restore Measurements

| Local sample | Turns | Size | Baseline full restore avg | PR first restore avg (`tail=3`) | Deferred full hydrate cost still paid | Turn files avoided before first render |
| --- | ---: | ---: | ---: | ---: | ---: | ---: |
| A | 26 | 3.02 MB | 75.5 ms | 4.2 ms | 75.5 ms | 23 |
| B | 23 | 4.29 MB | 99.5 ms | 5.8 ms | 99.5 ms | 20 |
| C | 17 | 1.33 MB | 35.1 ms | 2.4 ms | 35.1 ms | 14 |
| D | 14 | 2.95 MB | 59.3 ms | 2.9 ms | 59.3 ms | 11 |
| E | 11 | 3.80 MB | 79.7 ms | 10.0 ms | 79.7 ms | 8 |

Summary:
- Baseline full restore avg range: 35.1-99.5 ms.
- PR first restore avg range: 2.4-10.0 ms.
- Average initial backend restore reduction across these samples: 69.8 ms to 5.1 ms, about 92.7% lower before first render.
- Full-history hydration still costs 35.1-99.5 ms after the 150 ms delay; PR #958 moves that work behind latest-content rendering rather than eliminating it.
- Startup `interactive_shell_ready` remains unresolved and should continue to be tracked by this issue.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(startup): continue reducing desktop startup and interactive shell readiness time #949

Global Optimization Scoreboard

Follow-Up Metrics From PR #971

PR #971 Business Scenario Coverage

PR #971 Notes

Required Metrics For Next Startup Stage

Feature Goal

Original Problem

Related Stage 1 PR

Stage 1 Performance Metrics From PR #934

Stage 1 Scope Already Covered By PR #934

Remaining Follow-Up Work

Constraints

Follow-Up Metrics From PR #958

PR #958 Business Scenario Coverage

PR #958 Historical Session Restore Measurements

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PR	Status	Target scenario	Key result data	Known regression / risk	Decision state
#934	Merged	Release-fast cold startup / first visible shell	`main_window_shown` p50 1504.7 ms -> 1101.8 ms, -402.9 ms (-26.8%); `start_application_end` p50 1455.7 ms -> 1035.6 ms, -420.1 ms (-28.9%)	`interactive_shell_ready` p50 1538.9 ms -> 1628.4 ms, +89.4 ms (+5.8%)	Merged, but interactive readiness remains unresolved
#958	Merged	First open of larger historical sessions	Backend first restore avg 69.8 ms -> 5.1 ms, about -92.7% before first render	Deferred full hydrate still costs 35.1-99.5 ms after latest-content rendering	Merged, latest-content-first tradeoff accepted
#971	Open draft / mitigation measured	Startup by deferring editor and root AI startup paths; first editor-heavy interactions protected by post-splash idle warmup	Current head p50: `first_script_eval` -577.9 ms (-49.0%), `main_window_shown` -598.2 ms (-45.7%), `interactive_shell_ready` -660.6 ms (-38.4%); first code editor -176.7 ms (-46.0%), first Git diff -127.4 ms (-32.0%) in explicit-warmup path	Latest-main strict A/B not rerun after rebase; editor warmup consumes about 230 ms idle time; users clicking before warmup finishes can still pay partial cold cost	Draft; candidate positive, but keep open for latest-main A/B decision and continued `interactive_shell_ready` work

Business scenario	Baseline p50	PR #971 p50	Delta	Decision impact
Cold startup `first_script_eval`	1179.0 ms	601.1 ms	-577.9 ms (-49.0%)	Positive startup result.
Cold startup `main_window_shown`	1309.5 ms	711.3 ms	-598.2 ms (-45.7%)	Positive first-shell result.
Cold startup `interactive_shell_ready`	1720.4 ms	1059.8 ms	-660.6 ms (-38.4%)	Positive readiness result in current head data; issue remains open for deeper `interactive_shell_ready` optimization.
Non-critical initialization completion	1667.2 ms	1315.9 ms	-351.3 ms (-21.1%)	Positive in this run despite deferred scheduling.
First open of generated long session, backend restore	45.6 ms	41.5 ms	-4.1 ms (-9.0%)	Neutral to positive.
First open of generated long session, latest frame	45.7 ms	46.5 ms	+0.8 ms (+1.8%)	Small regression, below material UX threshold in this run.
First open of generated long session, initial hydrate	105.9 ms	102.9 ms	-3.0 ms (-2.8%)	Neutral to positive.
First open of generated long session, full hydrate	51.5 ms	52.7 ms	+1.2 ms (+2.3%)	Small regression, track only.
First code editor open, no explicit warmup wait	384.1 ms	207.4 ms	-176.7 ms (-46.0%)	Previous PR regression mitigated.
First code editor open, explicit warmup wait	384.1 ms	203.4 ms	-180.7 ms (-47.0%)	Positive first-use result after idle warmup.
First Git diff editor open, no explicit warmup wait	398.6 ms	283.7 ms	-114.9 ms (-28.8%)	Previous PR regression mitigated.
First Git diff editor open, explicit warmup wait	398.6 ms	271.2 ms	-127.4 ms (-32.0%)	Positive first-use result after idle warmup.

Scenario	Required comparison	Acceptance note
Cold startup to first visible shell	Latest `main` vs PR, repeated Windows `release-fast` cold launches	Must improve or be neutral with data.
Cold startup to `interactive_shell_ready`	Latest `main` vs PR, repeated Windows `release-fast` cold launches	Regressions need explicit confirmation.
First open of long historical session	Latest `main` vs PR, long-session fixture and real local sample if available	Latest visible turns must render first; total hydrate cost must be reported.
First code editor open	Latest `main` vs PR, first-use timing after fresh startup	Required for any PR that defers Monaco/editor work.
First Git diff editor open	Latest `main` vs PR, first-use timing after fresh startup	Required for any PR that lazy-loads diff editor paths.

P50 metric	Baseline	Stage 1 PR	Delta
`main_window_shown`	1504.7 ms	1101.8 ms	-402.9 ms (-26.8%)
`start_application_end`	1455.7 ms	1035.6 ms	-420.1 ms (-28.9%)
`startApplicationDurationMs`	614.7 ms	27.7 ms	-586.9 ms (-95.5%)
`afterRenderDurationMs`	584.1 ms	461.4 ms	-122.7 ms (-21.0%)
Primary startup API count	15.5	7.0	-8.5 (-54.8%)
Primary startup response bytes	12,891	7,699	-5,192 (-40.3%)
Session metadata loaded	187	27	-160 (-85.6%)
Max session metadata duration	175.1 ms	53.3 ms	-121.8 ms (-69.6%)
`interactive_shell_ready`	1538.9 ms	1628.4 ms	+89.4 ms (+5.8%)

Business scenario	Baseline / reference behavior	PR #958 result	Product impact
Debug startup to visible native window	Existing native show path	Main window `show` at 944 ms after main-window create start	No latency win claimed; unsafe delayed-window strategy was removed.
Debug startup to frontend script	Existing Vite/WebView cold module path	`first_script_eval` at 9799.4 ms	Still slow; static preload only reduces blank-screen perception.
Debug startup to shell readiness	Existing readiness path	`interactive_shell_ready` at 10557 ms	Still open follow-up work for this issue.
Startup workspace/session navigation APIs	Existing paged metadata path	`get_opened_workspaces` 15.1 ms, `get_recent_workspaces` 13.9 ms, 7 `list_persisted_sessions_page` calls max 14.6 ms	Confirms workspace/history list remains visible and responsive.
First open of small/current session	Full restore is effectively unavoidable	1/1 turn restored in 133.4 ms, convert 1.0 ms, state commit 0.5 ms	Neutral; no tail-load benefit for short sessions.
First open of larger historical sessions	Full history read before first render	Tail-3 first restore reads only latest turns, then full history hydrates after 150 ms if state is unchanged	Improves time to latest visible conversation, while preserving eventual full history.
Completed large model round rendering	Previously rendered from oldest visible groups first	Completed rounds render newest visible groups first; streaming rounds remain head-anchored	Improves perceived latest-content priority; unit-test covered, no separate browser timing claimed.

Local sample	Turns	Size	Baseline full restore avg	PR first restore avg (`tail=3`)	Deferred full hydrate cost still paid	Turn files avoided before first render
A	26	3.02 MB	75.5 ms	4.2 ms	75.5 ms	23
B	23	4.29 MB	99.5 ms	5.8 ms	99.5 ms	20
C	17	1.33 MB	35.1 ms	2.4 ms	35.1 ms	14
D	14	2.95 MB	59.3 ms	2.9 ms	59.3 ms	11
E	11	3.80 MB	79.7 ms	10.0 ms	79.7 ms	8

perf(startup): continue reducing desktop startup and interactive shell readiness time #949

Description

Global Optimization Scoreboard

Follow-Up Metrics From PR #971

PR #971 Business Scenario Coverage

PR #971 Notes

Required Metrics For Next Startup Stage

Feature Goal

Original Problem

Related Stage 1 PR

Stage 1 Performance Metrics From PR #934

Stage 1 Scope Already Covered By PR #934

Remaining Follow-Up Work

Constraints

Follow-Up Metrics From PR #958

PR #958 Business Scenario Coverage

PR #958 Historical Session Restore Measurements

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions