Diagnostic tracing and levers for dev session hot reload (#2417)#7337
Open
MitchLillie wants to merge 2 commits intomainfrom
Open
Diagnostic tracing and levers for dev session hot reload (#2417)#7337MitchLillie wants to merge 2 commits intomainfrom
MitchLillie wants to merge 2 commits intomainfrom
Conversation
Contributor
Author
|
/snapit |
Contributor
|
🫰✨ Thanks @MitchLillie! Your snapshot has been published to npm. Test the snapshot by installing your package globally: pnpm i -g --@shopify:registry=https://registry.npmjs.org @shopify/cli@0.0.0-snapshot-20260416212105Caution After installing, validate the version by running |
Tracing: every file change gets a monotonic [build:N] ID that traces
through the full pipeline in --verbose output. Logs at every stage:
onChange → handleWatcherEvents → buildExtensions → generateExtensionTypes
→ emit('all') → DevSession.onEvent → validateAppEvent → processEvents
→ bundleExtensionsAndUpload → result
Levers (env vars to isolate the permanent failure in #2417):
DEV_SKIP_GENERATE_TYPES=1
Skip generateExtensionTypes() after build. Tests if this call
hangs and permanently blocks emit('all').
DEV_SKIP_RESCAN_IMPORTS=1
Skip rescanImports() which can restart the file watcher. Tests
if a watcher restart during vite mid-write permanently loses
the admin extension's watched files.
DEV_SERIALIZE_ONCHANGE=1
Serialize onChange handlers with a mutex. Tests if concurrent
.then() chains corrupting shared state (this.app, bundle dir)
cause the permanent break.
DEV_UPLOAD_TIMEOUT_MS=15000
Add a timeout to bundleExtensionsAndUpload. Tests if a hung
GCS upload or API call permanently blocks the SerialBatchProcessor.
0c62648 to
62660b2
Compare
Toggles source code strings in a hosted app every N seconds to trigger repeated vite rebuilds. Used alongside the diagnostic levers to isolate the permanent failure in #2417.
Contributor
Author
|
/snapit |
Contributor
|
🫰✨ Thanks @MitchLillie! Your snapshot has been published to npm. Test the snapshot by installing your package globally: pnpm i -g --@shopify:registry=https://registry.npmjs.org @shopify/cli@0.0.0-snapshot-20260416214923Caution After installing, validate the version by running |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
WHY are these changes introduced?
Investigating shop/issues-admin-extensibility#2417 — dev session hot reloading permanently stops working after ~12-15 file changes. GCS uploads and websocket messages both stop, even though vite rebuilds and admin extension builds keep succeeding. Requires a restart to recover.
This PR adds tracing and diagnostic levers to isolate where the pipeline permanently breaks. No behavioral changes.
WHAT is this pull request doing?
Build ID tracing
Every file change gets a monotonic
[build:N]ID that traces through--verboseoutput at every stage:When the issue occurs, the gap between the last trace line and the missing next one reveals exactly what hung/threw/died.
Diagnostic levers (env vars)
Each lever tests a theory about what permanently breaks:
DEV_SKIP_GENERATE_TYPES=1generateExtensionTypes()after buildemit('all')— every subsequent onChange chain piles up behind itDEV_SKIP_RESCAN_IMPORTS=1rescanImports()(which can restart chokidar)./dist, permanently losing the admin extension's watched filesDEV_SERIALIZE_ONCHANGE=1.then()chains corrupt shared state (this.app, bundle dir), causing a permanent breakDEV_UPLOAD_TIMEOUT_MS=15000bundleExtensionsAndUploadSerialBatchProcessorIf a lever prevents the issue, that theory is confirmed. If none do, the tracing output will show us something new.
Stress test script
scripts/dev-session-stress-test.sh— toggles source strings in a hosted app at a configurable interval to trigger repeated rebuilds.How to test
1. Get a snapshot build:
2. In a hosted app directory (Preact template with
home/), start dev with tracing:3. In another terminal, start the stress test:
# From the CLI repo: ./scripts/dev-session-stress-test.sh /path/to/your/app 5 104. Wait for it to break (~12-15 changes), then check the trace:
5. Test levers one at a time to isolate the cause:
Checklist
pnpm changeset add