This was generated by AI during triage.
Summary
In CI mode the analyzer hard-fails the check on a transient initial WebSocket connect blip, with no retry. A one-off handshake failure to the Site API trips the check red and requires a manual re-run.
Detail
runInCI calls ApiClient.connect(...) once (src/main.ts). A failed WebSocket handshake throws, propagates to the top-level try/catch, and process.exit(1) → red check. Observed in the wild: Error: WebSocket connection failed. → Process completed with exit code 1, which passed cleanly on a re-run with identical code (prod relay was momentarily unreachable).
This is the same transient-vs-terminal distinction already handled for the baseline fetch (fetchPreviousRun retries timeouts/5xx/network before giving up) and for ingest failures — but the initial connect has none.
Proposal
Give ApiClient.connect a couple of retries with backoff on transient connect/handshake failures (mirror fetchPreviousRun's retry shape) before failing the run. A genuine, persistent connect failure should still fail; a single blip should not.
Acceptance criteria
- A transient connect failure retried within N attempts proceeds normally.
- A persistent connect failure still fails the run after retries are exhausted, with a clear message.
Provenance
Observed on a Query-Doctor/Site "Merge to prod" CI run; diagnosed as a transient prod relay blip unrelated to the change under test.
Summary
In CI mode the analyzer hard-fails the check on a transient initial WebSocket connect blip, with no retry. A one-off handshake failure to the Site API trips the check red and requires a manual re-run.
Detail
runInCIcallsApiClient.connect(...)once (src/main.ts). A failed WebSocket handshake throws, propagates to the top-leveltry/catch, andprocess.exit(1)→ red check. Observed in the wild:Error: WebSocket connection failed.→Process completed with exit code 1, which passed cleanly on a re-run with identical code (prod relay was momentarily unreachable).This is the same transient-vs-terminal distinction already handled for the baseline fetch (
fetchPreviousRunretries timeouts/5xx/network before giving up) and for ingest failures — but the initial connect has none.Proposal
Give
ApiClient.connecta couple of retries with backoff on transient connect/handshake failures (mirrorfetchPreviousRun's retry shape) before failing the run. A genuine, persistent connect failure should still fail; a single blip should not.Acceptance criteria
Provenance
Observed on a Query-Doctor/Site "Merge to prod" CI run; diagnosed as a transient prod relay blip unrelated to the change under test.