chore: release - merge dev into main by zbigniewsobiecki · Pull Request #1461 · mongrel-intelligence/cascade

zbigniewsobiecki · 2026-06-26T10:06:37Z

Automated release PR created by the release workflow.

Commits (8):

e67e8dcd fix(worker): disable setup.sh idle timeout, bump to Node 24, scope agent lint to changed files (#1460)
38f052ac fix(web): remove @theme inline so dark mode CSS tokens propagate correctly (#1459)
767dd948 feat(web): persist debug-analysis in-progress state and disable trigger button (MNG-1668) (#1458)
1ebc5661 docs(architecture): document cross-process debug-analysis status model (#1457)
f8088dc9 test(cli): cover debug-analysis --wait failed branch json + no-timeout (MNG-1669) (#1456)
48bdb263 fix(api): make debug-analysis status durable and cross-process (#1453)
f412c16d feat(web): show "Run is starting…" for ack-time work-item runs links (#1455)
284c6ffb feat(web): show "Run is starting…" pending state for fresh run links (#1454)

…1454) A freshly-shared /runs/<id> URL can point at a run row the worker has not committed yet (the window between "URL shared in a GitHub/PM comment" and "worker commits the run row"). Previously the page flashed a misleading terminal "Run not found" immediately. - Add shared RunPendingState component (Loader2 spinner + "Run is starting…" heading + subtext, optional message prop) for reuse by the work-item runs page in a follow-up story. - Add NOT_FOUND-aware retry to the runs.getById query (retry only while the error is NOT_FOUND and within RUN_PENDING_MAX_RETRIES, retryDelay RUN_PENDING_POLL_MS) so the page self-heals to the real run within seconds. - Replace the terminal "Run not found" branch with resolveRunDetailView() branching (pending / loading / not-found / error / ready); soften the not-found copy to acknowledge the run may have been cancelled/removed. - Preserve the existing 5s refetchInterval while status === running. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…1455) * feat(web): show "Run is starting…" for ack-time work-item runs links Work-item runs links (`/work-items/$projectId/$workItemId`) are posted at ack time — before the worker creates the run row — so `workItems.runs` returns `[]` and the page flashed a terminal "No runs found". Track mount time, keep polling through the bounded grace window, and render the shared `RunPendingState` while empty-within-grace. - Route tracks `mountedAt = useRef(Date.now())` and computes `elapsedMs`. - Replace the bespoke refetchInterval with the shared `workItemRunsRefetchInterval({ hasRunning, isEmpty, elapsedMs })` so the page polls through the ack-before-create window. - Derive `isPending` from `resolveWorkItemRunsView(...)` and pass it into `WorkItemRunsTable`; the empty branch renders `RunPendingState` when pending, else keeps "No runs found" (default preserved for the PR page). - Reuses the `run-pending.ts` helper + `RunPendingState` from the earlier stories — no new logic primitives. - Doc: add a run-links note to docs/architecture/08-config-credentials.md about the transient "starting" state for links shared before the run row. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(web): re-render work-item runs page at the pending grace boundary The work-item runs page derived `isPending` from elapsed time during render but had no driver to re-render once the grace window elapsed with a still-empty list. React Query structural-shares the same empty-array reference across polls and the component reads only data/isLoading/ isError/error, so no tracked prop changed after the first empty result — leaving the page stuck on "Run is starting…" forever instead of reverting to "No runs found". Schedule a one-shot timer (a `useState` tick) for the moment the grace window elapses while pending, so `isPending` is recomputed to false exactly at the boundary and the table falls back to "No runs found". Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* fix(api): make debug-analysis status durable and cross-process In production the debug-analysis job runs in a separate worker process, so the dashboard API's in-memory `isAnalysisRunning()` Set never reports `running`. Derive running/queued/failed status from the durable BullMQ dashboard-jobs queue (keyed by the deterministic `debug-analysis-<runId>` id) when REDIS_URL is set, and keep the in-memory + DB path for local dev. - getDebugAnalysisStatus: queue-mode precedence (in-flight job -> running short-circuiting the DB; DB record -> completed; failed job -> failed; else idle). Status union gains an additive `failed` literal. - triggerDebugAnalysis: queue-mode already-running guard throws CONFLICT on an in-flight job; re-run removes any prior terminal job then re-enqueues with the deterministic id so it is idempotent and reusable. - Read the REDIS_URL gate at call time (isQueueMode()) instead of a module-load const so it reflects the live env and is unit-testable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(api): make debug-analysis status reflect the analysis lifecycle, not spawn The durable signal the previous approach relied on (the cascade-dashboard-jobs BullMQ job) reaches `completed` at container *spawn*, not at analysis completion: guardedSpawn resolves once `container.start()` returns, while the debug agent runs for tens of seconds to minutes inside the spawned container and the `debug_analyses` row is written only at the very end. So `running` was observable only during the brief spawn window — during the actual analysis, status read `idle` (frontend stopped polling and showed the empty "Run Analysis" state) and the double-press guard lapsed (a second trigger re-enqueued a duplicate: 2× LLM cost + duplicate PM comment). Replace the job-state signal with a worker-owned, durable, cross-process lifecycle signal that models the analysis itself: - New `debug_analysis_status` table (PK on analyzed_run_id, ON DELETE CASCADE), migration 0056. The worker marks `running` around `triggerDebugAnalysis`, clears it on success (a present `debug_analyses` row is then `completed`), and marks `failed` on error. The dashboard also marks `running` at trigger time to cover the enqueue→spawn window. - `getDebugAnalysisStatus` and the re-trigger guard now read this row uniformly in queue and local-dev mode — no more BullMQ job-state reads. A stale `running` row (crashed worker, > DEBUG_ANALYSIS_RUNNING_STALE_MS = 2h, above the 30 min worker timeout) is ignored so a crash never wedges the run permanently. - The deterministic job id + remove-before-submit re-enqueue is kept so a near-simultaneous second trigger that slips past the guard cannot spawn a duplicate container. `running` is marked only after a successful enqueue so a failed enqueue cannot leave a phantom marker blocking retries. - Remove the worker-local in-memory tracker (`debug-status.ts`) — never visible to the dashboard process — and the now-unused `getDashboardJobState` helper. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(cli): surface terminal failed status in debug-analysis --wait loop The `cascade runs debug <id> --analyze --wait` poll loop only returned on `completed` and `idle`. After the durable-status PR added a real `failed` status, a failed analysis was no longer surfaced as `idle`, so the loop ignored it and polled for the full 5-minute deadline before printing a misleading "Timed out waiting for debug analysis to complete." message. Add a `failed` branch mirroring the existing `idle` branch so a failed analysis returns promptly with an accurate "Debug analysis failed." message. Also corrects the imprecise comment in runs.ts: the leftover terminal status row is overwritten by the markDebugAnalysisRunning upsert, not deleted by deleteDebugAnalysisByRunId. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(debug): cover durable failed-write error path in debug analysis runner Pins the defensive invariant in triggerDebugAnalysis that a failing markDebugAnalysisFailed write logs a warning but does not mask or replace the original agent error. This is the catch-handler branch that codecov flagged as the uncovered patch lines in src/triggers/shared/debug-runner.ts. Test-only; no production code change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(api): correct debug-analysis guard comment to reflect uniform durable read The already-running guard comment claimed it was 'durable in queue mode, in-memory in local dev' — leftover narrative from the abandoned approach. The shipped guard (assertDebugAnalysisNotInFlight) reads the durable debug_analysis_status row uniformly in both queue and local-dev mode; the in-memory isAnalysisRunning mechanism was deleted. Aligns the comment with the function's own docstring and the rest of the durable-table design. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(db): exercise debug_analysis_status ON DELETE CASCADE with a targeted delete The ON DELETE CASCADE test used truncateAll(), which issues `TRUNCATE ... CASCADE` and clears every referencing table regardless of the FK action — so it would pass even if the constraint were NO ACTION/RESTRICT, never actually exercising the clause it names. Delete only the parent agent_runs row instead: the DELETE now succeeds and removes the status row solely because the FK cascades; a non-cascading FK would raise a foreign-key violation. Also assert the status row exists before the delete to make the cascade observable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(debug): document failed-status coverage caveat for hard kills The `failed` debug-analysis status is written only from the in-process catch in triggerDebugAnalysis, so it covers catchable in-process errors only. A hard kill (watchdog/OOM) or a throw before the runner is reached (e.g. processDashboardJob failing to load the project config after the dashboard already marked running) leaves a `running` row that self-stales to `idle` rather than surfacing `failed`. Documents this deliberate tradeoff in-code per repeated review feedback (non-blocking). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(debug): note getRunById-null early-return in failed-coverage caveat The latest review observed that the runner's own early getRunById-null return is another path where the dashboard-written 'running' marker is never cleared or flipped to 'failed' (it self-stales to 'idle' after DEBUG_ANALYSIS_RUNNING_STALE_MS). That early return is a bare return before the try block, so the existing catch-block caveat did not cover it. Document the gap at the early-return site and extend the catch-block caveat to list it. Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(db): document the durable debug_analysis_status table Records the shipped cross-process debug-analysis status design (new table from MNG-1667) in the database architecture doc: ER relationship, key-tables row, and repository purpose. Captures the durable-table approach that actually shipped, not the abandoned queue-derived design the reviewer flagged, so the permanent in-repo record is accurate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(db): note failed-status coverage caveat on debug_analysis_status schema Surface the reviewer's persistently-flagged caveat in the canonical table definition: `failed` is written only for catchable in-process errors (the debug-runner's `catch`), so a hard kill (watchdog/OOM) or a throw before the runner is reached self-stales the `running` row to `idle` rather than surfacing `failed`. Also names DEBUG_ANALYSIS_RUNNING_STALE_MS and points at isDebugAnalysisRunActive so the staleness threshold is discoverable from the schema. Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(db): pin exact staleness boundary for isDebugAnalysisRunActive Lock the strict `<` against DEBUG_ANALYSIS_RUNNING_STALE_MS so a future refactor to `<=` (or back) is caught. Existing tests covered only a fresh (~0ms) and a clearly-stale (threshold + 1s) row; this adds the boundary itself — exactly at the threshold (stale) and one ms under (active) — under a frozen clock with real-timer cleanup. That boundary is what guarantees a crashed/OOM-killed worker never wedges a debug analysis as permanently `running`, which the durable debug_analysis_status design relies on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(worker): mark debug analysis failed on pre-runner config-load failure When `processDashboardJob` cannot load the project config for a `debug-analysis` job, it throws before `triggerDebugAnalysis` (whose `catch` writes the durable `failed` status) is ever reached. The dashboard has already marked the run `running` at trigger time, so the `running` row would otherwise linger and self-stale to `idle` after `DEBUG_ANALYSIS_RUNNING_STALE_MS` rather than surfacing `failed` — blocking re-trigger with CONFLICT for that ~2h window. Mark the run `failed` (best-effort, so a status-write error never masks the original project-not-found failure) at that site, extending `failed` coverage to this catchable in-process worker error. Hard kills (watchdog/OOM) remain the documented deliberate follow-up (router-side reconciliation on non-zero container exit). Keeps the in-code coverage caveats accurate (debug-runner `catch` block and the `debug_analysis_status` schema docstring) and adds worker-entry tests asserting the run is marked failed and that the original error still propagates when the failed-write itself fails. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(db): document debug-analysis status precedence and failed caveat Record the shipped durable-table behavior in the merged architecture doc: the getDebugAnalysisStatus precedence (active running → completed (DB-wins) → failed → idle, uniform in queue + local dev) and the failed-coverage caveat (catchable in-process errors only; hard-kill/OOM self-stales the running row to idle, with router-side reconciliation the deliberate follow-up). Documentation-only; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: cascade-bot <cascade-bot@users.noreply.github.com>

…t (MNG-1669) (#1456) The `failed` branch in the `cascade runs debug --analyze --wait` poll loop shipped in #1453 (commit 48bdb26), but two of MNG-1669's acceptance criteria lacked dedicated regression coverage: - AC4 ("returns on failed without waiting for the timeout"): the existing test only asserted the failure message. Add a test pinning that the loop polls exactly once and never prints the timeout message. - AC3 ("--json output for the failed case is consistent with other terminal branches"): no test existed. Add a test asserting that under --json the failed branch logs the plain message, does not emit a JSON object (outputJson not called), and does not fetch analysis content — mirroring the idle/timeout branches. Test-only; no production behavior change. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

#1457) Record the shipped durable, cross-process debug-analysis status model in the architecture deep-dives and the trigger README, replacing the stale "in-memory Set fallback" framing the work item inherited from an abandoned design. - 01-services.md: add "Cross-process debug-analysis status" subsection documenting the durable `debug_analysis_status` table read uniformly in queue + local dev, the writer/reader split, status precedence + 2h staleness self-heal, and the deterministic `debug-analysis-<runId>` dashboard job as a dedup (not status) mechanism. - 03-trigger-system.md: expand the auto-debug bullet and add a "Debug-analysis status" subsection cross-linking the service + DB docs. - src/triggers/README.md: note the durable status on the auto-debug row and add a focused subsection. Aligns the docs with the durable-table design already documented in 09-database.md (MNG-1667), which deleted the worker-local in-memory `debug-status.ts` Set because it was invisible to the dashboard process. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…er button (MNG-1668) (#1458) Keep the Debug Analysis panel's in-progress affordance and the disabled trigger button visible from the moment the analysis is triggered until a terminal status, instead of flickering back the instant the trigger mutation settles. - Extract pure decision helpers into web/src/lib/debug-analysis.ts (computeDebugAnalysisRunning, debugAnalysisRefetchInterval) so the polling/in-progress logic is node-testable. - Poll the status query from trigger until a terminal status via a polling-active flag (set on trigger success, cleared on completed/failed) in addition to status === 'running'. - isRunning now covers in-flight mutation + just-triggered + queued/running. - Split the component into a data wrapper + presentational DebugAnalysisView so the rendered UX is render-testable; disable both Run / Re-run buttons while running and show a spinner affordance. - Render an error for the failed status and re-enable the button; preserve the existing synchronous trigger-error handling. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

…ectly (#1459) Tailwind v4's `@theme inline` bakes token values directly into utility classes (e.g. bg-card → background-color: oklch(1 0 0)) instead of emitting CSS custom properties. The .dark { --color-card: ... } overrides had no effect on those utilities, leaving cards, tab bars, and other token-backed components stuck at their light-mode colours in dark mode. Removing `inline` makes Tailwind emit :root CSS custom properties so the existing .dark { } block takes effect for every token-based utility across the dashboard — bg-card, bg-muted, bg-input, bg-sidebar, border-border, etc. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…ent lint to changed files (#1460) * fix(worker): disable setup.sh idle timeout, bump to Node 24, scope agent lint to changed files Three related fixes after a worker run where .cascade/setup.sh was killed mid-Ruby-compilation and the agent then stalled on a full-tree eslint:fix: 1. setup.sh idle timeout: ruby-build suppresses make output by design, causing the 120s idle timeout to fire during compilation. Disable the idle timeout for setup.sh; the wall timeout (10min) remains the safety net for truly hung scripts. Also surface `reason` in the log so wall-timeout kills are distinguishable from idle-timeout kills. 2. Worker image Node 24: bump builder and production stages from node:22 to node:24, eliminating the EBADENGINE mismatch for projects that require Node >=24 (and reducing what setup.sh needs to install via asdf). 3. implementation.eta step 7: replace vague "run linting" instruction with an explicit changed-files-only lint pattern. Full-tree eslint --fix can take 10+ minutes on large TypeScript codebases and kills sessions. Also add a kill-on-timeout instruction to tmux.eta partial. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(prompts): include untracked files in scoped agent lint Review feedback on PR #1460: the scoped-lint guidance in step 7 still relies on `git diff --name-only HEAD`, which lists only tracked changes. At step 7 the agent has not committed yet (commit happens inside CreatePR at step 8) and nothing was `git add`ed, so brand-new source/test files are untracked and get excluded from the lint scope. Local lint then passes while the full-project `lint-and-test` CI job fails on exactly those new files — the opposite of this step's intent. Pick up untracked files via `git ls-files --others --exclude-standard`, dedupe with `sort -u`, and use `xargs -r` so the linter is not invoked with zero args when nothing matches. Note the same treatment applies to the Biome/other-linter variant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Cascade Bot <bot@cascade.dev>

codecov · 2026-06-26T10:11:40Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

aaight and others added 8 commits June 25, 2026 15:13

zbigniewsobiecki merged commit da22a5b into main Jun 26, 2026
13 of 14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: release - merge dev into main#1461

chore: release - merge dev into main#1461
zbigniewsobiecki merged 8 commits into
mainfrom
dev

zbigniewsobiecki commented Jun 26, 2026

Uh oh!

Uh oh!

codecov Bot commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

zbigniewsobiecki commented Jun 26, 2026

Commits (8):

Uh oh!

Uh oh!

codecov Bot commented Jun 26, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants