chore: release - merge dev into main#1461
Merged
Merged
Conversation
…1454) A freshly-shared /runs/<id> URL can point at a run row the worker has not committed yet (the window between "URL shared in a GitHub/PM comment" and "worker commits the run row"). Previously the page flashed a misleading terminal "Run not found" immediately. - Add shared RunPendingState component (Loader2 spinner + "Run is starting…" heading + subtext, optional message prop) for reuse by the work-item runs page in a follow-up story. - Add NOT_FOUND-aware retry to the runs.getById query (retry only while the error is NOT_FOUND and within RUN_PENDING_MAX_RETRIES, retryDelay RUN_PENDING_POLL_MS) so the page self-heals to the real run within seconds. - Replace the terminal "Run not found" branch with resolveRunDetailView() branching (pending / loading / not-found / error / ready); soften the not-found copy to acknowledge the run may have been cancelled/removed. - Preserve the existing 5s refetchInterval while status === running. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…1455) * feat(web): show "Run is starting…" for ack-time work-item runs links Work-item runs links (`/work-items/$projectId/$workItemId`) are posted at ack time — before the worker creates the run row — so `workItems.runs` returns `[]` and the page flashed a terminal "No runs found". Track mount time, keep polling through the bounded grace window, and render the shared `RunPendingState` while empty-within-grace. - Route tracks `mountedAt = useRef(Date.now())` and computes `elapsedMs`. - Replace the bespoke refetchInterval with the shared `workItemRunsRefetchInterval({ hasRunning, isEmpty, elapsedMs })` so the page polls through the ack-before-create window. - Derive `isPending` from `resolveWorkItemRunsView(...)` and pass it into `WorkItemRunsTable`; the empty branch renders `RunPendingState` when pending, else keeps "No runs found" (default preserved for the PR page). - Reuses the `run-pending.ts` helper + `RunPendingState` from the earlier stories — no new logic primitives. - Doc: add a run-links note to docs/architecture/08-config-credentials.md about the transient "starting" state for links shared before the run row. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(web): re-render work-item runs page at the pending grace boundary The work-item runs page derived `isPending` from elapsed time during render but had no driver to re-render once the grace window elapsed with a still-empty list. React Query structural-shares the same empty-array reference across polls and the component reads only data/isLoading/ isError/error, so no tracked prop changed after the first empty result — leaving the page stuck on "Run is starting…" forever instead of reverting to "No runs found". Schedule a one-shot timer (a `useState` tick) for the moment the grace window elapses while pending, so `isPending` is recomputed to false exactly at the boundary and the table falls back to "No runs found". Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(api): make debug-analysis status durable and cross-process In production the debug-analysis job runs in a separate worker process, so the dashboard API's in-memory `isAnalysisRunning()` Set never reports `running`. Derive running/queued/failed status from the durable BullMQ dashboard-jobs queue (keyed by the deterministic `debug-analysis-<runId>` id) when REDIS_URL is set, and keep the in-memory + DB path for local dev. - getDebugAnalysisStatus: queue-mode precedence (in-flight job -> running short-circuiting the DB; DB record -> completed; failed job -> failed; else idle). Status union gains an additive `failed` literal. - triggerDebugAnalysis: queue-mode already-running guard throws CONFLICT on an in-flight job; re-run removes any prior terminal job then re-enqueues with the deterministic id so it is idempotent and reusable. - Read the REDIS_URL gate at call time (isQueueMode()) instead of a module-load const so it reflects the live env and is unit-testable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(api): make debug-analysis status reflect the analysis lifecycle, not spawn The durable signal the previous approach relied on (the cascade-dashboard-jobs BullMQ job) reaches `completed` at container *spawn*, not at analysis completion: guardedSpawn resolves once `container.start()` returns, while the debug agent runs for tens of seconds to minutes inside the spawned container and the `debug_analyses` row is written only at the very end. So `running` was observable only during the brief spawn window — during the actual analysis, status read `idle` (frontend stopped polling and showed the empty "Run Analysis" state) and the double-press guard lapsed (a second trigger re-enqueued a duplicate: 2× LLM cost + duplicate PM comment). Replace the job-state signal with a worker-owned, durable, cross-process lifecycle signal that models the analysis itself: - New `debug_analysis_status` table (PK on analyzed_run_id, ON DELETE CASCADE), migration 0056. The worker marks `running` around `triggerDebugAnalysis`, clears it on success (a present `debug_analyses` row is then `completed`), and marks `failed` on error. The dashboard also marks `running` at trigger time to cover the enqueue→spawn window. - `getDebugAnalysisStatus` and the re-trigger guard now read this row uniformly in queue and local-dev mode — no more BullMQ job-state reads. A stale `running` row (crashed worker, > DEBUG_ANALYSIS_RUNNING_STALE_MS = 2h, above the 30 min worker timeout) is ignored so a crash never wedges the run permanently. - The deterministic job id + remove-before-submit re-enqueue is kept so a near-simultaneous second trigger that slips past the guard cannot spawn a duplicate container. `running` is marked only after a successful enqueue so a failed enqueue cannot leave a phantom marker blocking retries. - Remove the worker-local in-memory tracker (`debug-status.ts`) — never visible to the dashboard process — and the now-unused `getDashboardJobState` helper. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(cli): surface terminal failed status in debug-analysis --wait loop The `cascade runs debug <id> --analyze --wait` poll loop only returned on `completed` and `idle`. After the durable-status PR added a real `failed` status, a failed analysis was no longer surfaced as `idle`, so the loop ignored it and polled for the full 5-minute deadline before printing a misleading "Timed out waiting for debug analysis to complete." message. Add a `failed` branch mirroring the existing `idle` branch so a failed analysis returns promptly with an accurate "Debug analysis failed." message. Also corrects the imprecise comment in runs.ts: the leftover terminal status row is overwritten by the markDebugAnalysisRunning upsert, not deleted by deleteDebugAnalysisByRunId. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(debug): cover durable failed-write error path in debug analysis runner Pins the defensive invariant in triggerDebugAnalysis that a failing markDebugAnalysisFailed write logs a warning but does not mask or replace the original agent error. This is the catch-handler branch that codecov flagged as the uncovered patch lines in src/triggers/shared/debug-runner.ts. Test-only; no production code change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(api): correct debug-analysis guard comment to reflect uniform durable read The already-running guard comment claimed it was 'durable in queue mode, in-memory in local dev' — leftover narrative from the abandoned approach. The shipped guard (assertDebugAnalysisNotInFlight) reads the durable debug_analysis_status row uniformly in both queue and local-dev mode; the in-memory isAnalysisRunning mechanism was deleted. Aligns the comment with the function's own docstring and the rest of the durable-table design. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(db): exercise debug_analysis_status ON DELETE CASCADE with a targeted delete The ON DELETE CASCADE test used truncateAll(), which issues `TRUNCATE ... CASCADE` and clears every referencing table regardless of the FK action — so it would pass even if the constraint were NO ACTION/RESTRICT, never actually exercising the clause it names. Delete only the parent agent_runs row instead: the DELETE now succeeds and removes the status row solely because the FK cascades; a non-cascading FK would raise a foreign-key violation. Also assert the status row exists before the delete to make the cascade observable. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(debug): document failed-status coverage caveat for hard kills The `failed` debug-analysis status is written only from the in-process catch in triggerDebugAnalysis, so it covers catchable in-process errors only. A hard kill (watchdog/OOM) or a throw before the runner is reached (e.g. processDashboardJob failing to load the project config after the dashboard already marked running) leaves a `running` row that self-stales to `idle` rather than surfacing `failed`. Documents this deliberate tradeoff in-code per repeated review feedback (non-blocking). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(debug): note getRunById-null early-return in failed-coverage caveat The latest review observed that the runner's own early getRunById-null return is another path where the dashboard-written 'running' marker is never cleared or flipped to 'failed' (it self-stales to 'idle' after DEBUG_ANALYSIS_RUNNING_STALE_MS). That early return is a bare return before the try block, so the existing catch-block caveat did not cover it. Document the gap at the early-return site and extend the catch-block caveat to list it. Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(db): document the durable debug_analysis_status table Records the shipped cross-process debug-analysis status design (new table from MNG-1667) in the database architecture doc: ER relationship, key-tables row, and repository purpose. Captures the durable-table approach that actually shipped, not the abandoned queue-derived design the reviewer flagged, so the permanent in-repo record is accurate. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(db): note failed-status coverage caveat on debug_analysis_status schema Surface the reviewer's persistently-flagged caveat in the canonical table definition: `failed` is written only for catchable in-process errors (the debug-runner's `catch`), so a hard kill (watchdog/OOM) or a throw before the runner is reached self-stales the `running` row to `idle` rather than surfacing `failed`. Also names DEBUG_ANALYSIS_RUNNING_STALE_MS and points at isDebugAnalysisRunActive so the staleness threshold is discoverable from the schema. Comment-only; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(db): pin exact staleness boundary for isDebugAnalysisRunActive Lock the strict `<` against DEBUG_ANALYSIS_RUNNING_STALE_MS so a future refactor to `<=` (or back) is caught. Existing tests covered only a fresh (~0ms) and a clearly-stale (threshold + 1s) row; this adds the boundary itself — exactly at the threshold (stale) and one ms under (active) — under a frozen clock with real-timer cleanup. That boundary is what guarantees a crashed/OOM-killed worker never wedges a debug analysis as permanently `running`, which the durable debug_analysis_status design relies on. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(worker): mark debug analysis failed on pre-runner config-load failure When `processDashboardJob` cannot load the project config for a `debug-analysis` job, it throws before `triggerDebugAnalysis` (whose `catch` writes the durable `failed` status) is ever reached. The dashboard has already marked the run `running` at trigger time, so the `running` row would otherwise linger and self-stale to `idle` after `DEBUG_ANALYSIS_RUNNING_STALE_MS` rather than surfacing `failed` — blocking re-trigger with CONFLICT for that ~2h window. Mark the run `failed` (best-effort, so a status-write error never masks the original project-not-found failure) at that site, extending `failed` coverage to this catchable in-process worker error. Hard kills (watchdog/OOM) remain the documented deliberate follow-up (router-side reconciliation on non-zero container exit). Keeps the in-code coverage caveats accurate (debug-runner `catch` block and the `debug_analysis_status` schema docstring) and adds worker-entry tests asserting the run is marked failed and that the original error still propagates when the failed-write itself fails. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * docs(db): document debug-analysis status precedence and failed caveat Record the shipped durable-table behavior in the merged architecture doc: the getDebugAnalysisStatus precedence (active running → completed (DB-wins) → failed → idle, uniform in queue + local dev) and the failed-coverage caveat (catchable in-process errors only; hard-kill/OOM self-stales the running row to idle, with router-side reconciliation the deliberate follow-up). Documentation-only; no behavior change. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: cascade-bot <cascade-bot@users.noreply.github.com>
…t (MNG-1669) (#1456) The `failed` branch in the `cascade runs debug --analyze --wait` poll loop shipped in #1453 (commit 48bdb26), but two of MNG-1669's acceptance criteria lacked dedicated regression coverage: - AC4 ("returns on failed without waiting for the timeout"): the existing test only asserted the failure message. Add a test pinning that the loop polls exactly once and never prints the timeout message. - AC3 ("--json output for the failed case is consistent with other terminal branches"): no test existed. Add a test asserting that under --json the failed branch logs the plain message, does not emit a JSON object (outputJson not called), and does not fetch analysis content — mirroring the idle/timeout branches. Test-only; no production behavior change. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
#1457) Record the shipped durable, cross-process debug-analysis status model in the architecture deep-dives and the trigger README, replacing the stale "in-memory Set fallback" framing the work item inherited from an abandoned design. - 01-services.md: add "Cross-process debug-analysis status" subsection documenting the durable `debug_analysis_status` table read uniformly in queue + local dev, the writer/reader split, status precedence + 2h staleness self-heal, and the deterministic `debug-analysis-<runId>` dashboard job as a dedup (not status) mechanism. - 03-trigger-system.md: expand the auto-debug bullet and add a "Debug-analysis status" subsection cross-linking the service + DB docs. - src/triggers/README.md: note the durable status on the auto-debug row and add a focused subsection. Aligns the docs with the durable-table design already documented in 09-database.md (MNG-1667), which deleted the worker-local in-memory `debug-status.ts` Set because it was invisible to the dashboard process. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…er button (MNG-1668) (#1458) Keep the Debug Analysis panel's in-progress affordance and the disabled trigger button visible from the moment the analysis is triggered until a terminal status, instead of flickering back the instant the trigger mutation settles. - Extract pure decision helpers into web/src/lib/debug-analysis.ts (computeDebugAnalysisRunning, debugAnalysisRefetchInterval) so the polling/in-progress logic is node-testable. - Poll the status query from trigger until a terminal status via a polling-active flag (set on trigger success, cleared on completed/failed) in addition to status === 'running'. - isRunning now covers in-flight mutation + just-triggered + queued/running. - Split the component into a data wrapper + presentational DebugAnalysisView so the rendered UX is render-testable; disable both Run / Re-run buttons while running and show a spinner affordance. - Render an error for the failed status and re-enable the button; preserve the existing synchronous trigger-error handling. Co-authored-by: Cascade Bot <bot@cascade.dev> Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
…ectly (#1459) Tailwind v4's `@theme inline` bakes token values directly into utility classes (e.g. bg-card → background-color: oklch(1 0 0)) instead of emitting CSS custom properties. The .dark { --color-card: ... } overrides had no effect on those utilities, leaving cards, tab bars, and other token-backed components stuck at their light-mode colours in dark mode. Removing `inline` makes Tailwind emit :root CSS custom properties so the existing .dark { } block takes effect for every token-based utility across the dashboard — bg-card, bg-muted, bg-input, bg-sidebar, border-border, etc. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
…ent lint to changed files (#1460) * fix(worker): disable setup.sh idle timeout, bump to Node 24, scope agent lint to changed files Three related fixes after a worker run where .cascade/setup.sh was killed mid-Ruby-compilation and the agent then stalled on a full-tree eslint:fix: 1. setup.sh idle timeout: ruby-build suppresses make output by design, causing the 120s idle timeout to fire during compilation. Disable the idle timeout for setup.sh; the wall timeout (10min) remains the safety net for truly hung scripts. Also surface `reason` in the log so wall-timeout kills are distinguishable from idle-timeout kills. 2. Worker image Node 24: bump builder and production stages from node:22 to node:24, eliminating the EBADENGINE mismatch for projects that require Node >=24 (and reducing what setup.sh needs to install via asdf). 3. implementation.eta step 7: replace vague "run linting" instruction with an explicit changed-files-only lint pattern. Full-tree eslint --fix can take 10+ minutes on large TypeScript codebases and kills sessions. Also add a kill-on-timeout instruction to tmux.eta partial. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(prompts): include untracked files in scoped agent lint Review feedback on PR #1460: the scoped-lint guidance in step 7 still relies on `git diff --name-only HEAD`, which lists only tracked changes. At step 7 the agent has not committed yet (commit happens inside CreatePR at step 8) and nothing was `git add`ed, so brand-new source/test files are untracked and get excluded from the lint scope. Local lint then passes while the full-project `lint-and-test` CI job fails on exactly those new files — the opposite of this step's intent. Pick up untracked files via `git ls-files --others --exclude-standard`, dedupe with `sort -u`, and use `xargs -r` so the linter is not invoked with zero args when nothing matches. Note the same treatment applies to the Biome/other-linter variant. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Cascade Bot <bot@cascade.dev>
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Automated release PR created by the release workflow.
Commits (8):