Skip to content

Fail started agent runs on handler errors#4640

Open
KyleAMathews wants to merge 2 commits into
mainfrom
fix-started-runs-handler-errors
Open

Fail started agent runs on handler errors#4640
KyleAMathews wants to merge 2 commits into
mainfrom
fix-started-runs-handler-errors

Conversation

@KyleAMathews

Copy link
Copy Markdown
Contributor

Executive Summary

Mark newly-started agent runs as failed when a wake handler errors before ending them. This prevents the chat UI from showing "Thinking" indefinitely for runtime failure paths where the run start was persisted but no terminal run status was written.

Root Cause

The UI treats the latest run with status: started as actively streaming. When a handler created a run and then threw before calling the run's end path, processWake wrote an error event but did not also update that newly-started run to a terminal status. If the run was not yet visible through the local run collection, the existing failure association logic could not find it either.

Approach

  • Track run status events emitted through processWake during the current wake.
  • On handler failure, find the latest new run that is still started:
    • prefer DB-visible new started runs, and
    • fall back to produced run events that have not yet materialized in the local collection.
  • Write a terminal run update before the error event:
status: 'failed'
finish_reason: 'error'

The failure path deliberately checks that DB-visible runs are still started, so a run that already completed successfully is not regressed to failed if later cleanup throws.

Key Invariants

  • Every run started during a handler that then errors should become terminal if the runtime is still alive to handle the error.
  • A completed or already-failed run must not be overwritten by generic handler cleanup failure handling.
  • Error events should remain associated with the run that actually failed when such a run exists.

Non-goals

Trade-offs

The runtime keeps small per-wake bookkeeping of run status events in addition to querying the local collection. This is intentionally narrow: it only covers events emitted in the current wake and avoids broader persisted lease/heartbeat semantics, which need a separate design for crash recovery.

Verification

pnpm --dir packages/agents-runtime test process-wake.test.ts --run --reporter=dot

Files changed

  • packages/agents-runtime/src/process-wake.ts: fail the latest new still-started run when handler errors escape, with produced-event fallback for unmaterialized local writes.
  • packages/agents-runtime/test/process-wake.test.ts: add regression coverage for failing newly-started runs and for not regressing completed runs to failed.
  • .changeset/fail-started-agent-runs.md: patch changeset for @electric-ax/agents-runtime.

Refs #4633

@github-actions

github-actions Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Electric Agents Desktop Builds

Build artifacts for commit e6b9c58.

Platform Status Artifact
macOS Apple Silicon Passed DMG
macOS Intel Passed DMG
Windows x64 Passed Installer
Linux x64 Passed AppImage / deb

Workflow run

@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 58.00%. Comparing base (ee0da19) to head (e6b9c58).
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4640      +/-   ##
==========================================
- Coverage   59.46%   58.00%   -1.46%     
==========================================
  Files         385      342      -43     
  Lines       43039    40351    -2688     
  Branches    12383    11779     -604     
==========================================
- Hits        25591    23404    -2187     
+ Misses      17371    16909     -462     
+ Partials       77       38      -39     
Flag Coverage Δ
packages/agents 72.64% <ø> (ø)
packages/agents-mcp ?
packages/agents-mobile 80.67% <ø> (ø)
packages/agents-runtime 83.51% <100.00%> (+0.04%) ⬆️
packages/agents-server 75.45% <ø> (-0.03%) ⬇️
packages/agents-server-ui 7.51% <ø> (ø)
packages/electric-ax 51.06% <ø> (ø)
packages/experimental ?
packages/react-hooks ?
packages/start ?
packages/typescript-client ?
packages/y-electric ?
typescript 58.00% <100.00%> (-1.46%) ⬇️
unit-tests 58.00% <100.00%> (-1.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

Copy link
Copy Markdown
Contributor

Electric Agents Mobile Build

Local mobile checks ran for commit e6b9c58.

The EAS Android preview build was skipped because the mobile-eas-build label is not present.
Add the mobile-eas-build label to this PR to produce an installable preview build.

Workflow run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant