Skip to content

Fix Codex runtime and goal event handling#500

Merged
arul28 merged 1 commit into
mainfrom
ade/quail-codex-goal-loop-682e2913
Jun 1, 2026
Merged

Fix Codex runtime and goal event handling#500
arul28 merged 1 commit into
mainfrom
ade/quail-codex-goal-loop-682e2913

Conversation

@arul28
Copy link
Copy Markdown
Owner

@arul28 arul28 commented Jun 1, 2026

Summary

Fixes the ADE-side failure modes behind the Codex app-server instability seen in Work chats: a timed-out runtime action could tear down the whole project runtime, repeated goal sync notifications were rendered as visible timeline spam, Computer Use backend status was not exposed through ADE actions, and virtualized chat rows could keep stale heights after row identities changed.

What changed

  • Stop killing the owned ade-cli serve runtime on a single ade/actions/call timeout; ADE now drops the timed-out client socket and reconnects while keeping the runtime/PTY/chat process tree alive.
  • Keep unhandled promise rejections in ade-cli serve logged but non-fatal, while leaving uncaught exceptions fatal.
  • Expose computer_use_artifacts.getBackendStatus through ADE actions and soften the Codex Computer Use directive so agents do not stall when a helper is not exposed.
  • De-duplicate Codex goal timeline chips: ADE still refreshes stored goal usage/state, but only emits visible goal updates when objective/status/budget changes, and only emits clears for a known goal.
  • Key virtualized chat row measurements by row identity instead of row index to prevent stale measurements from creating blank transcript space.

The upstream Codex app-server README documents goal APIs as state-change notifications: thread/goal/set emits thread/goal/updated, thread/goal/get can return goal: null, and thread/goal/clear emits thread/goal/cleared when state changes. This patch treats those notifications as state sync, not as unconditional visible transcript events.

Docs reference: https://github.com/openai/codex/blob/main/codex-rs/app-server/README.md

Validation

  • npx vitest run src/main/services/chat/agentChatService.test.ts
  • npx vitest run src/main/services/localRuntime/localRuntimeConnectionPool.test.ts src/main/services/adeActions/registry.test.ts src/renderer/components/chat/AgentChatMessageList.test.tsx
  • npm --prefix apps/desktop run typecheck -- --pretty false
  • npm --prefix apps/ade-cli run typecheck -- --pretty false
  • npm --prefix apps/ade-cli run test -- src/cli.test.ts
  • env -u ADE_CHAT_SESSION_ID -u ADE_LANE_ID -u ADE_RUN_ID -u ADE_STEP_ID -u ADE_ATTEMPT_ID -u ADE_DEFAULT_ROLE npm --prefix apps/ade-cli run test -- src/adeRpcServer.test.ts
  • env -u ADE_CHAT_SESSION_ID -u ADE_LANE_ID -u ADE_RUN_ID -u ADE_STEP_ID -u ADE_ATTEMPT_ID -u ADE_DEFAULT_ROLE npm --prefix apps/ade-cli run test
  • npm run test:desktop:sharded
  • npm --prefix apps/ade-cli run build
  • npm --prefix apps/desktop run build
  • npm --prefix apps/desktop run lint (0 errors, existing warnings)
  • git diff --check
  • ade doctor --text

Risks

  • If a runtime action times out because the runtime is actually wedged, ADE now reconnects instead of killing the process immediately. A later explicit shutdown/disposal path still owns and cleans up the runtime child.
  • Goal usage fields can update without a visible timeline chip by design; the session summary still keeps the latest stored goal state.

ADE   Open in ADE  ·  ade/quail-codex-goal-loop-682e2913 branch  ·  PR #500

Summary by CodeRabbit

Release Notes

  • Bug Fixes

    • Unhandled async rejections no longer cause the CLI to exit unexpectedly
    • Chat transcript virtualization properly refreshes cached heights when messages change
    • Local runtime connections improved during timeout recovery
  • New Features

    • Added backend status capability for computer-use artifact handling
    • Enhanced computer-use backend selection with improved instructions and fallback behavior for unavailable backends

Greptile Summary

This PR hardens the Codex runtime and goal event pipeline across five distinct areas: runtime timeout recovery, rejection resilience in ade-cli serve, computer-use backend status exposure, Codex goal timeline deduplication, and virtualized chat row height keying.

  • Timeout recovery (localRuntimeConnectionPool.ts): A timed-out ade/actions/call now drops only the client socket, preserving the spawned runtime/PTY/chat tree; preserveOwnedRuntimeChildOnNextConnect signals the next tryConnect to re-attach ownership to the kept-alive child rather than treating it as an external daemon.
  • Goal event deduplication (agentChatService.ts): setCodexGoalAndMaybeEmitUpdate and clearKnownCodexGoal gate all visible timeline chips behind a diff of (objective, status, tokenBudget), so usage-only ticks and repeated sync notifications no longer flood the transcript; four new test cases cover clear-on-unknown, clear-on-known, usage dedup, and silent null-refresh paths.
  • Virtualized heights (AgentChatMessageList.tsx): measuredHeights is now keyed by row identity string instead of index, with synchronous GC during render to prevent stale cross-session measurements from creating phantom scroll space.

Confidence Score: 5/5

Safe to merge; all five fix areas are well-reasoned and covered by new targeted tests.

Each change is narrowly scoped and validated: the connection-pool ownership flag is tested for both the preserve and discard paths, the goal-deduplication helpers are covered by four new test cases across all meaningful event transitions, and the row-key fix is backed by a JSDOM height measurement test. No data loss, stale ownership, or incorrect event suppression paths were found.

No files require special attention.

Important Files Changed

Filename Overview
apps/ade-cli/src/cli.ts Adds a sliding-window rate limiter (5 logs per 60 s, then suppress with periodic summaries) for unhandled promise rejections, making them non-fatal while preventing unbounded log flooding; uncaught exceptions remain fatal.
apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.ts Adds preserveOwnedRuntimeChildOnNextConnect flag so a timed-out client is dropped without killing the runtime process; tryConnect correctly re-attaches ownership only when the flag is set, and falls back to null otherwise.
apps/desktop/src/main/services/chat/agentChatService.ts Introduces setCodexGoalAndMaybeEmitUpdate and clearKnownCodexGoal helpers to deduplicate goal timeline chips; visible updates are gated on changes to objective/status/tokenBudget, and clears only fire when a goal was previously stored.
apps/desktop/src/renderer/components/chat/AgentChatMessageList.tsx Switches measuredHeights from index-keyed to row-identity-keyed and adds synchronous render-phase GC of stale keys, preventing phantom scroll space when session content changes.
apps/desktop/src/main/services/adeActions/registry.ts Adds getBackendStatus to the computer_use_artifacts action allowlist, exposing the broker's backend status through ADE actions.
apps/desktop/src/main/services/chat/agentChatService.test.ts Adds four new test cases covering: silent clear when no goal is known, event emission on known-goal clear, usage-only deduplication, and null-refresh silent path; also asserts new directive text.
apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.test.ts Adds two new tests verifying ownership preservation (flag set) vs. stale-child clearance (flag unset) in tryConnect, and updates existing timeout test to assert no kill signal on timeout.
apps/desktop/src/renderer/components/chat/AgentChatMessageList.test.tsx New test validates that virtual sizer height shrinks when row identities change (different message prefix), confirming stale height measurements are not reused.
apps/desktop/src/main/services/adeActions/registry.test.ts Updates allowlist test and integration test to cover getBackendStatus in the computer_use_artifacts domain, including return-value assertion against the mock broker.

Reviews (2): Last reviewed commit: "Fix Codex runtime and goal event handlin..." | Re-trigger Greptile

@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 1, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
ade Ignored Ignored Preview Jun 1, 2026 5:41pm

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 1, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

The PR integrates computer-use backend status checking into the action allowlist and chat directives, refactors goal state management with visible-state deduplication helpers, preserves runtime ownership on action timeouts, fixes virtualization height caching to use row identities instead of indices, and makes CLI unhandled rejections non-fatal while keeping the process alive.

Changes

Runtime service integration and chat improvements

Layer / File(s) Summary
Computer-use backend status action support
apps/desktop/src/main/services/adeActions/registry.ts, apps/desktop/src/main/services/adeActions/registry.test.ts, apps/desktop/src/main/services/chat/agentChatService.ts, apps/desktop/src/main/services/chat/agentChatService.test.ts
computer_use_artifacts allowlist now includes getBackendStatus action; backend selection directives updated to reference the action when available; allowlist and runtime service wiring validated in test suite.
Codex goal visible state abstraction
apps/desktop/src/main/services/chat/agentChatService.ts
New CodexGoalVisibleState and equality helper normalize goals to user-relevant fields (objective, status, tokenBudget) for deduplication. setCodexGoalAndMaybeEmitUpdate and clearKnownCodexGoal helpers centralize goal updates and clearing with conditional event emission only when visible state changes.
Codex goal update call site refactoring
apps/desktop/src/main/services/chat/agentChatService.ts, apps/desktop/src/main/services/chat/agentChatService.test.ts
All goal update and clearing paths refactored to use the new helpers for consistent deduplication behavior across budget updates, fallback objective updates, explicit goal clearing, and goal retrieval nulling; new tests verify codex_goal_cleared and codex_goal_updated event emission behavior and deduplication.
Local runtime ownership preservation on timeout
apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.ts, apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.test.ts
resetConnectionAfterActionTimeout now closes only the RPC client without disposing the child process; tryConnect returns the preserved ownedRuntimeChild when reconnecting to an existing socket, ensuring cleanup continues for the owned runtime instance after timed-out actions.
Chat message virtualization height cache
apps/desktop/src/renderer/components/chat/AgentChatMessageList.tsx, apps/desktop/src/renderer/components/chat/AgentChatMessageList.test.tsx
Virtualization measurement cache refactored from index-based to row-key-based (Map<string, number>); cleanup effect prunes stale entries not in the current row set; rowHeight and handleMeasure use row keys to avoid stale measurements after row identity changes; new test verifies cache correctness after identity changes.
CLI unhandledRejection non-fatal handling
apps/ade-cli/src/cli.ts, apps/ade-cli/src/cli.test.ts
unhandledRejection handler now logs late rejections as diagnostic events while keeping the process alive; uncaughtException remains fatal. Test fixture version reads from ADE_CLI_VERSION environment variable with "0.0.0" fallback.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • arul28/ADE#461: Changes to agentChatService, adeActions/registry, and LocalRuntimeConnectionPool address objectives related to backend status integration and runtime ownership management.

Possibly related PRs

  • arul28/ADE#401: Both PRs modify LocalRuntimeConnectionPool connection/timeout handling, updating resetConnectionAfterActionTimeout and tryConnect ownership preservation logic.
  • arul28/ADE#85: Both PRs modify computer-use chat directive logic in agentChatService.ts and related tests to enhance backend status handling behavior.

Suggested labels

desktop

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Title check ✅ Passed The title 'Fix Codex runtime and goal event handling' directly aligns with the main changes across the changeset, which broadly address Codex runtime error handling and goal event emission logic.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ade/quail-codex-goal-loop-682e2913

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@arul28 arul28 changed the title Quail Codex Goal Loop Fix Codex runtime and goal event handling Jun 1, 2026
@arul28
Copy link
Copy Markdown
Owner Author

arul28 commented Jun 1, 2026

@copilot review but do not make fixes

@arul28
Copy link
Copy Markdown
Owner Author

arul28 commented Jun 1, 2026

Post-rebase validation after updating onto origin/main (Linear Lane Launch Fixes #499):\n\n- npx vitest run src/main/services/chat/agentChatService.test.ts src/main/services/localRuntime/localRuntimeConnectionPool.test.ts src/main/services/adeActions/registry.test.ts src/renderer/components/chat/AgentChatMessageList.test.tsx\n- env -u ADE_CHAT_SESSION_ID -u ADE_LANE_ID -u ADE_RUN_ID -u ADE_STEP_ID -u ADE_ATTEMPT_ID -u ADE_DEFAULT_ROLE npm --prefix apps/ade-cli run test -- src/cli.test.ts src/adeRpcServer.test.ts\n- npm --prefix apps/desktop run typecheck -- --pretty false\n- npm --prefix apps/ade-cli run typecheck -- --pretty false\n- npm --prefix apps/ade-cli run build\n- npm --prefix apps/desktop run build

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
apps/desktop/src/renderer/components/chat/AgentChatMessageList.test.tsx (1)

1290-1292: ⚡ Quick win

Avoid fixed virtual-sizer thresholds in this regression test.

This couples the test to current sizing constants. Compare the rerendered height against the measured “Tall” baseline instead.

Proposed test hardening
-      await waitFor(() => {
-        expect(virtualSizerHeight(rendered.container)).toBeGreaterThan(6_200);
-      });
+      let tallHeight = 0;
+      await waitFor(() => {
+        tallHeight = virtualSizerHeight(rendered.container);
+        expect(tallHeight).toBeGreaterThan(0);
+      });
@@
-      await waitFor(() => {
-        expect(virtualSizerHeight(rendered.container)).toBeLessThan(6_200);
-      });
+      await waitFor(() => {
+        expect(virtualSizerHeight(rendered.container)).toBeLessThan(tallHeight);
+      });

Also applies to: 1301-1303

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@apps/desktop/src/renderer/components/chat/AgentChatMessageList.test.tsx`
around lines 1290 - 1292, The test uses a hard-coded threshold in the waitFor
assertion (calling virtualSizerHeight(rendered.container)
toBeGreaterThan(6_200)); change it to compare against the measured "Tall"
baseline value instead of the fixed 6_200 constant: obtain the baseline height
used elsewhere in the test harness (the measured “Tall” baseline variable or
function) and assert virtualSizerHeight(rendered.container) is greater than that
baseline; apply the same replacement for the similar assertion at lines
1301-1303 so both checks compare to the Tall baseline rather than a magic
number.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@apps/desktop/src/renderer/components/chat/AgentChatMessageList.test.tsx`:
- Around line 1290-1292: The test uses a hard-coded threshold in the waitFor
assertion (calling virtualSizerHeight(rendered.container)
toBeGreaterThan(6_200)); change it to compare against the measured "Tall"
baseline value instead of the fixed 6_200 constant: obtain the baseline height
used elsewhere in the test harness (the measured “Tall” baseline variable or
function) and assert virtualSizerHeight(rendered.container) is greater than that
baseline; apply the same replacement for the similar assertion at lines
1301-1303 so both checks compare to the Tall baseline rather than a magic
number.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 38ae58b3-f00f-4cc7-b6c5-03ffc603fef9

📥 Commits

Reviewing files that changed from the base of the PR and between 48f946b and 0d5e41e.

📒 Files selected for processing (10)
  • apps/ade-cli/src/cli.test.ts
  • apps/ade-cli/src/cli.ts
  • apps/desktop/src/main/services/adeActions/registry.test.ts
  • apps/desktop/src/main/services/adeActions/registry.ts
  • apps/desktop/src/main/services/chat/agentChatService.test.ts
  • apps/desktop/src/main/services/chat/agentChatService.ts
  • apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.test.ts
  • apps/desktop/src/main/services/localRuntime/localRuntimeConnectionPool.ts
  • apps/desktop/src/renderer/components/chat/AgentChatMessageList.test.tsx
  • apps/desktop/src/renderer/components/chat/AgentChatMessageList.tsx
👮 Files not reviewed due to content moderation or server errors (1)
  • apps/desktop/src/main/services/chat/agentChatService.test.ts

Comment thread apps/ade-cli/src/cli.ts
Comment thread apps/desktop/src/renderer/components/chat/AgentChatMessageList.tsx Outdated
@arul28 arul28 force-pushed the ade/quail-codex-goal-loop-682e2913 branch from 9419c86 to 7bd249a Compare June 1, 2026 17:41
@arul28
Copy link
Copy Markdown
Owner Author

arul28 commented Jun 1, 2026

Addressed review follow-ups in the latest push:\n\n- Added a rolling rate limit for contained unhandled-rejection logs so a hot async loop does not flood stderr while keeping the runtime alive.\n- Scoped owned-runtime-child preservation to the explicit timeout reconnect path, and added coverage that external runtime connects do not inherit stale child ownership.\n- Made chat row-height stale-key cleanup synchronous and hardened the virtualizer regression test to compare against the measured tall baseline.\n\nAdditional validation:\n- npx vitest run src/main/services/localRuntime/localRuntimeConnectionPool.test.ts src/renderer/components/chat/AgentChatMessageList.test.tsx\n- npm --prefix apps/ade-cli run test -- src/cli.test.ts\n- npm --prefix apps/ade-cli run typecheck -- --pretty false\n- npm --prefix apps/desktop run typecheck -- --pretty false\n- npm --prefix apps/ade-cli run build\n- git diff --check

@arul28 arul28 merged commit 172d8f2 into main Jun 1, 2026
27 checks passed
@arul28 arul28 deleted the ade/quail-codex-goal-loop-682e2913 branch June 1, 2026 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant