Summary
When an agent responds to a <task-notification> (background task completion), the response text and any subsequent tool calls are never rendered in the chat UI. The agent does process the notification — it reads output, makes tool calls, generates text — but all of that is invisibly dropped before reaching the frontend.
Reproduction
- An Ambient session runs a Bash command with
run_in_background: true
- The script eventually exits (success or failure)
- A
<task-notification> is delivered to the agent with the task ID, output file, status, and summary
- The agent processes the notification and generates a text response (e.g. "Poll exited with code 3, let me check the comments...")
- Bug: That response is NOT rendered in the Ambient chat UI. The user sees nothing.
Root Cause (Two-Layer Failure)
Layer 1 — Runner: stream_between_run_events is implemented but never reachable
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py:374 implements stream_between_run_events(). This method has the correct logic:
- Consumes
TaskNotificationMessage from _between_run_queue → emits a task:completed CUSTOM event
- Picks up the agent's subsequent response messages from
_between_run_queue as the next non-task message
- Opens a synthetic
RunStartedEvent envelope around the response
- Pipes it through
_stream_claude_sdk → produces TEXT_MESSAGE_* events + MESSAGES_SNAPSHOT
- Closes with
RunFinishedEvent
This method is never called from anywhere. No HTTP endpoint, no gRPC listener task, no test invokes it. The between-run queue fills up with the agent's response and nobody drains it.
Additionally, app.py includes the events router twice (lines 280 and 325), with the second inclusion even commented # Between-run event stream (always registered) — a leftover from a previous incomplete attempt to wire this up.
Layer 2 — Backend: the between-run listener connects to a non-existent URL
components/backend/websocket/agui_proxy.go:1175 — listenBetweenRunEvents — is the backend goroutine designed to capture between-run events. It constructs:
eventsURL := strings.TrimSuffix(runnerURL, "/") + "/events"
This produces http://runner:8000/events — no thread_id path parameter. The runner only registers GET /events/{thread_id} (a required FastAPI path parameter), so the runner returns 404 for every attempt. After 30 retries with exponential backoff the goroutine exits. Failures are logged but never surfaced to users.
Even if the URL were corrected to include a thread_id, GET /events/{thread_id} reads from _active_streams[thread_id], which is populated exclusively during bridge.run() (user-initiated turns). Between-run queue messages are never placed there.
Also: gRPC transport has the same gap
When AMBIENT_GRPC_ENABLED=true, GRPCSessionListener._listen_loop only processes event_type == "user" messages. There is no background task draining worker._between_run_queue and feeding GRPCMessageWriter. Between-run events are equally lost in the gRPC transport path.
Message Flow (Broken)
Background task exits
→ SDK delivers TaskNotificationMessage via receive_messages()
→ _read_messages_forever: active_output_queue is None → goes to _between_run_queue
→ Claude responds to the notification
→ Response messages (StreamEvents, AssistantMessage, ResultMessage) → _between_run_queue
_between_run_queue now holds:
[TaskNotificationMessage, StreamEvent..., AssistantMessage, ResultMessage]
stream_between_run_events() has correct logic to drain this — but is never called.
Backend listenBetweenRunEvents goroutine:
→ Calls GET http://runner:8000/events (no thread_id → 404)
→ Retries 30× with backoff, logs failures
→ Gives up and exits
Result: agent response never leaves the runner. Frontend sees nothing. No DB record written.
Proposed Fix
Runner — expose stream_between_run_events via HTTP
Add a GET /events endpoint (no thread_id parameter) to the runner that serves stream_between_run_events:
# In ambient_runner/endpoints/events.py
@router.get("/events")
async def get_between_run_events(request: Request):
bridge = request.app.state.bridge
context = bridge.context
if not context:
raise HTTPException(503, "Context not initialized")
thread_id = context.session_id
encoder = EventEncoder(accept="text/event-stream")
async def event_stream():
async for event in bridge.stream_between_run_events(thread_id):
yield encoder.encode(event)
return StreamingResponse(
event_stream(),
media_type="text/event-stream",
headers={"Cache-Control": "no-cache", "X-Accel-Buffering": "no"},
)
stream_between_run_events runs a while True loop until the worker shuts down, so one persistent SSE connection covers the pod's lifetime. The backend's existing persistStreamedEvent + publishLine will correctly handle the incoming events (CUSTOM task events, TEXT_MESSAGE events, MESSAGES_SNAPSHOT, RUN_FINISHED).
Also remove the duplicate router registration from app.py:325.
gRPC path — wire between-run queue to GRPCMessageWriter
In GRPCSessionListener, after a worker is ready, spawn a background task that:
- Calls
bridge.stream_between_run_events(session_id)
- For each
RUN_STARTED: creates a new GRPCMessageWriter instance
- For each
MESSAGES_SNAPSHOT: feeds it to the writer (accumulates messages)
- For each
RUN_FINISHED: calls writer._write_message(status="completed") to persist to DB
This ensures between-run agent responses are also persisted in gRPC-mode deployments.
Secondary Issues Found During Investigation
ensureBetweenRunListener only starts on first user message — no restart if the listener gives up
cleanupStaleSessions deletes the tracking key without stopping the goroutine, allowing potential duplicate listeners
_stream_claude_sdk is called with frontend_tool_names=set() in between-run context, so HITL tools in between-run responses would not be detected as halting
Affected Files
| File |
Issue |
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py:374 |
stream_between_run_events correct but never called |
components/runners/ambient-runner/ambient_runner/endpoints/events.py |
Missing GET /events route for between-run stream |
components/runners/ambient-runner/ambient_runner/app.py:325 |
Duplicate events router registration |
components/backend/websocket/agui_proxy.go:1179 |
/events URL missing required thread_id; even correct URL would read wrong queue |
components/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.py |
No between-run queue integration in gRPC listener |
Severity
High. Any session using run_in_background: true (including all background tool calls from subagents) produces agent responses that are invisibly dropped. No error is shown to the user. The agent does work but appears silent.
Summary
When an agent responds to a
<task-notification>(background task completion), the response text and any subsequent tool calls are never rendered in the chat UI. The agent does process the notification — it reads output, makes tool calls, generates text — but all of that is invisibly dropped before reaching the frontend.Reproduction
run_in_background: true<task-notification>is delivered to the agent with the task ID, output file, status, and summaryRoot Cause (Two-Layer Failure)
Layer 1 — Runner:
stream_between_run_eventsis implemented but never reachablecomponents/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py:374implementsstream_between_run_events(). This method has the correct logic:TaskNotificationMessagefrom_between_run_queue→ emits atask:completedCUSTOM event_between_run_queueas the next non-task messageRunStartedEventenvelope around the response_stream_claude_sdk→ producesTEXT_MESSAGE_*events +MESSAGES_SNAPSHOTRunFinishedEventThis method is never called from anywhere. No HTTP endpoint, no gRPC listener task, no test invokes it. The between-run queue fills up with the agent's response and nobody drains it.
Additionally,
app.pyincludes the events router twice (lines 280 and 325), with the second inclusion even commented# Between-run event stream (always registered)— a leftover from a previous incomplete attempt to wire this up.Layer 2 — Backend: the between-run listener connects to a non-existent URL
components/backend/websocket/agui_proxy.go:1175—listenBetweenRunEvents— is the backend goroutine designed to capture between-run events. It constructs:This produces
http://runner:8000/events— nothread_idpath parameter. The runner only registersGET /events/{thread_id}(a required FastAPI path parameter), so the runner returns 404 for every attempt. After 30 retries with exponential backoff the goroutine exits. Failures are logged but never surfaced to users.Even if the URL were corrected to include a thread_id,
GET /events/{thread_id}reads from_active_streams[thread_id], which is populated exclusively duringbridge.run()(user-initiated turns). Between-run queue messages are never placed there.Also: gRPC transport has the same gap
When
AMBIENT_GRPC_ENABLED=true,GRPCSessionListener._listen_looponly processesevent_type == "user"messages. There is no background task drainingworker._between_run_queueand feedingGRPCMessageWriter. Between-run events are equally lost in the gRPC transport path.Message Flow (Broken)
Proposed Fix
Runner — expose
stream_between_run_eventsvia HTTPAdd a
GET /eventsendpoint (no thread_id parameter) to the runner that servesstream_between_run_events:stream_between_run_eventsruns awhile Trueloop until the worker shuts down, so one persistent SSE connection covers the pod's lifetime. The backend's existingpersistStreamedEvent+publishLinewill correctly handle the incoming events (CUSTOM task events, TEXT_MESSAGE events, MESSAGES_SNAPSHOT, RUN_FINISHED).Also remove the duplicate router registration from
app.py:325.gRPC path — wire between-run queue to GRPCMessageWriter
In
GRPCSessionListener, after a worker is ready, spawn a background task that:bridge.stream_between_run_events(session_id)RUN_STARTED: creates a newGRPCMessageWriterinstanceMESSAGES_SNAPSHOT: feeds it to the writer (accumulates messages)RUN_FINISHED: callswriter._write_message(status="completed")to persist to DBThis ensures between-run agent responses are also persisted in gRPC-mode deployments.
Secondary Issues Found During Investigation
ensureBetweenRunListeneronly starts on first user message — no restart if the listener gives upcleanupStaleSessionsdeletes the tracking key without stopping the goroutine, allowing potential duplicate listeners_stream_claude_sdkis called withfrontend_tool_names=set()in between-run context, so HITL tools in between-run responses would not be detected as haltingAffected Files
components/runners/ambient-runner/ambient_runner/bridges/claude/bridge.py:374stream_between_run_eventscorrect but never calledcomponents/runners/ambient-runner/ambient_runner/endpoints/events.pyGET /eventsroute for between-run streamcomponents/runners/ambient-runner/ambient_runner/app.py:325components/backend/websocket/agui_proxy.go:1179/eventsURL missing required thread_id; even correct URL would read wrong queuecomponents/runners/ambient-runner/ambient_runner/bridges/claude/grpc_transport.pySeverity
High. Any session using
run_in_background: true(including all background tool calls from subagents) produces agent responses that are invisibly dropped. No error is shown to the user. The agent does work but appears silent.