Skip to content

Unify worker turn input over IPC#130

Merged
t-kalinowski merged 38 commits into
mainfrom
opaque-worker-ipc-protocol
Jun 20, 2026
Merged

Unify worker turn input over IPC#130
t-kalinowski merged 38 commits into
mainfrom
opaque-worker-ipc-protocol

Conversation

@t-kalinowski

Copy link
Copy Markdown
Member

Summary

This standardizes worker communication around an opaque IPC turn protocol so the server can treat R, Python, and protocol workers the same way. Managed input now flows through turn_start and turn_input; workers own how that input reaches their runtime and report completion with idle or stdin_wait.

The change removes the old dual-path stdin/PTY feed accounting that made completion and interrupt behavior harder to reason about. Reset and teardown now use an explicit shutdown lifecycle message before falling back to stdin close and bounded process termination.

User-facing changes

  • Stdin-style waits use a consistent continuation path across R and Python.
  • Python help and stdin prompts are surfaced more consistently in transcript output.
  • Reset and worker teardown are more reliable because shutdown no longer depends on stdin EOF.

Internal changes

  • Replaced private stdin/PTY feed messages with shared turn_start, turn_input, interrupt, and shutdown IPC messages.
  • Removed the built-in adapter request-state layer.
  • Moved R and Python managed input behind worker-owned queues.
  • Preserved active turn state across stdin_wait so follow-up input routes as turn_input.
  • Updated protocol, architecture, debugging, testing, ADR, plan, and future-work docs.

Route managed R, Python, and protocol-worker input through shared IPC
turn messages. Move runtime stdin handling behind worker-owned queues,
add explicit shutdown control, and remove the old stdin/PTY feed
accounting path.

Update protocol docs, supersede the old stdin-close shutdown ADR, and
refresh Python help snapshots for the new stdin prompt behavior.
Increase the worker readiness wait for slower Windows startup and make timing-sensitive interrupt and Python backend tests wait for their required markers instead of assuming fast CI scheduling.
Avoid macOS CI pager timeouts caused by several R-backed pager integration tests producing large output concurrently under the capped Rust test runner.
Finding 1:
[P2] Preserve case-insensitive Windows env overrides — /Users/tomasz/.codex/worktrees/66bb/mcp-repl/src/worker_supervisor.rs:2001-2004. On Windows, environment names are case-insensitive, but this merge uses case-sensitive HashMap insert/remove. For non-sandboxed ConPTY workers, an override such as PATH will not replace an inherited Path, so make_env_block can emit both entries and the worker may keep the old value. Use a case-insensitive upsert/removal here like the existing Windows helpers.

Finding 2:
[P2] Keep direct ConPTY workers in a kill-on-close job — /Users/tomasz/.codex/worktrees/66bb/mcp-repl/src/worker_supervisor.rs:2019-2021. For non-sandboxed Windows ConPTY workers, this bypasses the wrapper path that assigned the child to a kill-on-close job object. WindowsProcess::kill() only terminates the root process, so user code that starts a child process can survive repl_reset or shutdown and keep running after the session is gone. Assign the direct process to the same job containment or retain the wrapper behavior.
[P1] Move worker_ready before R startup output — /Users/tomasz/.codex/worktrees/66bb/mcp-repl/src/r_session.rs:154-154. When a user or site R profile writes or prompts during setup_Rmainloop, the R console callbacks are already installed in setup_r, but worker_ready is emitted only after initialize_r() finishes. The server rejects any first sideband message that is not worker_ready, so startup files with cat(...), warnings, or readline() can make the R worker fail to spawn with a protocol error. Emit readiness before startup can produce sideband output, or buffer those events until after worker_ready.
[P1] Abort pending R readline input on interrupt — /Users/tomasz/.codex/worktrees/66bb/mcp-repl/src/worker.rs:65-67. When R is blocked in `readline()` after emitting `input_wait`, a Ctrl-C sideband message only clears queued text via `clear_pending_input()`. The main R thread remains parked in `r_read_console`, so the next non-empty tool input is consumed as the readline answer instead of running after the interrupt. The interrupt path needs to mark or wake the pending read as interrupted, not only clear the queue.
[P2] Defer input_wait until partial raw reads return — /Users/tomasz/.codex/worktrees/66bb/mcp-repl/src/python_session/unix_stdin.rs:255-257. On Unix, when Python code uses byte-oriented stdin APIs such as `os.read(0, 10)` or `sys.stdin.buffer.read(10)` and the follow-up batch supplies fewer bytes than requested, the branch emits `input_wait` before returning the already-read partial bytes to Python. This lets the server complete the follow-up while code after the read is still running, detaching later output to a poll. Return the partial bytes without emitting `input_wait` once any bytes have been read.
Finding addressed:
[P2] Stop treating input_line events as visible echoes — /Users/tomasz/.codex/worktrees/66bb/mcp-repl/src/output_timeline.rs:46-50. With v6, input_line is structural and no longer means the worker emitted prompt + input on stdout, but only the image anchoring path is guarded to raw echoes here; the echo-collapse path still matches IPC output_text against those IPC input events. For a worker/R/Python request where real stdout begins with the submitted line, e.g. input hello followed by output_text p> hello\nVISIBLE\n, the p> hello line is elided as a fake echo, so user output is lost.
@t-kalinowski t-kalinowski marked this pull request as ready for review June 20, 2026 14:08
@t-kalinowski t-kalinowski merged commit 106ca52 into main Jun 20, 2026
5 checks passed
@t-kalinowski t-kalinowski deleted the opaque-worker-ipc-protocol branch June 20, 2026 14:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant