Skip to content

fix(sandbox): length-frame exec-stdin writes so file I/O works over a…#3719

Open
imran31415 wants to merge 5 commits into
openai:mainfrom
imran31415:fix/docker-sandbox-tls-stdin-hang
Open

fix(sandbox): length-frame exec-stdin writes so file I/O works over a…#3719
imran31415 wants to merge 5 commits into
openai:mainfrom
imran31415:fix/docker-sandbox-tls-stdin-hang

Conversation

@imran31415

@imran31415 imran31415 commented Jul 1, 2026

Copy link
Copy Markdown

What

Fixes a hang in the Docker sandbox: session.write() / apply_manifest() blocks forever when DOCKER_HOST is reached over TLS (Docker-in-Docker sidecars, remote daemons).

Root cause

DockerSandboxSession._stream_into_exec streams the payload into a docker exec running tar -x / cat from stdin, then signals end-of-input with raw_sock.shutdown(socket.SHUT_WR) (inside except Exception: pass). Over a TLS transport that half-close never delivers a stdin-EOF to the container, so the in-container reader blocks forever, the exec never exits, and the client's drain loop hangs. Over a unix socket the half-close works, which is why this only bites TLS daemons.

Fix

Frame the payload by length and pipe the real command through head -c <n>, so the reader terminates on a byte count instead of a stdin half-close:

payload_length, read_stream, spool = _measure_stream(stream)
framed_cmd = ["sh", "-c", 'n=$1; shift; head -c "$n" | "$@"', "sh", str(payload_length), *cmd]
  • Transport-agnostic: works identically over unix sockets, TLS TCP, and DinD.
  • Keeps the deliberate avoidance of put_archive() (volume-driver-mount plugins reject the duplicate Mount, per the existing comments in read/write).
  • The existing shutdown(SHUT_WR) is left in place as a harmless best-effort no-op; termination no longer depends on it.
  • Scope: only the file-materialization path (_stream_into_exec). The interactive PTY path is untouched.

Length measurement (off the event loop, bounded memory)

_measure_stream runs entirely on the executor thread (never the event loop):

  • Seekable streams (the common case — in-memory tar, sized file payloads) are measured in place via seek/tell with zero copy.
  • Non-seekable streams (e.g. an HTTP response body or pipe) are copied into a SpooledTemporaryFile — kept in memory up to 16 MiB, then spilled to disk — so the length can be framed without buffering the whole payload in RAM. The spool is closed in a finally.

Tests

Adds regression tests to tests/sandbox/test_docker.py:

  • test_stream_into_exec_length_frames_stdin_payload — asserts the exec command is head -c <n>-framed with <n> == payload byte length, and that exactly the payload is streamed (payload includes NUL / non-UTF-8 bytes).
  • test_stream_into_exec_frames_non_seekable_stream — a non-seekable stream is spooled and framed with the correct byte count.

Validation:

  • tests/sandbox/817 passed, 20 skipped
  • ruff format --check and ruff check → clean

Notes

Discovered while running the Strix pentest tool (which uses this SDK's Docker sandbox) inside a Kubernetes dev workspace whose only Docker access is a TLS-only DinD sidecar — the scan hung during workspace materialization until this fix.

… TLS DOCKER_HOST

The Docker sandbox materializes files with `docker exec` + `tar -x`/`cat`
reading the payload from stdin, then signals end-of-input with
`raw_sock.shutdown(SHUT_WR)` (wrapped in `except Exception: pass`). Over a TLS
DOCKER_HOST that half-close never delivers a stdin EOF to the container, so the
in-container `tar`/`cat` blocks forever and `session.write()` /
`apply_manifest()` hang. This reproduces against Docker-in-Docker sidecars and
remote daemons reached over TLS (both common setups).

Fix without falling back to `put_archive()` (deliberately avoided for
volume-driver-mounted containers, see `write`): frame the payload by length and
pipe the real command through `head -c <n>`, so the reader stops after exactly
<n> bytes regardless of whether the stdin half-close is delivered. Works
identically over unix sockets, TLS TCP, and DinD.

Adds regression tests asserting the exec command is length-framed and the exact
payload is streamed (seekable and non-seekable streams).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b9dd8f0e13

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/sandbox/sandboxes/docker.py Outdated
…emory

Addresses review feedback: `_measure_stream` previously drained a non-seekable
stream into a BytesIO on the event-loop thread, which both blocked the loop and
was unbounded (a large HTTP-body/pipe upload could OOM the process).

Now all measuring/spooling runs inside the executor thread (`_write`), and a
non-seekable stream is copied into a `SpooledTemporaryFile` — kept in memory up
to 16 MiB, then spilled to disk — so the length can be framed without buffering
the whole payload in RAM. Seekable streams (the common case: in-memory tar,
sized file payloads) are still measured in place with zero copy. The spool is
closed in a finally.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7d491ceed

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/sandbox/sandboxes/docker.py Outdated
Addresses review feedback. `head -c "$n" | "$@"` reports only the consumer's
exit status, so on an image without `head` the `head: not found` error was
discarded while `cat`/`tar` saw an empty pipe, exited 0, and the write
"succeeded" after creating an empty file — silent data loss.

Capture the producer's status via a temp file and exit non-zero (98) when it
fails, so a missing/broken `head` surfaces as a WorkspaceArchiveWriteError
instead. Uses a portable POSIX pattern (no `pipefail`, which dash lacks).

Verified end-to-end against a real container: normal writes still exit 0 with
correct contents; with `head` removed the exec now exits 98 instead of silently
writing an empty file.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dfe259dc47

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/sandbox/sandboxes/docker.py Outdated
Addresses review feedback. The send loop read the payload to EOF while the exec
was framed with the length measured earlier by _measure_stream. If a seekable
input changed in between (e.g. an open file appended or truncated during
write()), that desynced: extra bytes piled up behind a `head -c` that had
already stopped, and a short read left `head` blocked on a TLS stdin that never
EOFs — re-introducing the hang this change removes.

Send at most `payload_length` bytes (the framed count) and raise
WorkspaceArchiveWriteError if the stream ends early, instead of truncating
silently or blocking.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a7f0b5a76e

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread src/agents/sandbox/sandboxes/docker.py Outdated
stream.seek(0, io.SEEK_END)
end = stream.tell()
stream.seek(start)
return end - start, stream, None

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Clamp negative measured lengths before framing

When a seekable stream's current position is past its end (BytesIO(b"abc").seek(10) or an open file left beyond current EOF), end - start is negative. That negative value is passed to head -c "$n"; head --help documents --bytes=[-]NUM as printing all but the last NUM bytes for a leading -, so over the TLS Docker transport the in-container head again waits for stdin EOF while the send loop sends nothing (remaining > 0 is false), recreating the hang this patch is fixing. Clamp this to 0 or fail before starting the exec.

Useful? React with 👍 / 👎.

Addresses review feedback. A seekable stream positioned past its end
(e.g. BytesIO(b"abc").seek(10), or a file left beyond EOF) makes
`end - start` negative. That negative count becomes `head -c -N`, which
means "all but the last N bytes" and reads to EOF — re-hanging over a TLS
stdin — while the send loop (`remaining > 0` false) sends nothing.

Clamp the measured length to 0: a position past end has no readable bytes,
so a 0-byte write (`head -c 0`, which exits immediately) is the correct and
safe result.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant