Skip to content

Inkbox tunnel#2

Draft
alex-w-99 wants to merge 7 commits intomainfrom
inkbox-tunnel
Draft

Inkbox tunnel#2
alex-w-99 wants to merge 7 commits intomainfrom
inkbox-tunnel

Conversation

@alex-w-99
Copy link
Copy Markdown
Contributor

No description provided.

alex-w-99 and others added 7 commits May 2, 2026 20:28
Replace the pyngrok bootstrap with a persistent HTTP/2 client against
the Inkbox tunnels data plane (`/_system/{connect,hello,intake,response,ws}`).
Adds `bootstrap_tunnel()` (control-plane CRUD via raw httpx wrapper —
no SDK support yet for `/tunnels/*`), the `InkboxTunnelClient` runtime
(parked intake pool, ASGI dispatch, response posting, RFC-8441
extended-CONNECT WS bridge, jittered exponential reconnect), and the
`TLSTerminator` for passthrough mode (in-memory ssl.MemoryBIO with
LE-signed cert via `POST /tunnels/{id}/sign-csr`). Drops `pyngrok`,
adds `h2`, `httpx[http2]`, `cryptography`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- _intake_loop retries on transient errors instead of shrinking the
  parked-stream pool one slot at a time
- flow control: _mark_window_blocked only clears the conn-window event
  when the conn window is actually zero; _await_window cancels the
  loser wait-tasks instead of leaking them
- _post_response strips inbound content-length / transfer-encoding
  before forwarding under inkbox-h-* (avoids duplicate headers on the
  third party)
- ASGI receive() blocks on a disconnect_event after the body, set by
  the dispatcher in finally — fixes routes that poll for disconnect
- bootstrap persists state.json immediately after POST /tunnels/
  succeeds; connect secret printed once to stderr (not the logger)
- passthrough keypair switched to EC P-256
- 401/403 from /_system/hello propagates out of serve_forever via
  _TunnelAuthError instead of hot-looping bad credentials
- TLSTerminator tempfiles use mkstemp + explicit fchmod 0o600
- _pump_ws shuts the sender down via a queue sentinel, only falls back
  to cancel() after a 2s grace

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The bridge between the customer agent and the tunnel server is a real
RFC-6455 WebSocket negotiated over RFC-8441 extended CONNECT
(`:protocol=inkbox-tunnel-ws`, `sec-websocket-version: 13`). Hypercorn
was sending us standard WS BINARY frames inside the h2 DATA frames of
the CONNECT stream, but the customer-side pump was treating those bytes
as raw envelope payload. The first 4 bytes of every frame (the WS
header) became a multi-GB "envelope length", so we never decoded a
single envelope across hundreds of inbound frames per call. The
outbound side had the symmetric problem: we sent bare envelopes that
were not WS-framed and not masked, which the server would have rejected
the moment we tried to use it.

Add a minimal WS frame codec and plumb it into both sides of `_pump_ws`:
inbound, drain WS frames first and feed their payloads into the
length-prefixed envelope decoder; outbound, wrap each envelope as a
masked WS BINARY frame (mask is mandatory for client→server). Handle
PING/PONG and echo CLOSE on shutdown.

Also add an optional OpenAI Realtime bridge (`USE_OPENAI_REALTIME=true`)
that opts the call WS handshake out of Inkbox-managed STT/TTS and pumps
g711_ulaw audio between Inkbox and OpenAI. Useful for isolating tunnel
transport from STT/TTS plumbing during end-to-end testing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds the data-plane runtime for INKBOX_TUNNEL_TLS_MODE=passthrough so
the agent can terminate TLS itself. Third-party TCP rides a dedicated
extended-CONNECT bridge stream (`/_system/tcp/{tcp_id}`) carrying raw
bytes inside WS BINARY frames; TLSSession decrypts; plaintext is fed
into a loopback hypercorn ASGI server that mirrors the public app's
routes; the response is encrypted and sent back the same way.

- _dispatch_tcp_stream orchestrator with CONNECT-first ordering,
  :status=200 wait, loopback dial, inbound/outbound pumps with deferred
  h2 ack, asymmetric half-close grace, drain-and-ack cleanup,
  cleanup-send timeouts to avoid flow-control park.
- _StreamEvent.flow_controlled_length + per-bridge eager-ack
  suppression so back-pressure rides through to third-party TCP RWND.
- TLSTerminator advertises ALPN h2/http1.1; cert/keypair lifecycle
  hardened (state-dir chmod 0o700, key/cert pubkey-mismatch resign).
- Loopback FastAPI wrapper with /__loopback_health; hypercorn started
  in lifespan against a pre-bound listening socket via fd:// bind, and
  tunnel-client startup is gated on a real HTTP health probe.
- Removes the dead one-shot passthrough branch from _dispatch_http
  along with its orphaned HTTP1.1 parse/build helpers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The tunnels API server runs the full ACME flow synchronously inside
/sign-csr (Route53 TXT write + INSYNC waiter + LE order polling).
30s isn't enough — LE order polling alone can eat that much. Bump the
client-side timeout for that call only to 180s; other endpoints keep
the 30s default.

Also stream the loopback hypercorn accesslog to stdout. With the
dispatch path going only through hypercorn in passthrough mode, that's
the canonical signal for per-request method/path/status — silencing
it leaves debugging blind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…orming

When the tunnel server returns 401 ("Unknown or stale owner_token") on
/_system/intake, retrying with the same owner_token will keep failing —
the server has forgotten our session (worker recycle, sibling worker
without our state, owner-token reaper, etc.). The previous behavior
treated 401 as a transient "park failed, return None" and looped
immediately; under load this produced ~600 re-park attempts in 3
seconds before something else (GOAWAY, etc.) finally tore the
connection down.

Now: _park_one_intake raises _OwnerTokenInvalidError on 401, and
_intake_loop catches it, calls _force_reconnect() to close the h2
transport, and exits the slot. Closing the writer lets _read_loop see
EOF and return cleanly, _run_once finalizes, serve_forever's outer
reconnect loop picks up — a fresh /_system/hello mints a new
owner_token and re-parks all 32 slots. Idempotent if multiple slots
race on the same 401 (first to call _force_reconnect wins; subsequent
writer.close()s are no-ops).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant