Skip to content

fix(sandbox): resolve symlinked binary paths in network policy matching#774

Open
johntmyers wants to merge 4 commits intomainfrom
fix/770-symlink-binary-resolution
Open

fix(sandbox): resolve symlinked binary paths in network policy matching#774
johntmyers wants to merge 4 commits intomainfrom
fix/770-symlink-binary-resolution

Conversation

@johntmyers
Copy link
Copy Markdown
Collaborator

@johntmyers johntmyers commented Apr 6, 2026

Summary

Policy binary paths specified as symlinks (e.g., /usr/bin/python3) were silently denied because the kernel reports the canonical path via /proc/<pid>/exe (e.g., /usr/bin/python3.11). This fix resolves symlinks through the container filesystem after the entrypoint starts, expanding the OPA policy data so both the original and resolved paths match.

Related Issue

Closes #770

Changes

  • opa.rs: Added resolve_binary_in_container() helper that resolves symlinks via /proc/<pid>/root/ on Linux using iterative read_link (not canonicalize, which resolves the procfs mount itself). Added from_proto_with_pid() and reload_from_proto_with_pid() methods that expand binary paths during OPA data construction. Existing from_proto() / reload_from_proto() delegate with pid=0 (backward-compatible, no expansion). Added normalize_path() for relative symlink targets with .. components.
  • lib.rs: load_policy() now retains the proto for post-start OPA rebuild. After entrypoint_pid.store(), triggers a one-shot OPA rebuild with the real PID. run_policy_poll_loop() passes the PID on each hot-reload so symlinks are re-resolved.
  • sandbox-policy.rego: Deny reason for binary mismatches now leads with SYMLINK HINT and includes actionable fix guidance (readlink -f command, what to check in logs).

Design decisions

  • Expand policy data, not evaluation logic — the Rego rules and per-request evaluation path are untouched. Only the OPA data (binary list) is enriched at load time. This avoids introducing new code in the security-critical hot path.
  • Graceful degradation — if symlink resolution fails for any reason, the original path is preserved and behavior is identical to before this change. Resolution is best-effort.
  • No Rego changes needed — the existing b.path == exec.path strict equality naturally matches the expanded entry.
  • read_link over canonicalizestd::fs::canonicalize resolves /proc/<pid>/root itself (a kernel pseudo-symlink to /), stripping the prefix needed for path extraction. We use iterative read_link which reads only the specified symlink target, staying within the container namespace.

Best-effort approach and known risks

Symlink resolution is opportunistic — it improves the common case but cannot be guaranteed in all environments. When resolution fails, we are loud about it: per-binary WARN-level logs explain exactly what failed and what the operator should do. Deny reasons include prominent SYMLINK HINT text with actionable fix commands. Both flow through the gRPC LogPushLayer and are visible via openshell logs.

Environments where resolution will not work:

Environment Reason User impact
Restricted ptrace scope (kernel.yama.ptrace_scope >= 2) /proc/<pid>/root/ returns EACCES even for own PID Symlinks must be specified as canonical paths in policy
Rootless containers (rootless Docker, Podman) User namespace isolation prevents procfs root traversal Same — canonical paths required
Kubernetes pods without elevated security context Default seccomp/AppArmor profiles may block procfs root access Same — canonical paths required
Standalone/local mode (--policy-rules/--policy-data, no --sandbox-id) No retained proto to rebuild, no gRPC log push Resolution doesn't run; deny reasons appear on stdout only
Multi-level symlinks through /etc/alternatives Should work (iterative loop handles chains up to 40 levels), but unusual layouts may produce unexpected resolved paths Verify with readlink -f inside sandbox
Dynamically created symlinks after container start Resolution runs at startup and on policy reload, not continuously New symlinks won't be resolved until next policy reload

In all failure cases: the original user-specified path is preserved, the deny behavior is identical to pre-fix, and the operator gets a clear warning log explaining why resolution didn't work and what to do about it.

Testing

  • mise run pre-commit passes
  • 19 new unit tests covering:
    • normalize_path helper for ../. resolution
    • resolve_binary_in_container edge cases (glob skip, pid=0, nonexistent paths)
    • Expanded binary matching (resolved path allowed, original preserved, unrelated binaries denied)
    • Ancestor matching with expanded paths
    • Proto round-trips with _with_pid variants
    • Hot-reload behavior (engine replacement, symlink expansion on reload, LKG preservation)
    • Deny reason includes SYMLINK HINT and readlink -f command
    • Linux-specific e2e tests with real symlinks (single-level, multi-level, non-symlink, full proto-to-decision, hot-reload before/after) — gracefully skip in restricted environments
  • All 452 existing + new tests pass (449 sandbox + 5 integration)

Checklist

  • Follows Conventional Commits
  • Commits are signed off (DCO)
  • Architecture docs updated (if applicable)

@johntmyers johntmyers requested a review from a team as a code owner April 6, 2026 22:44
@johntmyers johntmyers self-assigned this Apr 6, 2026
@johntmyers johntmyers added the test:e2e Requires end-to-end coverage label Apr 6, 2026
@mjamiv
Copy link
Copy Markdown

mjamiv commented Apr 9, 2026

Confirming this affects real deployments. We run 51+ CLI tools in an OpenShell sandbox, all symlinked from /sandbox/.local/bin/ → actual binaries elsewhere. When the proxy resolves through the symlink before checking the binary allowlist, tools that should be permitted get blocked.

Current workaround: list both the symlink path AND the resolved binary path in the policy binaries: section. This is brittle and scales poorly with tool count.

Would be great to see this merged — it would simplify our policy config significantly.

@johntmyers
Copy link
Copy Markdown
Collaborator Author

Thanks @mjamiv any chance you were able to build this branch and verify?

Policy binary paths specified as symlinks (e.g., /usr/bin/python3) were
silently denied because the kernel reports the canonical path via
/proc/<pid>/exe (e.g., /usr/bin/python3.11). The strict string equality
in Rego never matched.

Expand policy binary paths by resolving symlinks through the container
filesystem (/proc/<pid>/root/) after the entrypoint starts. The OPA data
now contains both the original and resolved paths, so Rego's existing
strict equality check naturally matches either.

- Add resolve_binary_in_container() helper for Linux symlink resolution
- Add from_proto_with_pid() and reload_from_proto_with_pid() to OpaEngine
- Trigger one-shot OPA rebuild after entrypoint_pid is stored
- Thread entrypoint_pid through run_policy_poll_loop for hot-reloads
- Improve deny reason with symlink debugging hint
- Add 18 new tests including hot-reload and Linux symlink e2e tests

Closes #770
…naccessible

The Linux-specific symlink resolution tests depend on /proc/<pid>/root/
being readable, which requires CAP_SYS_PTRACE or permissive ptrace
scope. This is unavailable in CI containers, rootless containers, and
hardened hosts. Add a procfs_root_accessible() guard that skips these
tests gracefully instead of failing.
…improve deny messages

When /proc/<pid>/root/ is inaccessible (restricted ptrace, rootless
containers, hardened hosts), resolve_binary_in_container now logs a
per-binary warning with the specific error, the path it tried, and
actionable guidance (use canonical path or grant CAP_SYS_PTRACE).
Previously this was completely silent.

The Rego deny reason for binary mismatches now leads with 'SYMLINK HINT'
and includes a concrete fix command ('readlink -f' inside the sandbox)
plus what to look for in logs if automatic resolution isn't working.
…ution

std::fs::canonicalize resolves /proc/<pid>/root itself (a kernel
pseudo-symlink to /) which strips the prefix needed for path extraction.
This caused resolution to silently fail in all environments, not just CI.

Replace with an iterative read_link loop that walks the symlink chain
within the container namespace without resolving the /proc mount point.
Add normalize_path helper for relative symlink targets containing ..
components. Update procfs_root_accessible test guard to actually probe
the full resolution path instead of just checking path existence.
@johntmyers johntmyers force-pushed the fix/770-symlink-binary-resolution branch from 8253c6c to 907b9fe Compare April 9, 2026 22:12
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Apr 9, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mjamiv
Copy link
Copy Markdown

mjamiv commented Apr 10, 2026

@johntmyers Haven't been able to build from source — no Rust toolchain on this host and it's a production VPS I'd rather not clutter.

Happy to test a pre-built binary or release candidate if one's available. Here's what I'd verify:

Test case: 51 CLI tools symlinked as /sandbox/.local/bin/<tool> → various targets (/sandbox/clawd/tools/<tool>/bin, /sandbox/.local/lib/node_modules/.bin/<tool>, /usr/local/bin/<tool>). Policy has binaries: entries using the symlink paths. On v0.0.25, the proxy resolves through the symlink before checking the allowlist, so only the canonical target path matches — the symlink paths in the policy get silently rejected.

Current workaround: Listing both symlink AND resolved paths in the binaries: section. Scales poorly with 51+ tools.

What I'd verify:

  1. Policy with only symlink paths in binaries: allows traffic
  2. The resolve_binary_in_container via /proc/<pid>/root/ resolves correctly
  3. Policy hot-reload (via openshell policy set) also picks up the symlink resolution
  4. No regression on non-symlinked binaries

If there's a dev release tag or binary I can drop in, I'll test same-day.

@mjamiv
Copy link
Copy Markdown

mjamiv commented Apr 10, 2026

@johntmyers Built, deployed, and tested. Here's what I found — the fix compiles cleanly and runs, but does not actually resolve the bug in our environment. The warning path fires unconditionally and the 403 reproduces. Details:

Environment

  • OpenShell v0.0.25 cluster (Docker container: ghcr.io/nvidia/openshell/cluster:0.0.25) on Ubuntu 24.04 VPS
  • K3s-managed sandbox (claw-test), supervisor running as root in the cluster container
  • Supervisor PID and mount namespaces match the sandbox child (verified — they share pid:[4026533081] and mnt:[4026533080])
  • CAP_SYS_PTRACE IS present in CapEff (0x00000004a82c35fbCAP_SYS_PTRACE bit set)

Build

  • rustc 1.94.1, cargo build --release -p openshell-sandbox on host (4 cores, CARGO_BUILD_JOBS=2)
  • ~6m35s build, 16MB binary, 0.0.27-dev.8+g907b9fe
  • 14 warnings (mostly dead-code), no errors

Deployment

  • docker cp the patched binary to /opt/openshell/bin/openshell-sandbox in the cluster container
  • kill -9 the running supervisor, K3s respawned with the new binary
  • Backup of stock v0.0.25 retained, rollback was clean

Test case

  • Policy lists /usr/bin/python3 in binaries: (not /usr/bin/python3.12)
  • Inside the sandbox: /usr/bin/python3 -> python3.12 (standard Debian symlink)
  • Policy group brave_search allows api.search.brave.com
  • Before: python3 -c 'import urllib.request; urllib.request.urlopen("https://api.search.brave.com")'Tunnel connection failed: 403 Forbidden

With the patched supervisor

Same 403. The supervisor logs 23 warnings at startup (once per policy binary ref), all identical:

WARN openshell_sandbox::opa: Cannot access container filesystem for symlink resolution;
binary paths in policy will be matched literally. If a policy binary is a symlink
(e.g., /usr/bin/python3 -> python3.11), use the canonical path instead, or run with CAP_SYS_PTRACE

No Resolved policy binary symlink info logs — every call to resolve_binary_in_container() hits the error branch in symlink_metadata(). It fires for non-symlinks too (/usr/bin/bash, /usr/bin/gh), which means symlink_metadata() itself is erroring on the /proc/<pid>/root/<path> lookup, not failing on the symlink check.

The puzzle

Running the exact same access manually from within the supervisor's PID + mount namespace works fine:

$ nsenter -t <supervisor_pid> -p -m sh
$ ls -la /proc/<child_pid>/root/usr/bin/python3
lrwxrwxrwx 1 root root 10 Nov 12 12:15 /proc/<child_pid>/root/usr/bin/python3 -> python3.12

So the path exists, the namespaces are right, CAP_SYS_PTRACE is held, and the supervisor is root. Yet std::fs::symlink_metadata("/proc/<pid>/root/usr/bin/bash") returns Err.

Best guesses

  1. Timing: reload_from_proto_with_pid(proto, handle.pid()) runs immediately after ProcessHandle::spawn in lib.rs:728. At that moment the child may have the PID allocated but /proc/<pid>/root/ may not be populated yet (pre-exec or pre-namespace-setup). The poll loop at 10s intervals never re-runs the resolve either — it calls reload_from_proto_with_pid with entrypoint_pid.load(), but I see zero Resolved policy binary symlink logs after 15+ seconds of uptime.
  2. PID translation: handle.pid() might be returning a PID in a namespace different from the one the /proc fs lookup uses (though our ns verification says they match).
  3. The error = %e field isn't showing up in log output — the default tracing subscriber elides fields. It'd be very helpful if the error message was appended to the warning body itself, or if there was a RUST_LOG=openshell_sandbox::opa=debug path that dumps the raw io::Error.

What would help

  • Include error = %e inline in the warning string so we can see ENOENT vs EACCES vs something stranger
  • Add a tracing::debug log right before the symlink_metadata call printing the exact container_path being tested
  • Consider calling resolve_binary_in_container later in the startup sequence, after the child has had a few hundred ms to initialize, or from the first policy poll tick rather than immediately after spawn

Happy to rebuild with an instrumented version if you want me to patch in a few extra logs and re-test. I also kept the build cache so re-runs are ~1 min.

Rollback verified clean, sandbox is back on stock v0.0.25.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

test:e2e Requires end-to-end coverage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Sandbox egress proxy checks resolved binary path, not symlink — python3 silently blocked

2 participants