Skip to content

supervisor: unify per-process lifecycle and bundle per-process state#24

Merged
congwang-mk merged 3 commits intomainfrom
process-lifecycle
Apr 26, 2026
Merged

supervisor: unify per-process lifecycle and bundle per-process state#24
congwang-mk merged 3 commits intomainfrom
process-lifecycle

Conversation

@congwang-mk
Copy link
Copy Markdown
Contributor

Summary

Three-step refactor of the supervisor's per-process state, originally motivated by a leak I noticed in commit 25afe7f (Linux arm64 sandbox runtime support):

  • cow: prune virtual_cwds/dir_cache for exited PIDs — minimal fix for the leak: `prune_reused_pid` only fires when a new process recycles a numeric PID, so without periodic cleanup `CowState::virtual_cwds` and `dir_cache` grow with the number of distinct child PIDs over the supervisor's lifetime. Adds a 30s GC sweep that drops entries whose process is gone.
  • `supervisor: unify per-process lifecycle and identity` — replaces the GC backstop with proper lifecycle: a `ProcessIndex` (pid → PidKey) on `SupervisorCtx` is the source of truth for sandbox membership; every notification's pid is registered with a pidfd watcher (`AsyncFd`) that runs unified cleanup on process exit. `brk_bases` migrates from `HashMap<i32, u64>` to `HashMap<PidKey, u64>` so a recycled PID can't inherit stale brk accounting. `ProcfsState::proc_pids` is removed — sandbox membership now has a single home, and /proc handlers query `ProcessIndex` directly via a synchronous `std::sync::RwLock` (no async lock on the hot path). The fast path of `register_child_if_new` is a single read-lock acquire on already-registered pids; pidfd liveness covers identity correctness.
  • `supervisor: bundle per-process state, collapse cleanup` — final consolidation: `PerProcessState` bundles `virtual_cwd`, `brk_base`, `cow_dir_cache`, and `procfs_dir_cache` into one struct held per-pid as `Arc<AsyncMutex>` inside `ProcessIndex`. `cleanup_pid` collapses to a single `unregister(key)` — dropping the `Arc` releases everything; the previous four sequential lock acquires (cow, procfs, resource, processes) are gone. Domain structs simplify accordingly: `CowState` keeps only `branch`, `ProcfsState` keeps only `vdso_patched_addr`, `ResourceState` loses `brk_bases`. COW handlers compress into a single `cow_call!` macro because every handler now has the uniform `(notif, cow, processes, fd)` signature.

Notable design points

  • pidfd ownership lives with the watcher, not the index. `AsyncFd` couples the kernel fd's lifetime to its tokio IO-driver registration — earlier iteration that put the OwnedFd in the index and gave the watcher a bare RawFd produced a parallel-test SIGSEGV when cleanup raced fd recycling.
  • `handle_cow_chdir`'s optimistic `virtual_cwd` insert is now bounded. It still runs before the kernel chdir, but pidfd cleanup guarantees the inconsistency window can't outlive the process.
  • Defensive 5min sweep stays as a backstop for the rare case where `pidfd_open` fails (very old kernel, etc.).

Test plan

  • `cargo test --workspace --lib --tests` — 217 unit + 183 integration tests passing
  • `pytest python/tests/` — 231 tests passing
  • No new compiler warnings
  • CI on arm64 runner

🤖 Generated with Claude Code

Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
Signed-off-by: Cong Wang <cwang@multikernel.io>
@congwang-mk congwang-mk merged commit bfbc980 into main Apr 26, 2026
8 checks passed
@congwang-mk congwang-mk deleted the process-lifecycle branch April 26, 2026 07:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant