Skip to content

Feat/agent os sandbox create options#4549

Draft
brittianwarner wants to merge 3 commits intomainfrom
feat/agent-os-sandbox-create-options
Draft

Feat/agent os sandbox create options#4549
brittianwarner wants to merge 3 commits intomainfrom
feat/agent-os-sandbox-create-options

Conversation

@brittianwarner
Copy link
Copy Markdown

Description

Please include a summary of the changes and the related issue. Please also include relevant motivation and context.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

…unting

The agentOs() sandbox config previously only accepted static options,
meaning all actor instances shared a single sandbox container. This adds
createOptions as a mutually exclusive alternative that is called
per-actor-instance so each gets its own dedicated sandbox.

Core implementation:
- Add createOptions to Zod schema with XOR .refine() enforcement
- Create AgentOsActorOptionsConfig exclusive union type (?: never pattern)
- Rewrite ensureVm() to resolve options from createOptions(c) or config.options
- Add promise-caching guard (vmBootPromise) to prevent concurrent actions
  from creating duplicate VMs via race condition
- Add defensive null-check on createOptions return with clear error message
- Wrap createOptions errors with context message and { cause } for stack traces
- Add debug log before factory invocation for slow-boot observability
- Clear vmBootPromise in onSleep and onDestroy

DX improvements:
- Add JSDoc to options and createOptions union members for IDE hover docs
- Fix any leak in AgentOsActorContext and AgentOsActionContext database
  param (any -> DatabaseProvider<RawAccess>)
- Fix zFunction() Zod error message ('Expected a function' instead of
  generic 'Invalid input')
- Export AgentOsActorContext, AgentOsCreateContext alias, AgentOsActorConfigInput

Example and docs:
- Rewrite sandbox example to use createOptions with context param
- Fix 'Sandbox extension' -> 'Sandbox mounting' terminology in client.ts
- Add missing deps to examples/agent-os/package.json
- Add 'Per-instance options' section to configuration.mdx
- Fix stale sandbox: { enabled: true } API in index.mdx crash course
- Fix missing .ts import extension in sandbox.mdx

Tests:
- Add agentOsCreateOptionsTestActor fixture to driver-test-suite
- Add 3 driver tests: writeFile round-trip, exec, two-instance isolation
- Add agent-os-config-validation.test.ts with 7 tests (Zod XOR
  enforcement, runtime factory boot, context availability)
- Add 9 type-level assertions to actor-types.test.ts (exclusive union,
  context typing, c.db not any, async callback, generic propagation)
Mirrors the sandbox actor pattern where the sandbox ID is stored in
actor state so that after a sleep/wake cycle the createOptions callback
can reconnect to the same container instead of provisioning a new one.

State changes:
- Add sandboxId: string | null to AgentOsActorState (persisted across
  sleep/wake). The createOptions callback reads c.state.sandboxId and
  passes it to SandboxAgent.start(), then persists the returned ID back.

Config changes:
- Add destroyOptions callback (only valid with createOptions, enforced
  by both TypeScript union and Zod refine). Called during onDestroy so
  the user can clean up external resources like sandbox containers.

Lifecycle hardening:
- Wrap dispose() in try/finally in both onSleep and onDestroy so vars
  are always cleaned up even if dispose throws, matching the sandbox
  actor's teardownAgentRuntime pattern.
- Clear sandboxId in a finally block in onDestroy so state is always
  consistent regardless of whether destroyOptions succeeds or fails.

Tests:
- Type tests for sandboxId typing, destroyOptions union exclusivity
- Zod schema tests for destroyOptions acceptance, rejection with
  options, and non-function rejection
- Runtime tests for sandboxId state persistence and destroyOptions
  config acceptance

Docs:
- Update sandbox.mdx, configuration.mdx, and index.mdx with the
  sandboxId reuse pattern and destroyOptions documentation
- Fix 'extension' -> 'sandbox mounting' terminology in sandbox.mdx
@railway-app
Copy link
Copy Markdown

railway-app bot commented Apr 3, 2026

This PR was not deployed automatically as @brittianwarner does not have access to the Railway project.

In order to get automatic PR deploys, please add @brittianwarner to your workspace on Railway.

@claude
Copy link
Copy Markdown

claude bot commented Apr 3, 2026

PR Review: Feat/agent os sandbox create options

This PR adds a createOptions factory function to agentOs() as a per-instance alternative to the existing static options field, along with a concurrent-boot guard, improved type safety, and updated docs/examples. The overall design is sound and follows established patterns in the codebase.


Critical Issue: c.getSandbox() Is Not Implemented

The example (examples/agent-os/src/sandbox/server.ts) and the documentation (website/src/content/docs/agent-os/sandbox.mdx, website/src/content/docs/agent-os/index.mdx) both reference c.getSandbox(), but this method does not appear to exist on AgentOsContext or the underlying ActorContext.

If a user follows the documented sandbox mounting pattern with createOptions, this will throw a runtime error. Either the implementation of getSandbox needs to be added to AgentOsContext, or the docs and example must be rewritten to show the manual approach without referencing an unimplemented method.


Boot-Guard Correctness: Resolved Promise Not Cleared

Once the boot promise resolves successfully, c.vars.vmBootGuard is never set back to null. The fast path (if (c.vars.agentOs)) correctly short-circuits on subsequent calls, so this is not a functional bug. However, leaving the resolved promise in vmBootGuard after the VM is ready is misleading and could mask future refactors. Consider clearing vmBootGuard immediately after setting c.vars.agentOs inside the boot IIFE.


Error Message Inaccuracy in Falsy Guard

The error thrown when resolvedOptions is falsy blames createOptions specifically, but the same code path is reached even when the static options branch was taken. The check and message should be split to distinguish the two paths.


Minor: Missing @ nocheck on Doc Code Blocks

Per CLAUDE.md, all TypeScript code blocks in docs are typechecked during the website build. The new primary examples in sandbox.mdx and index.mdx import from packages like sandbox-agent/docker. Verify these blocks are marked with @nocheck if they cannot be compiled in the typecheck environment (the bottom examples in the diff already carry this annotation).


Positive Observations

  • The concurrent-boot guard (vmBootGuard) is a solid improvement. The IIFE pattern correctly assigns the promise before awaiting it, so racing callers join the existing boot rather than starting a second one. Error cleanup is correct (guard cleared on rejection).
  • The try/finally pattern in onSleep and onDestroy ensures agentOs and vmBootGuard are always reset even if dispose() throws.
  • Replacing the biome-ignore empty-type comment with an actual interface AgentOsActorState definition is cleaner and more forward-compatible.
  • The TypeScript exclusive-union approach (createOptions?: never / options?: never) gives excellent compile-time ergonomics.
  • The new type tests in actor-types.test.ts are thorough and well-structured.
  • "Sandbox extension" is correctly replaced with "sandbox mounting" per CLAUDE.md.

Required before merge:

  1. Implement c.getSandbox() on AgentOsContext, or rmeve all references to it from docs/examples and replace with the correct manual pattern.
  2. Verify all new doc code blocks in sandbox.mdx and index.mdx have @nocheck annotations where needed.

@@ -51,6 +55,60 @@ export const registry = setup({ use: { vm } });
registry.start();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplify to:

  • static
    vs
  • dynamic

then just include a brief snippet showing dynamic options. show using the key to dynamically determine what to do.

example: pass env var based on c.key

Remove destroyOptions (deferred to c.getSandbox() auto-destroy).
Rename AgentOsActorContext to AgentOsContext, vmBootPromise to
vmBootGuard. Fix any types in zFunction. Simplify docs to static
vs dynamic sections with c.key example. Update sandbox docs and
example to c.getSandbox() API. Delete config validation tests.
Remove destroyOptions type tests and @rivet-dev/agent-os-sandbox dep.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants