agent_sandbox: Kubernetes agent sandbox resource implementation by geojaz · Pull Request #6732 · GoogleCloudPlatform/PerfKitBenchmarker

geojaz · 2026-06-04T05:06:27Z

What

Second step of reshaping the agent sandbox into a PKB resource (follows the skeleton in #6730). This fills in the Kubernetes implementation: a config-driven spec, the install orchestration, the data manifests, and a stub benchmark that provisions the resource.

Stacked on #6730 review and merge that first. Until #6730 merges, the diff below also includes the skeleton commit; GitHub narrows it to this PR's own commits once #6730 lands.

Changes

K8sAgentSandboxConfigSpec (k8s_agent_sandbox_spec.py): config-driven, with nested controller, sandbox_template, and sandbox_warmpool sub-specs. The sandbox_template block models the upstream SandboxTemplateSpec (pod shape rendered; template-level toggles like network_policy_management / env_vars_injection_policy / service are accepted and validated as stubs). The existing agent_sandbox_* flags are bridged via _ApplyFlags; the old controller_ref flag is renamed to agent_sandbox_manifest_ref.
K8sAgentSandbox (k8s_agent_sandbox.py): _Create orchestrates gVisor install, controller install, SandboxTemplate apply, and SandboxWarmPool install (private methods reading self.spec). _Delete is a no-op (the ephemeral cluster teardown reclaims the stack). The controller-manifest configuration and install helpers are ported from the prior linux_packages implementation.
Data manifests (data/agent_sandbox/): gVisor installer assets, and sandbox-template.yaml.j2 parameterized for runtime class, image, resources, and labels. The SandboxTemplate and SandboxWarmPool share a single fixed name.
Stub benchmark (agent_sandbox_benchmark.py): a container_cluster.agent_sandbox config that constructs a K8sAgentSandbox. Run returns no samples yet.
Tests (k8s_agent_sandbox_test.py): spec decode + flag overrides, controller-manifest injection, _Create orchestration, _Delete no-op, and benchmark-config construction.

Scope

Reviewable but not yet runnable end to end. The benchmark's Run (load generator + metrics) lands in the next PR, and the GKE/EKS/AKS nodepool changes that make it actually provision come after that. Cloud-agnostic: no provider changes here.

Follow-ups

Run load generation + metrics; then GKE, EKS, AKS provider support.

Introduce the agent sandbox as a PKB resource modeled on the kubernetes inference server pattern, replacing the prior linux_package shape. This change adds only the class/spec/registration skeleton plus the cloud-agnostic container_cluster wiring. The install logic and the benchmark are added in follow-up changes. - BaseAgentSandbox resource and GetAgentSandbox factory, keyed on SANDBOX_TYPE so additional sandbox implementations can coexist. - BaseAgentSandboxConfigSpec and AgentSandboxConfigDecoder, embeddable under container_cluster in a benchmark config. - K8sAgentSandbox / K8sAgentSandboxConfigSpec: the Kubernetes (kubernetes-sigs/agent-sandbox) implementation stubs. - KubernetesCluster constructs and lifecycles cluster.agent_sandbox alongside cluster.inference_server.

…b-specs and flags Add ControllerSpec / SandboxTemplateSpec / SandboxWarmPoolSpec nested sub-specs, the agent_sandbox_* stack and controller-tuning flags bridged via _ApplyFlags, and rename the old controller_ref flag to agent_sandbox_manifest_ref.

…-op _Delete

…en test

…oxWarmPool

…spec register The concrete resource module must import its concrete spec module (as wg_serving_inference_server imports wg_serving_inference_server_spec) so the agent_sandbox_* flags and K8sAgentSandboxConfigSpec register at runtime. Without it, a real pkb.py run fails at flag parsing / config decode even though unit tests (which import the spec module directly) pass.

hubatish · 2026-06-17T18:18:51Z

  inference_server: (
      kubernetes_inference_server_spec.BaseInferenceServerConfigSpec | None
  )
+  agent_sandbox: agent_sandbox_spec.BaseAgentSandboxConfigSpec | None


Inference server set the example here, but I'm not sure it's actually the correct location as opposed to having this in root benchmark_spec.py. Namely this approach creates some circular dependency issues, where agent_sandbox wants to reference a cluster (to call methods on it) & the cluster references it to create it.
I believe wg_serving_inference_server.py & kubernetes_inference_server.py get around this by having a parent / abstract service & a child - and/or by not using pytype at that top level. But yeah putting this in benchmark_spec.py is probably the right place.
See 4daab75 for how example_resource.py was added to benchmark_spec.py. The ConstructAgentSandbox call can also take a container_cluster in its init there.

hubatish · 2026-06-17T20:21:43Z

+    'agent_sandbox_controller_otel_endpoint', None,
+    'OTLP exporter endpoint when tracing is enabled.')
+flags.DEFINE_boolean(
+    'agent_sandbox_controller_leader_elect', False,


we should be able to set all these variables just through config_overrides like --config_override=kubernetes_redis_memtier.container_cluster.agent_sandbox.image=yada & the flags are just convenience. That's not bad to have the convenience flags, but it means we probably only need flags for the ones which we're most likely to actually manually change.

hubatish · 2026-06-17T20:22:48Z

+      config_values['runtime_class'] = flag_values.agent_sandbox_runtime_class
+
+
+class SandboxWarmPoolSpec(spec.BaseSpec):


Can you split some of these to their child PRs? IDK, this seems like it is implementing the base sandbox + a bunch of features. Each feature as an individual PR would be nice.

In general it seems like a ton of customization.. we can likely hardcode many of these values.

Maybe I'm misinterpreting this (I note below they are all referenced by the parent spec)

hubatish · 2026-06-17T20:23:40Z

+#
+# Targets nodes labelled sandbox.gke.io/runtime=runsc (the label the
+# benchmark applies to the sandbox node pool).
+apiVersion: apps/v1


this turned into mostly a "setup spec" PR so I don't think the yamls are used right? Push to the PR where they are used.

geojaz added 9 commits June 3, 2026 17:35

agent_sandbox: add gVisor installer assets and sandbox manifests

be2c8ba

agent_sandbox: port controller-manifest configuration helper with tests

1a41377

agent_sandbox: implement K8sAgentSandbox._Create orchestration and no…

5567851

…-op _Delete

agent_sandbox: add stub benchmark that provisions the resource

bf852e2

agent_sandbox: pass bare manifest names to ApplyManifest and strength…

7a385ee

…en test

agent_sandbox: use a single shared name for SandboxTemplate and Sandb…

e03ceaa

…oxWarmPool

This was referenced Jun 4, 2026

agent_sandbox: load generator, metrics, and runnable benchmark #6740

Open

gke: private nodes, DNS endpoint, Dataplane V2, cost allocation, monitoring, max_pods_per_node, nodepool labels/taints #6741

Open

hubatish reviewed Jun 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agent_sandbox: Kubernetes agent sandbox resource implementation#6732

agent_sandbox: Kubernetes agent sandbox resource implementation#6732
geojaz wants to merge 9 commits into
GoogleCloudPlatform:masterfrom
onix-net:geojaz/agent-sandbox-resource

geojaz commented Jun 4, 2026

Uh oh!

hubatish Jun 17, 2026

Uh oh!

hubatish Jun 17, 2026

Uh oh!

hubatish Jun 17, 2026

Uh oh!

hubatish Jun 17, 2026

Uh oh!

hubatish Jun 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		config_values['runtime_class'] = flag_values.agent_sandbox_runtime_class


		class SandboxWarmPoolSpec(spec.BaseSpec):

Conversation

geojaz commented Jun 4, 2026

What

Changes

Scope

Follow-ups

Uh oh!

hubatish Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

hubatish Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

hubatish Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

hubatish Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

hubatish Jun 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants