Skip to content

eks: support nodepool labels and taints#6744

Open
geojaz wants to merge 2 commits into
GoogleCloudPlatform:masterfrom
onix-net:geojaz/eks-nodepool-labels-taints
Open

eks: support nodepool labels and taints#6744
geojaz wants to merge 2 commits into
GoogleCloudPlatform:masterfrom
onix-net:geojaz/eks-nodepool-labels-taints

Conversation

@geojaz

@geojaz geojaz commented Jun 5, 2026

Copy link
Copy Markdown
Collaborator

Add support for node_labels and node_taints on EKS node groups.

What this adds

  • _ParseTaint(): parses taint strings in key=value:Effect or key:Effect format into the dict structure eksctl expects in its node group JSON.
  • BaseEksCluster._RenderNodeGroupJson(): merges nodepool.node_labels into the labels dict and appends parsed taints when nodepool.node_taints is set.

These were written to simplify managing EKS clusters end-to-end with PKB for the agent_sandbox benchmark, but are neither required by nor exclusively useful for that benchmark -- any PKB benchmark that provisions EKS nodepools with label or taint requirements can use them.

Also includes a fix for AWS key import to use the fileb:// URI scheme (--public-key-material=fileb://<path>), which avoids the AWS CLI's base64 interpretation of raw key material passed as a string argument.

Tests

Added testEksClusterNodepoolLabels, testEksClusterNodepoolTaints, and testParseTaint to ElasticKubernetesServiceTest. All 27 tests in the EKS test file pass.

Thanks to @shubham-gaur-x for the inspiration and testing

`aws ec2 import-key-pair --public-key-material` rejects raw OpenSSH
public keys as "Invalid base64" when the value is passed as a CLI
string. The CLI expects either base64-encoded bytes via the string
form, or raw bytes via a fileb://<path> URI.

PKB was passing the raw key string via cat -> --public-key-material=$keyfile,
which produces the rejected case. Switch to fileb://<path> so the
CLI reads the bytes directly.

Without this fix, AwsKeyFileManager.ImportKeyfile raises
PrepareException on every AWS provisioning, blocking EksCluster (and
AwsVm) Create().
@geojaz geojaz force-pushed the geojaz/eks-nodepool-labels-taints branch 2 times, most recently from e374b59 to 2a4de67 Compare June 5, 2026 16:19
Add _ParseTaint helper and update _RenderNodeGroupJson in BaseEksCluster
to pass node_labels and node_taints through to eksctl's node group JSON.
Taints are parsed from 'key=value:Effect' or 'key:Effect' strings into
the dict format eksctl expects.
@geojaz geojaz force-pushed the geojaz/eks-nodepool-labels-taints branch from 2a4de67 to c20ce1b Compare June 5, 2026 17:10
@hubatish

hubatish commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

What are some sample values y'all are setting this to? We've certainly use(d) selectors on pkb_nodepool or similar in places, so I can see some value.. but of course there are questions of "should this go in spec config, or in hardcoded values in yaml or pkb code?" which can sometimes be unfortunately arbitrary. I have been trying to put more useful info in the nodepool spec, so I could see it but would also appreciate examples/reasoning.

@geojaz

geojaz commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator Author

What are some sample values y'all are setting this to? [...] but would also appreciate examples/reasoning.

These labels and taints should live in the nodePoolSpec because it guarantees enforcement at all times. While it's technically possible to label nodes after they come up, that can get sketchy under node churn — and in the benchmarks I've come across, node labeling happens once during provisioning rather than as an ongoing activity.
Here's the sandbox nodepool from the agent_sandbox benchmark config, which is where this PR comes from:

yamlnodepools:
  sandbox:
    vm_count: 4
    vm_spec:
      GCP: {machine_type: c4-standard-16}
      AWS: {machine_type: m8i.4xlarge}
    node_labels:
      sandbox.gke.io/runtime: runsc
    node_taints:
      - sandbox.gke.io/runtime=runsc:NoSchedule

The gVisor-installer DaemonSet selects on the label and tolerates the taint to install runsc and register it with containerd; the sandbox pods use the same selector/toleration so they land on properly-configured nodes. And I expect utility beyond this benchmark — any benchmark with heterogeneous nodepools could use these fields.

That said — good question on where this should live, at least with respect to the agent_sandbox benchmark.

Right now that runtime selector appears in both the nodepool spec and the hardcoded DaemonSet/RuntimeClass manifests, with nothing keeping them in sync; if the spec value changed, the manifests wouldn't follow. The generic node_labels/node_taints fields are the right primitive to expose, but this particular selector is really an internal invariant of how the benchmark wires gVisor together, so I'd rather single-source it than hand-maintain the same string in four places. Let me make a couple tweaks to #6740 with that in mind — I think the pool can rendezvous on the pkb_nodepool label that's already injected, with the toleration derived from node_taints, so the spec stays the source of truth.

logging.info('Successfully applied S3 PVC.')


def _ParseTaint(taint_str: str) -> dict[str, str]:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add the matching GKE wiring in _AddNodeParamsToCmd as well? It would help symmetric setup for label/taint-based pinning for cross-cloud benchmarks.. You could merge node_labels
into --node-labels and --node-taints from node_taints in google_kubernetes_engine.py.

self, nodepool: container.BaseNodePoolConfig
) -> dict[str, Any]:
"""Constructs the node group json dictionary."""
labels = {'pkb_nodepool': nodepool.name}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will work for eksctl setup, but #6748 uses eks create-group explicitly which cannot leverage this.

@ashishsuneja could we converge on the approach?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ajay — agree this is worth converging on. A few things that might help:
The kubernetes_management EKS path (#6748) already sets the pkb_nodepool label on every nodegroup — in both the eksctl-config path and the create-nodegroup payload — and already selects nodes by it (jsonpath on .metadata.labels.pkb_nodepool). So the label-based selection of single-node benchmarks need already works against nodegroups it provisions.
On the divergence you flagged: #6748 uses eks create-nodegroup directly rather than eksctl because the benchmark measures the managed-nodegroup create/upgrade/delete operations themselves — eksctl wraps those in CloudFormation, which adds provisioning latency we can't attribute to the control plane. So the two paths can't share the eksctl-config label mechanism directly, but they converge on the same pkb_nodepool contract.
Your broader point — shared nodepool-provisioning helpers so single-node benchmarks can reuse the logic — is the right direction, but it spans this PR, #6744, and the existing provisioning benchmark. Could we come to a conclusion maybe by having a short discussion or chat to agree on the shared interface before reshaping any one PR? I'm happy to align to whatever shared contract we land on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants