Skip to content

Comments

Pass KE workload via mounted secret to workers#62129

Open
amoghrajesh wants to merge 6 commits intoapache:mainfrom
astronomer:pass-jwt-to-ke-pods-via-secret
Open

Pass KE workload via mounted secret to workers#62129
amoghrajesh wants to merge 6 commits intoapache:mainfrom
astronomer:pass-jwt-to-ke-pods-via-secret

Conversation

@amoghrajesh
Copy link
Contributor

@amoghrajesh amoghrajesh commented Feb 18, 2026


We used to pass the workload to a K8s worker using command line args which is not a good practice.

Through this PR, I am creating a K8s Secret to pass in the task workload: https://kubernetes.io/docs/concepts/configuration/secret/. The secret will contain the ExecuteTask workload JSON and it will be mounted into the worker pod at a fixed path. The pod reads the workload using --json-path instead of --json-string. The secret's lifecycle is tied to the pod via k8s ownerReferences, so it is automatically garbage collected when the pod is deleted. The cleanup CronJob acts as a fallback for any orphaned secrets.

Sizing implications?

Each Secret will be under 1 KB or less in size considering the standard fields it will have and the structure we form, making the overhead negligible even at high concurrency.

Since the scheduler now requires creating a K8s Secret for the worker to mount, the helm chart pod-launcher RBAC role has been updated to grant the scheduler permission to create, get, and patch secrets.

This is needed to create the workload secret and to set the ownerReference on it after the pod is created. This doesn't seem too bad since the scheduler is a trusted component and already had the same verbs for the pods resource.

Ran a few examples by deploying the change on K8s and this is what we see now:

  1. Fetched the args now for a running worker:
(airflow) ➜  airflow git:(pass-jwt-to-ke-pods-via-secret) ✗ kubectl get pod -n airflow --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1].spec.containers[0].args}' 
["python","-m","airflow.sdk.execution_time.execute_workload","--json-path","/run/secrets/airflow-workload/workload.json"]%              
(airflow) ➜  airflow git:(pass-jwt-to-ke-pods-via-secret) ✗ kubectl get pod -n airflow --sort-by=.metadata.creationTimestamp -o jsonpath='{.items[-1].spec.containers[0].args}'
["python","-m","airflow.sdk.execution_time.execute_workload","--json-path","/run/secrets/airflow-workload/workload.json"]%
  1. Images showing args and one of the secrets
image (70) image (69)
  1. Secrets have ownerRefs and get deleted once task is done
image image
  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Member

@jedcunningham jedcunningham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also add secrets cleanup into cleanup-pods (maybe rename it?). With ownerreferences the window is small, but it does still exist.

metadata=client.V1ObjectMeta(
name=secret_name,
namespace=self.namespace,
labels={"airflow-workload-secret": "true"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add more labels to here, like dag_id, run_id, task_id, map_index. And/or ti id.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for cleanup reasons a label with the task UUID at least would be great!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call, handled it and added dag_id, task_id, run_id, map_index and even ti_id: b4a137b

if isinstance(command[0], ExecuteTask):
workload = command[0]
command = workload_to_command_args(workload)
secret_name = f"{WORKLOAD_SECRET_VOLUME_NAME}-{pod_id}"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a "volume name" here is a bit odd...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes semantically you are right, I extracted that as a constant: WORKLOAD_SECRET_NAME and used it instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Handled in 3d773ea

if secret_name:
if pod.spec.volumes is None:
pod.spec.volumes = []
pod.spec.volumes.append(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should do this in construct_pod so that the final pod is sent to pod_mutation_hook.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, handled it in 0b3d3e9

raise
self._delete_workload_secret(f"airflow-workload-{pod_name}", namespace)

def _delete_workload_secret(self, secret_name: str, namespace: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should instead patch the secret and set ownerReferences to the pod that is using it. k8s will then automatically delete the secret when the pod is deleted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm interesting take, let me try and do that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That actually made sense, I had to look up ownerRefs and it totally was worth it, handled it in 0cb0b19 by making a patch API call for the secret

workload = command[0]
command = workload_to_command_args(workload)
secret_name = f"{WORKLOAD_SECRET_VOLUME_NAME}-{pod_id}"
self.kube_client.create_namespaced_secret(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There need to be a clear error message especially in the transitional phase especially when upgraded and the credentials are lagging permissions to create/delete secrets.

@jscheffl
Copy link
Contributor

Thanks for the PR. I really assume this is a good improvement.

Nevertheless thinking about and improving here this also adds a bit of additional complexity for cases where one or multiple remote clusters are being used to distribute workload. Means (1) when upgrading provider existing installs might run into pitfall and need to upgrade permissions allowing to add / delete secrets. So something that need to be considered when upgrading. Especially for distributed setups and then (2) also remote clusters would not grant additional permissions and we likely get a lot of trouble reports?

(3) If I consider there are people using a distributed K8s setup I'd be a bit worried if I deleted create/delete secret permission to a remote, if such "remote K8s admin" might be reluctant, would there be a way to force configure the legacy secret sharing via Pod manifest possible?

@amoghrajesh
Copy link
Contributor Author

We should also add secrets cleanup into cleanup-pods (maybe rename it?). With ownerreferences the window is small, but it does still exist.

Thanks, I had missed that flow (so many sometimes :)), added it in e7b3b5f

@amoghrajesh
Copy link
Contributor Author

@jscheffl all valid concerns, I wonder what's the best way to handle it here.

  1. For upgrade paths, when people upgrade to the new version of this provider, the new code will try to create that secret and fail with a 403, will adding something like "you need to update your RBAC" in the scheduler logs be a good way to handle this?
  2. About remote clusters, unsure how to handle it. This will involve a remote cluster admin to grant some more permissions (specially in secure environments), so I wonder what's the best way forward there.

In such cases, maybe the best fallback would be to fallback to the legacy way (using CLI args) maybe using a flag or a new configuration to keep migration smooth and not break usage?

Any thoughts @jedcunningham @jscheffl @potiuk ?

@jscheffl
Copy link
Contributor

@jscheffl all valid concerns, I wonder what's the best way to handle it here.

1. For upgrade paths, when people upgrade to the new version of this provider, the new code will try to create that secret and fail with a 403, will adding something like "you need to update your RBAC" in the scheduler logs be a good way to handle this?

2. About remote clusters, unsure how to handle it. This will involve a remote cluster admin to grant some more permissions (specially in secure environments), so I wonder what's the best way forward there.

In such cases, maybe the best fallback would be to fallback to the legacy way (using CLI args) maybe using a flag or a new configuration to keep migration smooth and not break usage?

Any thoughts @jedcunningham @jscheffl @potiuk ?

Regarding 2) I have no strong opinion. Just by the arguments... an automated fallback with logged warning might be the "nicest" and a security researcher then might complaint that such error might start dropping secrets to CLI. It might be a configurable fallback?

Regarding 1) in theory could follow whatever we decide for (2)?

@amoghrajesh
Copy link
Contributor Author

Yeah I think a configurable fallback to follow the "cli" way of doing things might be the safest path forward in terms of compat. We should make it clear in warnings about this though that RBAC needs to be updated. Hmm let me wait for others to chime in here too

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:helm-chart Airflow Helm Chart area:providers provider:cncf-kubernetes Kubernetes (k8s) provider related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants