Skip to content

Unify executor workload queues with tier-based scheduling#63482

Closed
anishgirianish wants to merge 1 commit intoapache:mainfrom
anishgirianish:workload-queue-refactor
Closed

Unify executor workload queues with tier-based scheduling#63482
anishgirianish wants to merge 1 commit intoapache:mainfrom
anishgirianish:workload-queue-refactor

Conversation

@anishgirianish
Copy link
Copy Markdown
Contributor


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

Summary

Refactors executor workload queue management for extensibility. No behavioral change , scheduling order, slot accounting, and all provider executors work identically to before.

Follows the direction proposed by @ferruzzi #62343 (comment).

Problem

Adding a new workload type (like ExecuteCallback or TestConnection) required touching ~6 places in BaseExecutor: a new queue dict, a new supports_* flag,slots calculation, an isinstance branch in queue_workload, a dedicated scheduling method, and isinstance branches in dequeue/trigger logic. Each provider executor that overrode queue_workload also needed updating. This made extending the executor interface unnecessarily painful.

What this does

Replaces the per-type queue dicts and boolean capability flags with three simple primitives:

  • executor_queues : a single dict keyed by workload type string (e.g. "ExecuteTask", "ExecuteCallback") instead of separate queued_tasks, queued_callbacks, queued_connection_tests dicts
  • supported_workload_types : a frozenset of type strings instead of iindividual supports_callbacks, supports_connection_test booleans
  • WorkloadQueueDef(scheduling_tier, sort_key), a small NamedTuple on each workload class that controls scheduling priority. Callbacks get tier 0, tasks get tier 1, same order as before, just explicit now.

The base class queue_workload is now generic: validate the type, store by key. Four provider executors (K8s, ECS, Batch, Lambda) no longer need their own queue_workload overrides. trigger_tasks becomes trigger_workloads since it handles all workload types now.

Adding a new workload type after this refactor

  1. Define key and queue_def on the workload dataclass
  2. Add the type string to supported_workload_types on supporting executors
  3. Handle the type in _process_workloads done

No changes needed in BaseExecutor itself.


  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@boring-cyborg boring-cyborg bot added area:Executors-core LocalExecutor & SequentialExecutor area:providers provider:amazon AWS/Amazon - related issues provider:celery provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:edge Edge Executor / Worker (AIP-69) / edge3 labels Mar 12, 2026
self.team_name: str | None = team_name
self.queued_tasks: dict[TaskInstanceKey, workloads.ExecuteTask] = {}
self.queued_callbacks: dict[str, workloads.ExecuteCallback] = {}
self.executor_queues: dict[str, dict] = defaultdict(dict)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, this is not possible. This breaks compatibility between providers and core.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jscheffl Thank you very much for the quick heads up, closing this pr as I messed up while rebasing, will create a new pr with that in mind. thank you

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jscheffl - I really like the direction this PR was going and it's a huge step toward simplifying the scheduler. Once he fixes the rebase, are you willing to chat about what he needs to do to make this back-compat?

I think we can work something out where the old parameters are retained and flagged as deprecated, and assemble the new executor_queues from those if they are present

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure and have not taken a deep look into the potential simplification but you are renaming fields in the BaseExecutor class which all Executors inherit on.

Yes, the fields are adjusted in the PR from executors. But you can mix the versions of providers, upgrade Airflow core and leave providers in old versions (actually this is a best practice in upgrading not to have complexity in triage) and you can also upgrade providers when you keep the same core version. In any of these two the renamed fields break the executors. So renaming the fields (and semantic how the fields are used) is breaking the (public) interface for executors.

Copy link
Copy Markdown
Contributor

@ferruzzi ferruzzi Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He actually had a better solution already, I'll see you over on the new PR (#63491) when you get time. He's using an @property to cover that backcompat. I think that should work.

@anishgirianish anishgirianish marked this pull request as draft March 12, 2026 21:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Executors-core LocalExecutor & SequentialExecutor area:providers provider:amazon AWS/Amazon - related issues provider:celery provider:cncf-kubernetes Kubernetes (k8s) provider related issues provider:edge Edge Executor / Worker (AIP-69) / edge3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants