-
Notifications
You must be signed in to change notification settings - Fork 101
💥 Add explicit Worker configuration for type (workflow/activity/nexus) #1059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Tuner configuration allows conflicting settings.
The validation check for mutual exclusivity between tuner and max_outstanding_* fields doesn't include max_outstanding_nexus_tasks. This allows users to set both a tuner and max_outstanding_nexus_tasks, which should be mutually exclusive like the other max_outstanding_* fields, potentially causing configuration conflicts.
crates/common/src/worker.rs#L354-L362
sdk-core/crates/common/src/worker.rs
Lines 354 to 362 in 07538f4
| return Err( | |
| "Legacy build id-based versioning must have a non-empty build_id" | |
| .to_owned(), | |
| ); | |
| } | |
| } | |
| } | |
| } | |
crates/sdk-core/src/worker/mod.rs
Outdated
| future::pending::<()>().await; | ||
| unreachable!() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are in activity-only mode, I think this could hang shutdown. If non_local_activities_complete is set true up at line 867, in the next poll we'll end up inside the pending at 859 and be stuck there forever because local_activities_complete was never set true.
I think we need to set it true in this branch, probably.
At minimum, we should verify that shutting down an activity-only worker works properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, added unit and integration tests testing this shutdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Configuration conflict bypasses validation.
The validation check for mutual exclusivity between tuner and max_outstanding_* fields doesn't include max_outstanding_nexus_tasks. This allows users to set both a tuner and max_outstanding_nexus_tasks, which violates the documented constraint that these fields are mutually exclusive. The check should include self.max_outstanding_nexus_tasks.is_some() in the condition.
crates/common/src/worker.rs#L369-L373
sdk-core/crates/common/src/worker.rs
Lines 369 to 373 in 85ca74b
| { | |
| return Err("max_outstanding_* fields are mutually exclusive with `tuner`".to_owned()); | |
| } | |
| if let Some(wv) = self.versioning_strategy.as_ref() { |
| if let Some(la_mgr) = &self.local_act_mgr | ||
| && self | ||
| .handle_la_complete_action(la_mgr.complete(&task_token, la_res)) | ||
| .is_some() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Inconsistent Local Activity Error Handling
The complete_local_act function silently succeeds when local_act_mgr is None (workflows disabled), but this is inconsistent with how other task completions are handled. When attempting to complete an activity task with a local activity task token on a worker without workflows enabled, it should return an error like CompleteActivityError::ActivityNotEnabled rather than silently doing nothing, matching the error handling pattern used for regular activities and other task types.
| let worker = starter.get_worker().await; | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we'll want to provide a variant of this test where there is one task of each of the enabled types that gets polled for and completed, just to make sure that once the machinery is active shutdown still works.
Sorry, I know it's a bit annoying but we've got to be super careful with the shutdown stuff.
You can also potentially use rstest to help with the parameterization here
| pub max_cached_workflows: u32, | ||
| pub tuner: TunerHolder, | ||
| pub no_remote_activities: bool, | ||
| pub task_types: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over the C boundary, I think it's definitely clearer to just use three bools instead of having these magic bitmask integers the SDK will hardcode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ended up making a struct of bools
| /// | ||
| /// Note: At least one task type must be specified or the worker will fail validation. | ||
| #[builder(default = "worker_task_types::all()")] | ||
| pub task_types: WorkerTaskTypes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure just three bools is simplest, even if you have to make a hash internally, but I guess not a big deal. Also at this point it could be argued it shouldn't have a default and should be required to be set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanna say having default be all by default matches existing behavior today, but then again not sure it matters much since it's not user facing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on whether we should have a default since every SDK needs to start setting this anyways and the default therefore will never be used if done properly by SDKs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair point, I can remove the default to force SDKs to be explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I'm ok w/ defaulted too, just that I hope SDKs will ignore that and always set.
| /// | ||
| /// Note: At least one task type must be specified or the worker will fail validation. | ||
| #[builder(default = "worker_task_types::all()")] | ||
| pub task_types: WorkerTaskTypes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there somewhere in the client code that has to be altered to properly account for this as part of the worker dedupe check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Planning on keeping that PR separate, #1059, since that feature introduces different behavior, whereas this PR introduces a new feature, but maintains existing behavior, and to make review easier
cretz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not review rest of logic beyond just the lang-facing structure
| uint32_t max_cached_workflows; | ||
| struct TemporalCoreTunerHolder tuner; | ||
| bool no_remote_activities; | ||
| struct TemporalCoreWorkerTaskTypes task_types; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't need to be a whole separate struct IMO just for the few fields within, but it's mostly harmless
| /// | ||
| /// Note: At least one task type must be specified or the worker will fail validation. | ||
| #[builder(default = "worker_task_types::all()")] | ||
| pub task_types: WorkerTaskTypes, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No strong opinion on whether we should have a default since every SDK needs to start setting this anyways and the default therefore will never be used if done properly by SDKs
| #[case::activity_and_nexus_idle(false, true, true, false, "activity-nexus-idle")] | ||
| #[tokio::test] | ||
| async fn test_task_type_combinations_unified( | ||
| #[case] enable_workflows: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could've used #[values] here to have it do the combination matrix for you, but, this is fine.
What was changed
Now when registering a worker, users can explicitly specify workflow/nexus/activity workers, instead of always waiting for polls for workflow/nexus workers.
💥 Removed the
no_remote_activitiesin favor of this new WorkerConfig optionWhy?
This came up in #1058, to allow for multiple workers with different types to be registered to the same TQ.
This should also simplify shutdown, and make things a little more efficient.
Checklist
Closes
How was this tested:
Added some tests
Note
Introduce
WorkerTaskTypesto selectively enable polling for workflows, activities, and nexus; removeno_remote_activities; update worker behavior, errors, C-bridge, and tests accordingly.WorkerTaskTypes(all,workflow_only,activity_only,nexus_only) and replaceWorkerConfig.no_remote_activitieswithtask_types.max_cached_workflowsonly when workflows enabled; tighten poller/cache constraints.task_types; workflows/local activities/nexus are optional.ShutDownwhen polling disabled.WorkflowNotEnabled,ActivityNotEnabled,NexusNotEnabledfor inappropriate completions.TemporalCoreWorkerTaskTypes; map to coreWorkerTaskTypes.task_types.task_typesand new behavior.Written by Cursor Bugbot for commit 5deef19. This will update automatically on new commits. Configure here.