-
Notifications
You must be signed in to change notification settings - Fork 63
[slice] Cluster scoped #827
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
slice/internal/core/slice.go
Outdated
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: SliceName(wl.Name, podSetName), | ||
| Namespace: wl.Namespace, | ||
| Name: SliceName(wl.Name, podSetName), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if we have workloads with the same names and PodSet names in different namespaces? Should we include the namespace in the name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add test case for this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test "should handle two JobSets with the same name in different namespaces", which passes without any additional changes. The thing is that each workload gets a different name (bc it is build using has of UID of the JobSet https://github.com/kubernetes-sigs/kueue/blob/36e330ef5aa07ec20157cd523805cd173dac488b/pkg/controller/jobframework/workload_names.go#L52).
Do you think its possible to construct a test in which two workloads have the same names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, what if the user creates a prebuilt workload?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see. I think you are right. So I prepended the namespace to the slice name for fix the potential name collision. Added a unit test "should create a slice for another workload with the same name but in a different namespace" that should cover this scenario. Please take a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, we didn’t handle removing slices when a workload was deleted due to ownerReference. I think we should add this logic to the reconciler to handle workload deletion properly.
If workload is deleted then xpk/slice/internal/controller/workload_controller.go Lines 109 to 112 in cf114f1
I can't think of a scenario that requires additional cleanup (also, none of our e2e tests is failing with this change). What am I missing? |
Before this, we try to get the workload, and if it doesn’t exist, we skip the reconciliation. But probably it shouldn't be a problem because we have finalizer, right? |
I think so too. We delete the finalizer only after performing the cleanup so the workload cannot be deleted if the slices still exist. |
| s.Annotations["slice.accelerator.gke.io/owner-workload-name"] = name | ||
| s.Annotations["slice.accelerator.gke.io/owner-workload-namespace"] = ns |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use const vals here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
|
Overall LGTM |
Description
The Slice CR is changed from namespaced to cluster-scoped. This PR is to align with this change.
Issue
Testing