-
Notifications
You must be signed in to change notification settings - Fork 217
OCPBUGS-65621: add dedicated service account to crb, cvo and version pod #1266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
OCPBUGS-65621: add dedicated service account to crb, cvo and version pod #1266
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThe changes refactor service account configuration for the cluster-version-operator. A ClusterRoleBinding granting cluster-admin to the default service account is removed, replaced by two new named ServiceAccounts (cluster-version-operator and update-payload) with corresponding RBAC bindings. Pod and deployment manifests are updated to reference the new named service accounts. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes ✨ Finishing touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ehearne-redhat The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/retest |
|
/test e2e-aws-ovn-upgrade |
|
/test e2e-aws-ovn-techpreview |
|
/retest |
|
/retest |
|
@ehearne-redhat: This pull request references Jira Issue OCPBUGS-65621, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@ehearne-redhat: This pull request references Jira Issue OCPBUGS-65621, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@ehearne-redhat: This pull request references Jira Issue OCPBUGS-65621, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
/retest |
1 similar comment
|
/retest |
|
/test e2e-aws-ovn-techpreview |
|
/test okd-scos-images |
install/0000_90_cluster-version-operator_02_roles-cluster-version-operator.yaml
Show resolved
Hide resolved
install/0000_00_cluster-version-operator_02_service_account.yaml
Outdated
Show resolved
Hide resolved
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRoleBinding | ||
| metadata: | ||
| name: cluster-version-operator-1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant vs. the cluster-version-operator you declare down at the end of this file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, this one is interesting. So, in order to get the into-change and out-of-change tests to pass, these two CRBs play different roles.
cluster-version-operator CRB is for the out-of-change test. It looks for this binding so it has the necessary permissions for the default service account to conduct itself.
cluster-version-operator-1 CRB is for the into-change test. This binding ensures the new service account cluster-version-operator has the necessary permissions to conduct itself.
Without cluster-version-operator CRB, the default service account appears to not have the necessary permissions and fail, probably because of the naming of the CRBs. This is important for the out-of-change test to pass.
| namespace: openshift-cluster-version | ||
| roleRef: | ||
| kind: ClusterRole | ||
| name: cluster-admin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update-payload Pod doesn't make Kube API calls at all, so I don't think we need this cluster-version-operator-payload ClusterRoleBinding.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've commented it out to test this. If proves true in testing, I'll remove entirely.
| k8s-app: cluster-version-operator | ||
| spec: | ||
| automountServiceAccountToken: false | ||
| serviceAccountName: cluster-version-operator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you either add this to the bootstrap manifest too, or have a commit message that mentions why we don't need a service account for that bootstrap manifest?
In that vein, you might want to reshuffle your existing commit stack to try and tell the transformation story in a more narrative arc. It is completely fine to take a bunch of commits, if you need more space to talk about each pivot in a series. But at the moment, there are things like fc55fa5, which sounds like useful context to include in a "why I did things this way..." commit message in a commit that adds the new role-bindings. But I don't see a benefit to keeping it completely separate, vs. having a single commit that brings in the finished roll bindings and then explains all the context you need to explain that finished shape.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added it to the bootstrap manifest. I'll wait to see how tests behave before squashing the commits into a narrative commit, or a collection of them depending.
|
/retest |
1 similar comment
|
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@install/0000_00_cluster-version-operator_03_roles.yaml`:
- Around line 31-45: The ClusterRoleBinding manifest is missing the required
roleRef.apiGroup field; update the ClusterRoleBinding (metadata.name:
cluster-version-operator) to add roleRef.apiGroup: rbac.authorization.k8s.io
alongside the existing roleRef.kind: ClusterRole and roleRef.name: cluster-admin
so the roleRef block is valid for Kubernetes RBAC (affecting the
ClusterRoleBinding that grants the ServiceAccount in namespace
openshift-cluster-version name default).
a1352a4 to
aca5426
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@install/0000_90_cluster-version-operator_02_roles.yaml`:
- Around line 16-30: Uncomment and enable the ClusterRoleBinding for the
update-payload ServiceAccount (ClusterRoleBinding named
cluster-version-operator-payload) or replace it with a least-privileged binding:
restore the commented block that binds subject kind: ServiceAccount name:
update-payload namespace: openshift-cluster-version to roleRef kind: ClusterRole
name: cluster-admin (or create a new ClusterRole/Role with minimal permissions
and reference it instead), ensure the metadata annotation
include.release.openshift.io/self-managed-high-availability remains if required,
and verify there are no duplicate bindings elsewhere for update-payload.
- Around line 31-46: Remove the legacy ClusterRoleBinding that grants
cluster-admin to the default ServiceAccount: delete the ClusterRoleBinding
resource with metadata.name "cluster-version-operator" that has roleRef.kind
"ClusterRole"/name "cluster-admin" and subjects binding kind "ServiceAccount"
name "default" in namespace "openshift-cluster-version" (the block currently
annotated with release.openshift.io/delete: "true"); this proactively cleans up
the legacy artifact since the deployment uses the dedicated
cluster-version-operator ServiceAccount and its proper binding.
| # apiVersion: rbac.authorization.k8s.io/v1 | ||
| # kind: ClusterRoleBinding | ||
| # metadata: | ||
| # name: cluster-version-operator-payload | ||
| # annotations: | ||
| # include.release.openshift.io/self-managed-high-availability: "true" | ||
| # subjects: | ||
| # - kind: ServiceAccount | ||
| # name: update-payload | ||
| # namespace: openshift-cluster-version | ||
| # roleRef: | ||
| # kind: ClusterRole | ||
| # name: cluster-admin | ||
| # apiGroup: rbac.authorization.k8s.io | ||
| # --- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enable RBAC binding for the update-payload ServiceAccount.
This block is commented out (Line 16-30), so the update-payload SA has no ClusterRoleBinding here, which conflicts with the PR objective of a dedicated SA for version pods. If the payload pod needs cluster-level access, this will break. Please confirm intent and enable (or replace with least-privileged RBAC).
🛠️ Proposed fix (uncomment the binding)
-# apiVersion: rbac.authorization.k8s.io/v1
-# kind: ClusterRoleBinding
-# metadata:
-# name: cluster-version-operator-payload
-# annotations:
-# include.release.openshift.io/self-managed-high-availability: "true"
-# subjects:
-# - kind: ServiceAccount
-# name: update-payload
-# namespace: openshift-cluster-version
-# roleRef:
-# kind: ClusterRole
-# name: cluster-admin
-# apiGroup: rbac.authorization.k8s.io
-# ---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: ClusterRoleBinding
+metadata:
+ name: cluster-version-operator-payload
+ annotations:
+ include.release.openshift.io/self-managed-high-availability: "true"
+subjects:
+- kind: ServiceAccount
+ name: update-payload
+ namespace: openshift-cluster-version
+roleRef:
+ kind: ClusterRole
+ name: cluster-admin
+ apiGroup: rbac.authorization.k8s.io
+---If you want to verify whether another binding already exists, you can run:
#!/bin/bash
# Look for any RBAC bindings referencing update-payload
rg -n -C2 'update-payload' -g '*.yaml'
rg -n -C2 'cluster-version-operator-payload' -g '*.yaml'🤖 Prompt for AI Agents
In `@install/0000_90_cluster-version-operator_02_roles.yaml` around lines 16 -
30, Uncomment and enable the ClusterRoleBinding for the update-payload
ServiceAccount (ClusterRoleBinding named cluster-version-operator-payload) or
replace it with a least-privileged binding: restore the commented block that
binds subject kind: ServiceAccount name: update-payload namespace:
openshift-cluster-version to roleRef kind: ClusterRole name: cluster-admin (or
create a new ClusterRole/Role with minimal permissions and reference it
instead), ensure the metadata annotation
include.release.openshift.io/self-managed-high-availability remains if required,
and verify there are no duplicate bindings elsewhere for update-payload.
| apiVersion: rbac.authorization.k8s.io/v1 | ||
| kind: ClusterRoleBinding | ||
| metadata: | ||
| name: cluster-version-operator | ||
| annotations: | ||
| kubernetes.io/description: Grant the cluster-version operator permission to perform cluster-admin actions while managing the OpenShift core. | ||
| include.release.openshift.io/self-managed-high-availability: "true" | ||
| release.openshift.io/delete: "true" | ||
| roleRef: | ||
| kind: ClusterRole | ||
| name: cluster-admin | ||
| apiGroup: rbac.authorization.k8s.io | ||
| subjects: | ||
| - kind: ServiceAccount | ||
| namespace: openshift-cluster-version | ||
| name: default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# Find the file mentioned in the review
find . -name "*cluster-version-operator*" -type f | head -20Repository: openshift/cluster-version-operator
Length of output: 376
🏁 Script executed:
# List files in the install directory to understand structure
ls -la install/ 2>/dev/null | head -30Repository: openshift/cluster-version-operator
Length of output: 1614
🏁 Script executed:
# Check if the exact file exists
cat -n install/0000_90_cluster-version-operator_02_roles.yaml 2>/dev/null | head -60Repository: openshift/cluster-version-operator
Length of output: 1775
🏁 Script executed:
# Check if there are ServiceAccount definitions in the directory
grep -r "kind: ServiceAccount" install/ | head -10Repository: openshift/cluster-version-operator
Length of output: 579
🏁 Script executed:
# Look for other references to "default" ServiceAccount in cluster-version context
rg "default" install/ -A2 -B2 | head -40Repository: openshift/cluster-version-operator
Length of output: 1759
🏁 Script executed:
# Check git history or comments to understand the intent
head -5 install/0000_90_cluster-version-operator_02_roles.yamlRepository: openshift/cluster-version-operator
Length of output: 203
🏁 Script executed:
# Check the service account definitions to understand which is actually used
cat -n install/0000_00_cluster-version-operator_02_service_account.yamlRepository: openshift/cluster-version-operator
Length of output: 772
🏁 Script executed:
# Check deployment to see which ServiceAccount it uses
grep -A5 -B5 "serviceAccountName\|serviceAccount:" install/0000_00_cluster-version-operator_03_deployment.yamlRepository: openshift/cluster-version-operator
Length of output: 426
The default ServiceAccount binding is already marked for deletion.
This binding (lines 31–46) does grant cluster-admin to the default SA, but the release.openshift.io/delete: "true" annotation marks it as a legacy artifact scheduled for deletion during cluster upgrades. The actual deployment uses the dedicated cluster-version-operator ServiceAccount (see deployment.yaml and the proper binding at lines 1–14), so this is not an active security issue. If you want to clean up the legacy binding proactively, removing it is safe, but it's not a security regression since automated deletion markers are in place.
🤖 Prompt for AI Agents
In `@install/0000_90_cluster-version-operator_02_roles.yaml` around lines 31 -
46, Remove the legacy ClusterRoleBinding that grants cluster-admin to the
default ServiceAccount: delete the ClusterRoleBinding resource with
metadata.name "cluster-version-operator" that has roleRef.kind
"ClusterRole"/name "cluster-admin" and subjects binding kind "ServiceAccount"
name "default" in namespace "openshift-cluster-version" (the block currently
annotated with release.openshift.io/delete: "true"); this proactively cleans up
the legacy artifact since the deployment uses the dedicated
cluster-version-operator ServiceAccount and its proper binding.
|
/retest |
aca5426 to
0b2dc38
Compare
|
@ehearne-redhat: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What
openshift-cluster-versionnamespace.Why