Skip to content

Fix docs usability and known issues#355

Merged
a-mccarthy merged 5 commits intoNVIDIA:mainfrom
a-mccarthy:lifecycle-updates
Mar 4, 2026
Merged

Fix docs usability and known issues#355
a-mccarthy merged 5 commits intoNVIDIA:mainfrom
a-mccarthy:lifecycle-updates

Conversation

@a-mccarthy
Copy link
Collaborator

@a-mccarthy a-mccarthy commented Mar 2, 2026

This PR includes

  • a known issue for 25.10.1,
  • lifecycle terminology update
  • highlighting openshift docs links better in the docs, especially on the install page

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>
@github-actions
Copy link

github-actions bot commented Mar 2, 2026

Documentation preview

https://nvidia.github.io/cloud-native-docs/review/pr-355

Comment on lines +89 to +92
* When using RKE2 on RHEL 9.6 with SELinux enforcing mode enabled, the MIG Manager will not install correctly.
This happens because the GPU Operator doesn't have the correct permissions to read from the feature file directly on nodes in the cluster.
This prevents the GPU Operator from correctly evaluating if a node has MIG-capable GPUs on it and whether the GPU Operator should deploy the MIG Manager.
Workaround this issue by enabling NVIDIA GPU Feature Discovery to use the Node Feature API by default in ClusterPolicy:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* When using RKE2 on RHEL 9.6 with SELinux enforcing mode enabled, the MIG Manager will not install correctly.
This happens because the GPU Operator doesn't have the correct permissions to read from the feature file directly on nodes in the cluster.
This prevents the GPU Operator from correctly evaluating if a node has MIG-capable GPUs on it and whether the GPU Operator should deploy the MIG Manager.
Workaround this issue by enabling NVIDIA GPU Feature Discovery to use the Node Feature API by default in ClusterPolicy:
* When deploying the GPU Operator on systems with SELinux in enforcing mode, the MIG Manager will not get scheduled to GPU nodes. This is because the GPU Feature Discovery pod, which adds a label to nodes indicating if a MIG-capable GPU is present, fails to label nodes due to insufficient permissions on Node Feature Discovery's feature file drop-in directory. Workaround this issue by configuring NVIDIA GPU Feature Discovery to use the Node Feature API instead of feature files in ClusterPolicy:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is my suggestion for how to word this. Feel free to push back.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cdesiniotis thanks! i made some updates here, can you take another look?

Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>
Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>
@a-mccarthy a-mccarthy changed the title known issue for 25.10.1 and lifecycle terminology update Fix docs usability and known issues Mar 3, 2026
Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>
Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>
@a-mccarthy a-mccarthy merged commit a2edf56 into NVIDIA:main Mar 4, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants