From 293a03ca401c7a3768fe4b5ca30baaaa0c64ff2e Mon Sep 17 00:00:00 2001 From: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> Date: Sun, 1 Mar 2026 21:55:18 -0500 Subject: [PATCH 1/5] known issue for 25.10.1 and lifecycle terminology update Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> --- gpu-operator/life-cycle-policy.rst | 6 +++--- gpu-operator/release-notes.rst | 17 +++++++++++++++++ 2 files changed, 20 insertions(+), 3 deletions(-) diff --git a/gpu-operator/life-cycle-policy.rst b/gpu-operator/life-cycle-policy.rst index dcb629ee6..f22b2f86b 100644 --- a/gpu-operator/life-cycle-policy.rst +++ b/gpu-operator/life-cycle-policy.rst @@ -56,13 +56,13 @@ The product life cycle and versioning are subject to change in the future. - Status * - 25.10.x - - Generally Available + - Supported * - 25.3.x - - Maintenance + - Deprecated * - 24.9.x and lower - - EOL + - End of Support .. _operator-component-matrix: diff --git a/gpu-operator/release-notes.rst b/gpu-operator/release-notes.rst index bd4634d96..7422126c7 100644 --- a/gpu-operator/release-notes.rst +++ b/gpu-operator/release-notes.rst @@ -83,6 +83,23 @@ Fixed Issues * Fixed a bug where the k8s-driver-manager would wait indefinitely when MOFED is enabled and ``USE_HOST_MOFED`` is set to true despite the MOFED being pre-installed on the host. +Known Issues +------------ + +* When using RKE2 on RHEL 9.6 with SELinux enforcing mode enabled, the MIG Manager will not install correctly. + This happens because the GPU Operator doesn't have the correct permissions to read from the feature file directly on nodes in the cluster. + This prevents the GPU Operator from correctly evaluating if a node has MIG-capable GPUs on it and whether the GPU Operator should deploy the MIG Manager. + Workaround this issue by enabling NVIDIA GPU Feature Discovery to use the Node Feature API by default in ClusterPolicy: + + .. code-block:: yaml + + gfd: + env: + - name: USE_NODE_FEATURE_API + value: "true" + + + .. _v25.10.0: 25.10.0 From ecba520c0bcc5081a3c77bbe6bbdaa2e07d15782 Mon Sep 17 00:00:00 2001 From: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> Date: Tue, 3 Mar 2026 10:14:39 -0500 Subject: [PATCH 2/5] update release note, update full lifecycle section Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> --- gpu-operator/life-cycle-policy.rst | 5 ++--- gpu-operator/release-notes.rst | 7 +++---- 2 files changed, 5 insertions(+), 7 deletions(-) diff --git a/gpu-operator/life-cycle-policy.rst b/gpu-operator/life-cycle-policy.rst index f22b2f86b..69f71f832 100644 --- a/gpu-operator/life-cycle-policy.rst +++ b/gpu-operator/life-cycle-policy.rst @@ -39,9 +39,8 @@ Patch releases typically include critical bug and CVE fixes, but can include min NVIDIA GPU Operator Life Cycle ****************************** -When a major version of NVIDIA GPU Operator is released, the previous major version enters maintenance support -and only receives patch release updates for critical bug and CVE fixes. -All prior major versions enter end-of-life (EOL) and are no longer supported and do not receive patch release updates. +When a new major version of NVIDIA GPU Operator is released, the previous major version enters deprecated support and only receives patch release updates for critical bug and CVE fixes. +All prior major versions enter end of support and are no longer supported and do not receive patch release updates. The product life cycle and versioning are subject to change in the future. diff --git a/gpu-operator/release-notes.rst b/gpu-operator/release-notes.rst index 7422126c7..4a468697c 100644 --- a/gpu-operator/release-notes.rst +++ b/gpu-operator/release-notes.rst @@ -86,10 +86,9 @@ Fixed Issues Known Issues ------------ -* When using RKE2 on RHEL 9.6 with SELinux enforcing mode enabled, the MIG Manager will not install correctly. - This happens because the GPU Operator doesn't have the correct permissions to read from the feature file directly on nodes in the cluster. - This prevents the GPU Operator from correctly evaluating if a node has MIG-capable GPUs on it and whether the GPU Operator should deploy the MIG Manager. - Workaround this issue by enabling NVIDIA GPU Feature Discovery to use the Node Feature API by default in ClusterPolicy: +* When deploying the GPU Operator on systems with SELinux in enforcing mode, the MIG Manager does not get scheduled on GPU nodes. + This happens because the GPU Feature Discovery pod has insufficient permissions on Node Feature Discovery's feature-file drop-in directory, so it cannot add the label that indicates a MIG-capable GPU is present. + To work around this issue, configure NVIDIA GPU Feature Discovery to use the Node Feature API instead of feature files in ClusterPolicy: .. code-block:: yaml From 1f788aab54a96711067c2c5a318b8abcbc930229 Mon Sep 17 00:00:00 2001 From: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> Date: Tue, 3 Mar 2026 14:44:16 -0500 Subject: [PATCH 3/5] Highlight openshift docs better Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> --- gpu-operator/getting-started.rst | 5 +++++ gpu-operator/index.rst | 7 +++++++ gpu-operator/overview.rst | 15 ++++++--------- 3 files changed, 18 insertions(+), 9 deletions(-) diff --git a/gpu-operator/getting-started.rst b/gpu-operator/getting-started.rst index 86d59c069..d9b932a08 100644 --- a/gpu-operator/getting-started.rst +++ b/gpu-operator/getting-started.rst @@ -31,6 +31,11 @@ Installing the NVIDIA GPU Operator The current patch release of this version of the NVIDIA GPU Operator is ``${version}``. +.. admonition:: Red Hat OpenShift Container Platform Install + :class: tip + + For installation on Red Hat OpenShift Container Platform, refer to :external+ocp:doc:`steps-overview`. + ************* Prerequisites ************* diff --git a/gpu-operator/index.rst b/gpu-operator/index.rst index afa96c50b..15be5a617 100644 --- a/gpu-operator/index.rst +++ b/gpu-operator/index.rst @@ -67,6 +67,13 @@ Air-Gapped Network Service Mesh +.. toctree:: + :caption: OpenShift Container Platform + :titlesonly: + :hidden: + + NVIDIA GPU Operator on Red Hat OpenShift Container Platform + .. toctree:: :caption: CSP configurations :titlesonly: diff --git a/gpu-operator/overview.rst b/gpu-operator/overview.rst index d9ba3f422..8d2007e22 100644 --- a/gpu-operator/overview.rst +++ b/gpu-operator/overview.rst @@ -34,18 +34,15 @@ Kubernetes device plugin for GPUs, the `NVIDIA Container Toolkit `_, `DCGM `_ based monitoring and others. -.. card:: Red Hat OpenShift Container Platform - - For information about installing, managing, and upgrading the Operator, - refer to :external+ocp:doc:`index`. - - Information about supported versions is available in :ref:`Supported Operating Systems and Kubernetes Platforms`. - - About This Documentation ======================== -Browse through the following documents for getting started, platform support and release notes. +Browse through the following documents for getting started, platform support and release notes for the NVIDIA GPU Operator. + +.. admonition:: Red Hat OpenShift Container Platform + :class: tip + + Refer to :external+ocp:doc:`index` for information about installing, managing, and upgrading the Operator on Red Hat OpenShift Container Platform. Getting Started --------------- From 27cda4110611142c5f8ccb806c3af10f24d2a686 Mon Sep 17 00:00:00 2001 From: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> Date: Tue, 3 Mar 2026 14:52:53 -0500 Subject: [PATCH 4/5] Update section links Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> --- gpu-operator/index.rst | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/gpu-operator/index.rst b/gpu-operator/index.rst index 15be5a617..841bd4cd2 100644 --- a/gpu-operator/index.rst +++ b/gpu-operator/index.rst @@ -68,20 +68,20 @@ Service Mesh .. toctree:: - :caption: OpenShift Container Platform :titlesonly: :hidden: - NVIDIA GPU Operator on Red Hat OpenShift Container Platform + .. toctree:: - :caption: CSP configurations + :caption: Cloud and Managed Kubernetes Platforms :titlesonly: :hidden: Amazon EKS Azure AKS Google GKE + NVIDIA GPU Operator on Red Hat OpenShift Container Platform .. include:: overview.rst From 120716ab71e241ccac86902b6e92c51b65e53c32 Mon Sep 17 00:00:00 2001 From: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> Date: Tue, 3 Mar 2026 15:01:22 -0500 Subject: [PATCH 5/5] Update section title Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> --- gpu-operator/index.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gpu-operator/index.rst b/gpu-operator/index.rst index 841bd4cd2..53f77b5c8 100644 --- a/gpu-operator/index.rst +++ b/gpu-operator/index.rst @@ -74,7 +74,7 @@ .. toctree:: - :caption: Cloud and Managed Kubernetes Platforms + :caption: Platform-Specific Configurations :titlesonly: :hidden: