Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions gpu-operator/getting-started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,11 @@ Installing the NVIDIA GPU Operator

The current patch release of this version of the NVIDIA GPU Operator is ``${version}``.

.. admonition:: Red Hat OpenShift Container Platform Install
:class: tip

For installation on Red Hat OpenShift Container Platform, refer to :external+ocp:doc:`steps-overview`.

*************
Prerequisites
*************
Expand Down
9 changes: 8 additions & 1 deletion gpu-operator/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -68,13 +68,20 @@
Service Mesh <install-gpu-operator-service-mesh.rst>

.. toctree::
:caption: CSP configurations
:titlesonly:
:hidden:



.. toctree::
:caption: Platform-Specific Configurations
:titlesonly:
:hidden:

Amazon EKS <amazon-eks.rst>
Azure AKS <microsoft-aks.rst>
Google GKE <google-gke.rst>
NVIDIA GPU Operator on Red Hat OpenShift Container Platform <https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html>


.. include:: overview.rst
11 changes: 5 additions & 6 deletions gpu-operator/life-cycle-policy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -39,9 +39,8 @@ Patch releases typically include critical bug and CVE fixes, but can include min
NVIDIA GPU Operator Life Cycle
******************************

When a major version of NVIDIA GPU Operator is released, the previous major version enters maintenance support
and only receives patch release updates for critical bug and CVE fixes.
All prior major versions enter end-of-life (EOL) and are no longer supported and do not receive patch release updates.
When a new major version of NVIDIA GPU Operator is released, the previous major version enters deprecated support and only receives patch release updates for critical bug and CVE fixes.
All prior major versions enter end of support and are no longer supported and do not receive patch release updates.

The product life cycle and versioning are subject to change in the future.

Expand All @@ -56,13 +55,13 @@ The product life cycle and versioning are subject to change in the future.
- Status

* - 25.10.x
- Generally Available
- Supported

* - 25.3.x
- Maintenance
- Deprecated

* - 24.9.x and lower
- EOL
- End of Support


.. _operator-component-matrix:
Expand Down
15 changes: 6 additions & 9 deletions gpu-operator/overview.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,18 +34,15 @@ Kubernetes device plugin for GPUs, the `NVIDIA Container Toolkit <https://github
automatic node labeling using `GFD <https://github.com/NVIDIA/gpu-feature-discovery>`_, `DCGM <https://developer.nvidia.com/dcgm>`_ based monitoring and others.


.. card:: Red Hat OpenShift Container Platform

For information about installing, managing, and upgrading the Operator,
refer to :external+ocp:doc:`index`.

Information about supported versions is available in :ref:`Supported Operating Systems and Kubernetes Platforms`.


About This Documentation
========================

Browse through the following documents for getting started, platform support and release notes.
Browse through the following documents for getting started, platform support and release notes for the NVIDIA GPU Operator.

.. admonition:: Red Hat OpenShift Container Platform
:class: tip

Refer to :external+ocp:doc:`index` for information about installing, managing, and upgrading the Operator on Red Hat OpenShift Container Platform.

Getting Started
---------------
Expand Down
16 changes: 16 additions & 0 deletions gpu-operator/release-notes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,22 @@ Fixed Issues
* Fixed a bug where the k8s-driver-manager would wait indefinitely when MOFED is enabled and ``USE_HOST_MOFED`` is set to true despite the MOFED being pre-installed on the host.


Known Issues
------------

* When deploying the GPU Operator on systems with SELinux in enforcing mode, the MIG Manager does not get scheduled on GPU nodes.
This happens because the GPU Feature Discovery pod has insufficient permissions on Node Feature Discovery's feature-file drop-in directory, so it cannot add the label that indicates a MIG-capable GPU is present.
To work around this issue, configure NVIDIA GPU Feature Discovery to use the Node Feature API instead of feature files in ClusterPolicy:

.. code-block:: yaml

gfd:
env:
- name: USE_NODE_FEATURE_API
value: "true"



.. _v25.10.0:

25.10.0
Expand Down