Conversation
Documentation preview |
319d6c0 to
a4293b1
Compare
|
|
||
| .. _example-custom-mig-configuration: | ||
|
|
||
| Example: Custom MIG Configuration |
There was a problem hiding this comment.
Is this example making a distinction between custom MIG configuration after installation compared to the previous example?
There was a problem hiding this comment.
@rajathagasthya Yes, i believe that was the orginal intention of the 2 examples. It feels a bit redundant though. Do you see these as separate use cases customers would need different examples?
There was a problem hiding this comment.
It's not too important, but maybe worth keeping. We can make it more obvious in the example title. I suggest the following:
- Move the step
Optional: Monitor the MIG Manager logs to confirm the new MIG geometry is appliedfrom this example to the previous example. - Rename this example to something like
Configure Custom MIG Configuration to an Existing GPU Operator Deploymentand say:
To configure running GPU Operator deployment with a custom MIG config:
(keep steps 1 through 4; step 5 isn't necessary)
Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com> Co-authored-by: Rajath Agasthya <rajathagasthya@gmail.com>
|
|
||
| * Added support for including extra manifests with the Helm chart. | ||
|
|
||
| * Added a the ``sandboxWorkloads.mode`` field to help manage sandboxWorkloads. with ["kubevirt", "kata"] as valid values. |
There was a problem hiding this comment.
need to add more context to this
Signed-off-by: Abigail McCarthy <20771501+a-mccarthy@users.noreply.github.com>
| | `570.195.03 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-570-195-03/index.html>`_ | ||
| | `550.163.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-550-163-01/index.html>`_ | ||
| | `535.274.02 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-535-274-03/index.html>`_ | ||
| - | `590.48.01 <https://docs.nvidia.com/datacenter/tesla/tesla-release-notes-590-48-01/index.html>`_ |
There was a problem hiding this comment.
@tariq1890, what driver versions do we want to list for 26.3.0?
|
|
||
| * Improved the GPU Operator to deploy on heterogenous clusters with different operating systems on GPU nodes. | ||
|
|
||
| * Fixed issues where the GPU Operator was not using getting the correct operating system on heterogenous clusters. Now the GPU Operator uses the OS version labels from GPU worker nodes, added by NFD, when determining what OS-specific paths to use for repository configuration files. (`PR #562 <https://github.com/NVIDIA/gpu-operator/issues/562>`_, `PR #2138 <https://github.com/NVIDIA/gpu-operator/pull/2138>`_) |
There was a problem hiding this comment.
"was not using getting" doesn't look right
|
|
||
| * Fixed an issue on Openshift clusters where the ``dcgm-exporter`` pod gets bound to another Security Context Constraint (SCC) object, not the one named 'nvidia-dcgm-exporter' which the GPU Operator creates. (`PR #2122 <https://github.com/NVIDIA/gpu-operator/pull/2122>`_) | ||
|
|
||
| * Fixed an issue where the GPU Operator was not correctly cleaning up deamonsets https://github.com/NVIDIA/gpu-operator/pull/2081 |
There was a problem hiding this comment.
This probably needs to be a link
No description provided.