NVIDIA Platform Application Management Cluster

Instructions on enabling the NVIDIA platform application on a Management cluster

Enable NVIDIA Platform Application on Kommander for Management Cluster

If you intend to run applications that make use of GPU’s on your cluster, you should install the NVIDIA GPU operator. To enable NVIDIA GPU support when installing Kommander on a management cluster, perform the following steps:

Create an installation configuration file:
CODE
```
dkp install kommander --init > install.yaml
```
Append the following to the apps section in the install.yaml file to enable Nvidia platform services.
CODE
```
apps:
  nvidia-gpu-operator:
   enabled: true
```
Install Kommander using the configuration file you created:
CODE
```
dkp install kommander --installer-config ./install.yaml --kubeconfig=${CLUSTER_NAME}.conf
```
In the previous command, the --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you set the context to install Kommander on the right cluster. For alternatives and recommendations around setting your context, refer to Provide Context for Commands with a kubeconfig File.
Proceed to the Select the correct Toolkit version for your NVIDIA GPU Operator section.

TIP: Sometimes, applications require a longer period of time to deploy, which causes the installation to time out. Add the --wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment of applications.

Select the Correct Toolkit Version for your NVIDIA GPU Operator

The NVIDIA Container Toolkit allows users to run GPU accelerated containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPU and must be configured correctly according to your base operating system.

Kommander (Management Cluster) Customization

Select the correct Toolkit version based on your OS:
The NVIDIA Container Toolkit allows users to run GPU accelerated containers. The toolkit includes a container runtime library and utilities to automatically configure containers to leverage NVIDIA GPU and must be configured correctly according to your base operating system.
Centos 7.9/RHEL 7.9:
If you’re using Centos 7.9 or RHEL 7.9 as the base operating system for your GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to the following:
CODE
```
kind: Installation
apps:
  nvidia-gpu-operator:
   enabled: true
   values: |
     toolkit:
       version: v1.10.0-centos7
```
RHEL 8.4/8.6 and SLES 15 SP3
If you’re using RHEL 8.4/8.6 or SLES 15 SP3 as the base operating system for your GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to the following:
CODE
```
kind: Installation
apps:
  nvidia-gpu-operator:
   enabled: true
   values: |
     toolkit:
       version: v1.10.0-ubi8
```
Ubuntu 18.04 and 20.04
If you’re using Ubuntu 18.04 or 20.04 as the base operating system for your GPU enabled nodes, set the toolkit.version parameter in your Kommander Installer Configuration file or <kommander.yaml> to the following:
CODE
```
kind: Installation
apps:
  nvidia-gpu-operator:
   enabled: true
   values: |
     toolkit:
       version: v1.11.0-ubuntu20.04
```
Install Kommander, using the configuration file you created:
CODE
```
dkp install kommander --installer-config ./install.yaml
```
In the previous command, the --kubeconfig=${CLUSTER_NAME}.conf flag ensures that you set the context to install Kommander on the right cluster. For alternatives and recommendations around setting your context, refer to Provide Context for Commands with a kubeconfig File.

TIP: Sometimes, applications require a longer period of time to deploy, which causes the installation to time out. Add the --wait-timeout <time to wait> flag and specify a period of time (for example, 1h) to allocate more time to the deployment of applications.