Pre-provisioned Air-gapped GPU: Configure Environment
If the NVIDIA runfile installer has not been downloaded, then retrieve and install the download first by running the following command. The first line in the command below downloads and installs the runfile and the second line places it in the artifacts directory.
- CODE
curl -O https://download.nvidia.com/XFree86/Linux-x86_64/470.82.01/NVIDIA-Linux-x86_64-470.82.01.run mv NVIDIA-Linux-x86_64-470.82.01.run artifacts
The instructions below outline how to fulfill the requirements for a pre-provisioned infrastructure when using an air-gapped environment. In order to create a cluster, you must first setup the environment with necessary artifacts. All artifacts for Pre-provisioned Air-gapped environments need to get onto the bastion host. Artifacts needed by nodes must be unpacked and distributed on the bastion before other provisioning will work in the absence of an internet connection.
There is a new complete DKP air-gapped bundle available to download which contains all the DKP components needed for air-gapped installation. (i.e. dkp-air-gapped-bundle_v2.5.2_linux_amd64.tar.gz
)
Setup Process:
The bootstrap image must be extracted and loaded onto the bastion host.
Artifacts must be copied onto cluster hosts for nodes to access.
If using GPU, those artifacts must be positioned locally.
Docker Registry seeded with images locally.
Load the Bootstrap Image
Assuming you have downloaded
dkp-air-gapped-bundle_v2.5.2_linux_amd64.tar.gz
from the download site mentioned above, extract the tarball to a local directory:CODEtar -xzvf dkp-air-gapped-bundle_v2.5.2_linux_amd64.tar.gz && cd dkp-v2.5.2
Load the bootstrap Docker image on your bastion machine:
CODEdocker load -i konvoy-bootstrap-image-v2.5.2.tar
Copy Air-gapped Artifacts onto Cluster Hosts
Using the Konvoy Image Builder, you can copy the required artifacts onto your cluster hosts.
Assuming you have downloaded dkp-air-gapped-bundle_v2.5.2_linux_amd64.tar.gz
, continue below:
The Kubernetes image bundle will be located in
kib/artifacts/images
and you will want to verify image and artifacts.Verify the image bundles exist in
artifacts/images
:CODE$ ls artifacts/images/ kubernetes-images-1.25.4-d2iq.1.tar kubernetes-images-1.25.4-d2iq.1-fips.tar
Verify the artifacts for your OS exist in the
artifacts/
directory and export the appropriate variables:CODE$ ls artifacts/ 1.25.4_centos_7_x86_64.tar.gz 1.25.4_redhat_8_x86_64_fips.tar.gz containerd-1.6.17-d2iq.1-rhel-7.9-x86_64.tar.gz containerd-1.6.17-d2iq.1-rhel-8.6-x86_64_fips.tar.gz pip-packages.tar.gz 1.25.4_centos_7_x86_64_fips.tar.gz 1.25.4_rocky_9_x86_64.tar.gz containerd-1.6.17-d2iq.1-rhel-7.9-x86_64_fips.tar.gz containerd-1.6.17-d2iq.1-rocky-9.0-x86_64.tar.gz 1.25.4_redhat_7_x86_64.tar.gz 1.25.4_ubuntu_20_x86_64.tar.gz containerd-1.6.17-d2iq.1-rhel-8.4-x86_64.tar.gz containerd-1.6.17-d2iq.1-rocky-9.1-x86_64.tar.gz 1.25.4_redhat_7_x86_64_fips.tar.gz containerd-1.6.17-d2iq.1-centos-7.9-x86_64.tar.gz containerd-1.6.17-d2iq.1-rhel-8.4-x86_64_fips.tar.gz containerd-1.6.17-d2iq.1-ubuntu-20.04-x86_64.tar.gz NVIDIA-Linux-x86_64-470.82.01.run 1.25.4_redhat_8_x86_64.tar.gz containerd-1.6.17-d2iq.1-centos-7.9-x86_64_fips.tar.gz containerd-1.6.17-d2iq.1-rhel-8.6-x86_64.tar.gz images
For example, for RHEL 8.4 you would set:
CODEexport OS_PACKAGES_BUNDLE=1.25.4_redhat_8_x86_64.tar.gz export CONTAINERD_BUNDLE=containerd-1.6.17-d2iq.1-rhel-8.4-x86_64.tar.gz
Export the following environment variables, ensuring that all control plane and worker nodes are included:
CODEexport CONTROL_PLANE_1_ADDRESS="<control-plane-address-1>" export CONTROL_PLANE_2_ADDRESS="<control-plane-address-2>" export CONTROL_PLANE_3_ADDRESS="<control-plane-address-3>" export WORKER_1_ADDRESS="<worker-address-1>" export WORKER_2_ADDRESS="<worker-address-2>" export WORKER_3_ADDRESS="<worker-address-3>" export WORKER_4_ADDRESS="<worker-address-4>" export SSH_USER="<ssh-user>" export SSH_PRIVATE_KEY_FILE="<private key file>"
SSH_PRIVATE_KEY_FILE
must be either the name of the SSH private key file in your working directory or an absolute path to the file in your user’s home directory.Generate an
inventory.yaml
for GPU nodes which is automatically picked up by thekonvoy-image upload
in the next step.CODEcat <<EOF > gpu_inventory.yaml all: vars: ansible_port: 22 ansible_ssh_private_key_file: $SSH_PRIVATE_KEY_FILE ansible_user: $SSH_USER hosts: $GPU_WORKER_1_ADDRESS: ansible_host: $GPU_WORKER_1_ADDRESS EOF cat <<EOF > inventory.yaml all: vars: ansible_user: $SSH_USER ansible_port: 22 ansible_ssh_private_key_file: $SSH_PRIVATE_KEY_FILE hosts: $CONTROL_PLANE_1_ADDRESS: ansible_host: $CONTROL_PLANE_1_ADDRESS $CONTROL_PLANE_2_ADDRESS: ansible_host: $CONTROL_PLANE_2_ADDRESS $CONTROL_PLANE_3_ADDRESS: ansible_host: $CONTROL_PLANE_3_ADDRESS $WORKER_1_ADDRESS: ansible_host: $WORKER_1_ADDRESS $WORKER_2_ADDRESS: ansible_host: $WORKER_2_ADDRESS $WORKER_3_ADDRESS: ansible_host: $WORKER_3_ADDRESS $WORKER_4_ADDRESS: ansible_host: $WORKER_4_ADDRESS EOF
Upload the artifacts to the gpu nodepool with the
nvidia-runfile
flag with the following command:CODEkonvoy-image upload artifacts --inventory-file=gpu_inventory.yaml \ --container-images-dir=./artifacts/images/ \ --os-packages-bundle=./artifacts/$OS_PACKAGES_BUNDLE \ --containerd-bundle=artifacts/$CONTAINERD_BUNDLE \ --pip-packages-bundle=./artifacts/pip-packages.tar.gz \ --nvidia-runfile=./artifacts/NVIDIA-Linux-x86_64-470.82.01.run
KIB uses variable overrides to specify base image and container images to use in your new machine image. The variable overrides files for NVIDIA and FIPS can be ignored unless adding an overlay feature.
Use the
--overrides overrides/fips.yaml,overrides/offline-fips.yaml
flag with manifests located in the overrides directory or see these pages in the documentation: