Create a New Custom Azure Cluster
Prerequisites
Before you begin, make sure you have created a Bootstrap cluster.
Name your cluster
Give your cluster a unique name suitable for your environment.
Set the environment variable:
CODEexport CLUSTER_NAME=<azure-example>
Tips and Tricks
Below are a few ways to customize your setup which are optional. If you prefer to do a basic setup, skip Tips and Tricks and proceed to Create a New Azure Cluster section.
Important to remember related to the options below: the --compute-gallery-id
image will be in the format --compute-gallery-id /subscriptions/<subscription id>/resourceGroups/<resource group name>/providers/Microsoft.Compute/galleries/<gallery name>/images/<image definition name>/versions/<version id>
.
Option 1: To create a cluster name that is unique, use the following command. This creates a unique name every time you run it, so use it with forethought:
CODEexport CLUSTER_NAME=azure-example-$(LC_CTYPE=C tr -dc 'a-z0-9' </dev/urandom | fold -w 5 | head -n1) echo $CLUSTER_NAME
CODEazure-example-pf4a3
Option 2: To use a custom Azure Image when creating your cluster, you must create that Azure Image using KIB first.
CODEdkp create cluster azure --cluster-name=${CLUSTER_NAME} \ --compute-gallery-id "<Managed Image Shared Image Gallery Id>" --dry-run \ --output=yaml \ > ${CLUSTER_NAME}.yaml
If your environment uses HTTP/HTTPS proxies, you must include the flags
--http-proxy
,--https-proxy
, and--no-proxy
and their related values in this command for it to be successful. More information is available in Configuring an HTTP/HTTPS Proxy.Option 3: To create individual files with different smaller manifests for ease in editing, you can add the
--output-directory
flag. This will create multiple files in the specified directory which must already exist.CODEdkp create cluster azure --cluster-name=${CLUSTER_NAME} \ --compute-gallery-id "<Managed Image Shared Image Gallery Id>" --dry-run \ --output=yaml \ --output-directory=<existing-directory>
Option 4: To use a custom DNS on Azure, you need a DNS name in your control. Then create a DKP cluster using the standard method described below with the
--self-managed
flag. Once the resource group has been created, you can create your hosted zone with the command below:CODEaz network dns zone create --resource-group "d2iq-professional-services" --name
You no longer need to create a cluster issuer. There are several documents that explain custom DNS in the Kommander component.
Option 5: To allow DKP to create a cluster with Marketplace based images such as for Rocky Linux, the following flags are available. If these fields were specified in the override file during image creation, the flags must be used in cluster creation:
--plan-offer
,--plan-publisher
and--plan-sku
- CODE
--plan-offer rockylinux-9 --plan-publisher erockyenterprisesoftwarefoundationinc1653071250513 --plan-sku rockylinux-9
If you see a similar error to "Creating a virtual machine from Marketplace image or a custom image sourced from a Marketplace image requires Plan information in the request." when creating a cluster, you must also set the following flags --plan-offer
, --plan-publisher
, --plan-sku
. For example when creating a cluster with Rocky Linux VMs, add the following flags to your dkp create cluster azure
command:
--plan-offer
,--plan-publisher
and--plan-sku
For more information regarding this flag or others, please refer to the CLI for the dkp create cluster section of the documentation and select your provider.
Create a new Azure Kubernetes cluster
Availability zones (AZs) are isolated locations within data center regions from which public cloud services originate and operate. Because all the nodes in a node pool are deployed in a single Availability Zone, you may wish to create additional node pools to ensure your cluster has nodes deployed in multiple Availability Zones.
By default, the control-plane Nodes will be created in 3 different zones. However, the default worker Nodes will reside in a single Availability Zone. You may create additional node pools in other Availability Zones with the dkp create nodepool
command. See Microsoft’s documentation for more information on Availability Options for Azure VM.
If you are using Azure as a pre-provisoined environment: DKP uses localvolumeprovisioner
as the default storage provider if creating a pre-provisioned Azure cluster. However, localvolumeprovisioner
is not suitable for production use. You should use a Kubernetes CSI compatible storage that is suitable for production.
You can choose from any of the storage options available for Kubernetes. To disable the default that Konvoy deploys, set the default StorageClasslocalvolumeprovisioner
as non-default. Then set your newly created StorageClass to be the default by following the commands in the Kubernetes documentation called Changing the Default Storage Class.
Ensure your subnets do not overlap with your host subnet because they cannot be changed after cluster creation. If you need to change the kubernetes subnets, you must do this at cluster creation. The default subnets used in DKP are:
CODEspec: clusterNetwork: pods: cidrBlocks: - 192.168.0.0/16 services: cidrBlocks: - 10.96.0.0/12
The below cluster create directions instead describes how to create a cluster using Azure as the infrastructure provider provisioning clusters, which uses Azure Disks Container Storage Interface as the default StorageClass.
Generate the Kubernetes cluster objects. The following example shows a common configuration. See dkp create cluster azure reference for the full list of cluster creation options.
CODEdkp create cluster azure --cluster-name=${CLUSTER_NAME} \ --dry-run \ --output=yaml \ > ${CLUSTER_NAME}.yaml
CODEGenerating cluster resources
Refer to the Cluster Creation Customization Choices section for more information on how to use optional flags such as the
--output-directory
flag.(Optional) To configure the Control Plane and Worker nodes to use an HTTP proxy:
CODEexport CONTROL_PLANE_HTTP_PROXY=http://example.org:8080 export CONTROL_PLANE_HTTPS_PROXY=http://example.org:8080 export CONTROL_PLANE_NO_PROXY="example.org,example.com,example.net,localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.254,.cloudapp.azure.com" export WORKER_HTTP_PROXY=http://example.org:8080 export WORKER_HTTPS_PROXY=http://example.org:8080 export WORKER_NO_PROXY="example.org,example.com,example.net,localhost,127.0.0.1,10.96.0.0/12,192.168.0.0/16,kubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local,.svc,.svc.cluster,.svc.cluster.local,169.254.169.254,.cloudapp.azure.com"
Replace
example.org,example.com,example.net
with your internal addresseslocalhost
and127.0.0.1
addresses should not use the proxy10.96.0.0/12
is the default Kubernetes service subnet192.168.0.0/16
is the default Kubernetes pod subnetkubernetes,kubernetes.default,kubernetes.default.svc,kubernetes.default.svc.cluster,kubernetes.default.svc.cluster.local
is the internal Kubernetes kube-apiserver service.svc,.svc.cluster,.svc.cluster.local
is the internal Kubernetes services169.254.169.254
is the Azure metadata server.cloudapp.azure.com
is for the worker nodes to allow them to communicate directly to the kube-apiserver load balancer
(Optional) Create a Kubernetes cluster with HTTP proxy configured. This step assumes you did not already create a cluster in the previous steps:
CODEdkp create cluster azure --cluster-name=${CLUSTER_NAME} \ --control-plane-http-proxy="${CONTROL_PLANE_HTTP_PROXY}" \ --control-plane-https-proxy="${CONTROL_PLANE_HTTPS_PROXY}" \ --control-plane-no-proxy="${CONTROL_PLANE_NO_PROXY}" \ --worker-http-proxy="${WORKER_HTTP_PROXY}" \ --worker-https-proxy="${WORKER_HTTPS_PROXY}" \ --worker-no-proxy="${WORKER_NO_PROXY}" \ --dry-run \ --output=yaml \ > ${CLUSTER_NAME}.yaml
Inspect or edit the cluster objects:
NOTE: Familiarize yourself with Cluster API before editing the cluster objects as edits can prevent the cluster from deploying successfully.
The objects are Custom Resources defined by Cluster API components, and they belong in three different categories:
Cluster
A Cluster object has references to the infrastructure-specific and control plane objects. Because this is an Azure cluster, there is an AzureCluster object that describes the infrastructure-specific cluster properties. Here, this means the Azure region, the VPC ID, subnet IDs, and security group rules required by the Pod network implementation.
Control Plane
A KubeadmControlPlane object describes the control plane, which is the group of machines that run the Kubernetes control plane components, which include the etcd distributed database, the API server, the core controllers, and the scheduler. The object describes the configuration for these components. The object also has a reference to an infrastructure-specific object that describes the properties of all control plane machines. Here, it references an AzureMachineTemplate object, which describes the instance type, the type of disk used, and the size of the disk, among other properties.
Node Pool
A Node Pool is a collection of machines with identical properties. For example, a cluster might have one Node Pool with large memory capacity, another Node Pool with GPU support. Each Node Pool is described by three objects: The MachinePool references an object that describes the configuration of Kubernetes components (for example, kubelet) deployed on each node pool machine, and an infrastructure-specific object that describes the properties of all node pool machines. Here, it references a KubeadmConfigTemplate, and an AzureMachineTemplate object, which describes the instance type, the type of disk used, the size of the disk, among other properties.
For in-depth documentation about the objects, read Concepts in the Cluster API Book.
Modify Control Plane Audit logs settings using the information contained in the page Configuring the Control Plane.
Create the cluster from the objects. A warning will appear in the console if the resource already exists and will require you to remove the resource or update your YAML.
CODEkubectl create -f ${CLUSTER_NAME}.yaml
NOTE: If you used the
--output-directory
flag in yourdkp create .. --dry-run
step above, create the cluster from the objects you created by specifying the directory:CODEkubectl create -f <existing-directory>/
Output will be similar to output below:
CODEcluster.cluster.x-k8s.io/azure-example created azurecluster.infrastructure.cluster.x-k8s.io/azure-example created kubeadmcontrolplane.controlplane.cluster.x-k8s.io/azure-example-control-plane created azuremachinetemplate.infrastructure.cluster.x-k8s.io/azure-example-control-plane created secret/azure-example-etcd-encryption-config created machinedeployment.cluster.x-k8s.io/azure-example-md-0 created azuremachinetemplate.infrastructure.cluster.x-k8s.io/azure-example-md-0 created kubeadmconfigtemplate.bootstrap.cluster.x-k8s.io/azure-example-md-0 created clusterresourceset.addons.cluster.x-k8s.io/calico-cni-installation-azure-example created configmap/calico-cni-installation-azure-example created configmap/tigera-operator-azure-example created clusterresourceset.addons.cluster.x-k8s.io/azure-disk-csi-azure-example created configmap/azure-disk-csi-azure-example created clusterresourceset.addons.cluster.x-k8s.io/cluster-autoscaler-azure-example created configmap/cluster-autoscaler-azure-example created clusterresourceset.addons.cluster.x-k8s.io/node-feature-discovery-azure-example created configmap/node-feature-discovery-azure-example created clusterresourceset.addons.cluster.x-k8s.io/nvidia-feature-discovery-azure-example created configmap/nvidia-feature-discovery-azure-example created
Wait for the cluster control-plane to be ready:
CODEkubectl wait --for=condition=ControlPlaneReady "clusters/${CLUSTER_NAME}" --timeout=20m
CODEcluster.cluster.x-k8s.io/azure-example condition met
After the objects are created on the API server, the Cluster API controllers reconcile them. They create infrastructure and machines. As they progress, they update the Status of each object. Konvoy provides a command to describe the current status of the cluster:
CODEdkp describe cluster -c ${CLUSTER_NAME}
CODENAME READY SEVERITY REASON SINCE MESSAGE Cluster/azure-example True 3m4s ├─ClusterInfrastructure - AzureCluster/azure-example True 8m26s ├─ControlPlane - KubeadmControlPlane/azure-example-control-plane True 3m4s │ ├─Machine/azure-example-control-plane-l8j9r True 3m9s │ ├─Machine/azure-example-control-plane-slprd True 7m17s │ └─Machine/azure-example-control-plane-xhxxg True 5m9s └─Workers └─MachineDeployment/azure-example-md-0 True 4m31s ├─Machine/azure-example-md-0-d67567c8b-2674r True 5m19s ├─Machine/azure-example-md-0-d67567c8b-mbmhk True 5m17s ├─Machine/azure-example-md-0-d67567c8b-pzg8k True 5m17s └─Machine/azure-example-md-0-d67567c8b-z8km9 True 5m17s
As they progress, the controllers also create Events. List the Events using this command:
CODEkubectl get events | grep ${CLUSTER_NAME}
For brevity, the example uses
grep
. It is also possible to use separate commands to get Events for specific objects. For example,kubectl get events --field-selector involvedObject.kind="AzureCluster"
andkubectl get events --field-selector involvedObject.kind="AzureMachine"
.CODE15m Normal AzureClusterObjectNotFound azurecluster AzureCluster object default/azure-example not found 15m Normal AzureManagedControlPlaneObjectNotFound azuremanagedcontrolplane AzureManagedControlPlane object default/azure-example not found 15m Normal AzureClusterObjectNotFound azurecluster AzureCluster.infrastructure.cluster.x-k8s.io "azure-example" not found 8m22s Normal SuccessfulSetNodeRef machine/azure-example-control-plane-bmc9b azure-example-control-plane-fdvnm 10m Normal Machine controller dependency not yet met azuremachine/azure-example-control-plane-fdvnm Machine Controller has not yet set OwnerRef 12m Normal SuccessfulSetNodeRef machine/azure-example-control-plane-msftd azure-example-control-plane-z9q45 10m Normal SuccessfulSetNodeRef machine/azure-example-control-plane-nrvff azure-example-control-plane-vmqwx 12m Normal Machine controller dependency not yet met azuremachine/azure-example-control-plane-vmqwx Machine Controller has not yet set OwnerRef 14m Normal Machine controller dependency not yet met azuremachine/azure-example-control-plane-z9q45 Machine Controller has not yet set OwnerRef 14m Warning VMIdentityNone azuremachinetemplate/azure-example-control-plane You are using Service Principal authentication for Cloud Provider Azure which is less secure than Managed Identity. Your Service Principal credentials will be written to a file on the disk of each VM in order to be accessible by Cloud Provider. To learn more, see https://capz.sigs.k8s.io/topics/identities-use-cases.html#azure-host-identity 12m Warning ControlPlaneUnhealthy kubeadmcontrolplane/azure-example-control-plane Waiting for control plane to pass preflight checks to continue reconciliation: [machine azure-example-control-plane-msftd does not have APIServerPodHealthy condition, machine azure-example-control-plane-msftd does not have ControllerManagerPodHealthy condition, machine azure-example-control-plane-msftd does not have SchedulerPodHealthy condition, machine azure-example-control-plane-msftd does not have EtcdPodHealthy condition, machine azure-example-control-plane-msftd does not have EtcdMemberHealthy condition] 11m Warning ControlPlaneUnhealthy kubeadmcontrolplane/azure-example-control-plane Waiting for control plane to pass preflight checks to continue reconciliation: [machine azure-example-control-plane-nrvff does not have APIServerPodHealthy condition, machine azure-example-control-plane-nrvff does not have ControllerManagerPodHealthy condition, machine azure-example-control-plane-nrvff does not have SchedulerPodHealthy condition, machine azure-example-control-plane-nrvff does not have EtcdPodHealthy condition, machine azure-example-control-plane-nrvff does not have EtcdMemberHealthy condition] 9m52s Normal SuccessfulSetNodeRef machine/azure-example-md-0-84bd8b5f5b-b8cnq azure-example-md-0-bsc82 9m53s Normal SuccessfulSetNodeRef machine/azure-example-md-0-84bd8b5f5b-j8ldg azure-example-md-0-mjcbn 9m52s Normal SuccessfulSetNodeRef machine/azure-example-md-0-84bd8b5f5b-lx89f azure-example-md-0-pmq8f 10m Normal SuccessfulSetNodeRef machine/azure-example-md-0-84bd8b5f5b-pcv7q azure-example-md-0-vzprf 15m Normal SuccessfulCreate machineset/azure-example-md-0-84bd8b5f5b Created machine "azure-example-md-0-84bd8b5f5b-j8ldg" 15m Normal SuccessfulCreate machineset/azure-example-md-0-84bd8b5f5b Created machine "azure-example-md-0-84bd8b5f5b-lx89f" 15m Normal SuccessfulCreate machineset/azure-example-md-0-84bd8b5f5b Created machine "azure-example-md-0-84bd8b5f5b-pcv7q" 15m Normal SuccessfulCreate machineset/azure-example-md-0-84bd8b5f5b Created machine "azure-example-md-0-84bd8b5f5b-b8cnq" 15m Normal Machine controller dependency not yet met azuremachine/azure-example-md-0-bsc82 Machine Controller has not yet set OwnerRef 15m Normal Machine controller dependency not yet met azuremachine/azure-example-md-0-mjcbn Machine Controller has not yet set OwnerRef 15m Normal Machine controller dependency not yet met azuremachine/azure-example-md-0-pmq8f Machine Controller has not yet set OwnerRef
If changing the Calico encapsulation, D2iQ recommends changing it after cluster creation, but before production.
Known Limitations
Be aware of these limitations in the current release of Konvoy.
The Konvoy version used to create a bootstrap cluster must match the Konvoy version used to create a workload cluster.
Konvoy supports deploying one workload cluster.
Konvoy generates a set of objects for one Node Pool.
Konvoy does not validate edits to cluster objects.
Next, you can Explore the New Cluster or Make it Self-managed.