Replace an AWS Node
Prerequisites
Before you begin, you must:
Replace a Worker Node
In certain situations, you may want to delete a worker node and have Cluster API replace it with a newly provisioned machine.
Identify the name of the node to delete.
List the nodes:
CODEkubectl --kubeconfig ${CLUSTER_NAME}.conf get nodes
The output from this command resembles the following:
CODENAME STATUS ROLES AGE VERSION ip-10-0-100-85.us-west-2.compute.internal Ready <none> 16m v1.25.4 ip-10-0-106-183.us-west-2.compute.internal Ready control-plane,master 15m v1.25.4 ip-10-0-158-104.us-west-2.compute.internal Ready control-plane,master 17m v1.25.4 ip-10-0-203-138.us-west-2.compute.internal Ready control-plane,master 16m v1.25.4 ip-10-0-70-169.us-west-2.compute.internal Ready <none> 16m v1.25.4 ip-10-0-77-176.us-west-2.compute.internal Ready <none> 16m v1.25.4 ip-10-0-96-61.us-west-2.compute.internal Ready <none> 16m v1.25.4
Export a variable with the node name to use in the next steps:
This example uses the name
ip-10-0-100-85.us-west-2.compute.internal
.CODEexport NAME_NODE_TO_DELETE="<ip-10-0-100-85.us-west-2.compute.internal>"
Delete the Machine resource
CODEexport NAME_MACHINE_TO_DELETE=$(kubectl --kubeconfig ${CLUSTER_NAME}.conf get machine -ojsonpath="{.items[?(@.status.nodeRef.name==\"$NAME_NODE_TO_DELETE\")].metadata.name}") kubectl --kubeconfig ${CLUSTER_NAME}.conf delete machine "$NAME_MACHINE_TO_DELETE"
CODEmachine.cluster.x-k8s.io "aws-example-1-md-0-cb9c9bbf7-t894m" deleted
The command will not return immediately. It will return once the Machine resource has been deleted.
A few minutes after the Machine resource is deleted, the corresponding Node resource is also deleted.
Observe that the Machine resource is being replaced using this command:
CODEkubectl --kubeconfig ${CLUSTER_NAME}.conf get machinedeployment
CODENAME CLUSTER REPLICAS READY UPDATED UNAVAILABLE PHASE AGE VERSION aws-example-md-0 aws-example 4 3 4 1 ScalingUp 7m53s v1.25.4
In this example, there are two replicas, but only 1 is ready. One replica is unavailable, and the
ScalingUp
phase means a new Machine is being created.Identify the replacement Machine using this command:
CODEexport NAME_NEW_MACHINE=$(kubectl --kubeconfig ${CLUSTER_NAME}.conf get machines \ -l=cluster.x-k8s.io/deployment-name=${CLUSTER_NAME}-md-0 \ -ojsonpath='{.items[?(@.status.phase=="Running")].metadata.name}{"\n"}') echo "$NAME_NEW_MACHINE"
CODEaws-example-md-0-cb9c9bbf7-hcl8z aws-example-md-0-cb9c9bbf7-rtdqw aws-example-md-0-cb9c9bbf7-td29r aws-example-md-0-cb9c9bbf7-w64kg
If the output is empty, the new Machine has probably exited the
Provisioning
phase and entered theRunning
phase.Identify the replacement Node using this command:
CODEkubectl --kubeconfig ${CLUSTER_NAME}.conf get nodes
CODENAME STATUS ROLES AGE VERSION ip-10-0-106-183.us-west-2.compute.internal Ready control-plane,master 20m v1.25.4 ip-10-0-158-104.us-west-2.compute.internal Ready control-plane,master 23m v1.25.4 ip-10-0-203-138.us-west-2.compute.internal Ready control-plane,master 22m v1.25.4 ip-10-0-70-169.us-west-2.compute.internal Ready <none> 22m v1.25.4 ip-10-0-77-176.us-west-2.compute.internal Ready <none> 22m v1.25.4 ip-10-0-86-58.us-west-2.compute.internal Ready <none> 57s v1.25.4 ip-10-0-96-61.us-west-2.compute.internal Ready <none> 22m v1.25.4
If the output is empty, the Node resource is not yet available, or does not yet have the expected annotation. Wait a few minutes, then repeat the command.