Use Cluster API to change the Cluster Topology

In Creating your first cluster with Cluster API we demonstrated how to create a cluster using a standard declarative configuration. Now we will show how to leverage this declarative configuration to update cluster topology with ease.

Table of Contents

What is a Cluster Topology?
Add a new pool of worker nodes
Remove a worker node
Scale out control plane nodes
- Scale from 1 to 3 control plane nodes
More scale operations
Next: MachineHealthChecks and Remediation
More information

What is a Cluster Topology?

We define a cluster topology as the set of configurations that describe your cluster: a few examples are the number of control plane and worker nodes; the type of Machine hardware that underlies various nodes; regional distribution across nodes or node pools. You may also see this described as "cluster shape" elsewhere.

Add a new pool of worker nodes

Now we can demonstrate how easily you can leverage the flexibility of Cluster API to add and remove pools of nodes. We'll re-use the existing default-worker MachineDeployment class; in other words, we'll define a discrete, new node set based on a pre-existing, common worker machine recipe.

For reference, this is the declarative block that defines the default-worker MachineDeployment class in the ClusterClass spec we originally installed:

  workers:
    machineDeployments:
    - class: default-worker
      template:
        bootstrap:
          ref:
            apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
            kind: KubeadmConfigTemplate
            name: quick-start-default-worker-bootstraptemplate
        infrastructure:
          ref:
            apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
            kind: DockerMachineTemplate
            name: quick-start-default-worker-machinetemplate

We'll reference the above default-worker class in an updated Cluster configuration that declares a second worker pool.

The difference between this spec and the original spec used to create the docker-cluster-one Cluster:

diff yamls/clusters/1-docker-cluster-one.yaml yamls/clusters/3-docker-cluster-one-second-worker-pool.yaml

Output:

35a36,38
>         - class: default-worker # Adding a 2nd worker pool
>           name: md-1
>           replicas: 1

The above shows that we've added a new MachineDeployment called md-1 (Our existing MachineDeployment is named md-0) to our new spec. By applying this modified spec to our kind management cluster we will initiate the creation of a new node from this new node pool:

kubectl apply -f yamls/clusters/3-docker-cluster-one-second-worker-pool.yaml

Output:

cluster.cluster.x-k8s.io/docker-cluster-one configured

Let's watch those new nodes come online:

kubectl --kubeconfig cluster-one.kubeconfig get nodes -w

Output:

NAME                                             STATUS   ROLES           AGE   VERSION
docker-cluster-one-md-0-nrh7k-75ddc4778f-wwpg9   Ready    <none>          25h   v1.24.6
docker-cluster-one-xbqb2-bcjvj                   Ready    control-plane   26h   v1.24.6
docker-cluster-one-md-1-ksqwt-8489684d5b-fpgjs   NotReady                   <none>          0s    v1.24.6
docker-cluster-one-md-1-ksqwt-8489684d5b-fpgjs   NotReady                   <none>          46s   v1.24.6
docker-cluster-one-md-1-ksqwt-8489684d5b-fpgjs   NotReady                   <none>          46s   v1.24.6
docker-cluster-one-md-1-ksqwt-8489684d5b-fpgjs   Ready                      <none>          76s   v1.24.6

clusterctl describe now shows two MachineDeployments:

clusterctl describe cluster docker-cluster-one

Output:

NAME                                                              READY  SEVERITY  REASON  SINCE  MESSAGE
Cluster/docker-cluster-one                                        True                     11m
├─ClusterInfrastructure - DockerCluster/docker-cluster-one-4ffkk  True                     12m
├─ControlPlane - KubeadmControlPlane/docker-cluster-one-hfrd8     True                     11m
│ └─Machine/docker-cluster-one-hfrd8-4r5mp                        True                     11m
└─Workers
  ├─MachineDeployment/docker-cluster-one-md-0-mbbfs               True                     10m
  │ └─Machine/docker-cluster-one-md-0-mbbfs-dbb74566d-pvww6       True                     10m
  └─MachineDeployment/docker-cluster-one-md-1-s2bdz               True                     23s
    └─Machine/docker-cluster-one-md-1-s2bdz-7475cfb889-7jnf9      True                     28s

Remove a worker node

We can now demonstrate how easy it is to "roll back" such a change, as well as show how to remove an existing node pool from your Cluster configuration. In our case it's as easy as reapplying the original Cluster spec, which only declares one md-0 node pool:

kubectl apply -f yamls/clusters/1-docker-cluster-one.yaml

Once again, we should observe only one running worker node, in the md-0 pool:

kubectl --kubeconfig cluster-one.kubeconfig get nodes

Output:

NAME                                            STATUS   ROLES           AGE    VERSION
docker-cluster-one-md-0-lq4f8-b59497b9d-xchpm   Ready    <none>          99m    v1.24.6
docker-cluster-one-mvthd-k7dwf                  Ready    control-plane   174m   v1.24.6

Note: If you're running this tutorial on Windows, or your system is running below the minimum resource requirements it's highly recommended you move on to the next section: MachineHealthChecks and Remediation. If you run into trouble at any point please consult the troubleshooting guide.

Scale out control plane nodes

A common Kubernetes cluster maintenance activity is scaling out (or in) the number of control plane nodes in response to cluster activity. Because Cluster API configuration interfaces are in fact Kubernetes resources, there are a lot of ways to do this. We'll demonstrate using a variety of methods.

Scale from 1 to 3 control plane nodes

Because we are leveraging the Kubernetes declarative model, we can simply refer to a desired configuration specification and rely upon Cluster API to evaluate what's different between the two.

Assuming that your docker-cluster-one Cluster still has its original configuration, you should have one control plane node:

kubectl --kubeconfig cluster-one.kubeconfig get nodes

Output:

NAME                                             STATUS   ROLES           AGE   VERSION
docker-cluster-one-md-0-nrh7k-75ddc4778f-vlph9   Ready    <none>          24m   v1.24.6
docker-cluster-one-xbqb2-tlmpb                   Ready    control-plane   24m   v1.24.6

Let's use the idempotent model to submit a modified configuration of our docker-cluster-one Cluster with 3 control plane replicas instead of 1. We've provided a reference yaml of this updated configuration in this repo.

If you're on masOS or Linux you can diff this modified spec from the original spec used to create the docker-cluster-one Cluster:

diff yamls/clusters/1-docker-cluster-one.yaml yamls/clusters/3-docker-cluster-one-3-control-plane-replicas.yaml

Output:

17c17
<       replicas: 1
---
>       replicas: 3 # Replicas changed from 1 to 3

The above shows that the yaml specs are almost identical, with the only change being the replicas value on L17. By applying that modified spec to our kind management cluster we can achieve a control plane node scale out operation:

kubectl apply -f yamls/clusters/3-docker-cluster-one-3-control-plane-replicas.yaml

Output:

cluster.cluster.x-k8s.io/docker-cluster-one configured

Now we can watch the new control plane nodes come online:

kubectl --kubeconfig cluster-one.kubeconfig get nodes -w

An interesting note! As your cluster transitions from 1 to 2 control plane nodes, it will temporarily lose etcd quorum and the apiserver running on your cluster will briefly go offline. So your kubectl -w command will be interrupted. Re-run the same command again to resume watching your cluster nodes:

kubectl --kubeconfig cluster-one.kubeconfig get nodes -w

Output:

NAME                                             STATUS     ROLES           AGE   VERSION
docker-cluster-one-md-0-nrh7k-75ddc4778f-vlph9   Ready      <none>          86m   v1.24.6
docker-cluster-one-xbqb2-tlmpb                   Ready      control-plane   86m   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   <none>          21s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   <none>          21s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   control-plane   21s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   control-plane   23s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   control-plane   23s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   control-plane   30s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   control-plane   30s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   NotReady   control-plane   30s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   Ready      control-plane   31s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   Ready      control-plane   48s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   Ready      control-plane   48s   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   Ready      control-plane   53s   v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          0s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          0s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          0s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          0s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          2s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          2s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          7s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          7s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   NotReady   <none>          7s    v1.24.6
docker-cluster-one-xbqb2-sjlwm                   Ready      <none>          10s   v1.24.6
docker-cluster-one-xbqb2-sjlwm                   Ready      <none>          10s   v1.24.6
docker-cluster-one-xbqb2-sjlwm                   Ready      <none>          12s   v1.24.6

More scale operations

For such a simple cluster topology change against a single configuration, it's also possible to update our Cluster resource in-place. Let's use kubectl edit to do that.

kubectl edit cluster/docker-cluster-one

The above command will open your locally configured editor (for example, most macOS and Linux environments will be configured to launch vim). You'll want to look for the yaml configuration at the path spec.topology.controlPlane.replicas. Based on our prior scale out the value should be 3. Go ahead and change that back to 1, and then save the changes in your editor:

Output after editing, saving and exiting your editor:

cluster.cluster.x-k8s.io/docker-cluster-one edited

After a few minutes we should now see the cluster back to reporting 1 control plane node (note that we're pointing kubectl to our workload cluster below using the previously saved cluster-one.kubeconfig kubeconfig file):

kubectl --kubeconfig cluster-one.kubeconfig get nodes

Output:

NAME                                             STATUS   ROLES           AGE    VERSION
docker-cluster-one-md-0-nrh7k-75ddc4778f-vlph9   Ready    <none>          118m   v1.24.6
docker-cluster-one-xbqb2-8wcbj                   Ready    control-plane   31m    v1.24.6

We can do the same gesture to scale worker nodes as well. This time we want to edit the configuration at path spec.topology.workers.machineDeployments. We should only have one item in that array; change its replicas value from 1 to 3:

kubectl edit cluster/docker-cluster-one

Output after editing, saving, and exiting your editor:

cluster.cluster.x-k8s.io/docker-cluster-one edited

If you updated from 1 to 3, you would see those 3 nodes gradually come online after the change to our Cluster's spec.topology.workers.machineDeployments replicas value:

kubectl --kubeconfig cluster-one.kubeconfig get nodes

Output:

NAME                                             STATUS   ROLES           AGE     VERSION
docker-cluster-one-md-0-nrh7k-75ddc4778f-lzfr4   Ready    <none>          7m44s   v1.24.6
docker-cluster-one-md-0-nrh7k-75ddc4778f-t28hd   Ready    <none>          7m40s   v1.24.6
docker-cluster-one-md-0-nrh7k-75ddc4778f-vlph9   Ready    <none>          131m    v1.24.6
docker-cluster-one-xbqb2-8wcbj                   Ready    control-plane   44m     v1.24.6

Note!: Because we're executing the above topology changes using the Docker provider, your local system may not have sufficient resources to run 4 nodes. If so, you can skip the control-plane scaling part and reduce the number of worker nodes to 2 from 3.

Feel free to continue experimenting with scaling. When you're done and ready to move forward, let's go back to our original topology configuration of 1 control plane node and 1 worker node. It's easy to do that by simply reapplying our original Cluster spec, which declares that configuration:

kubectl apply -f yamls/clusters/1-docker-cluster-one.yaml

Next: MachineHealthChecks and Remediation

You are now in control of your Cluster's topology configuration! Let's next explore MachineHealthChecks and Remediation for operational self-healing.

More information

To learn more about managing Kubernetes clusters using a Cluster Topology see the topology section of the CAPI book.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

3-cluster-topology.md

3-cluster-topology.md

Use Cluster API to change the Cluster Topology

What is a Cluster Topology?

Add a new pool of worker nodes

Remove a worker node

Scale out control plane nodes

Scale from 1 to 3 control plane nodes

More scale operations

Next: MachineHealthChecks and Remediation

More information

Files

3-cluster-topology.md

Latest commit

History

3-cluster-topology.md

File metadata and controls

Use Cluster API to change the Cluster Topology

What is a Cluster Topology?

Add a new pool of worker nodes

Remove a worker node

Scale out control plane nodes

Scale from 1 to 3 control plane nodes

More scale operations

Next: MachineHealthChecks and Remediation

More information