Skip to content

KCP should not consider control plane initialized if there is only a machine being deleted. #12870

@fabriziopandini

Description

@fabriziopandini

What steps did you take and what happened?

KCP support remediating single CP machines up to when control plane is considered initialized

A control plane which is considered initialized when KCP can actually connect to the workload cluster and it detects that kubeadm config has been created, which is a proxy signal for kubeadm init completed.

However in some edge case, when users have an aggressive nodStartupTimeout and infra is slow for any reason, it might happed that deletion of the first CP machine is triggered, and kubeadm init completes in the short timeframe between when machine deletion is triggered and when the machine goes away.

This leads to an inconsistent state where cluster is initialized, no CP machine exists, and the replacement CP machine fails when trying to join

What did you expect to happen?

KCP should not consider control plane initialized if there is only a machine being deleted.

Cluster API version

main

Kubernetes version

No response

Anything else you would like to add?

No response

Label(s) to be applied

/kind bug
One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels.

Metadata

Metadata

Labels

area/provider/control-plane-kubeadmIssues or PRs related to KCPkind/bugCategorizes issue or PR as related to a bug.priority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next release.triage/acceptedIndicates an issue or PR is ready to be actively worked on.

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions