-
Notifications
You must be signed in to change notification settings - Fork 81
Add snapshot restore validator #1191
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
jiaqiluo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with some nits.
ba25be2 to
88b2c50
Compare
jakefhyde
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 nit
|
|
||
| // parseSnapshotClusterSpec decodes snapshot.SnapshotFile.Metadata into a v1.ClusterSpec. | ||
| // The metadata is stored as a nested, gzipped, base64-encoded structure. | ||
| func parseSnapshotClusterSpec(snap *rkev1.ETCDSnapshot) (*v1.ClusterSpec, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We copy this from Rancher correct? I wonder if we could move this to github.com/rancher/rancher/pkg/apis/provisioning.cattle.io/v1 somewhere.
Issue: rancher/rancher#52574
Problem
When restoring from an ETCD snapshot, the webhook did not validate the snapshot metadata before accepting
spec.rkeConfig.etcdSnapshotRestore.It was possible to request
"kubernetesVersion"or"all"forrestoreRKEConfigeven when the referenced snapshot had missing or invalid metadata.This led to restore requests that passed admission but failed later in the restore flow with parse errors.
Solution
This PR adds a validator for
spec.rkeConfig.etcdSnapshotRestoreonprovisioning.cattle.io/v1, Clusterand wires the RKE client into the webhookClientsstruct.The validator:
etcdSnapshotRestorechanges from empty to a new non empty value, so it does not block unrelated cluster updates.etcdSnapshotRestore.nameexists in the same namespace.etcdSnapshotRestore.restoreRKEConfigis one of"none","kubernetesVersion", or"all"."kubernetesVersion", requires akubernetesVersion, and for"all", requires bothkubernetesVersionandrkeConfig.In addition:
pkg/server/handlers.gowas moved to a management cluster only list so that validation only runs where snapshot resources exist (local/management cluster). This avoids issues on downstream clusters that do not have the snapshot resources.Docs are updated to describe the new validation behavior, and unit tests cover the main success and failure paths.
This partially addresses the linked issue by validating snapshot metadata before restore. The annotation based mode filtering will be handled in a follow up change.
CheckList