-
Notifications
You must be signed in to change notification settings - Fork 508
fix: don't try to clean up pvs on nodes that are gone #480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
69c6989 to
1e1388b
Compare
|
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
|
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
|
This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
provisioner.go
Outdated
| if _, err := p.kubeClient.CoreV1().Nodes().Get(context.TODO(), node, metav1.GetOptions{}); err != nil { | ||
| logrus.Infof("Node %v does not exist, skipping cleanup of volume %v", node, pv.Name) | ||
| return nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@marcusramberg WDYT?
| if _, err := p.kubeClient.CoreV1().Nodes().Get(context.TODO(), node, metav1.GetOptions{}); err != nil { | |
| logrus.Infof("Node %v does not exist, skipping cleanup of volume %v", node, pv.Name) | |
| return nil | |
| } | |
| if _, err := p.kubeClient.CoreV1().Nodes().Get(context.TODO(), node, metav1.GetOptions{}); err != nil && apierrors.IsNotFound(err) { | |
| logrus.Infof("Node %v does not exist, skipping cleanup of volume %v", node, pv.Name) | |
| return nil | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That seems reasonable to me, I'll update the PR. It's imported in there as k8serror tho.
1e1388b to
bdf05c2
Compare
derekbit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM @marcusramberg. Thanks for your contribution. The improvement will be in v0.0.33 that is scheduled in Oct
We're running local-provisioner to provide local storage for CI runners where nodes come and go pretty frequently.
We observe that the provisioner is trying run clean up on nodes that are already gone, which causes helper pods
to be stuck in pending state as they cannot be scheduled.
This PR adds a check to see if the node still exists before trying to clean up the node.