In #184, we had decided that instead of marking the VolumeAttachment as detached, we would just requeue the volume to have the workqueue process it again.
However, this doesn't work in the case where the Node is deleted. In that scenario:
- ListVolumes() shows that volume is not attached to the node anymore
- ReconcileVA() sets force sync
- syncAttach() just tries to reattach the volume again and fails because node is gone
- In k/k AD controller, we try to attach to new node, but it fails on the multi-attach check because volume is still attached in asw.
What should happen is:
- ListVolumes() shows that volume is not attached to the node anymore
- We actually mark VolumeAttachment.status.attached as detached
- In k/k AD controller, VerifyVolumesAttached() sees that VolumeAttachment is detached, updates asw
- AD reconciler allows new Attach on new node to proceed.
I'm not sure the best way to fix step 2). Some suggestions I have in order of preference:
- We go back to actually updating VolumeAttachment in ReconcileVA() like the original PR did. But we call markAsDetached to make sure we update everything properly.
- We pass some more state to syncVA() so that it can
markAsDetached if csiAttach failed on the force sync.
In #184, we had decided that instead of marking the VolumeAttachment as detached, we would just requeue the volume to have the workqueue process it again.
However, this doesn't work in the case where the Node is deleted. In that scenario:
What should happen is:
I'm not sure the best way to fix step 2). Some suggestions I have in order of preference:
markAsDetachedifcsiAttachfailed on the force sync.