Skip to content

Conversation

loktev-d
Copy link
Contributor

@loktev-d loktev-d commented Sep 30, 2025

Description

Fix VirtualDisk remaining in WaitForFirstConsumer phase even after VM attachment and provisioning has started.

Why do we need it, and what problem does it solve?

When using WFFC storage class with volume populators:

  1. VD transitions to WaitForFirstConsumer waiting for VM
  2. VM is created and attached to VD
  3. Volume provisioning starts (importer pod running)
  4. Issue: VD controller continues setting phase to WaitForFirstConsumer because DataVolume is in PendingPopulation state, even though the "first consumer" (VM) already exists

This creates perception of "hanging" - users see VD stuck in WFFC for minutes while provisioning is actually running.

What is the expected result?

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

Changelog entries

section: vd
type: fix
summary: VirtualDisk no longer stuck in WaitForFirstConsumer phase after VM attachment.

@loktev-d loktev-d added this to the v1.2.0 milestone Sep 30, 2025
Copy link

sourcery-ai bot commented Sep 30, 2025

Reviewer's Guide

Updates to WFFC handling in VirtualDisk controller: logic now dynamically fetches StorageClass binding mode, excludes disks with attached VMs from staying in WaitForFirstConsumer phase, and corresponding test adjustments were made.

Sequence diagram for VirtualDisk phase update after VM attachment

sequenceDiagram
    participant VDController
    participant VirtualDisk
    participant StorageClass
    participant VM
    participant DataVolume

    VDController->>VirtualDisk: Check Status.Phase
    VDController->>StorageClass: Fetch by StorageClassName
    StorageClass-->>VDController: Return VolumeBindingMode
    VDController->>VirtualDisk: Check AttachedToVirtualMachines
    alt VolumeBindingMode is WaitForFirstConsumer and no VM attached
        VDController->>VirtualDisk: Set Phase to WaitForFirstConsumer
    else VM is attached
        VDController->>VirtualDisk: Do not set Phase to WaitForFirstConsumer
    end
Loading

Class diagram for updated VirtualDisk WaitForFirstConsumer logic

classDiagram
    class BlockDeviceHandler {
        +checkVirtualDisksToBeWFFC(ctx, s)
        client
    }
    class VirtualDisk {
        Status
        Status.Phase
        Status.StorageClassName
        Status.Conditions
        Status.AttachedToVirtualMachines
    }
    class StorageClass {
        VolumeBindingMode
    }
    BlockDeviceHandler --> VirtualDisk : checks
    BlockDeviceHandler --> StorageClass : fetches
    VirtualDisk --> StorageClass : references by StorageClassName

    class WaitForDVStep {
        +setForFirstConsumerIsAwaited(ctx, vd)
        dv
        cb
    }
    WaitForDVStep --> VirtualDisk : updates phase
    WaitForDVStep --> StorageClass : checks binding mode

    class DataVolume {
        Status.Phase
    }
    WaitForDVStep --> DataVolume : checks phase
Loading

File-Level Changes

Change Details Files
Prevent VirtualDisk from remaining in WaitForFirstConsumer when VM is attached
  • Include check for AttachedToVirtualMachines count before setting WFFC phase
  • Restrict Phase=DiskWaitForFirstConsumer only when no VMs are attached
images/virtualization-artifact/pkg/controller/vd/internal/source/step/wait_for_dv_step.go
Enhance handler to fetch StorageClass and use its binding mode
  • Import storage/v1 API and types package
  • Fetch StorageClass by name from client in block_device_condition
  • Check VolumeBindingMode and disk Ready condition status
images/virtualization-artifact/pkg/controller/vm/internal/block_device_condition.go
Update tests to simulate WFFC storage class scenarios
  • Import storage/v1 in block_devices_test.go
  • Set StorageClassName on VirtualDiskStatus in tests
  • Add getWFFCStorageClass helper to create WFFC StorageClass
  • Include the StorageClass object in fake client setup
images/virtualization-artifact/pkg/controller/vm/internal/block_devices_test.go

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

@loktev-d loktev-d marked this pull request as draft September 30, 2025 17:15
Signed-off-by: Daniil Loktev <[email protected]>
@loktev-d loktev-d force-pushed the fix/vd/wffc-stuck-in-waiting-phase branch from 7cb3b44 to 1909f13 Compare October 3, 2025 13:25
loktev-d and others added 4 commits October 3, 2025 16:28
Signed-off-by: Daniil Loktev <[email protected]>
Signed-off-by: Daniil Loktev <[email protected]>
Signed-off-by: Daniil Loktev <[email protected]>
@loktev-d loktev-d added the e2e/run Run e2e test on cluster of PR author label Oct 3, 2025
@deckhouse-BOaTswain
Copy link
Contributor

deckhouse-BOaTswain commented Oct 3, 2025

Workflow has started.
Follow the progress here: Workflow Run

The target step completed with status: failure.

@deckhouse-BOaTswain deckhouse-BOaTswain removed the e2e/run Run e2e test on cluster of PR author label Oct 3, 2025
@loktev-d loktev-d marked this pull request as ready for review October 6, 2025 08:23
Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes and they look great!

Prompt for AI Agents
Please address the comments from this code review:

## Individual Comments

### Comment 1
<location> `images/virtualization-artifact/pkg/controller/vm/internal/block_device_condition.go:60-66` </location>
<code_context>
 	for _, vd := range vds {
-		if vd.Status.Phase == v1alpha2.DiskWaitForFirstConsumer {
-			return true, nil
+		scName := vd.Status.StorageClassName
+		sc, err := object.FetchObject(ctx, types.NamespacedName{Name: scName}, h.client, &storagev1.StorageClass{})
+		if err != nil {
+			return false, fmt.Errorf("fetch storage class %s: %w", scName, err)
+		}
+
+		if sc != nil && sc.VolumeBindingMode != nil && *sc.VolumeBindingMode == storagev1.VolumeBindingWaitForFirstConsumer {
+			readyCondition, _ := conditions.GetCondition(vdcondition.ReadyType, vd.Status.Conditions)
+			if readyCondition.Status != metav1.ConditionTrue {
</code_context>

<issue_to_address>
**suggestion:** Consider handling missing or empty StorageClassName more explicitly.

If vd.Status.StorageClassName is empty, FetchObject will try to fetch a storage class with an empty name, which could cause errors or unnecessary log entries. Consider adding a check to handle this case before calling FetchObject.
</issue_to_address>

### Comment 2
<location> `images/virtualization-artifact/pkg/controller/vm/internal/block_device_condition.go:68` </location>
<code_context>
+		readyCondition, _ := conditions.GetCondition(vdcondition.ReadyType, vd.Status.Conditions)
</code_context>

<issue_to_address>
**issue (bug_risk):** Check for nil readyCondition before accessing Status.

Add a nil check for readyCondition before accessing its Status to prevent a potential panic.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +60 to +66
scName := vd.Status.StorageClassName
sc, err := object.FetchObject(ctx, types.NamespacedName{Name: scName}, h.client, &storagev1.StorageClass{})
if err != nil {
return false, fmt.Errorf("fetch storage class %s: %w", scName, err)
}

if sc != nil && sc.VolumeBindingMode != nil && *sc.VolumeBindingMode == storagev1.VolumeBindingWaitForFirstConsumer {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Consider handling missing or empty StorageClassName more explicitly.

If vd.Status.StorageClassName is empty, FetchObject will try to fetch a storage class with an empty name, which could cause errors or unnecessary log entries. Consider adding a check to handle this case before calling FetchObject.


if sc != nil && sc.VolumeBindingMode != nil && *sc.VolumeBindingMode == storagev1.VolumeBindingWaitForFirstConsumer {
readyCondition, _ := conditions.GetCondition(vdcondition.ReadyType, vd.Status.Conditions)
if readyCondition.Status != metav1.ConditionTrue {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Check for nil readyCondition before accessing Status.

Add a nil check for readyCondition before accessing its Status to prevent a potential panic.

@LopatinDmitr LopatinDmitr self-requested a review October 8, 2025 09:24
for _, vd := range vds {
if vd.Status.Phase == v1alpha2.DiskWaitForFirstConsumer {
return true, nil
scName := vd.Status.StorageClassName
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we check the SC type of the disk in VM? Why can't we trust the VD phase?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants