Skip to content

Conversation

@yfuruyama
Copy link

@yfuruyama yfuruyama commented Dec 15, 2025

Fixes: #4561

Details: Changes the guardrail (checkReplicasAvailable) to be called after SetHeaderRoute (and SetMirrorRoute) are executed so that setHeaderRoute step is not skipped when the Stable ReplicaSet is not ready.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this is a chore.
  • The title of the PR is (a) conventional with a list of types and scopes found here, (b) states what changed, and (c) suffixes the related issues number. E.g. "fix(controller): Updates such and such. Fixes #1234".
  • I've signed my commits with DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My builds are green. Try syncing with master if they are not.
  • My organization is added to USERS.md.

@yfuruyama yfuruyama force-pushed the fix-skip-setheaderroute branch from 9a97406 to 9fedff5 Compare December 15, 2025 03:44
@github-actions
Copy link
Contributor

github-actions bot commented Dec 15, 2025

Published E2E Test Results

  4 files    4 suites   3h 26m 49s ⏱️
117 tests 107 ✅  7 💤 3 ❌
476 runs  440 ✅ 28 💤 8 ❌

For more details on these failures, see this check.

Results for commit 0128082.

♻️ This comment has been updated with latest results.

@github-actions
Copy link
Contributor

github-actions bot commented Dec 15, 2025

Published Unit Test Results

2 369 tests   2 369 ✅  3m 3s ⏱️
  129 suites      0 💤
    1 files        0 ❌

Results for commit 0128082.

♻️ This comment has been updated with latest results.

@codecov
Copy link

codecov bot commented Dec 15, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.36%. Comparing base (7518bde) to head (0128082).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #4562      +/-   ##
==========================================
- Coverage   84.36%   84.36%   -0.01%     
==========================================
  Files         164      164              
  Lines       18841    18841              
==========================================
- Hits        15896    15895       -1     
- Misses       2079     2080       +1     
  Partials      866      866              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@yfuruyama yfuruyama force-pushed the fix-skip-setheaderroute branch from 722b9b9 to 0128082 Compare December 16, 2025 01:54
@sonarqubecloud
Copy link

@zachaller
Copy link
Collaborator

Hm.... I think another possibility is maybe trying to get some validation that the step completed. If we look at the code below I think we essentially always mark the step completed this might be why we skip?

case currentStep.SetHeaderRoute != nil:
    return true
case currentStep.SetMirrorRoute != nil:
    return true

@yfuruyama
Copy link
Author

@zachaller Thank you for taking a look!

If we look at the code below I think we essentially always mark the step completed this might be why we skip?

Yes, that's "always-completed" logic made me confused, but I think that is not the core reason why the step is skipped.

Looking at reconcileTrafficRouting(), there are only 2 cases where reconcileTrafficRouting finishes without error before reaching SetHeaderRoute():

  1. When reconciler is empty:
    // ensure that trafficReconcilers list is healthy
    if len(reconcilers) == 0 {
    c.log.Info("No TrafficRouting Reconcilers found")
    c.newStatus.Canary.Weights = nil
    return nil
    }
  2. When Stable ReplicaSet is not ready (checkReplicasAvailable returned false):
    // check if the stable RS has enough pods before recalculating the new
    // weight status.
    if !c.checkReplicasAvailable(c.stableRS, weightutil.MaxTrafficWeight(c.rollout)-desiredWeight) {
    return nil
    }

The 1st reason was not the case here, and my test in #4561 can reproduce this, so I believe that the checkReplicasAvailable validation skipped the setHeaderRoute step.

@yfuruyama
Copy link
Author

@zachaller Hi Zach, could you take a look at this again when you have a chance? Thanks!

@yfuruyama
Copy link
Author

@zachaller Hi Zach, could you take a look at this again?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SetHeaderTraffic is skipped unexpectedly if Stable ReplicaSet is not ready

2 participants