Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent infinite retries of autoscaling #9574

Draft
wants to merge 1 commit into
base: 4.19
Choose a base branch
from

Conversation

Pearl1594
Copy link
Contributor

Description

This PR fixes: #9318

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)
  • build/CI
  • test (unit or integration test code)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

How did you try to break this feature and the system with this change?

@Pearl1594
Copy link
Contributor Author

@weizhouapache would like some advice on this issue. Do you think that if the number of all VMs including those in error and stopped states >= max size then we should stop scaling any further Or do you think if there are VMs in error state we need to retry for a few iterations?

Copy link

codecov bot commented Aug 22, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 4.30%. Comparing base (6e6a276) to head (463d2a6).
Report is 5 commits behind head on 4.19.

❗ There is a different number of reports uploaded between BASE (6e6a276) and HEAD (463d2a6). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (6e6a276) HEAD (463d2a6)
unittests 1 0
Additional details and impacted files
@@             Coverage Diff              @@
##               4.19   #9574       +/-   ##
============================================
- Coverage     15.08%   4.30%   -10.79%     
============================================
  Files          5406     366     -5040     
  Lines        472889   29514   -443375     
  Branches      57738    5162    -52576     
============================================
- Hits          71352    1270    -70082     
+ Misses       393593   28100   -365493     
+ Partials       7944     144     -7800     
Flag Coverage Δ
uitests 4.30% <ø> (ø)
unittests ?

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -67,7 +67,7 @@ public int countAvailableVmsByGroup(long vmGroupId) {
SearchCriteria<Integer> sc = CountBy.create();
sc.setParameters("vmGroupId", vmGroupId);
sc.setJoinParameters("vmSearch", "states",
State.Starting, State.Running, State.Stopping, State.Migrating);
State.Starting, State.Running, State.Stopping, State.Migrating, State.Error, State.Stopped);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may make autoscaler to not retry deployment if any deployment goes into error state intermittently. Should we include Error state after some n reties?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that was my worry too, wanted some inputs on it. I'll do that. Thanks @shwstppr

@DaanHoogland DaanHoogland changed the title Prevent infinite retires of autoscaling Prevent infinite retries of autoscaling Aug 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants