Skip to content

Conversation

@tbg
Copy link
Member

@tbg tbg commented Dec 19, 2025

Backport 3/3 commits from #158473 on behalf of @tbg.


Bring replica rebalancing in line with the candidate filtering strategy
documented in rebalanceStore. Both lease transfers and replica transfers
now map relatively cleanly to the pre-means/post-means comment
on rebalanceStore. I also made both lease and replica transfers
principled about which of the existing replicas they should ignore
the disposition for (existing leaseholder for lease transfers, all
replicas for replica transfers).

Other changes:

  • Rename computeCandidatesForRange to computeCandidatesForReplicaTransfer
  • Consolidate exclusions into single postMeansExclusions parameter
  • Move candidate logging into computeCandidatesForReplicaTransfer
  • Add unit tests for retainReadyReplicaTargetStoresOnly
  • assorted comment improvements and small structural improvements.

Part of #156776.
^-- need to plumb the mmaprototype.Status inputs to mmaAllocator still, both in asim and production code.

Epic: CRDB-55052


Release justification: low risk mmaprototype changes only

tbg added 3 commits November 28, 2025 15:51
Previously, computeCandidatesForRange (now computeCandidatesForReplicaTransfer)
computed the mean over all constraint-satisfying stores, then applied
post-means exclusions before returning candidates. These exclusions were:
- Stores on nodes already housing replicas of the range
- The shedding store itself (handled via a separate check in the loop)
- If the shedding store's node was CPU overloaded, all other stores on
  that node

The last bullet was ad-hoc and asymmetric: we wouldn't apply the same logic
to other overloaded nodes that happened to have replicas. It also wasn't
exercised, and the downstream candidate selection logic wouldn't choose these
stores anyway (they'd be rejected for being on an overloaded node).

This commit makes two changes:

1. Simplified post-means exclusions: The parameter is renamed from
   storesToExclude to postMeansExclusions to clarify its role. All exclusions
   are now handled uniformly in this set:
   - The shedding store (we're moving away from it)
   - Stores on nodes with other existing replicas

   The ad-hoc "exclude other stores on shedding node if node is overloaded"
   logic is removed. Within-node rebalance (to another store on the shedding
   node) is now permitted at this stage.

2. Pre-means filtering: This is new. Before computing the mean, we now filter
   out stores that aren't ready to receive replicas based on disposition.
   As a legacy case, we manually filter on high disk utilization, but there
   is a TODO to fold this into the disposition as well - this is tracked and
   will be done separately.

   This is done by retainReadyReplicaTargetStoresOnly. The shedding store
   bypasses the disposition check since it already has the replica - its load
   should be in the mean regardless of its disposition for NEW replicas.

   When filtering occurs, we recompute the mean over the filtered set. The
   common case (nothing filtered) still uses the cached mean for efficiency.

Other cleanup:
- Fixed logging (VEventf instead of mixing V() and Infof)
- Added scratch fields to clusterState for the new filtering but also left
  a TODO to get out of scratch fields entirely.

Part of cockroachdb#156776.

Epic: CRDB-55052
- Improve documentation and TODOs
- Skip leaseholder for disposition checks
@tbg tbg requested review from a team as code owners December 19, 2025 13:48
@blathers-crl blathers-crl bot added blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. labels Dec 19, 2025
@blathers-crl
Copy link

blathers-crl bot commented Dec 19, 2025

Thanks for opening a backport.

Before merging, please confirm that the change does not break backwards compatibility and otherwise complies with the backport policy. Include a brief release justification in the PR description explaining why the backport is appropriate. All backports must be reviewed by the TL for the owning area. While the stricter LTS policy does not yet apply, please exercise judgment and consider gating non-critical changes behind a disabled-by-default feature flag when appropriate.

@blathers-crl blathers-crl bot added backport Label PR's that are backports to older release branches T-kv KV Team labels Dec 19, 2025
@blathers-crl
Copy link

blathers-crl bot commented Dec 19, 2025

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@wenyihu6 wenyihu6 requested a review from arulajmani December 19, 2025 13:48
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@tbg tbg merged commit 9d0571b into cockroachdb:release-26.1 Dec 19, 2025
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Label PR's that are backports to older release branches blathers-backport This is a backport that Blathers created automatically. O-robot Originated from a bot. T-kv KV Team target-release-26.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants