Skip to content

Commit 5e994f7

Browse files
egeguneshors
andauthored
K8SPSMDB-486: Fix cluster crash on losing majority due to downscale (#695)
* K8SPSMDB-486: Fix cluster crash on losing majority due to downscale Co-authored-by: Viacheslav Sarzhan <[email protected]>
1 parent b56d06c commit 5e994f7

File tree

3 files changed

+72
-0
lines changed

3 files changed

+72
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
# Allow downscaling replicaSets in a safe manner
2+
3+
* Date: 2021-06-15
4+
5+
## Status
6+
7+
* Status: Accepted
8+
9+
## Context
10+
11+
Currently there are no mechanisms to stop users downscaling MongoDB replicaSets
12+
in a manner that causes cluster failure. For example, if a user has 7 members
13+
of replicaSet and downscales it to 3 members, the cluster is going to lose
14+
majority (4 members) and the operator won't be able to run `replSetReconfig`
15+
because of that.
16+
17+
## Considered Options
18+
19+
* Compare current statefulset size with replicaSet size in CR and downscale one
20+
by one on each reconciliation until reaching the target size.
21+
* Compare current statefulset size with replicaSet size in CR and if the target
22+
size breaks the majority throw an error.
23+
* Rather than populating replicaSet members based on a set of pods that are selected
24+
according to labels, we could fetch pods one by one in respect to replicaSet size and
25+
thus removing the excess pods from replicaSet config *hopefully* before they
26+
become inaccessible.
27+
* ~Use [admission controllers](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/) to validate size updates.~ For every CRD, there can be only one webhook in endpoint in a cluster. PSMDB Operator doesn't work cluster wide
28+
* Document this issue and let the end users decide
29+
30+
## Decision
31+
32+
Chosen option: We will compare the current statefulset size with the replicaSet size in
33+
the CR and downscale one by one on each reconciliation until reaching the target
34+
size.
35+
36+
## Consequences
37+
38+
Users can downscale their clusters to any size they like in a single step.
39+
40+
### Negative Consequences
41+
42+
We will be mutating the replicaSet size field in CR to downscale one by one. If
43+
CR is updated by operator before the replicaSet reaches the target size, we'll
44+
overwrite the user's changes on size field. For instance, in the `writeStatus`
45+
method, we're trying to update the status subresource but if it fails we update
46+
the whole CR.

docs/architecture/decisions/index.md

+1
Original file line numberDiff line numberDiff line change
@@ -3,5 +3,6 @@
33
This log lists the architectural decisions for PSMDBO.
44

55
- [ADR-0001](0001-record-architecture-decisions.md) - Use Markdown Architectural Decision Records
6+
- [ADR-0002](0002-allow-downscaling-in-a-safe-manner.md) - Allow downscaling replicaSets in a safe manner
67

78
For new ADRs, please use [template.md](template.md) as basis.

pkg/controller/perconaservermongodb/psmdb_controller.go

+25
Original file line numberDiff line numberDiff line change
@@ -225,6 +225,11 @@ func (r *ReconcilePerconaServerMongoDB) Reconcile(request reconcile.Request) (re
225225
return reconcile.Result{}, err
226226
}
227227

228+
err = r.safeDownscale(cr)
229+
if err != nil {
230+
return reconcile.Result{}, errors.Wrap(err, "safe downscale")
231+
}
232+
228233
if cr.ObjectMeta.DeletionTimestamp != nil {
229234
err = r.checkFinalizers(cr)
230235
return rr, err
@@ -498,6 +503,26 @@ func (r *ReconcilePerconaServerMongoDB) checkConfiguration(cr *api.PerconaServer
498503
return nil
499504
}
500505

506+
func (r *ReconcilePerconaServerMongoDB) safeDownscale(cr *api.PerconaServerMongoDB) error {
507+
for _, rs := range cr.Spec.Replsets {
508+
sf, err := r.getRsStatefulset(cr, rs.Name)
509+
if err != nil && !k8serrors.IsNotFound(err) {
510+
return errors.Wrap(err, "get rs statefulset")
511+
}
512+
513+
if k8serrors.IsNotFound(err) {
514+
continue
515+
}
516+
517+
// downscale 1 pod on each reconciliation
518+
if *sf.Spec.Replicas-rs.Size > 1 {
519+
rs.Size = *sf.Spec.Replicas - 1
520+
}
521+
}
522+
523+
return nil
524+
}
525+
501526
func (r *ReconcilePerconaServerMongoDB) getRemovedSfs(cr *api.PerconaServerMongoDB) ([]appsv1.StatefulSet, error) {
502527
removed := make([]appsv1.StatefulSet, 0)
503528

0 commit comments

Comments
 (0)