Skip to content

Commit a132be5

Browse files
egeguneshors
andauthored
K8SPSMDB-1219: PBM multi storage support (#1843)
* K8SPSMDB-1219: PBM multi storage support Operator always supported multiple storages but didn't have native support for having multiple backup storages until v2.6.0. In operator we were reconfiguring PBM every time user selects a storage for their backups/restores different than the previous storage. This was causing long wait periods esp. in storages with lots of backups due to resync operation. Another limitation was forcing users to have only 1 backup storage if they want to enable point-in-time-recovery. In case of multiple storages, PBM would upload oplog chunks to whatever storage is last used by a backup/restore and this would make consistent recovery impossible. PBM v2.6.0 added native support for multiple storages and these changes introduce it to our operator: * User can have one main storage in PBM configuration. Any other storages can be added as profiles. Main storage can be found in: ``` kubectl exec cluster1-rs0-0 -c backup-agent -- pbm config ``` This commit introduces a new field `main` in storage spec: ``` storages: s3-us-west: main: true type: s3 s3: bucket: operator-testing credentialsSecret: cluster1-s3-secrets region: us-west-2 ```` If user only has 1 storage configured in `cr.yaml`, operator will automatically use it as main storage. If more than 1 storage is configured, one of them must have `main: true`. User can't have more than 1 storage with `main: true`. If user changes main storage in cr.yaml, operator will configure PBM with the new storage and start resync. Any other storage in `cr.yaml` will be added to PBM as a profile. User can see profiles using cli: ``` kubectl exec cluster1-rs0-0 -c backup-agent -- pbm profile list ``` When user adds a new profile to `cr.yaml`, operator will add it to PBM but won't start resync. **`pbm config --force-resync` only start resync for the main storage.** To manually resync a profile: ``` kubectl exec cluster1-rs0-0 -c backup-agent -- pbm profile sync <storage-name> ``` If user starts a restore using a backup in a storage configured as profile, operator will start resync operation for profile and block restore until resync finishes. Note: Profiles are also called external storages in PBM documentation. If user has multiple storages in `cr.yaml` and changes main storage between them, operator: 1. configures PBM with the new main storage. 2. adds the old main as a profile. 2. deletes profile for the new main storage. If user configures `backupSource` in backups/restores: * if `cr.yaml` has no storages configured, operator configures PBM with storage data in `backupSource` field. This storage will effectively be the main storage until user adds a storage to `cr.yaml`. After a storage is configured PBM configuration will be overwritten and `backupSource` storage will be gone. * if `cr.yaml` has a storage configured, operator adds `backupSource` storage as a profile. * Oplog chunks will be only be uploaded to main storage. User can use any backup as base backup for point-in-time-recovery. * Incremental backup chains all need to be stored in the same storage. TBD after #1836 merged. --- Other significant changes in operator behavior: * Operator now configures automatically configures PBM on a fresh cluster. Before this changes, PBM was not configured until user starts a backup/restore after deploying a fresh cluster. Now, PBM will directly configured with main storage in `cr.yaml` and resync will be started in the background. There's a new field in `PerconaServerMongoDB` status: `backupConfigHash`. Operator will maintain hash of the current PBM configuration and reconfigures PBM if hash is changed. Fields in `spec.backup.pitr` are excluded from hash calculation, they're handled separately. * If `PerconaServerMongoDB` is annotated with `percona.com/resync-pbm=true`, operator will start resync operation both for main storage and profiles. Resync for profiles are started with the equivalent of `pbm profile sync --all`. These resync operations will be run in the background and will not block the reconciliation. * If a backup that has `percona.com/delete-backup` finalizer is deleted, operator will only delete oplogs chunks if it's in main storage. * fix users * fix repeatedly sending backup commands * fix users * address review comments * assert err in main storage unit test * don't use break labels --------- Co-authored-by: Viacheslav Sarzhan <[email protected]>
1 parent 72bf015 commit a132be5

File tree

48 files changed

+1779
-588
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+1779
-588
lines changed

build/Dockerfile

+2-1
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ RUN go mod download
77

88
ARG GIT_COMMIT
99
ARG GIT_BRANCH
10+
ARG BUILD_TIME
1011
ARG GO_LDFLAGS
1112
ARG GOOS=linux
1213
ARG TARGETARCH
@@ -16,7 +17,7 @@ COPY . .
1617

1718
RUN mkdir -p build/_output/bin \
1819
&& GOOS=$GOOS GOARCH=${TARGETARCH} CGO_ENABLED=$CGO_ENABLED GO_LDFLAGS=$GO_LDFLAGS \
19-
go build -ldflags "-w -s -X main.GitCommit=$GIT_COMMIT -X main.GitBranch=$GIT_BRANCH" \
20+
go build -ldflags "-w -s -X main.GitCommit=$GIT_COMMIT -X main.GitBranch=$GIT_BRANCH -X main.BuildTime=$BUILD_TIME" \
2021
-o build/_output/bin/percona-server-mongodb-operator \
2122
cmd/manager/main.go \
2223
&& cp -r build/_output/bin/percona-server-mongodb-operator /usr/local/bin/percona-server-mongodb-operator

cmd/manager/main.go

+2-1
Original file line numberDiff line numberDiff line change
@@ -35,6 +35,7 @@ import (
3535
var (
3636
GitCommit string
3737
GitBranch string
38+
BuildTime string
3839
scheme = k8sruntime.NewScheme()
3940
setupLog = ctrl.Log.WithName("setup")
4041
)
@@ -69,7 +70,7 @@ func main() {
6970
ctrl.SetLogger(zap.New(zap.UseFlagOptions(&opts)))
7071

7172
setupLog.Info("Manager starting up", "gitCommit", GitCommit, "gitBranch", GitBranch,
72-
"goVersion", runtime.Version(), "os", runtime.GOOS, "arch", runtime.GOARCH)
73+
"buildTime", BuildTime, "goVersion", runtime.Version(), "os", runtime.GOOS, "arch", runtime.GOARCH)
7374

7475
namespace, err := k8s.GetWatchNamespace()
7576
if err != nil {

config/crd/bases/psmdb.percona.com_perconaservermongodbs.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -321,6 +321,8 @@ spec:
321321
required:
322322
- path
323323
type: object
324+
main:
325+
type: boolean
324326
s3:
325327
properties:
326328
bucket:
@@ -18790,6 +18792,8 @@ spec:
1879018792
properties:
1879118793
backup:
1879218794
type: string
18795+
backupConfigHash:
18796+
type: string
1879318797
backupVersion:
1879418798
type: string
1879518799
conditions:

deploy/bundle.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -1017,6 +1017,8 @@ spec:
10171017
required:
10181018
- path
10191019
type: object
1020+
main:
1021+
type: boolean
10201022
s3:
10211023
properties:
10221024
bucket:
@@ -19486,6 +19488,8 @@ spec:
1948619488
properties:
1948719489
backup:
1948819490
type: string
19491+
backupConfigHash:
19492+
type: string
1948919493
backupVersion:
1949019494
type: string
1949119495
conditions:

deploy/crd.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -1017,6 +1017,8 @@ spec:
10171017
required:
10181018
- path
10191019
type: object
1020+
main:
1021+
type: boolean
10201022
s3:
10211023
properties:
10221024
bucket:
@@ -19486,6 +19488,8 @@ spec:
1948619488
properties:
1948719489
backup:
1948819490
type: string
19491+
backupConfigHash:
19492+
type: string
1948919493
backupVersion:
1949019494
type: string
1949119495
conditions:

deploy/cw-bundle.yaml

+4
Original file line numberDiff line numberDiff line change
@@ -1017,6 +1017,8 @@ spec:
10171017
required:
10181018
- path
10191019
type: object
1020+
main:
1021+
type: boolean
10201022
s3:
10211023
properties:
10221024
bucket:
@@ -19486,6 +19488,8 @@ spec:
1948619488
properties:
1948719489
backup:
1948819490
type: string
19491+
backupConfigHash:
19492+
type: string
1948919493
backupVersion:
1949019494
type: string
1949119495
conditions:

e2e-tests/custom-replset-name/conf/some-name.yml

+1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ spec:
1212
serviceAccountName: percona-server-mongodb-operator
1313
storages:
1414
aws-s3:
15+
main: true
1516
type: s3
1617
s3:
1718
credentialsSecret: aws-s3-secret

e2e-tests/custom-users-roles-sharded/conf/some-name-rs0.yml

+1
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ spec:
6363
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
6464
storages:
6565
aws-s3:
66+
main: true
6667
type: s3
6768
s3:
6869
credentialsSecret: aws-s3-secret

e2e-tests/custom-users-roles/conf/some-name-rs0.yml

+1
Original file line numberDiff line numberDiff line change
@@ -63,6 +63,7 @@ spec:
6363
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
6464
storages:
6565
aws-s3:
66+
main: true
6667
type: s3
6768
s3:
6869
credentialsSecret: aws-s3-secret

e2e-tests/data-at-rest-encryption/conf/some-name-unencrypted.yml

+2-1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ spec:
1212
image: perconalab/percona-server-mongodb-operator:main-backup
1313
storages:
1414
aws-s3:
15+
main: true
1516
type: s3
1617
s3:
1718
credentialsSecret: aws-s3-secret
@@ -82,4 +83,4 @@ spec:
8283
resources:
8384
requests:
8485
storage: 1Gi
85-
size: 3
86+
size: 3

e2e-tests/data-at-rest-encryption/conf/some-name.yml

+1
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@ spec:
1212
image: perconalab/percona-server-mongodb-operator:main-backup
1313
storages:
1414
aws-s3:
15+
main: true
1516
type: s3
1617
s3:
1718
credentialsSecret: aws-s3-secret

e2e-tests/demand-backup-physical-sharded/conf/some-name-sharded.yml

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ spec:
1111
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
1212
storages:
1313
aws-s3:
14+
main: true
1415
type: s3
1516
s3:
1617
credentialsSecret: aws-s3-secret

e2e-tests/demand-backup-physical/conf/some-name-arbiter-nv.yml

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ spec:
1313
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
1414
storages:
1515
aws-s3:
16+
main: true
1617
type: s3
1718
s3:
1819
credentialsSecret: aws-s3-secret

e2e-tests/demand-backup-physical/conf/some-name.yml

+1
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ spec:
1313
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
1414
storages:
1515
aws-s3:
16+
main: true
1617
type: s3
1718
s3:
1819
credentialsSecret: aws-s3-secret

e2e-tests/demand-backup-sharded/conf/some-name-rs0.yml

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ spec:
1111
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
1212
storages:
1313
aws-s3:
14+
main: true
1415
type: s3
1516
s3:
1617
credentialsSecret: aws-s3-secret

e2e-tests/demand-backup/conf/some-name-rs0.yml

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ spec:
1111
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
1212
storages:
1313
aws-s3:
14+
main: true
1415
type: s3
1516
s3:
1617
credentialsSecret: aws-s3-secret

e2e-tests/expose-sharded/conf/some-name-rs0.yml

+1
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ spec:
1111
image: perconalab/percona-server-mongodb-operator:1.1.0-backup
1212
storages:
1313
aws-s3:
14+
main: true
1415
type: s3
1516
s3:
1617
credentialsSecret: aws-s3-secret

e2e-tests/functions

+32-10
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,9 @@ fi
6868
KUBE_VERSION=$(kubectl version -o json | jq -r '.serverVersion.major + "." + .serverVersion.minor' | $sed -r 's/[^0-9.]+//g')
6969

7070
log() {
71+
set +o xtrace
7172
echo "[$(date +%Y-%m-%dT%H:%M:%S%z)]" $*
73+
set_debug
7274
}
7375

7476
set_debug() {
@@ -198,15 +200,14 @@ wait_backup_agent() {
198200
local agent_pod=$1
199201

200202
set +o xtrace
201-
retry=0
202-
echo -n $agent_pod
203-
until [ "$(kubectl_bin logs $agent_pod -c backup-agent | egrep -v "\[ERROR\] pitr: check if on:|node:|starting PITR routine|\[agentCheckup\]" | cut -d' ' -f3- | tail -n 1)" == "listening for the commands" ]; do
203+
local retry=0
204+
echo -n "waiting for pbm-agent to be ready in ${agent_pod}..."
205+
until kubectl_bin logs $agent_pod -c backup-agent | grep "listening for the commands"; do
204206
sleep 5
205207
echo -n .
206208
let retry+=1
207209
if [ $retry -ge 360 ]; then
208-
kubectl_bin logs $agent_pod -c backup-agent \
209-
| tail -100
210+
kubectl_bin logs $agent_pod -c backup-agent | tail -100
210211
echo max retry count $retry reached. something went wrong with operator or kubernetes cluster
211212
exit 1
212213
fi
@@ -341,6 +342,7 @@ wait_restore() {
341342
local target_state=${3:-"ready"}
342343
local wait_cluster_consistency=${4:-1}
343344
local wait_time=${5:-1780}
345+
local ok_if_ready=${6:-0}
344346

345347
set +o xtrace
346348
retry=0
@@ -351,6 +353,10 @@ wait_restore() {
351353
echo -n .
352354
let retry+=1
353355
current_state=$(kubectl_bin get psmdb-restore restore-$backup_name -o jsonpath='{.status.state}')
356+
if [[ ${ok_if_ready} == 1 && ${current_state} == 'ready' ]]; then
357+
echo "OK"
358+
break
359+
fi
354360
if [[ $retry -ge $wait_time || ${current_state} == 'error' ]]; then
355361
kubectl_bin logs ${OPERATOR_NS:+-n $OPERATOR_NS} $(get_operator_pod) \
356362
| grep -v 'level=info' \
@@ -364,7 +370,7 @@ wait_restore() {
364370
exit 1
365371
fi
366372
done
367-
echo
373+
echo "OK"
368374
set_debug
369375

370376
if [ $wait_cluster_consistency -eq 1 ]; then
@@ -460,10 +466,17 @@ deploy_minio() {
460466
kubectl_bin create svc -n ${OPERATOR_NS} externalname minio-service --external-name="minio-service.${namespace}.svc.cluster.local" --tcp="9000"
461467
fi
462468

463-
# create bucket
469+
create_minio_bucket operator-testing
470+
}
471+
472+
create_minio_bucket() {
473+
local bucket=$1
474+
464475
kubectl_bin run -i --rm aws-cli --image=perconalab/awscli --restart=Never -- \
465-
bash -c 'AWS_ACCESS_KEY_ID=some-access-key AWS_SECRET_ACCESS_KEY=some-secret-key AWS_DEFAULT_REGION=us-east-1 \
466-
/usr/bin/aws --endpoint-url http://minio-service:9000 s3 mb s3://operator-testing'
476+
bash -c "AWS_ACCESS_KEY_ID=some-access-key \
477+
AWS_SECRET_ACCESS_KEY=some-secret-key \
478+
AWS_DEFAULT_REGION=us-east-1 \
479+
/usr/bin/aws --endpoint-url http://minio-service:9000 s3 mb s3://${bucket}"
467480
}
468481

469482
deploy_vault() {
@@ -848,11 +861,20 @@ compare_mongo_cmd() {
848861
local suffix="$4"
849862
local database="${5:-myApp}"
850863
local collection="${6:-test}"
864+
local sort="$7"
851865

852-
run_mongo "use ${database}\n db.${collection}.${command}()" "$uri" "mongodb" "$suffix" \
866+
local full_command="db.${collection}.${command}()"
867+
if [[ ! -z ${sort} ]]; then
868+
full_command="${full_command}.${sort}"
869+
fi
870+
871+
log "running ${full_command} in ${database}"
872+
873+
run_mongo "use ${database}\n ${full_command}" "$uri" "mongodb" "$suffix" \
853874
| egrep -v 'I NETWORK|W NETWORK|F NETWORK|Error saving history file|Percona Server for MongoDB|connecting to:|Unable to reach primary for set|Implicit session:|versions do not match|Error saving history file:' \
854875
| $sed -re 's/ObjectId\("[0-9a-f]+"\)//; s/-[0-9]+.svc/-xxx.svc/' \
855876
>$tmp_dir/${command}${postfix}
877+
856878
diff -u ${test_dir}/compare/${command}${postfix}.json $tmp_dir/${command}${postfix}
857879
}
858880

e2e-tests/mongod-major-upgrade/run

+2
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,8 @@ function main() {
1919
kubectl_bin apply -f "${conf_dir}/client.yml" \
2020
-f "${conf_dir}/secrets.yml"
2121

22+
apply_s3_storage_secrets
23+
2224
desc 'install version service'
2325

2426
cp $test_dir/conf/operator.main.psmdb-operator.dep.json ${tmp_dir}/operator.${OPERATOR_VERSION}.psmdb-operator.dep.json
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
switched to db myApp
2+
{ "_id" : , "x" : 100500 }
3+
{ "_id" : , "x" : 100501 }
4+
{ "_id" : , "x" : 100502 }
5+
{ "_id" : , "x" : 100503 }
6+
{ "_id" : , "x" : 100504 }
7+
{ "_id" : , "x" : 100505 }
8+
{ "_id" : , "x" : 100506 }
9+
{ "_id" : , "x" : 100507 }
10+
{ "_id" : , "x" : 100508 }
11+
{ "_id" : , "x" : 100509 }
12+
{ "_id" : , "x" : 100510 }
13+
bye
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
switched to db myApp
2+
{ "_id" : , "x" : 100500 }
3+
{ "_id" : , "x" : 100501 }
4+
{ "_id" : , "x" : 100502 }
5+
{ "_id" : , "x" : 100503 }
6+
{ "_id" : , "x" : 100504 }
7+
{ "_id" : , "x" : 100505 }
8+
bye
+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
apiVersion: psmdb.percona.com/v1
2+
kind: PerconaServerMongoDBBackup
3+
metadata:
4+
finalizers:
5+
- percona.com/delete-backup
6+
name: backup-minio
7+
spec:
8+
clusterName: some-name
9+
storageName: minio
+10
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
apiVersion: psmdb.percona.com/v1
2+
kind: PerconaServerMongoDBRestore
3+
metadata:
4+
name:
5+
spec:
6+
clusterName: some-name
7+
backupName:
8+
pitr:
9+
type: date
10+
date:

0 commit comments

Comments
 (0)