Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Fix PVC resizing issue and refactor PVC resizing logic #1268

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

fengyinqiao
Copy link
Contributor

Description

This PR addresses a critical issue where PVC resizing did not take effect when node-conf persistence was enabled. Additionally, the PVC resizing logic has been refactored for improved modularity and reliability. The main changes include:

  1. Label Correction

    • Updated the PVC selection label from "redis" to "middleware" to ensure the correct PVCs are targeted.
  2. VolumeClaimTemplate Handling

    • Previously, the code used a hardcoded VolumeClaimTemplates[0], which caused PVC resizing to fail when node-conf persistence was enabled.
    • The new logic filters out the VolumeClaimTemplate named "node-conf" and selects the appropriate template (for Redis RDB data), ensuring that PVC resizing is applied correctly.
  3. Enhanced Logging and Error Handling

    • Each PVC update now logs a success message, making it easier to trace which PVCs were successfully resized and which were not.
    • If any PVC update fails, the function returns an error, preventing silent failures.

These improvements enhance the reliability of the PVC resizing mechanism and increase code maintainability. Please review and let me know if any further adjustments are needed.

Copy link

codecov bot commented Mar 9, 2025

Codecov Report

Attention: Patch coverage is 64.34783% with 41 lines in your changes missing coverage. Please review.

Project coverage is 34.66%. Comparing base (60ef5a2) to head (76010a4).
Report is 59 commits behind head on master.

Files with missing lines Patch % Lines
pkg/k8sutils/pvc.go 71.62% 15 Missing and 6 partials ⚠️
pkg/k8sutils/statefulset.go 51.21% 16 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1268      +/-   ##
==========================================
+ Coverage   30.20%   34.66%   +4.46%     
==========================================
  Files          55       54       -1     
  Lines        6294     5962     -332     
==========================================
+ Hits         1901     2067     +166     
+ Misses       4200     3711     -489     
+ Partials      193      184       -9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shubham-cmyk
Copy link
Member

shubham-cmyk commented Mar 9, 2025

@fengyinqiao Great work.
i would suggest if you could write up e2e test using the chainsaw refer

we can make sure that PVC is successfully.
If you need any help regarding that ping us

@fengyinqiao
Copy link
Contributor Author

@fengyinqiao Great work. i would suggest if you could write up e2e test using the chainsaw refer

we can make sure that PVC is successfully. If you need any help regarding that ping us

ok,I will give it a try.

fengyinqiao added 2 commits March 10, 2025 10:01
- Fixed label error: updated PVC selection label from "redis" to "middleware".
- Corrected PVC expansion when node-conf persistence is enabled by avoiding hardcoded use of VolumeClaimTemplates[0].
- PVC resizing now filters out the "node-conf" template and targets the correct PVC for Redis RDB data.
- Enhanced error handling and added per-PVC update logging for better traceability.

Signed-off-by: fengyinqiao <[email protected]>
- todo: Handle scale-out scenario

Signed-off-by: fengyinqiao <[email protected]>
Signed-off-by: fengyinqiao <[email protected]>
@fengyinqiao
Copy link
Contributor Author

@shubham-cmyk What is the default storageClass for e2e, the field allowVolumeExpansion is true or false? Now e2e failed for pvc not resized, I think it was caused by the default storageClass that does not allow volume expansion.

Signed-off-by: fengyinqiao <[email protected]>
@fengyinqiao
Copy link
Contributor Author

@shubham-cmyk @drivebyer @iamabhishek-dubey can anyone help me with the e2e test?

@shubham-cmyk
Copy link
Member

@shubham-cmyk What is the default storageClass for e2e, the field allowVolumeExpansion is true or false? Now e2e failed for pvc not resized, I think it was caused by the default storageClass that does not allow volume expansion.

i don't think this should be caused by the storage class if you are using the kind cluster the storage class is provided by them. I think it should be expandable.

@shubham-cmyk
Copy link
Member

shubham-cmyk commented Mar 10, 2025

@shubham-cmyk @drivebyer @iamabhishek-dubey can anyone help me with the e2e test?

Your e2e test look perfect to me i think the operator it somehow not able to resize the volume here.
i have to try it first in the dev mode.
Also Have you tried this if you have locally deployed it ?

@fengyinqiao
Copy link
Contributor Author

fengyinqiao commented Mar 10, 2025

@shubham-cmyk @drivebyer @iamabhishek-dubey can anyone help me with the e2e test?

Your e2e test look perfect to me i think the operator it somehow not able to resize the volume here. i have to try it first in the dev mode. Also Have you tried this if you have locally deployed it ?

sure, see logs and the pvcs:

{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-3] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-4] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-5] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
➜  kubectl -n ot-operators get pvc -w | grep redis-cluster-v7
node-conf-redis-cluster-v7-follower-0                           Bound    d-uf6ilczhgpczuvud8vxr   40Gi       RWO            alicloud-disk-essd-pl0   7d19h
node-conf-redis-cluster-v7-follower-1                           Bound    d-uf623hbv2xfwltcoocz3   40Gi       RWO            alicloud-disk-essd-pl0   7d19h
node-conf-redis-cluster-v7-follower-2                           Bound    d-uf6cue68hyydrdu4itmb   40Gi       RWO            alicloud-disk-essd-pl0   7d19h
node-conf-redis-cluster-v7-leader-0                             Bound    d-uf6gnvmvtdgrnc824965   40Gi       RWO            alicloud-disk-essd-pl0   7d19h
node-conf-redis-cluster-v7-leader-1                             Bound    d-uf6gnvmvtdgrnc82496b   40Gi       RWO            alicloud-disk-essd-pl0   7d19h
node-conf-redis-cluster-v7-leader-2                             Bound    d-uf6b2pznjf8yx6727616   40Gi       RWO            alicloud-disk-essd-pl0   7d19h
redis-cluster-v7-follower-redis-cluster-v7-follower-0           Bound    d-uf647w15hb4ac0tfcav4   54Gi       RWO            alicloud-disk-essd-pl0   7d19h
redis-cluster-v7-follower-redis-cluster-v7-follower-1           Bound    d-uf63h4h8by86ipv130re   54Gi       RWO            alicloud-disk-essd-pl0   7d19h
redis-cluster-v7-follower-redis-cluster-v7-follower-2           Bound    d-uf6gqbr8mbow2a3far2o   54Gi       RWO            alicloud-disk-essd-pl0   7d19h
redis-cluster-v7-leader-redis-cluster-v7-leader-0               Bound    d-uf62t5jpvumcf47tnmga   54Gi       RWO            alicloud-disk-essd-pl0   7d19h
redis-cluster-v7-leader-redis-cluster-v7-leader-1               Bound    d-uf630at4yg7fd6al4yra   54Gi       RWO            alicloud-disk-essd-pl0   7d19h
redis-cluster-v7-leader-redis-cluster-v7-leader-2               Bound    d-uf6ilczhgpczuvud8vxo   54Gi       RWO            alicloud-disk-essd-pl0   7d19h

@shubham-cmyk
Copy link
Member

@shubham-cmyk @drivebyer @iamabhishek-dubey can anyone help me with the e2e test?

Your e2e test look perfect to me i think the operator it somehow not able to resize the volume here. i have to try it first in the dev mode. Also Have you tried this if you have locally deployed it ?

sure, see logs:

{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-3] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-4] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-5] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}

can you confirm me this with the actual screenshot of the pvc too. that the storage is now 2Gi

@fengyinqiao
Copy link
Contributor Author

@shubham-cmyk @drivebyer @iamabhishek-dubey can anyone help me with the e2e test?

Your e2e test look perfect to me i think the operator it somehow not able to resize the volume here. i have to try it first in the dev mode. Also Have you tried this if you have locally deployed it ?

sure, see logs:

{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-3] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-4] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-5] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}

can you confirm me this with the actual screenshot of the pvc too. that the storage is now 2Gi

@shubham-cmyk @drivebyer @iamabhishek-dubey can anyone help me with the e2e test?

Your e2e test look perfect to me i think the operator it somehow not able to resize the volume here. i have to try it first in the dev mode. Also Have you tried this if you have locally deployed it ?

sure, see logs:

{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-3] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-4] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}
{"level":"info","ts":"2025-03-09T11:27:07Z","msg":"sts:redis-cluster-v7-leader resized pvc [redis-cluster-v7-leader-redis-cluster-v7-leader-5] from 42949672960 to 57982058496","controller":"rediscluster","controllerGroup":"redis.redis.opstreelabs.in","controllerKind":"RedisCluster","RedisCluster":{"name":"redis-cluster-v7","namespace":"ot-operators"},"namespace":"ot-operators","name":"redis-cluster-v7","reconcileID":"37d29098-9c19-4991-b565-1a612e6b16e6"}

can you confirm me this with the actual screenshot of the pvc too. that the storage is now 2Gi

I resize pvc from 40Gi to 54Gi, see the pvcs above

@shubham-cmyk
Copy link
Member

looks like it did on your local. i have to try on my setup too

@shubham-cmyk
Copy link
Member

image

@fengyinqiao I think you are right the default storage class don't have the expansion true here

@fengyinqiao
Copy link
Contributor Author

image

@fengyinqiao I think you are right the default storage class don't have the expansion true here

Alright...so is there any solution for this problem, otherwise we have to remove the e2e test?

@shubham-cmyk
Copy link
Member

I tried adding volume expansion field but test are still failing.
I have to try this on the cloud or something also try this locally

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants