-
Notifications
You must be signed in to change notification settings - Fork 4
[controller] Implement rvr-status-config-node-id-controller #341
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: astef-prototype
Are you sure you want to change the base?
[controller] Implement rvr-status-config-node-id-controller #341
Conversation
- Implement controller for assigning unique nodeId (0-7) to RVR replicas - Add request types and reconciler with conflict retry handling - Add comprehensive unit tests covering all scenarios - Add SPEC_COMPLIANCE documentation - Register controller in registry
- Use standard reconcile.Reconciler and .For() - Use structured logging - Remove custom Request types - Add CONTROLLER_STYLE_GUIDE.md
- Remove unused fields from Reconciler struct (rdr, sch) - Export Reconciler fields (Cl, Log, LogAlt) for testability - Remove NewReconciler constructor, create struct directly in BuildController - Update all field usages to exported fields (r.cl -> r.Cl, r.logAlt -> r.LogAlt) - Update tests to use exported fields directly - Remove unused runtime import Follows updated CONTROLLER_STYLE_GUIDE.md standards for simplified controller structure.
…d-controller - Replace standard testing package with Ginkgo/Gomega framework - Refactor all 9 tests to use Describe/It/BeforeEach pattern - Use GinkgoLogr for integrated logging with Ginkgo - Update SPEC_COMPLIANCE.md to reflect Ginkgo testing approach - Add ginkgo/v2 and gomega dependencies to go.mod
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Убираем
| WithEventFilter(predicate.Funcs{ | ||
| CreateFunc: func(ce event.CreateEvent) bool { | ||
| rvr, ok := ce.Object.(*v1alpha3.ReplicatedVolumeReplica) | ||
| if !ok { | ||
| return false | ||
| } | ||
| // Trigger only if nodeID is not set | ||
| return rvr.Status == nil || rvr.Status.Config == nil || rvr.Status.Config.NodeId == nil | ||
| }, | ||
| UpdateFunc: func(_ event.UpdateEvent) bool { | ||
| // No-op: nodeID is immutable once set, so we only care about CREATE | ||
| return false | ||
| }, | ||
| DeleteFunc: func(_ event.DeleteEvent) bool { | ||
| // No-op: deletion doesn't require nodeID assignment | ||
| return false | ||
| }, | ||
| GenericFunc: func(ge event.GenericEvent) bool { | ||
| rvr, ok := ge.Object.(*v1alpha3.ReplicatedVolumeReplica) | ||
| if !ok { | ||
| return false | ||
| } | ||
| // Trigger only if nodeID is not set (for reconciliation on startup) | ||
| return rvr.Status == nil || rvr.Status.Config == nil || rvr.Status.Config.NodeId == nil | ||
| }, | ||
| }). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Давай убирать
| LogAlt logr.Logger | ||
| } | ||
|
|
||
| var _ reconcile.Reconciler = &Reconciler{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| var _ reconcile.Reconciler = &Reconciler{} | |
| var _ reconcile.Reconciler = (*Reconciler)(nil) |
| Cl client.Client | ||
| Log *slog.Logger | ||
| LogAlt logr.Logger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make private
| // Filter by replicatedVolumeName | ||
| filteredRVRs := make([]v1alpha3.ReplicatedVolumeReplica, 0) | ||
| for _, item := range rvrList.Items { | ||
| if item.Spec.ReplicatedVolumeName == rvr.Spec.ReplicatedVolumeName { | ||
| filteredRVRs = append(filteredRVRs, item) | ||
| } | ||
| } | ||
| rvrList.Items = filteredRVRs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Есть slices.DeleteFunc
| for _, item := range rvrList.Items { | ||
| if item.Status != nil && item.Status.Config != nil && item.Status.Config.NodeId != nil { | ||
| nodeID := *item.Status.Config.NodeId | ||
| if nodeID >= minNodeID && nodeID <= maxNodeID { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
А в проверке выше это не проверялось. Это точно нужно?
| rvrList.Items = filteredRVRs | ||
|
|
||
| // Collect used nodeIDs | ||
| usedNodeIDs := make(map[uint]bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Вместо bool тут правильнее использовать пустую структуру
| // NOTE: Setting status condition is NOT in the spec. | ||
| // This was added to improve observability - administrators can see the problem | ||
| // in RVR status conditions instead of only in controller logs. | ||
| // To revert: remove the setNodeIDErrorCondition call and the function definition. | ||
| // The spec only requires returning an error, which we do below. | ||
| if err := r.setNodeIDErrorCondition(ctx, rvr, fmt.Sprintf( | ||
| "too many replicas for volume %s: %d (maximum is %d)", | ||
| rvr.Spec.ReplicatedVolumeName, | ||
| totalReplicas, | ||
| maxNodeID+1, | ||
| )); err != nil { | ||
| log.Info("failed to set error condition", "err", err) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Надо прояснить этот момент и сделать как надо
| return reconcile.Result{}, fmt.Errorf("getting RVR for patch: %w", err) | ||
| } | ||
| if err := api.PatchStatusWithConflictRetry(ctx, r.Cl, freshRVR, func(currentRVR *v1alpha3.ReplicatedVolumeReplica) error { | ||
| // Check again if nodeID is already set (handles race condition where another worker set it during retry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Этого не может произойти. Одновременно два одинаковых объекта не рекосайлятся
| func createRVR(name, volumeName, nodeName string) *v1alpha3.ReplicatedVolumeReplica { | ||
| return &v1alpha3.ReplicatedVolumeReplica{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: name, | ||
| }, | ||
| Spec: v1alpha3.ReplicatedVolumeReplicaSpec{ | ||
| ReplicatedVolumeName: volumeName, | ||
| NodeName: nodeName, | ||
| }, | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
То что внутри этой функции читаемее сигнатуры. Давай заинлайним
Description
Implements the
rvr-status-config-node-id-controlleras specified indocs/dev/spec_v1alpha3.md. The controller automatically assigns uniquenodeIdvalues (0-7) toReplicatedVolumeReplicaresources within the sameReplicatedVolume.Changes:
images/controller/internal/controllers/rvr_status_config_node_id/registry.godocs/dev/controllers/rvr_status_config_node_id/SPEC_COMPLIANCE.mdNo critical components affected: Only writes to RVR status subresource. No restarts or changes to existing components.
Why do we need it, and what problem does it solve?
DRBD requires each replica to have a unique
nodeId(0-7) for cluster identification. Without this controller,nodeIdassignment would be manual, error-prone, and not scalable.This controller automates the process, ensuring uniqueness, handling concurrent assignments safely, and providing error conditions when assignment fails (e.g., too many replicas).
Related spec:
docs/dev/spec_v1alpha3.md-rvr-status-config-node-id-controller[OK | priority: 5 | complexity: 2]What is the expected result?
After applying:
nodeIdinstatus.config.nodeId(range 0-7)ReplicatedVolumegets a uniquenodeIdConfigurationAdjusted=Falseif >8 replicas or all nodeIds usednodeIdHow to verify:
ReplicatedVolumeReplicawithoutnodeId→ check thatstatus.config.nodeIdis assignedMUST NOT change:
nodeIdremain unchangedChecklist