Skip to content

Conversation

@IvanOgurchenok
Copy link

Description

Implements the rvr-status-config-node-id-controller as specified in docs/dev/spec_v1alpha3.md. The controller automatically assigns unique nodeId values (0-7) to ReplicatedVolumeReplica resources within the same ReplicatedVolume.

Changes:

  • New controller in images/controller/internal/controllers/rvr_status_config_node_id/
  • Controller registered in registry.go
  • Documentation: docs/dev/controllers/rvr_status_config_node_id/SPEC_COMPLIANCE.md

No critical components affected: Only writes to RVR status subresource. No restarts or changes to existing components.

Why do we need it, and what problem does it solve?

DRBD requires each replica to have a unique nodeId (0-7) for cluster identification. Without this controller, nodeId assignment would be manual, error-prone, and not scalable.

This controller automates the process, ensuring uniqueness, handling concurrent assignments safely, and providing error conditions when assignment fails (e.g., too many replicas).

Related spec: docs/dev/spec_v1alpha3.md - rvr-status-config-node-id-controller [OK | priority: 5 | complexity: 2]

What is the expected result?

After applying:

  • New RVR resources automatically receive a unique nodeId in status.config.nodeId (range 0-7)
  • Each replica within the same ReplicatedVolume gets a unique nodeId
  • Different volumes have independent nodeId pools
  • Error handling: returns error and sets ConfigurationAdjusted=False if >8 replicas or all nodeIds used
  • Idempotent: doesn't change already assigned nodeId

How to verify:

  • Create a new ReplicatedVolumeReplica without nodeId → check that status.config.nodeId is assigned
  • Create multiple replicas for the same volume → verify unique nodeIds

MUST NOT change:

  • Existing RVR resources with assigned nodeId remain unchanged
  • No changes to other controllers or components

Checklist

  • The code is covered by unit tests.
  • e2e tests passed.
  • Documentation updated according to the changes.
  • Changes were tested in the Kubernetes cluster manually.

- Implement controller for assigning unique nodeId (0-7) to RVR replicas
- Add request types and reconciler with conflict retry handling
- Add comprehensive unit tests covering all scenarios
- Add SPEC_COMPLIANCE documentation
- Register controller in registry
@asergunov asergunov marked this pull request as draft November 27, 2025 11:21
@IvanOgurchenok IvanOgurchenok self-assigned this Nov 27, 2025
- Use standard reconcile.Reconciler and .For()
- Use structured logging
- Remove custom Request types
- Add CONTROLLER_STYLE_GUIDE.md
- Remove unused fields from Reconciler struct (rdr, sch)
- Export Reconciler fields (Cl, Log, LogAlt) for testability
- Remove NewReconciler constructor, create struct directly in BuildController
- Update all field usages to exported fields (r.cl -> r.Cl, r.logAlt -> r.LogAlt)
- Update tests to use exported fields directly
- Remove unused runtime import

Follows updated CONTROLLER_STYLE_GUIDE.md standards for simplified controller structure.
@IvanOgurchenok IvanOgurchenok linked an issue Nov 28, 2025 that may be closed by this pull request
…d-controller

- Replace standard testing package with Ginkgo/Gomega framework
- Refactor all 9 tests to use Describe/It/BeforeEach pattern
- Use GinkgoLogr for integrated logging with Ginkgo
- Update SPEC_COMPLIANCE.md to reflect Ginkgo testing approach
- Add ginkgo/v2 and gomega dependencies to go.mod
@IvanOgurchenok IvanOgurchenok requested a review from astef November 28, 2025 09:59
@IvanOgurchenok IvanOgurchenok marked this pull request as ready for review November 28, 2025 10:00
@IvanOgurchenok IvanOgurchenok marked this pull request as draft November 28, 2025 14:49
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Убираем

Comment on lines +26 to +51
WithEventFilter(predicate.Funcs{
CreateFunc: func(ce event.CreateEvent) bool {
rvr, ok := ce.Object.(*v1alpha3.ReplicatedVolumeReplica)
if !ok {
return false
}
// Trigger only if nodeID is not set
return rvr.Status == nil || rvr.Status.Config == nil || rvr.Status.Config.NodeId == nil
},
UpdateFunc: func(_ event.UpdateEvent) bool {
// No-op: nodeID is immutable once set, so we only care about CREATE
return false
},
DeleteFunc: func(_ event.DeleteEvent) bool {
// No-op: deletion doesn't require nodeID assignment
return false
},
GenericFunc: func(ge event.GenericEvent) bool {
rvr, ok := ge.Object.(*v1alpha3.ReplicatedVolumeReplica)
if !ok {
return false
}
// Trigger only if nodeID is not set (for reconciliation on startup)
return rvr.Status == nil || rvr.Status.Config == nil || rvr.Status.Config.NodeId == nil
},
}).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Давай убирать

LogAlt logr.Logger
}

var _ reconcile.Reconciler = &Reconciler{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
var _ reconcile.Reconciler = &Reconciler{}
var _ reconcile.Reconciler = (*Reconciler)(nil)

Comment on lines +21 to +23
Cl client.Client
Log *slog.Logger
LogAlt logr.Logger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make private

Comment on lines +57 to +64
// Filter by replicatedVolumeName
filteredRVRs := make([]v1alpha3.ReplicatedVolumeReplica, 0)
for _, item := range rvrList.Items {
if item.Spec.ReplicatedVolumeName == rvr.Spec.ReplicatedVolumeName {
filteredRVRs = append(filteredRVRs, item)
}
}
rvrList.Items = filteredRVRs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Есть slices.DeleteFunc

for _, item := range rvrList.Items {
if item.Status != nil && item.Status.Config != nil && item.Status.Config.NodeId != nil {
nodeID := *item.Status.Config.NodeId
if nodeID >= minNodeID && nodeID <= maxNodeID {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

А в проверке выше это не проверялось. Это точно нужно?

rvrList.Items = filteredRVRs

// Collect used nodeIDs
usedNodeIDs := make(map[uint]bool)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Вместо bool тут правильнее использовать пустую структуру

Comment on lines +80 to +92
// NOTE: Setting status condition is NOT in the spec.
// This was added to improve observability - administrators can see the problem
// in RVR status conditions instead of only in controller logs.
// To revert: remove the setNodeIDErrorCondition call and the function definition.
// The spec only requires returning an error, which we do below.
if err := r.setNodeIDErrorCondition(ctx, rvr, fmt.Sprintf(
"too many replicas for volume %s: %d (maximum is %d)",
rvr.Spec.ReplicatedVolumeName,
totalReplicas,
maxNodeID+1,
)); err != nil {
log.Info("failed to set error condition", "err", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Надо прояснить этот момент и сделать как надо

return reconcile.Result{}, fmt.Errorf("getting RVR for patch: %w", err)
}
if err := api.PatchStatusWithConflictRetry(ctx, r.Cl, freshRVR, func(currentRVR *v1alpha3.ReplicatedVolumeReplica) error {
// Check again if nodeID is already set (handles race condition where another worker set it during retry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Этого не может произойти. Одновременно два одинаковых объекта не рекосайлятся

Comment on lines +35 to +45
func createRVR(name, volumeName, nodeName string) *v1alpha3.ReplicatedVolumeReplica {
return &v1alpha3.ReplicatedVolumeReplica{
ObjectMeta: metav1.ObjectMeta{
Name: name,
},
Spec: v1alpha3.ReplicatedVolumeReplicaSpec{
ReplicatedVolumeName: volumeName,
NodeName: nodeName,
},
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

То что внутри этой функции читаемее сигнатуры. Давай заинлайним

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[controller] Implement rvr-status-config-node-id-controller

3 participants