Skip to content

Conversation

@carlhoerberg
Copy link
Member

Summary

  • Adds OSR (out-of-sync replica) tracking alongside existing ISR tracking
  • Exposes follower sync state directly in API response for better cluster visibility
  • Updates web UI to clearly display sync status with visual indicators

Backend Changes

  • Clustering Server: Track OSR in etcd at lavinmq/osr key, updating when followers connect, sync, or disconnect
  • Follower Serialization: Add state property ("Syncing" or "Synced") to follower JSON output
  • Nodes API: Return all_followers (both synced and syncing) instead of only synced followers

Frontend Changes

  • Nodes View: Add "Status" column to followers table between ID and Remote address
  • Status Display:
    • Out-of-sync replicas shown in red and bold
    • In-sync replicas shown in green
    • Status determined directly from follower state property

Technical Details

The implementation mirrors the existing ISR tracking pattern:

  • OSR tracked with @dirty_osr flag and update_osr method
  • State updates triggered at follower lifecycle events
  • Clean API design - sync state embedded in each follower object, no separate arrays needed

Test Plan

  • Verify OSR tracking updates in etcd when followers connect
  • Confirm syncing followers appear as "Out-of-sync" in web UI
  • Verify followers transition to "In-sync" after full sync completes
  • Check OSR updates when followers disconnect
  • Test with multiple followers in various sync states

🤖 Generated with Claude Code

Tracks syncing followers alongside in-sync replicas for better cluster visibility. Each follower now exposes its sync state (Syncing/Synced) directly in the API response, eliminating the need for separate ISR/OSR arrays.

Backend changes:
- Track OSR in etcd at lavinmq/osr key alongside existing ISR tracking
- Add state property to Follower JSON serialization
- Return all followers (synced + syncing) in nodes API

Frontend changes:
- Add Status column to followers table showing sync state
- Display out-of-sync replicas in red/bold, in-sync in green
- Simplify rendering by using state property directly from API

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@carlhoerberg carlhoerberg requested a review from a team as a code owner November 22, 2025 07:15
@claude
Copy link

claude bot commented Nov 22, 2025

Claude finished @carlhoerberg's task —— View job


Reviewing OSR Tracking Implementation

Todo List:

  • Read and analyze clustering/follower.cr changes
  • Read and analyze clustering/server.cr changes
  • Read and analyze http/controller/nodes.cr changes
  • Read and analyze static/js/nodes.js changes
  • Read and analyze views/nodes.ecr changes
  • Identify any bugs, security issues, performance problems, or missing error handling
  • Post review findings

Review Findings

Bug in clustering/server.cr:238

  • Empty string handling issue: When no followers are syncing, ids will be an empty string but rstrip(',') is still called
  • This could cause etcd to store an empty key value instead of properly handling the empty case
  • Fix needed: Check if string is empty before calling rstrip

Missing Error Handling in clustering/server.cr:240

  • @etcd.put(osr_key, ids) lacks error handling for etcd connection failures
  • Unlike ISR updates, OSR updates could silently fail and leave cluster state inconsistent
  • Fix needed: Add proper error handling around etcd operations

Performance Concern in JavaScript

  • static/js/nodes.js:186: Using fallback 'Unknown' state could mask actual sync issues in UI
  • No validation that item.state contains expected values ('Syncing' or 'Synced')
  • Improvement needed: Add validation or logging for unexpected state values

No security vulnerabilities or Crystal anti-patterns found. The implementation follows existing patterns correctly.


@carlhoerberg
Copy link
Member Author

Fixes #1425

@carlhoerberg carlhoerberg added this to the 2.7.0 milestone Nov 24, 2025
@carlhoerberg carlhoerberg linked an issue Nov 24, 2025 that may be closed by this pull request
@carlhoerberg carlhoerberg marked this pull request as draft December 5, 2025 23:20
@carlhoerberg
Copy link
Member Author

Closed in favor of #1537

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

OSR(out of sync replicas) for better nodes interface

2 participants