Skip to content

[Bug] One master hangs and failed, still trying to connect the failed master #363

@jazzbearz

Description

@jazzbearz

Fred version - 10.1.0
Redis version - 8.2.2
Platform - linux
Deployment type - cluster

Describe the bug
When one of the master is failed or hanged and redis still reports it's connected, even when the master is tagged failed, the fred tries to connect the master and fails.

State:

CLUSTER NODES
b22bb3e1c186c159ea28d980dd84c9287bb0d25b 10...217:6379@16379 myself,master - 0 0 14 connected 0-5460
0b1b63b9d18f3a9ea4dc05f7a9cd774622130df1 10...174:6379@16379 slave a4fa13948ff2786503d53c7a2d0ac9891df3e649 0 1760462603059 13 connected
64745849f9c6311f2244d68adab084a62871ea9a 10...181:6380@16380 slave a4fa13948ff2786503d53c7a2d0ac9891df3e649 0 1760462603060 13 connected
c55bd01313d3d3530c133f312eaa68b37a75ef85 10...217:6380@16380 slave 2bd6e49b47c5425d04347892c539960b95e91305 0 1760462603000 5 connected
66dcc2bf84dc7cb9bccc61402d02de88015f87ac 10...17:6379@16379 slave 2bd6e49b47c5425d04347892c539960b95e91305 0 1760462603059 5 connected
a4fa13948ff2786503d53c7a2d0ac9891df3e649 10...17:6380@16380 master - 0 1760462603562 13 connected 5461-10922
e218165bec05325b91096d6f5d7acdf57e40df67 10...174:6380@16380 slave b22bb3e1c186c159ea28d980dd84c9287bb0d25b 0 1760462603059 14 connected
ae39a1552ff5df0e6c635a5fe573c30e2fd22e29 10...93:6380@16380 slave,fail a4fa13948ff2786503d53c7a2d0ac9891df3e649 1760460320513 1760460319000 13 connected
2bd6e49b47c5425d04347892c539960b95e91305 10...181:6379@16379 master - 0 1760462603562 5 connected 10923-16383
c02388d3f5a1a3f52ee4add23a8ff13c397b70c3 10...93:6379@16379 master,fail - 1760460321516 1760460320010 1 connected
7cba3470ed9cf3a46b9779fb8bb1c72874777ecf 10...47:6379@16379 slave b22bb3e1c186c159ea28d980dd84c9287bb0d25b 0 1760462604065 14 connected
88d335f3d4450f30c4340cfc7e52e810250b3869 10...47:6380@16380 slave 2bd6e49b47c5425d04347892c539960b95e91305 0 1760462603562 5 connected

To Reproduce
Steps to reproduce the behavior:

  1. Setup a cluster
  2. Initialize Fred via Pool
  3. Pause one of the master by command kill -STOP pid
  4. Try to set or get a data from the master's slot

Logs
(If possible set RUST_LOG=fred=trace and run with --features debug-ids)

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions