WhiteListRoundRobinPolicy - handle nodes with same ip #1253
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
When using cqlsh on a cassandra cluster where each node has the same IP but different ports, running
SELECT * FROM system.peers_v2;
will result in one of the peers being swapped for one of the local nodes.I tracked this issue down to the
python-driver
sWhiteListRoundRobinPolicy
which is used by cqlsh.Reproduction
I can reproduce the issue with this
docker-compose.yaml
, which creates a cluster of nodes accessible on 127.0.0.1 on ports 9042, 9043 and 9044:And this python-driver sample:
You will observe that each attempt prints a different list of peers, which should not be possible.
The problem is that the
WhiteListRoundRobinPolicy
checks only the hostname/address, so if all nodes have the same address but different ports they will all pass the whitelist.The fix
The first commit moves the endpoints into their own file, this was needed to avoid a cyclic dependency issue.
Then the second commit alters the
WhiteListRoundRobinPolicy
constructor to allow the user to optionally specify a port along with the hostname, which is used to make the whitelist more specific.The change is completely backwards compatible with the existing interface, I had to make the implementation a little complicated to handle the new functionality in a backwards compatible way.
I'm open to any alternatives to the constructor interface I used here.
I didnt really understand what the deal is with the
self._allowed_hosts = tuple(hosts)
line, it seems to be used for some sort of serialization thing, so I just left it as is. Let me know if it requires any changes.Process
It seems like a jira ticket isnt strictly needed for small changes, let me know if you need a ticket for this PR to be merged. And if so let me know how I should go about getting access, I tried and couldn't figure it out.