-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Introduce IP-level bans for connection spam targeting validators #3366
base: staging
Are you sure you want to change the base?
Conversation
Signed-off-by: ljedrz <[email protected]>
Signed-off-by: ljedrz <[email protected]>
Signed-off-by: ljedrz <[email protected]>
Thx for making this! @kpandl would you like to take the first pass review in the coming week? |
Generally looks good to me. Can you explain the setting of the const variables ( I've seen 24 hour bans in the Bitcoin protocol, but I've read other sources that suggest it could primarily limit DOS attacks over Tor. Either way, setting the values in a config could also be an interesting feature in the long term. |
@kpandl the values were chosen semi-arbitrarily; basically any limit imposed on the number of connection attempts will suffice to avoid a "flood", but we may decide to be as harsh as we like. Technically, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional testing/validation needed
|
||
// Check the previous low-level connection timestamp. | ||
if let Some(peer_stats) = self.tcp.known_peers().get(peer_addr.ip()) { | ||
if peer_stats.timestamp().elapsed().as_secs() <= MIN_CONNECTION_INTERVAL_IN_SECS { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meshiest wrote: "in my test environment this is permanently banning my local validators (they're connecting to eachother while booting up)"
@Meshiest can you describe your test in more detail? Most importantly how many validators, from genesis or not, and what rough machine specs? That would be useful for @ljedrz to reproduce and chime in on an alternative design or constants.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have an entirely local 4 validator, 4 bootstrap+core clients, and 4 fringe client network for testing another change & made a few snarkos changes to support 127.x.y.z IPs so the peers would have different addresses while still being local and having default ports (can probably push these somewhere). I also updated the bootstrap peers to be the core clients
When booting the nodes all at the same time, validators are starved for peers and are the initiators for all their requests. This results in the validators getting banned both by other validators as well as their own core clients in the non-gateway concurrent con check.
I'm not certain my IP changes to support other local addresses are contributing but they are part of the equation and it could be related to the other testing I was doing at the time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Meshiest just to rule out other possible issues, could you try starting the nodes one-by-one and see if this avoids the bans? I agree that we should also support concurrent launch, this is just to double-check that it is this edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebuilding via:
- checkout [Feature] Introduce IP-level bans for connection spam targeting validators #3366
- cherry-pick Testing features for custom loopback isonets #3397 (I'm running all my nodes locally)
- merge mainnet
And I wasn't able to reproduce the parallel booting ban, though after more testing and rapidly restarting nodes I wasn't able to achieve any bans. Not entirely sure what's going on. Out of time until later today (at least 12 hours)
This PR primarily targets #3311, but my intention is for the new setup to be extensible based on potential future needs (e.g. detecting and banning for message spam).
The 1st commit alters the (currently mostly unused) low-level
KnownPeers
collection to work on IPs only (instead of IP+port pairs) and extends the relatedStats
object with atimestamp
field which is updated whenever a (low-level) connection is established.The new IP-level bans are introduced only at the (higher-level) handshake level for the following reasons:
Router
setupHeartbeat
can be used to clean up expired bansTcp
doesn't need to do any housekeeping)The downside is that several concurrent low-level connection attempts may still be accepted even post-IP-ban. However, this would also (though to a lesser extent) be the case if the ban check was moved "to the lower level", as we would still have to accept an inbound connection before being able to check its IP. That being said, these are relatively lightweight and still subject to peer limits.
The ban-related
const
s were picked arbitrarily and we may choose any other values instead. Also, this new feature is disabled for tests and--dev
runs (since they almost exclusively involve a single localhost IP).CI run is here