Feature lighthouse host query filtering #1358

jampe · 2025-03-19T17:18:21Z

=> This PR builds upon PR "Incoming Handshake filtering based on firewall rules" #1357 and will remain in draft status until that PR is either merged or declined.

This pull request introduces the functionality to send the flattened firewall rules (hostname, groups, group combos and cidrs) to a node's lighthouses. After a node has transmitted this information to its lighthouse, the lighthouse will filter host queries for that node based on the provided whitelist. The node includes this flattened firewall rules in the initial HostUpdateNotification message and will resend it if there are any changes to the data.

Config changes:

A new configuration value has been added within the Lighthouse section to enable this feature:

# This setting on a lighthouse determines whether to enforce the host query protection
# whitelist received from a node. On a node, this setting controls whether the node
# sends its handshake filtering whitelist to the lighthouses at all.
#enable_host_query_protection: false

Implementation details:

When sending a HostUpdateNotification message, the function HandshakeFilter:ToHandshakeFilteringWhitelist is called to convert the whitelist from its map format to the appropriate protobuf data structure. Then the function HandshakeFilter:FromHandshakeFilteringWhitelist performs the inverse operation on the lighthouse side. Afterwards the lighthouse stores the HandshakeFilter instance in the node's RemoteList structure by invoking RemoteList:unlockedSetHandshakeFilteringWhitelist().

Upon receiving host queries, the lighthouse checks for any stored rules associated with the queried node. If rules are present, the whitelist is consulted to determine whether the query should be permitted.

To accommodate the flattened firewall rules sent to the lighthouse, the HostUpdateNotification message has been extended. The whitelist is sent to the lighthouses only on the initial request, when the whitelist changes locally, or when the tunnel to the lighthouse was rebuilt. This approach helps reduce unnecessary network load for users operating Nebula with many nodes.

To identify potentially malicious or infected nodes, a new metric, "lighthouse.hostqueries.filtered", has been introduced to track filtered host queries on a lighthouse.

JackDoan · 2025-03-20T14:47:12Z

This is a really interesting concept, thanks for putting this together!

If I understand this correctly, it looks like you're deciding whether or not to filter a host query based on if that host's firewall would permit a handshake, which definitely makes sense. I was curious if you had considered an implementation where hosts send their entire firewall to the lighthouse? The lighthouse wouldn't be able to evaluate protocol+port rules, obviously, but in theory, you could allow the lighthouse to filter out queries that it "knows" would be unable to communicate with the queried-for host.

One potential issue that comes to mind with both approaches is the potential size of the HandshakeFilteringWhitelist field. The internet is not very good at delivering fragmented packets, and if there are a lot of rules (or rules with very long group names!), the NebulaMeta packet would potentially be undeliverable. This problem technically already exists for certificates, but I think it's more likely to crop up with firewall rules, since they frequently end up needing to express many different combinations, rather than describe a single host, like a cert does.

jampe · 2025-03-20T20:23:29Z

If I understand this correctly, it looks like you're deciding whether or not to filter a host query based on if that host's firewall would permit a handshake, which definitely makes sense. I was curious if you had considered an implementation where hosts send their entire firewall to the lighthouse? The lighthouse wouldn't be able to evaluate protocol+port rules, obviously, but in theory, you could allow the lighthouse to filter out queries that it "knows" would be unable to communicate with the queried-for host.

Yes, your description somewhat summarizes the functionality implemented by this PR. The foundational PR #1357 establishes filtering at the node level, filtering connections between nodes that either know the nebula port or can infer it. This PR then extends the filtering capabilities to the lighthouse, implementing the approach you outlined.

Initially, my intention was to transmit the inbound firewall rule structure to the lighthouse. However, upon reviewing the code, I recognized that, given the information available to the lighthouse for filtering, I can omit some data. The HandshakeFilteringWhitelist structure contains deduplicated hosts, groups, combinations of groups (ANDed groups), CIDR blocks, and other relevant entities that the lighthouse can utilize for filtering purposes.

One potential issue that comes to mind with both approaches is the potential size of the HandshakeFilteringWhitelist field. The internet is not very good at delivering fragmented packets, and if there are a lot of rules (or rules with very long group names!), the NebulaMeta packet would potentially be undeliverable. This problem technically already exists for certificates, but I think it's more likely to crop up with firewall rules, since they frequently end up needing to express many different combinations, rather than describe a single host, like a cert does.

Good input! While I considered strategies to minimize network and CPU load on both nodes and lighthouses - such as transmitting data only in the initial message or when local firewall rules are modified - I had not accounted for the issue of packet fragmentation.

A solution could be to create a dedicated message type. Since the user controls the MTU, I could manage packet sizes accordingly and split the data into multiple packets as necessary. Group names may be likely to be repeated across firewall rules, compression could be efficient in further reducing the size of the transmitted data. What do you think?

implement incoming handshake filtering

Verified

This commit was signed with the committer’s verified signature. The key has expired.

jampe

GPG key ID: 91EEAF846A824DDE
Expired

Verified
Learn about vigilant mode

Loading
Loading status checks…

05e558e

salesforce-cla bot added the cla:signed label Mar 19, 2025

jampe added 2 commits March 19, 2025 18:48

use MustParse* functions in firewall tests

Verified

This commit was signed with the committer’s verified signature. The key has expired.

jampe

GPG key ID: 91EEAF846A824DDE
Expired

Verified
Learn about vigilant mode

Loading
Loading status checks…

028adbf

implement host query protection

Verified

This commit was signed with the committer’s verified signature. The key has expired.

jampe

GPG key ID: 91EEAF846A824DDE
Expired

Verified
Learn about vigilant mode

Loading
Loading status checks…

f7f3ff3

jampe force-pushed the feature_lighthouse_host_query_filtering branch from 864bc8e to f7f3ff3 Compare March 19, 2025 17:53

always create handshake filter instance

Verified

This commit was signed with the committer’s verified signature. The key has expired.

jampe

GPG key ID: 91EEAF846A824DDE
Expired

Verified
Learn about vigilant mode

Loading
Loading status checks…

66222be

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature lighthouse host query filtering #1358

Feature lighthouse host query filtering #1358

jampe commented Mar 19, 2025

JackDoan commented Mar 20, 2025

jampe commented Mar 20, 2025

Feature lighthouse host query filtering #1358

Are you sure you want to change the base?

Feature lighthouse host query filtering #1358

Conversation

jampe commented Mar 19, 2025

JackDoan commented Mar 20, 2025

jampe commented Mar 20, 2025