-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bgpd: fix SRv6 L3VPN route leak to VPNv4 and VPNv6 SID nexthop validity #18322
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have several topotests to validate SRv6 L3VPN route leaking.
These are not showing the issue.
I assume you have a scenario / topology that is not covered by the current topotests.
Can you add a new topotest that reproduces your scenario and shows that your patch fixes the issue?
I will attempt to work on that. However, I have had some trouble recently with the topotests functioning locally. I originally suspected it may have to do with a neighbor within a VRF using link-local addressing. More specifically when also using extended nexthop. However, I seemed to continue to have issues regardless. I was able to determine that when it evaluated the SID as a nexthop during the leak to VPN it was not resulting in nh_valid being true. After applying the changes in this PR, everything functioned as expected. |
2627cf1
to
c00225e
Compare
c00225e
to
3c2c99a
Compare
@cscarpitta I have added a topotest to demonstrate the desired behavior and rebased on the lastest master. I noticed that none of the other SRv6 tests include learning routes from a CE and distributing them via VPNv4/VPNv6. They currently only test leaking connected routes which may explain why this has gone unnoticed. Additional odd behavior observed here is that the interface must enter promiscuous mode (tcpdump) and capture traffic from a "ce" before it marks a route as valid; even within the VRF itself. I encourage you to try this by commenting out these lines in the topotest and experimenting with it interactively: # tests/topotests/bgp_srv6l3vpn_to_bgp_vrf4/bgp_srv6l3vpn_to_bgp_vrf4.py
wait_for_ce_traffic("r1", "eth1")
wait_for_ce_traffic("r2", "eth1") However this may not be directly related to SRv6 and I have experienced this seemingly randomly before. |
3c2c99a
to
8a9d13e
Compare
@cscarpitta I've refactored this a bit more after additional testing. The I was also able to narrow down the previous behavior requiring tcpdump. It appears as though Zebra needs kicked after routes are exported to the VPN tables otherwise they will remain invalid indefinitely for some unknown reason. This is accomplished in the topotest by waiting for the routes to be exported and then re-issuing the default configuration command |
22ff833
to
3e188e0
Compare
When an IPv4 or IPv6 route belonging to a VRF is leaked to VPNv4 or VPNv6 using SRv6, the locally assigned SID never contains nexthops (bnc->nexthop_num > 0) and never passes bgp_isvalid_nexthop(). This change updates leak_update_nexthop_valid() to assume the SID nexthop is valid if SRv6 is enabled, a SID is allocated and the route is valid in the source table. It will still fail if SRv6 is enabled and neither a srv6_l3vpn or a srv6_vpn SID is assigned. Signed-off-by: Jonathan Voss <[email protected]>
3e188e0
to
86f71b6
Compare
Removed all ephemeral items from topotest json fixtures. One-off unrelated failure in: TopoTests Ubuntu 22.04 amd64 Part 9 only: |
When an IPv4 or IPv6 route belonging to a VRF is leaked to VPNv4 or VPNv6 using SRv6, the locally assigned SID never contains nexthops (bnc->nexthop_num > 0) and never passes bgp_isvalid_nexthop().
This change updates leak_update_nexthop_valid() to assume the SID nexthop is valid if SRv6 is enabled, a SID is allocated and the route is valid in the source table. It will still fail if SRv6 is enabled and neither a srv6_l3vpn or a srv6_vpn SID is assigned.
There may be a better way to check the validity of the SID itself, however this change assumes that if the source route was valid and a SID is able to be assigned, it is probable that it is functioning correctly.
Before details:
SRv6 details
Invalid nexthop leaked route to VPNv4
Invalid nexthop leaked route to VPNv6
After details:
Valid nexthop leaked route to VPNv4
Valid nexthop leaked route to VPNv6
Received IPv4 routing
Received IPv6 routing