Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bgpd: fix SRv6 L3VPN route leak to VPNv4 and VPNv6 SID nexthop validity #18322

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jvoss
Copy link
Contributor

@jvoss jvoss commented Mar 6, 2025

When an IPv4 or IPv6 route belonging to a VRF is leaked to VPNv4 or VPNv6 using SRv6, the locally assigned SID never contains nexthops (bnc->nexthop_num > 0) and never passes bgp_isvalid_nexthop().

This change updates leak_update_nexthop_valid() to assume the SID nexthop is valid if SRv6 is enabled, a SID is allocated and the route is valid in the source table. It will still fail if SRv6 is enabled and neither a srv6_l3vpn or a srv6_vpn SID is assigned.

There may be a better way to check the validity of the SID itself, however this change assumes that if the source route was valid and a SID is able to be assigned, it is probable that it is functioning correctly.

Before details:

SRv6 details
R1# show ipv6 route
I>* fdff:def0:3::/48 [115/0] is directly connected, sr0, seg6local End -, weight 1, 00:11:17
B>* fdff:def0:3:412::/128 [20/0] is directly connected, TEST, seg6local End.DT46 table 1042, weight 1, 00:11:17
R1# show bgp segment-routing srv6
locator_name: MAIN
  prefix: fdff:def0:3::/48
  block-length: 32
  node-length: 16
  func-length: 16
  arg-length: 0
locator_chunks:
functions:
- sid: fdff:def0:3:412::
  locator: MAIN
- sid: fdff:def0:3:412::
  locator: MAIN
bgps:
- name: default
  vpn_policy[AFI_IP].tovpn_sid: (null)
  vpn_policy[AFI_IP6].tovpn_sid: (null)
  per-vrf tovpn_sid: (null)
- name: TEST
  vpn_policy[AFI_IP].tovpn_sid: (null)
  vpn_policy[AFI_IP6].tovpn_sid: (null)
  per-vrf tovpn_sid: fdff:def0:3:412::
Invalid nexthop leaked route to VPNv4
R1# show bgp ipv4 vpn 172.22.110.96/27
BGP routing table entry for 100.64.0.3:1042:172.22.110.96/27, version 0
not allocated
Paths: (1 available, no best path)
  Not advertised to any peer
  65010
    0.0.0.0 from :: (100.64.0.3) vrf TEST(4) announce-nh-self
    (fe80::100) (used)
      Origin IGP, metric 0, invalid, sourced, local
      Extended Community: RT:65000:1042
      Originator: 100.64.0.3
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 19:00:12 2025
# debug bgp vpn leak-from-vrf

Mar 05 19:00:12 R1 bgpd[51149]: vpn_leak_from_vrf_update: from vrf VRF TEST
Mar 05 19:00:12 R1 bgpd[51149]: vpn_leak_from_vrf_update: post merge static_attr.ecommunity{65000:1042}
Mar 05 19:00:12 R1 bgpd[51149]: vpn_leak_from_vrf_update: new_attr->ecommunity{65000:1042}
Mar 05 19:00:12 R1 bgpd[51149]: leak_update: entry: leak-to=VRF default, p=172.22.110.96/27, type=10, sub_type=0
Mar 05 19:00:12 R1 bgpd[51149]: Found existing bnc fdff:def0:3:412::/128(0)(VRF TEST) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix
Mar 05 19:00:12 R1 bgpd[51149]: leak_update_nexthop_valid: 172.22.110.96/27 nexthop is not valid (in VRF TEST)
Mar 05 19:00:12 R1 bgpd[51149]: leak_update: ->VRF default: 172.22.110.96/27: Added new route
Invalid nexthop leaked route to VPNv6
R1# show bgp ipv6 vpn fd0f:6965:e916::/48
BGP routing table entry for 100.64.0.3:1042:fd0f:6965:e916::/48, version 0
not allocated
Paths: (1 available, no best path)
  Not advertised to any peer
  65010
    fe80::100 from :: (100.64.0.3) vrf TEST(4) announce-nh-self
    (fe80::100) (used)
      Origin IGP, metric 0, invalid, sourced, local
      Extended Community: RT:65000:1042
      Originator: 100.64.0.3
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 19:00:12 2025
# debug bgp vpn leak-from-vrf

Mar 05 19:02:12 R1 bgpd[12231]: vpn_leak_from_vrf_update: from vrf VRF TEST
Mar 05 19:02:12 R1 bgpd[12231]: vpn_leak_from_vrf_update: post merge static_attr.ecommunity{65000:1042}
Mar 05 19:02:12 R1 bgpd[12231]: vpn_leak_from_vrf_update: new_attr->ecommunity{65000:1042}
Mar 05 19:02:12 R1 bgpd[12231]: leak_update: entry: leak-to=VRF default, p=fd0f:6965:e916::/48, type=10, sub_type=0
Mar 05 19:02:12 R1 bgpd[12231]: Found existing bnc fdff:def0:3:412::/128(0)(VRF TEST) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix
Mar 05 19:02:12 R1 bgpd[12231]: leak_update_nexthop_valid: fd0f:6965:e916::/48 nexthop is not valid (in VRF TEST)
Mar 05 19:02:12 R1 bgpd[12231]: leak_update: ->VRF default: fd0f:6965:e916::/48: Added new route

After details:

Valid nexthop leaked route to VPNv4
R1# show bgp ipv4 vpn 172.22.110.96/27
BGP routing table entry for 100.64.0.3:1042:172.22.110.96/27, version 454
not allocated
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  fdff:def0:1::1 fdff:def0:2::1
  65010
    0.0.0.0 from :: (100.64.0.3) vrf TEST(4) announce-nh-self
    (fe80::100) (used)
      Origin IGP, metric 0, valid, sourced, local, best (First path received)
      Extended Community: RT:65000:1042
      Originator: 100.64.0.3
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 20:33:47 2025
Mar 05 20:33:47 R1 bgpd[56322]: vpn_leak_from_vrf_update: from vrf VRF TEST
Mar 05 20:33:47 R1 bgpd[56322]: vpn_leak_from_vrf_update: post merge static_attr.ecommunity{65000:1042}
Mar 05 20:33:47 R1 bgpd[56322]: vpn_leak_from_vrf_update: new_attr->ecommunity{65000:1042}
Mar 05 20:33:47 R1 bgpd[56322]: leak_update: entry: leak-to=VRF default, p=172.22.110.96/27, type=10, sub_type=0
Mar 05 20:33:47 R1 bgpd[56322]: Found existing bnc fdff:def0:3:412::/128(0)(VRF TEST) flags 0x82 ifindex 0 #paths 2 peer 0x0, resolved prefix UNK prefix
Mar 05 20:33:47 R1 bgpd[56322]: leak_update_nexthop_valid: 172.22.110.96/27 nexthop is valid (in VRF TEST)
Mar 05 20:33:47 R1 bgpd[56322]: leak_update: ->VRF default: 172.22.110.96/27: Added new route
Valid nexthop leaked route to VPNv6
R1# show bgp ipv6 vpn fd0f:6965:e916::/48
BGP routing table entry for 100.64.0.3:1042:fd0f:6965:e916::/48, version 125
not allocated
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  fdff:def0:1::1 fdff:def0:2::1
  65010
    fe80::100 from :: (100.64.0.3) vrf TEST(4) announce-nh-self
    (fe80::100) (used)
      Origin IGP, metric 0, valid, sourced, local, best (First path received)
      Extended Community: RT:65000:1042
      Originator: 100.64.0.3
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 20:33:47 2025
Mar 06 20:33:47 R1 bgpd[66511]: vpn_leak_from_vrf_update: from vrf VRF TEST
Mar 06 20:33:47 R1 bgpd[66511]: vpn_leak_from_vrf_update: post merge static_attr.ecommunity{65000:1042}
Mar 06 20:33:47 R1 bgpd[66511]: vpn_leak_from_vrf_update: new_attr->ecommunity{65000:1042}
Mar 06 20:33:47 R1 bgpd[66511]: leak_update: entry: leak-to=VRF default, p=fd0f:6965:e916::/48, type=10, sub_type=0
Mar 06 20:33:47 R1 bgpd[66511]: Found existing bnc fdff:def0:3:412::/128(0)(VRF TEST) flags 0x82 ifindex 0 #paths 4 peer 0x0, resolved prefix UNK prefix
Mar 06 20:33:47 R1 bgpd[66511]: leak_update_nexthop_valid: fd0f:6965:e916::/48 nexthop is valid (in VRF TEST)
Mar 06 20:33:47 R1 bgpd[66511]: leak_update: ->VRF default: fd0f:6965:e916::/48: Added new route
Received IPv4 routing
R2# show bgp ipv4 vpn 172.22.110.96/27
BGP routing table entry for 100.64.0.3:1042:172.22.110.96/27, version 454
not allocated
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  fdff:def0:2::1 fdff:def0:3::1 fdff:def0:4::1 fdff:def0:5::1
  65010, (Received from a RR-client)
    0.0.0.0 (metric 16) from fdff:def0:3::1 (100.64.0.3)
      Origin IGP, metric 0, localpref 100, valid, internal, best (First path received)
      Extended Community: RT:65000:1042
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 20:33:47 2025
R2# show bgp vrf TEST ipv4 172.22.110.96/27
BGP routing table entry for 172.22.110.96/27, version 614
Paths: (1 available, best #1, vrf TEST)
  Not advertised to any peer
  Imported from 100.64.0.3:1042:172.22.110.96/27
  65010
    fdff:def0:3::1 (metric 16) from :: (172.20.19.64) vrf default(0) announce-nh-self
      Origin IGP, metric 0, localpref 100, valid, sourced, local, best (First path received), rpki validation-state: valid
      Extended Community: RT:65000:1042
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 20:33:47 2025
R2# show ip route vrf TEST | grep 172.22.110.96/27
B>  172.22.110.96/27 [200/0] via fdff:def0:3::1 (vrf default) (recursive), label 16672, seg6 fdff:def0:3:412::, weight 1, 00:12:18

R2# show ip route vrf TEST 172.22.110.96/27
Routing entry for 172.22.110.96/27
  Known via "bgp", distance 200, metric 0, vrf TEST, best
  Last update 00:10:31 ago
    fdff:def0:3::1(vrf default) (recursive), label 16672, weight 1
  *   fe80::38d5:21ff:fea7:f3b1, via l2tpeth3(vrf default), label 16672, weight 1
R2# ping 172.22.110.97 vrf TEST source-address 172.20.19.64
PING 172.22.110.97 (172.22.110.97) from 172.20.19.64 : 56(84) bytes of data.
64 bytes from 172.22.110.97: icmp_seq=1 ttl=64 time=0.838 ms
64 bytes from 172.22.110.97: icmp_seq=2 ttl=64 time=0.806 ms
64 bytes from 172.22.110.97: icmp_seq=3 ttl=64 time=0.907 ms
Received IPv6 routing
R2# show bgp ipv6 vpn fd0f:6965:e916::/48
BGP routing table entry for 100.64.0.3:1042:fd0f:6965:e916::/48, version 1146
not allocated
Paths: (1 available, best #1)
  Advertised to non peer-group peers:
  fdff:def0:2::1 fdff:def0:3::1 fdff:def0:4::1 fdff:def0:5::1
  65010, (Received from a RR-client)
    fdff:def0:3::1 (metric 16) from fdff:def0:3::1 (100.64.0.3)
      Origin IGP, metric 0, localpref 0, valid, internal, best (First path received)
      Extended Community: RT:65000:1042
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 20:33:47 2025
R2# show bgp vrf TEST ipv6 fd0f:6965:e916::/48
BGP routing table entry for fd0f:6965:e916::/48, version 1277
Paths: (1 available, best #1, vrf TEST)
  Not advertised to any peer
  Imported from 100.64.0.3:1042:fd0f:6965:e916::/48
  65010
    fdff:def0:3::1 (metric 16) from :: (172.20.19.64) vrf default(0) announce-nh-self
      Origin IGP, metric 0, localpref 0, valid, sourced, local, best (First path received), rpki validation-state: valid
      Extended Community: RT:65000:1042
      Remote label: 16672
      Remote SID: fdff:def0:3::
      Last update: Wed Mar  5 20:33:47 2025
R2# show ipv6 route vrf TEST | grep /48
B>  fd0f:6965:e916::/48 [200/0] via fdff:def0:3::1 (vrf default) (recursive), label 16672, seg6 fdff:def0:3:412::, weight 1, 00:10:30

R2# show ipv6 route vrf TEST fd0f:6965:e916::/48
Routing entry for fd0f:6965:e916::/48
  Known via "bgp", distance 200, metric 0, vrf TEST, best
  Last update 00:10:31 ago
    fdff:def0:3::1(vrf default) (recursive), label 16672, weight 1
  *   fe80::38d5:21ff:fea7:f3b1, via l2tpeth3(vrf default), label 16672, weight 1
R2# ping fd0f:6965:e916::1 vrf TEST source-address fdb1:e72a:343d::1
PING fd0f:6965:e916::1(fd0f:6965:e916::1) from fdb1:e72a:343d::1 : 56 data bytes
64 bytes from fd0f:6965:e916::1: icmp_seq=1 ttl=64 time=0.816 ms
64 bytes from fd0f:6965:e916::1: icmp_seq=2 ttl=64 time=0.995 ms
64 bytes from fd0f:6965:e916::1: icmp_seq=3 ttl=64 time=0.763 ms

Copy link
Contributor

@cscarpitta cscarpitta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have several topotests to validate SRv6 L3VPN route leaking.
These are not showing the issue.

I assume you have a scenario / topology that is not covered by the current topotests.

Can you add a new topotest that reproduces your scenario and shows that your patch fixes the issue?

@jvoss
Copy link
Contributor Author

jvoss commented Mar 9, 2025

I will attempt to work on that. However, I have had some trouble recently with the topotests functioning locally.

I originally suspected it may have to do with a neighbor within a VRF using link-local addressing. More specifically when also using extended nexthop. However, I seemed to continue to have issues regardless.

I was able to determine that when it evaluated the SID as a nexthop during the leak to VPN it was not resulting in nh_valid being true. After applying the changes in this PR, everything functioned as expected.

@jvoss jvoss force-pushed the jvoss/srv6_l3vpn_fix branch from 2627cf1 to c00225e Compare March 10, 2025 23:00
@github-actions github-actions bot added size/XXL and removed size/S labels Mar 10, 2025
@jvoss jvoss force-pushed the jvoss/srv6_l3vpn_fix branch from c00225e to 3c2c99a Compare March 10, 2025 23:03
@jvoss
Copy link
Contributor Author

jvoss commented Mar 10, 2025

@cscarpitta I have added a topotest to demonstrate the desired behavior and rebased on the lastest master. I noticed that none of the other SRv6 tests include learning routes from a CE and distributing them via VPNv4/VPNv6. They currently only test leaking connected routes which may explain why this has gone unnoticed.

Additional odd behavior observed here is that the interface must enter promiscuous mode (tcpdump) and capture traffic from a "ce" before it marks a route as valid; even within the VRF itself. I encourage you to try this by commenting out these lines in the topotest and experimenting with it interactively:

# tests/topotests/bgp_srv6l3vpn_to_bgp_vrf4/bgp_srv6l3vpn_to_bgp_vrf4.py

wait_for_ce_traffic("r1", "eth1")
wait_for_ce_traffic("r2", "eth1")

However this may not be directly related to SRv6 and I have experienced this seemingly randomly before.

@jvoss jvoss force-pushed the jvoss/srv6_l3vpn_fix branch from 3c2c99a to 8a9d13e Compare March 12, 2025 03:08
@jvoss
Copy link
Contributor Author

jvoss commented Mar 12, 2025

@cscarpitta I've refactored this a bit more after additional testing.

The nh_valid check while using SRv6 now only marks it valid in VPN tables if the route is valid in its source table. This avoids advertising potentially invalid routes. I have also updated the topotest to not use link-local addressing in its BGP peering to avoid other unrelated complications.

I was also able to narrow down the previous behavior requiring tcpdump. It appears as though Zebra needs kicked after routes are exported to the VPN tables otherwise they will remain invalid indefinitely for some unknown reason. This is accomplished in the topotest by waiting for the routes to be exported and then re-issuing the default configuration command bgp network import-check. This seemed to be enough to cause Zebra to re-evaluate.

@jvoss jvoss force-pushed the jvoss/srv6_l3vpn_fix branch 4 times, most recently from 22ff833 to 3e188e0 Compare March 12, 2025 15:47
When an IPv4 or IPv6 route belonging to a VRF is leaked to VPNv4 or VPNv6 using
SRv6, the locally assigned SID never contains nexthops (bnc->nexthop_num > 0)
and never passes bgp_isvalid_nexthop().

This change updates leak_update_nexthop_valid() to assume the SID nexthop is
valid if SRv6 is enabled, a SID is allocated and the route is valid in the
source table. It will still fail if SRv6 is enabled and neither a srv6_l3vpn
or a srv6_vpn SID is assigned.

Signed-off-by: Jonathan Voss <[email protected]>
@jvoss jvoss force-pushed the jvoss/srv6_l3vpn_fix branch from 3e188e0 to 86f71b6 Compare March 12, 2025 16:37
@jvoss
Copy link
Contributor Author

jvoss commented Mar 12, 2025

Removed all ephemeral items from topotest json fixtures.

One-off unrelated failure in: TopoTests Ubuntu 22.04 amd64 Part 9 only:
pim_boundary_acl.test_pim_boundary_acl test_pim_asm_igmp_join_acl

@jvoss jvoss marked this pull request as draft March 13, 2025 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants