bgpd: check rmac and nh of evpn imported routes #18372

dawkopagh · 2025-03-12T16:19:11Z

Fix for an issue which has been described here: #18240
Basically in a setup where multiple peers advertise identical evpn type-5 rotues, if we clear session with any of them (and at least one is still up), we observe route withdrawals from kernel (even though we receive those routes from remaining peers). If the amount of routes is significant it causes a noticeable downtime before routes are readded to kernel.

I'd appreciate a feedback on why a route withdrawal is necessary in such case (per the comment above I can assume that rmac entry/nh might be lingering somewhere, but where? Change which introduced withdrawal has been merged long time ago, is it still necessary?)

Also menitoned behavior (where routes are being withdrawn immediately even if other peers advertise them) has started to occur after backpressure bgp zebra client #15524.

donaldsharp · 2025-03-12T17:13:44Z

could you please add the verbiage you added to the PR to the actual commit message? I do not want to loose this data

Check rmac and nh of new bestpath for evpn imported prefix before withdrawal. Currently when new bestpath is designated, evpn imported routes are being withdrawn from kernel causing downtime which length depends on amount of routes to process. Basically in a setup where multiple peers advertise identical evpn type-5 routes, if we clear session with any of them (and at least one is still up), we observe route withdrawals from kernel (even though we receive those routes from remaining peers). If the amount of routes is significant it causes a noticeable downtime before routes are readded to kernel. Also menitoned behavior (where routes are being withdrawn immediately even if other peers advertise them) has started to occur after backpressure bgp zebra client FRRouting#15524 Let's check rmac entry and nh of new selected bestpath and do not actually withdraw them from kernel if those two are the same. This fixes and issue where the same routes are being advertised by multiple peers and we clear session with one of them FRRouting#18240 Signed-off-by: Dawid Kopec <[email protected]>

ton31337 · 2025-03-12T20:45:09Z

@Mergifyio backport stable/10.3 stable/10.2 stable/10.1

mergify · 2025-03-12T20:45:12Z

backport stable/10.3 stable/10.2 stable/10.1

🟠 Waiting for conditions to match

merged [📌 backport requirement]

ton31337 · 2025-03-12T20:46:19Z

@raja-rajasekar opinion?

raja-rajasekar · 2025-03-12T23:18:25Z

The withdraw actual shouldn't be at this point here(An action item I had in my plate for some time but never got a chance to do it), and https://github.com/FRRouting/frr/pull/18158/files addressed that issue (Not yet merged).

This is because we can always end up in a situation where we immediately withdraw the route while the install is pending for later (backpressure), thereby blackhole-ing the traffic.

So I am not sure why we need to patch this code, and rather maybe directly check with the fix in https://github.com/FRRouting/frr/pull/18158/files (Of course this does some other things as well which I haven't looked at it yet, however the commit which removes this block is what I am talking about)

dawkopagh · 2025-03-13T07:17:54Z

Indeed #18158 is a better fix and would be ideal to have. I have tested my case with frr build from that PR and there is no downtime (which is expected as the snippet with route withdrawal has been deleted).

However there were concerns with breaking a standard pattern in zebra and there were no updates under the PR so I thought that fix might not eventually make it to master and thought of less invasive fix addressing the case where we have an update with old and new bestpaths with the same next-hop resulting in a downtime due to kernel add/del operations.

frrbot bot added the bgp label Mar 12, 2025

github-actions bot added size/XS master labels Mar 12, 2025

dawkopagh force-pushed the check_rmac_nh_for_evpn_imported_route branch from 8833051 to 4865a82 Compare March 12, 2025 18:04

github-actions bot added the backport label Mar 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bgpd: check rmac and nh of evpn imported routes #18372

bgpd: check rmac and nh of evpn imported routes #18372

dawkopagh commented Mar 12, 2025

donaldsharp commented Mar 12, 2025

ton31337 commented Mar 12, 2025

mergify bot commented Mar 12, 2025

ton31337 commented Mar 12, 2025

raja-rajasekar commented Mar 12, 2025 •

edited

Loading

dawkopagh commented Mar 13, 2025

bgpd: check rmac and nh of evpn imported routes #18372

Are you sure you want to change the base?

bgpd: check rmac and nh of evpn imported routes #18372

Conversation

dawkopagh commented Mar 12, 2025

donaldsharp commented Mar 12, 2025

ton31337 commented Mar 12, 2025

mergify bot commented Mar 12, 2025

🟠 Waiting for conditions to match

ton31337 commented Mar 12, 2025

raja-rajasekar commented Mar 12, 2025 • edited Loading

dawkopagh commented Mar 13, 2025

raja-rajasekar commented Mar 12, 2025 •

edited

Loading