-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zebra: zebra crash for zapi stream #18359
base: master
Are you sure you want to change the base?
Conversation
Issue: If static route is created with a BGP route as nexthop, which recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode fails, as there is not enough memory allocated for stream. This causes assert/core dump in zebra. Right now we allocate fixed memory of ZEBRA_MAX_PACKET_SIZ size. Fix: 1)Dynamically calculate required memory size for the stream 2)try to optimize memory usage Testing: No crash happens anymore with the fix zebra: zebra crash for zapi stream Issue: If static route is created with a BGP route as nexthop, which recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode fails, as there is not enough memory allocated for stream. This causes assert/core dump in zebra. Right now we allocate fixed memory of ZEBRA_MAX_PACKET_SIZ size. Fix: 1)Dynamically calculate required memory size for the stream 2)try to optimize memory usage Testing: No crash happens anymore with the fix r1# r1# sharp install routes 2100:cafe:: nexthop 2001:db8::1 1000 r1# r2# conf r2(config)# ipv6 route 2503:feca::100/128 2100:cafe::1 r2(config)# exit r2# Signed-off-by: Soumya Roy <[email protected]> Signed-off-by: Soumya Roy <[email protected]>
Signed-off-by: Soumya Roy <[email protected]>
7c7ee72
to
b392bef
Compare
@@ -1086,6 +1086,93 @@ static void zapi_nexthop_group_sort(struct zapi_nexthop *nh_grp, | |||
&zapi_nexthop_cmp); | |||
} | |||
|
|||
/* size needed by a stream for a zapi nexthop */ | |||
size_t zapi_nh_encode_stream_size(const struct zapi_nexthop *api_nh) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, no, it's not maintainable/safe to try to capture the sizes of our structs in code in this way.
can you take a look at the arithmetic approach that was recently added for the zapi_route struct, and see whether that or some adaptation of it could work here too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That model is not directly applicable here. 1) In other model, they have bigger ref max size which is zebra route size(there they are encoding route, not nexthop), does msg_size = sizeof(struct zapi_route); then msg_size - nh size of unused struct zapi_nexthop, in my case, I dont have any reference to start except the max ZEBRA_MAX_PACKET_SIZ, size of zapi_nexthop == 320 bytes, for 520 it is 520x320 >150000, which is more than ZEBRA_MAX_PACKET_SIZ, so I can't do ZEBRA_MAX_PACKET_SIZ - no_of_unused_nhx320. (memory requirement in negative size) 2) Also for other model, it was just lucky that sizeof(struct zapi_route) ~= 320000, so that assuming sizeof(struct zapi_route) - no_of_unused_nh*320, will always work, but in my view this assumption itself can be wrong, considering there is variable part(pointer,array) in zapi_nexthop, i.e szieof(zapi_nexthop) may not reflect the actual size we need to encode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm- I'll try to say what I meant in a different way.
the receiving side of this zapi message in lib/zclient uses struct zapi_route
- and sizeof(struct zapi_route)
is a good proxy for the maximum message size you need - isn't it? there are a couple of fields that aren't used in this specific path, but those are not large, mostly. the 'opaque' blob is the exception, and you can math that out? you'll be left with the fixed fields, and the embedded nexthops. you can also math out the right number of regular and backup nexthops. since those are used on the decode side, they're a good proxy for the space needed on the encode side?
Issue:
If static route is created with a BGP route as nexthop, which recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode fails, as there is not enough memory allocated for stream. This causes assert/core dump in zebra. Right now we allocate fixed memory of ZEBRA_MAX_PACKET_SIZ size.
Fix:
1)Dynamically calculate required memory size for the stream 2)try to optimize memory usage
Testing:
No crash happens anymore with the fix
zebra: zebra crash for zapi stream
Issue:
If static route is created with a BGP route as nexthop, which recursively resolves over 512 ECMP v6 nexthops, zapi nexthop encode fails, as there is not enough memory allocated for stream. This causes assert/core dump in zebra. Right now we allocate fixed memory of ZEBRA_MAX_PACKET_SIZ size.
Fix:
1)Dynamically calculate required memory size for the stream 2)try to optimize memory usage
Testing:
No crash happens anymore with the fix
r1#
r1# sharp install routes 2100:cafe:: nexthop 2001:db8::1 1000 r1#
r2# conf
r2(config)# ipv6 route 2503:feca::100/128 2100:cafe::1 r2(config)# exit
r2#