Add support for rosidl::Buffer-aware per-endpoint pub/sub (backport #930)#987
Open
mergify[bot] wants to merge 1 commit into
Open
Add support for rosidl::Buffer-aware per-endpoint pub/sub (backport #930)#987mergify[bot] wants to merge 1 commit into
mergify[bot] wants to merge 1 commit into
Conversation
* Add support for rosidl::Buffer-aware per-endpoint pub/sub Signed-off-by: CY Chen <cyc@nvidia.com> * Add buffer backend init/shutdown functions Signed-off-by: CY Chen <cyc@nvidia.com> * Fix lint error Signed-off-by: CY Chen <cyc@nvidia.com> * Rename backend aux info to backend metadata Signed-off-by: CY Chen <cyc@nvidia.com> * Update to use per-context buffer backend registry support Signed-off-by: CY Chen <cyc@nvidia.com> * Add support for legacy subscribers in the rosidl::buffer path Signed-off-by: CY Chen <cyc@nvidia.com> * Add CPU group endpoints Signed-off-by: CY Chen <cyc@nvidia.com> * Use per-topic CPU channels Signed-off-by: CY Chen <cyc@nvidia.com> * fix(graph_cache): release graph_mutex_ before invoking discovery callbacks update_topic_map_for_put() collected discovery callbacks under discovery_mutex_ but still invoked them while graph_mutex_ was held (via the lock_guard in parse_put). Any callback that re-enters graph_cache — e.g. creating a per-endpoint subscription which calls register_publisher_discovery_callback() — would attempt to re-acquire graph_mutex_ on the same thread, deadlocking immediately. Fix: change update_topic_map_for_put() and update_topic_maps_for_put() to return the collected callbacks instead of invoking them. parse_put() switches from lock_guard to unique_lock so it can call lock.unlock() before iterating over the returned callbacks. This is a defensive complement to the lock-order fix in e91c15a. While no current callback directly re-acquires graph_mutex_, invoking external callbacks under an internal mutex is an API contract violation that creates fragility for future changes. Signed-off-by: YuanYu Yuan <yuanyu.yuan@zettascale.tech> * style: fix uncrustify line-length in is_cpu_only_backend_metadata Signed-off-by: YuanYu Yuan <yuanyu.yuan@zettascale.tech> * Use single shared accelerated channel per buffer-aware subscriber Signed-off-by: CY Chen <cyc@nvidia.com> * Address review comments Signed-off-by: CY Chen <cyc@nvidia.com> * fix(liveliness): escape '/' in backend names embedded in key expressions * Update buffer size estimation to align with rmw_fastrtps_cpp Signed-off-by: CY Chen <cyc@nvidia.com> * Change logs for rosidl::Buffer to DEBUG level Signed-off-by: CY Chen <cyc@nvidia.com> * Replace RCUTILS_LOG_DEBUG_NAMED with RMW_ZENOH_ROSIDL_BUFFER_LOG_DEBUG_NAMED Signed-off-by: CY Chen <cyc@nvidia.com> --------- Signed-off-by: CY Chen <cyc@nvidia.com> Signed-off-by: YuanYu Yuan <yuanyu.yuan@zettascale.tech> Co-authored-by: yuanyuyuan <az6980522@gmail.com> (cherry picked from commit e95c62d)
Contributor
|
Pulls: #987 |
Contributor
CI Test Results
Linux-aarch64 — 2 failure(s)
Linux-rhel — 6 failure(s)
Windows — 42 failure(s)
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This pull request adds full
rosidl::Buffersupport tormw_zenoh_cpp, enabling per-endpoint Zenoh publishers and subscribers for zero-copy buffer transport between compatible backends. When a publisher and subscriber share compatible non-CPU buffer backends, data can be transferred via a lightweight descriptor; when backends are incompatible, the system falls back to standard CPU-based buffer serialization.This pull request consists of the following key changes:
rosidl_buffer_backend_registry::initialize_buffer_backends()/shutdown_buffer_backends()during RMW init/shutdown to load and tear down buffer backend plugins.create_descriptor_with_endpoint()(nullptr: CPU fallback). Publisher creation explicitly adds"cpu"tobackend_aux_info.publish()first sends endpoint-aware messages viapublish_buffer_aware(), then conditionally falls through to the standard base-key publish path only when the total matched subscription count exceeds discovered buffer-aware subscribers -- avoiding unnecessary CPU conversion when all subscribers are buffer-aware.Messagestruct owns endpoint info viastd::optional<EndpointInfoStorage>Endpoint info is passed into deserialization for correct backend reconstruction.acceptable_buffer_backends: Parses the subscription option --NULL/empty/"cpu": CPU-only (advertises"cpu"in liveliness token);"any": all installed; specific names: filtered. Inon_publisher_discovered(), CPU is always added to the publisher's backend list.Is this user-facing behavior change?
This pull request does not change existing
rmw_zenoh_cppbehavior for standard (non-Buffer) messages. For messages withuint8[]fields, the per-endpoint transport is transparent -- publishers and subscribers share backend info automatically, and CPU fallback ensures correctness when backends are incompatible.Did you use Generative AI?
Yes. Claude (claude-4.6-opus) via Cursor was used to assist with creating an initial prototype version of the changes contained in this PR.
Additional Information
This PR is part of the broader ROS 2 native buffer feature introduced in this post.
This is an automatic backport of pull request #930 done by Mergify.