-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The RecordIntegrationTestFixture.can_record_again_after_stop
test is tend to be flaky
#1914
Comments
Preliminary RCA (Root Cause Analysis)From the log it is clear that we are getting two identical messages, and one of them is extra and unexpected.
However, while the received timestamp for them is different, the sent timestamps are the same
It does mean that the message was sent only once, but received twice! In the Rosbag2 recorder, we clear the inner subscriptions list once when we are doing Recorder::stop() and the second time just before the start of the recording. However, in practice, things are not so simple.
Potentially it could be a race condition that when we reached out subscription cleanup code it is weak pointer is not expired because the executor transformed it to the shared pointer in the void CallbackGroup::collect_all_ptrs(..) and hasn't been released it yet. Proposed solutionThe straightforward solution would be to redesign the Rosbag2 recorder and eliminate using |
Description
Sometimes, on CI the RecordIntegrationTestFixture.can_record_again_after_stop test fails because recorder receiving 5 messages while expected only 4.
It seems that the subscription from the first run hasn't been deleted and triggered callback twice.
It was added debug information in the #1871 PR.
Expected Behavior
The test shall pass without errors and the Recorder shall not receive extra unexpected messages
Actual Behavior
Sometimes Rosbag2 recorder receives one extra message after calling
recorder::stop()
and thenrecorder::record()
again.To Reproduce
The issue reproduces on CI only.
Link to the failed CI job https://build.ros2.org/job/Rpr__rosbag2__ubuntu_noble_amd64/528/testReport/junit/rosbag2_transport/RecordIntegrationTestFixture/can_record_again_after_stop/
Full log with debug info available here https://build.ros2.org/job/Rpr__rosbag2__ubuntu_noble_amd64/528/consoleText
Log for failing test: Click to expand
System (please complete the following information)
Additional context
** Add any other context about the problem here **
The text was updated successfully, but these errors were encountered: