Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not crash on failed writes but skip the file #1933

Open
wants to merge 1 commit into
base: rolling
Choose a base branch
from

Conversation

Timple
Copy link

@Timple Timple commented Mar 7, 2025

In the current implementation when a write cannot be performed, the recorder node crashes.

But since we have it running in a composable container with other nodes, this takes down much more than it should.

So I propose to handle this a bit more graceful.

@Timple Timple force-pushed the no-crash-on-write-fail branch from af81890 to 479011a Compare March 7, 2025 10:23
Copy link
Contributor

@fujitatomoya fujitatomoya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Timple

i am not sure if this is the right thing to do.

1st, after printing this error, following code will generate the segmentation fault to access NULL shared pointer. what the difference does it make?
this situation is pretty much similar with cannot open the storage, IMO this is really exceptional situation and it cannot proceed anymore. this case, i say exception should be generated.

CC: @MichaelOrlov what do you think?

@Timple
Copy link
Author

Timple commented Mar 9, 2025

Ah, the how is completely up to debate. I'm not that familiar with the codebase.

But crashing nodes bringing down composed containers is an issue.

@MichaelOrlov
Copy link
Contributor

MichaelOrlov commented Mar 10, 2025

@Timple @fujitatomoya Unable to open storage is a fatal unrecoverable error, and an exception should be thrown.
However, probably we shouldn't crash other nodes running in the container.
The reason why crash happened is because an unhandled exception.
Before adding composable recorder/player it was Okay to not handle unrecoverable exceptions because in this case our CLI recorder or player would exit with the error code. That is expected behavior.

However, for the composable recorder and player, it would be better to intercept all exceptions on the upper layer and terminate only the failed node.
I am curious if there is an option to do this on the composition container layer.
BTW it seems this is a generic problem for composable nodes. The crash may cause by any other node's unhandled exception not only by the Rosbag2 nodes.

I think we shall keep baseline behavior when terminating CLI ros2 bag record by unrecoverable exception and returning the error code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants