Skip to content

Conversation

@marc-hanheide
Copy link
Member

What type of PR is this? (check all applicable)

  • Refactor
  • Feature
  • Bug Fix
  • Optimization
  • Documentation Update

Description

This PR adds dual-level heartbeat monitoring in Sentor:

  1. Safety heartbeat (safety/heartbeat) – driven by safety_critical checks
  2. Warning heartbeat (warning/heartbeat) – driven by autonomy_critical checks

Other key pieces added:

  • NodeMonitor class to track ROS 2 node liveness.
  • Updated TopicMonitor and SafetyMonitor to expose both is_alive and is_autonomy_alive.
  • New YAML flags (autonomy_critical) + extended README with detailed config guidance.
  • Example config (test_monitor_config.yaml) covering topic + node monitors.

This pull request removes the ROS 1-specific implementation of the sentor package, transitioning it to a ROS 2-only codebase. The changes include removing ROS 1 dependencies, messages, configurations, and launch files, as well as adding comprehensive documentation for the ROS 2 YAML configuration structure.

Key Changes:

Removal of ROS 1-specific files and configurations:

  • Deleted the CMakeLists.txt file, which contained ROS 1-specific build instructions, including catkin dependencies, message/service generation, and installation rules.
  • Removed package.xml, which defined ROS 1 dependencies and metadata for the sentor package.
  • Deleted ROS 1 launch files, including launch/sentor.launch and launch/topic_mapping.launch. These files configured ROS 1 nodes and parameters. [1] [2]
  • Removed all ROS 1 message definitions, such as msg/MonitorArray.msg and msg/TopicMapArray.msg. [1] [2]

Removal of legacy configuration files:

  • Deleted several YAML configuration files (config/execute.yaml, config/map.yaml, config/rob_lindsey.yaml, config/test.yaml) that were designed for ROS 1 monitoring setups. These files included topic monitors, signal lambdas, and execution rules. [1] [2] [3] [4]

Documentation updates for ROS 2:

  • Added detailed documentation to README.md explaining the YAML configuration structure for ROS 2, including monitors, node_monitors, and heartbeat topics (safety/heartbeat and warning/heartbeat). The documentation also provides examples and a quick checklist for users migrating to ROS

@marc-hanheide marc-hanheide changed the title Frist version working for ROS2 First version working for ROS2 Jul 28, 2025
@marc-hanheide marc-hanheide added the enhancement New feature or request label Jul 28, 2025

This comment was marked as outdated.

@marc-hanheide marc-hanheide marked this pull request as ready for review July 28, 2025 10:38
@marc-hanheide
Copy link
Member Author

I have pulled this back into draft, until Cyano0#1 is considered by @Cyano0 and eventually merged.

We need CI and devcontainers first to properly assess this PR, I suggest

marc-hanheide and others added 3 commits July 28, 2025 14:03
making the structure compliant with our usual structure and adding CI and devcontainer
@marc-hanheide
Copy link
Member Author

Great to see Cyano0#1 is now merged into the and we have first successful CI builds: https://github.com/LCAS/sentor/pull/60/checks

Now, I strongly suggest that basic functionality is tried out in the devcontainer. Once @Cyano0 confirms it works as intended, please tick the "ready for review" button

Thanks

@Cyano0
Copy link

Cyano0 commented Jul 28, 2025

Great to see Cyano0#1 is now merged into the and we have first successful CI builds: https://github.com/LCAS/sentor/pull/60/checks

Now, I strongly suggest that basic functionality is tried out in the devcontainer. Once @Cyano0 confirms it works as intended, please tick the "ready for review" button

Thanks

I will just replace the ros2 topic type today test it locally and then go for ready for review.

@marc-hanheide
Copy link
Member Author

Great to see Cyano0#1 is now merged into the and we have first successful CI builds: https://github.com/LCAS/sentor/pull/60/checks
Now, I strongly suggest that basic functionality is tried out in the devcontainer. Once @Cyano0 confirms it works as intended, please tick the "ready for review" button
Thanks

I will just replace the ros2 topic type today test it locally and then go for ready for review.

Great, thanks, but please test (also) in the dev container. That is the cleanest environment, as the wrong shebang I spotted earlier suggest you have a customised python environment that is not standard. So, always best to make sure it runs in the "safe space" of the dev container (at least as well, if you don't fancy developing in it)

@marc-hanheide marc-hanheide marked this pull request as ready for review July 29, 2025 12:58
@marc-hanheide marc-hanheide requested a review from Copilot July 29, 2025 12:58

This comment was marked as outdated.

@marc-hanheide
Copy link
Member Author

Over to @ibrahimhroob and @LeonardoGuevara to chip in and approve if they're happy

@marc-hanheide marc-hanheide removed their assignment Jul 29, 2025
@marc-hanheide
Copy link
Member Author

@Cyano0

Can you provide instructions on how to test the implementation?

I tried to run sentor_node with the default configuration, and it already fails at loading the config file. So I suspect I'm doing something fundamentally wrong?

I tried ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml

and it already fails loading the configuration:

root@035423787dc7:/workspaces/sentor# ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml
Monitoring topics:
Traceback (most recent call last):
  File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 66, in <module>
    if topic.get("include", True) is False:
AttributeError: 'str' object has no attribute 'get'
[ros2run]: Process exited with failure 1

I know how to fix that one, i.e., change https://github.com/LCAS/sentor/pull/60/files#diff-1a1c2b9ef95420336044e9bdfa777dd93ff406d38febd6a0c187b6053ef25f26R51 to this:

        topics = [item for sublist in items for item in sublist['monitors']]

Then it loads the file, but fails later with

sentor_node.py-1] Traceback (most recent call last):
[sentor_node.py-1]   File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 79, in <module>
[sentor_node.py-1]     msg_type=multi_monitor.get_message_type(topic["message_type"]),
[sentor_node.py-1] AttributeError: 'MultiMonitor' object has no attribute 'get_message_type'

As I said, I likely do something fundamentally wrong, but hope you can advise.

All should be directly reproducable in the dev container.

@marc-hanheide
Copy link
Member Author

I have also created Cyano0#2 to add a launch file to your PR. I suggest to merge it, as it makes starting sentor easier.

@Cyano0
Copy link

Cyano0 commented Jul 30, 2025

@Cyano0

Can you provide instructions on how to test the implementation?

I tried to run sentor_node with the default configuration, and it already fails at loading the config file. So I suspect I'm doing something fundamentally wrong?

I tried ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml

and it already fails loading the configuration:

root@035423787dc7:/workspaces/sentor# ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml
Monitoring topics:
Traceback (most recent call last):
  File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 66, in <module>
    if topic.get("include", True) is False:
AttributeError: 'str' object has no attribute 'get'
[ros2run]: Process exited with failure 1

I know how to fix that one, i.e., change https://github.com/LCAS/sentor/pull/60/files#diff-1a1c2b9ef95420336044e9bdfa777dd93ff406d38febd6a0c187b6053ef25f26R51 to this:

        topics = [item for sublist in items for item in sublist['monitors']]

Then it loads the file, but fails later with

sentor_node.py-1] Traceback (most recent call last):
[sentor_node.py-1]   File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 79, in <module>
[sentor_node.py-1]     msg_type=multi_monitor.get_message_type(topic["message_type"]),
[sentor_node.py-1] AttributeError: 'MultiMonitor' object has no attribute 'get_message_type'

As I said, I likely do something fundamentally wrong, but hope you can advise.

All should be directly reproducable in the dev container.

Sorry I was using test_sentor.py to run sentor. The current version of sentor_note.py is an old version of code (before I added some new functions), I should replace it with the current test_sentor.py and delete the test_sentor.py.

I usually run like:
ros2 run sentor test_sentor.py --ros-args --param config_file:=/workspaces/sentor/src/sentor/config/test_monitor_config.yaml

Add launch files and update package.xml for Sentor monitoring system
@marc-hanheide
Copy link
Member Author

@Cyano0
Can you provide instructions on how to test the implementation?
I tried to run sentor_node with the default configuration, and it already fails at loading the config file. So I suspect I'm doing something fundamentally wrong?
I tried ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml
and it already fails loading the configuration:

root@035423787dc7:/workspaces/sentor# ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml
Monitoring topics:
Traceback (most recent call last):
  File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 66, in <module>
    if topic.get("include", True) is False:
AttributeError: 'str' object has no attribute 'get'
[ros2run]: Process exited with failure 1

I know how to fix that one, i.e., change https://github.com/LCAS/sentor/pull/60/files#diff-1a1c2b9ef95420336044e9bdfa777dd93ff406d38febd6a0c187b6053ef25f26R51 to this:

        topics = [item for sublist in items for item in sublist['monitors']]

Then it loads the file, but fails later with

sentor_node.py-1] Traceback (most recent call last):
[sentor_node.py-1]   File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 79, in <module>
[sentor_node.py-1]     msg_type=multi_monitor.get_message_type(topic["message_type"]),
[sentor_node.py-1] AttributeError: 'MultiMonitor' object has no attribute 'get_message_type'

As I said, I likely do something fundamentally wrong, but hope you can advise.
All should be directly reproducable in the dev container.

Sorry I was using test_sentor.py to run sentor. The current version of sentor_note.py is an old version of code (before I added some new functions), I should replace it with the current test_sentor.py and delete the test_sentor.py.

I usually run like: ros2 run sentor test_sentor.py --ros-args --param config_file:=/workspaces/sentor/src/sentor/config/test_monitor_config.yaml

I see! In that. can you please update this, as the name of the node is then misleading. Or is it only a test node and not the one that should finally be used?

I'd like to have this as complete as possible before we merge it

@Cyano0
Copy link

Cyano0 commented Jul 30, 2025

@Cyano0
Can you provide instructions on how to test the implementation?
I tried to run sentor_node with the default configuration, and it already fails at loading the config file. So I suspect I'm doing something fundamentally wrong?
I tried ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml
and it already fails loading the configuration:

root@035423787dc7:/workspaces/sentor# ros2 run sentor sentor_node.py --ros-args -p config_file:=/workspaces/sentor/install/sentor/share/sentor/config/test_monitor_config.yaml
Monitoring topics:
Traceback (most recent call last):
  File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 66, in <module>
    if topic.get("include", True) is False:
AttributeError: 'str' object has no attribute 'get'
[ros2run]: Process exited with failure 1

I know how to fix that one, i.e., change https://github.com/LCAS/sentor/pull/60/files#diff-1a1c2b9ef95420336044e9bdfa777dd93ff406d38febd6a0c187b6053ef25f26R51 to this:

        topics = [item for sublist in items for item in sublist['monitors']]

Then it loads the file, but fails later with

sentor_node.py-1] Traceback (most recent call last):
[sentor_node.py-1]   File "/workspaces/sentor/install/sentor/lib/sentor/sentor_node.py", line 79, in <module>
[sentor_node.py-1]     msg_type=multi_monitor.get_message_type(topic["message_type"]),
[sentor_node.py-1] AttributeError: 'MultiMonitor' object has no attribute 'get_message_type'

As I said, I likely do something fundamentally wrong, but hope you can advise.
All should be directly reproducable in the dev container.

Sorry I was using test_sentor.py to run sentor. The current version of sentor_note.py is an old version of code (before I added some new functions), I should replace it with the current test_sentor.py and delete the test_sentor.py.
I usually run like: ros2 run sentor test_sentor.py --ros-args --param config_file:=/workspaces/sentor/src/sentor/config/test_monitor_config.yaml

I see! In that. can you please update this, as the name of the node is then misleading. Or is it only a test node and not the one that should finally be used?

I'd like to have this as complete as possible before we merge it

Sure, I have updated this. It can be used as a final node at least to me.

@marc-hanheide marc-hanheide requested a review from Copilot July 30, 2025 16:42
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR transitions the Sentor monitoring system from ROS 1 to ROS 2, implementing dual-level heartbeat monitoring with safety and warning beats. The change removes all ROS 1-specific code and configurations while introducing comprehensive ROS 2 implementations with node monitoring capabilities.

  • Complete migration from ROS 1 to ROS 2 architecture with updated message interfaces and node structures
  • Introduction of dual heartbeat system: safety-critical monitoring via safety/heartbeat and autonomy-critical monitoring via warning/heartbeat
  • Addition of NodeMonitor class for ROS 2 node liveness tracking alongside existing topic monitoring

Reviewed Changes

Copilot reviewed 56 out of 61 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/sentor_msgs/ New ROS 2 message package with updated message definitions and service interfaces
src/sentor/sentor/ Complete ROS 2 Python implementation of monitoring classes with dual-level heartbeat support
src/sentor/scripts/sentor_node.py Main ROS 2 node orchestrating topic and node monitors with heartbeat publishers
src/sentor/config/test_monitor_config.yaml Example configuration demonstrating ROS 2 YAML structure with safety/autonomy flags
Legacy files removed All ROS 1 CMakeLists.txt, package.xml, launch files, and Python implementations
Comments suppressed due to low confidence (1)

src/sentor_msgs/srv/GetTopicMaps.srv:3

  • The service response references sentor_msgs/TopicMapArray but this message type is not consistently used throughout the codebase. Consider verifying that all message type references use the correct package prefix.
sentor_msgs/TopicMapArray topic_maps

crit = [k for k,v in self.conditions.items() if v.get("safety_critical")]
lamb_ok = all(self.conditions[k]["satisfied"] for k in crit) if crit else True
alive = rate_ok and lamb_ok
self.node.get_logger().info(f"[LIVENESS] {self.topic_name}: rate={rate:.2f}/{self.rate}, lambdas={crit} all={lamb_ok}")
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The liveness check logging at INFO level occurs every second (line 95 timer). This could generate excessive log output in production. Consider using DEBUG level or reducing the frequency of this logging.

Suggested change
self.node.get_logger().info(f"[LIVENESS] {self.topic_name}: rate={rate:.2f}/{self.rate}, lambdas={crit} all={lamb_ok}")
self.node.get_logger().debug(f"[LIVENESS] {self.topic_name}: rate={rate:.2f}/{self.rate}, lambdas={crit} all={lamb_ok}")

Copilot uses AI. Check for mistakes.
self.enabled = False
self.node.get_logger().info(f"[ROSTopicHz] Monitoring stopped for {self.topic_name}")

# def callback_hz(self, msg):
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are large blocks of commented code (lines 56-125 and 131-143) that should be removed to improve code readability and maintainability.

Copilot uses AI. Check for mistakes.
self._stop_event.clear()


# import rclpy
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are large blocks of commented-out code (lines 88-333) that should be removed to improve code maintainability and readability.

Copilot uses AI. Check for mistakes.
Comment on lines +114 to +115
# FIX: Get the service type manually
service_class = get_service_type(service_name)
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment indicates this is a temporary fix. Consider implementing proper service type discovery or document why manual service type mapping is the intended approach.

Suggested change
# FIX: Get the service type manually
service_class = get_service_type(service_name)
# Dynamically discover the service type using ROS2 introspection
service_class = get_service_class_by_name(self, service_name)

Copilot uses AI. Check for mistakes.
def sleep(self, duration):
""" Sleeps for a given duration in seconds """
self.get_logger().info(f"Sleeping for {duration} seconds...")
time.sleep(duration) # FIX: Replaced `rclpy.sleep()` with `time.sleep()`
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment suggests this is a temporary fix. In ROS 2, consider using the node's timer mechanisms or rclpy.sleep() for better integration with the ROS 2 executor.

Suggested change
time.sleep(duration) # FIX: Replaced `rclpy.sleep()` with `time.sleep()`
sleep_future = rclpy.task.Future()
self.create_timer(duration, lambda: sleep_future.set_result(True))
rclpy.spin_until_future_complete(self, sleep_future)

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,138 @@
monitors: #[]
Copy link

Copilot AI Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration file contains extensive commented-out sections (lines 2-56) that make it difficult to understand the actual configuration. Consider moving example configurations to separate documentation or example files.

Copilot uses AI. Check for mistakes.
Copy link

@ibrahimhroob ibrahimhroob left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let get this merged

@ibrahimhroob ibrahimhroob merged commit 7ea13f8 into LCAS:ros2-devel Sep 10, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants