Skip to content

Segfault in MultiThreadedExecutor using rmw_fastrtps_cpp #728

@sea-bass

Description

@sea-bass

Bug report

Required Info:

  • Operating System:
    • Ubuntu 22.04 (inside a Docker container)
  • Installation type:
    • Binaries
  • Version or commit hash:
    • The latest available binaries on Humble
  • DDS implementation:
    • rmw_fastrtps_cpp
  • Client library (if applicable):
    • rclcpp

Steps to reproduce issue

We unfortunately don't have a full reproduction case, but we are creating a node with a multi-threaded executor as follows:

rclcpp::executors::MultiThreadedExecutor exec(rclcpp::ExecutorOptions(), 0, false, std::chrono::milliseconds(250));

The segfault we see is:

[objective_server_node_main-8] [ERROR] [1684327668.068245959] [objective_server_node]: Execution of behavior tree for objective `3 Waypoints Pick and Place` did not succeed.
[objective_server_node_main-8] Stack trace (most recent call last) in thread 833:
[objective_server_node_main-8] #8    Object "/usr/lib/aarch64-linux-gnu/ld-linux-aarch64.so.1", at 0xffffffffffffffff, in 
[objective_server_node_main-8] #7    Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffff9e5a5d1b, in 
[objective_server_node_main-8] #6    Object "/usr/lib/aarch64-linux-gnu/libc.so.6", at 0xffff9e53d5c7, in 
[objective_server_node_main-8] #5    Object "/usr/lib/aarch64-linux-gnu/libstdc++.so.6.0.30", at 0xffff9e7731fb, in 
[objective_server_node_main-8] [1684327680.555]: Sequence                  IDLE -> RUNNING
[objective_server_node_main-8] [1684327680.555]: ClearSnapshot             IDLE -> RUNNING
[objective_server_node_main-8] #4    Source "./src/rclcpp/executors/multi_threaded_executor.cpp", line 85, in run [0xffff9edbffc7]
[objective_server_node_main-8] #3    Source "./src/rclcpp/executor.cpp", line 912, in get_next_executable [0xffff9edb8b5b]
[objective_server_node_main-8] #2    Source "./src/rclcpp/executor.cpp", line 756, in wait_for_work [0xffff9edb5bdf]
[objective_server_node_main-8] #1    Object "/opt/ros/humble/lib/librcl.so", at 0xffff9e185b97, in rcl_wait
[objective_server_node_main-8] #0    Object "/opt/ros/humble/lib/librmw_fastrtps_shared_cpp.so", at 0xffff9d5f5718, in rmw_fastrtps_shared_cpp::__rmw_wait(char const*, rmw_subscriptions_s*, rmw_guard_conditions_s*, rmw_services_s*, rmw_clients_s*, rmw_events_s*, rmw_wait_set_s*, rmw_time_s const*)
[objective_server_node_main-8] Segmentation fault (Address not mapped to object [0xaaa1b04a5f08])
[ERROR] [objective_server_node_main-8]: process has died [pid 666, exit code -11, cmd '/opt/overlay_ws/install/moveit_studio_agent/lib/moveit_studio_agent/objective_server_node_main --ros-args --log-level error --ros-args -r __node:=objective_server_node --params-file /tmp/launch_params_j4cwbvx_ --params-file /tmp/launch_params_7oyj5lbr --params-file /tmp/launch_params_2aqbg8_q --params-file /tmp/launch_params_12q1oav_ --params-file /tmp/launch_params_h4pxzy2y --params-file /tmp/launch_params_8vn2ch2o --params-file /tmp/launch_params_1c4b41ac --params-file /tmp/launch_params_blie9quv'].

One of our engineers looked into this with a debugger and saw:

Thread 25 "objective_serve" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xffffa535e660 (LWP 74792)]
0x0000fffff6417588 in rmw_fastrtps_shared_cpp::__rmw_wait (identifier=, subscriptions=, guard_conditions=0xaaaaab4e0b30, services=, clients=, events=, wait_set=, wait_timeout=) at /opt/underlay_ws/src/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_wait.cpp:220
220 if (!condition->get_trigger_value()) {
(gdb) bt
#0 0x0000fffff6417588 in rmw_fastrtps_shared_cpp::__rmw_wait (
identifier=, subscriptions=,
guard_conditions=0xaaaaab4e0b30, services=,
clients=, events=, wait_set=,
wait_timeout=)
at /opt/underlay_ws/src/rmw_fastrtps/rmw_fastrtps_shared_cpp/src/rmw_wait.cpp:220
#1 0x0000fffff6fc5b98 in rcl_wait (wait_set=0xffffffffa110, timeout=250000000)
at ./src/rcl/wait.c:595
#2 0x0000fffff7bf5be0 in rclcpp::Executor::wait_for_work (
this=0xffffffffa060, timeout=...) at /usr/include/c++/11/chrono:521
#3 0x0000fffff7bf8b5c in rclcpp::Executor::get_next_executable (
this=0xffffffffa060, any_executable=..., timeout=...)
at ./src/rclcpp/executor.cpp:912
#4 0x0000fffff7bfffc8 in rclcpp::executors::MultiThreadedExecutor::run (
this=0xffffffffa060, this_thread_number=)
at ./src/rclcpp/executors/multi_threaded_executor.cpp:85
#5 0x0000fffff75b31fc in ?? () from /lib/aarch64-linux-gnu/libstdc++.so.6
#6 0x0000fffff737d5c8 in ?? () from /lib/aarch64-linux-gnu/libc.so.6
#7 0x0000fffff73e5d1c in ?? () from /lib/aarch64-linux-gnu/libc.so.6`

Additional information

This segfault does not occur using Cyclone DDS.

Implementation considerations

Metadata

Metadata

Assignees

No one assigned

    Labels

    backlogbugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions