The research toolkit for RL-based robot navigation on ROS 2.
RosNav-RL wraps any reinforcement learning framework - Stable-Baselines3, DreamerV3, or your own - with a unified, fully modular pipeline: sensor data collection and preprocessing, typed observation spaces, composable reward shaping, automatic dependency resolution between pipeline units, and Optuna-based hyperparameter search. Every layer is independently swappable so you can isolate and iterate on exactly the component you care about without rewriting the rest.
The core idea: define your robot's sensors, observations, reward, and algorithm as composable YAML configs. Switch from PPO to DreamerV3, add a new reward term, or plug in a custom observation generator - all without changing a single line of training code.
| ROS | Humble (ROS 2 only) |
| Python | 3.10+ |
| RL Backends | Stable-Baselines3, sb3-contrib, DreamerV3 |
| Config | Pydantic v2 - type-safe, auto-validated, YAML round-trip |
- Wrap any RL backend - the common
RL_Modelinterface means you swap SB3 ↔ DreamerV3 ↔ your own implementation without touching training code. Same observations, same reward, same config. - Declarative data pipeline - define collectors by ROS message type (
sensor_msgs/LaserScan); preprocessing and topic wiring happen automatically. - Dependency-resolved observation graph - generators declare what they need; a topological sort determines execution order at startup so you never manage it manually.
- Composable reward shaping - stack reward units in YAML, evaluated in parallel with safety categorization and schema-validated inter-unit dependencies.
- Built-in hyperparameter tuning - Optuna integration with MedianPruner / HyperbandPruner, framework-specific pruning callbacks, and automatic best-param export.
- One-command deployment -
ros2 run rosnav_rl action_server.pywraps any trained agent behind aGetCommandservice.
Arena-Rosnav users: just run
arena feature training install- skip to Usage.
Requirements: ROS 2 Humble · Python 3.10+ · uv
# 1. Clone
cd ~/colcon_ws/src
git clone --depth 1 https://github.com/Arena-Rosnav/rosnav-rl.git
# 2. Install Python deps (uv creates and manages the venv)
cd rosnav-rl/rosnav_rl
uv sync && source .venv/bin/activate
# 3. Build & source
cd ~/colcon_ws
colcon build --packages-select rosnav_rl rosnav_rl_msgs
source install/setup.bashThat's it. You're ready to deploy or train.
The action server needs two things: a trained agent and an observations config that tells it which ROS topics to subscribe to and how to interpret them.
# Don't have an agent yet? Create one with random weights in seconds:
python3 scripts/create_test_agent.py --agent-name test_agentThe observations config (observations.yaml) maps your robot's sensors to collector types. A minimal example for a laser + goal setup:
# observations.yaml
datasources:
front_laser:
type: sensor_msgs/LaserScan
params:
topic: "scan" # your lidar topic
up_to_date_required: true
goal_pose:
type: geometry_msgs/PoseStamped
params:
topic: "goal_pose"
up_to_date_required: false
robot_pose_from_tf:
type: RobotPoseTFGenerator
params: {}A full annotated example with all available collectors and generators lives at rosnav_rl/observations/observations.yaml. Copy it, strip what you don't need, and point it at your topics.
# Start the action server
ros2 run rosnav_rl action_server.py --ros-args \
-p agent_name:=test_agent \
-p observations_config:=/path/to/observations.yaml
# Or via launch file
ros2 launch rosnav_rl action_server.launch.py \
agent_name:=test_agent \
observations_config:=/path/to/observations.yamlThe server reads sensor data from ROS 2 topics and returns a geometry_msgs/Twist via the rosnav_rl_msgs/srv/GetCommand service. On inference errors it logs a warning and returns zero velocity - it won't crash your robot.
# Poke it manually
ros2 service call /get_command rosnav_rl_msgs/srv/GetCommand {}The cleanest way to train is via arena_training, which wires up the gym environments and launch files. For a standalone training loop, use the Python API directly:
import rosnav_rl
from rosnav_rl.cfg.action_spaces import DifferentialDriveActionSpace
from rosnav_rl.cfg.parameters import AgentParameters
from rosnav_rl.model.stable_baselines3.cfg import (
StableBaselinesCfg, PPO_Cfg, PPO_Algorithm_Cfg,
)
spec = rosnav_rl.AgentConfig(
robot="jackal",
action_space=DifferentialDriveActionSpace(
linear_range=(-2.0, 2.0),
angular_range=(-4.0, 4.0),
),
# Review and adjust these before every training run - especially
# laser_num_beams / laser_max_range (must match your robot's LIDAR),
# robot_radius / safety_distance, and goal_radius / max_steps.
# When using arena_training these are auto-populated from the robot
# description, but you should still verify them
parameters=AgentParameters(
laser_num_beams=720,
laser_max_range=30.0,
robot_radius=0.215,
safety_distance=0.3,
goal_radius=0.35,
max_steps=500,
),
framework=StableBaselinesCfg(
algorithm=PPO_Cfg(
architecture_name="AGENT_1",
parameters=PPO_Algorithm_Cfg(
total_timesteps=5_000_000,
learning_rate=3e-4,
),
),
),
reward=rosnav_rl.RewardCfg(
reward_function_dict={
"goal_reached": {"reward": 15.0},
"collision": {"reward": -10.0},
"approach_goal": {"pos_factor": 0.3, "neg_factor": 0.5},
"safe_distance": {"reward": -0.15},
},
),
)
agent = rosnav_rl.RL_Agent(spec)
agent.initialize_model()
agent.train(train_envs=train_envs, eval_envs=eval_envs)Before every training run: open the generated
agent.yamland verify theparameters:block. Key fields:laser_num_beams,laser_max_range,robot_radius,safety_distance,goal_radius,max_steps, and all velocity bounds. A mismatch between these and your actual robot will silently degrade policy quality. See the AgentParameters reference for a full field table.
To swap to DreamerV3, replace StableBaselinesCfg(...) with DreamerV3Cfg(...) - everything else stays the same. With Arena:
arena launch sim:=gazebo local_planner:=rosnav_rl env_n:=2 \
train_config:=sb_training_config.yamlIt's a one-line YAML change - no Python rewrite required. See the config reference and tutorials for all options.
The test suite covers config loading, path resolution, model init, and the GetCommand service:
cd rosnav_rl
python3 -m pytest tests/ -vNo simulator needed - tests are fully offline. If you want a real smoke-test of the full inference path:
# Creates agents/test_agent/ with random-weight best_model.zip
python3 scripts/create_test_agent.py --agent-name test_agent
# Then spin up the action server and call it
ros2 run rosnav_rl action_server.py --ros-args -p agent_name:=test_agent| Path | Description |
|---|---|
rosnav_rl/ |
Core Python package - models, observations, rewards, spaces, deployment |
rosnav_rl_msgs/ |
ROS 2 message & service definitions (GetCommand, etc.) |
Go deeper:
| Document | What's in it |
|---|---|
| rosnav_rl/README.md | Architecture overview, data-flow diagram, module table |
| rosnav_rl/GUIDE.md | Full developer guide - design patterns, core concepts, code organization |
| rosnav_rl/TUTORIALS.md | Step-by-step: add an algorithm, build a reward unit, create an observation space |
MIT - see LICENSE.md.