TL;DR
This paper presents a novel approach combining goal-conditioned safe reinforcement learning with planning techniques to enable safe, efficient, and scalable multi-agent navigation in hazardous environments, outperforming existing methods.
Contribution
It introduces a unified framework that integrates safe RL and planning for goal-conditioned multi-agent navigation, enhancing safety and scalability in complex environments.
Findings
Achieves safer navigation over longer distances.
Effectively coordinates multiple agents in hazardous settings.
Outperforms state-of-the-art baselines in benchmarks.
Abstract
Safe navigation is essential for autonomous systems operating in hazardous environments. Traditional planning methods excel at long-horizon tasks but rely on a predefined graph with fixed distance metrics. In contrast, safe Reinforcement Learning (RL) can learn complex behaviors without relying on manual heuristics but fails to solve long-horizon tasks, particularly in goal-conditioned and multi-agent scenarios. In this paper, we introduce a novel method that integrates the strengths of both planning and safe RL. Our method leverages goal-conditioned RL and safe RL to learn a goal-conditioned policy for navigation while concurrently estimating cumulative distance and safety levels using learned value functions via an automated self-training algorithm. By constructing a graph with states from the replay buffer, our method prunes unsafe edges and generates a waypoint-based plan that the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
