Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation
Qisong He, Xinmiao Huang, Jinwei Hu, Zhuoyun Li, Yi Dong, Changshun Wu, Xiaowei Huang

TL;DR
This paper introduces a risk-sensitive reinforcement learning framework for robot navigation that enhances safety margins and enables formal reachability verification, improving reliability in cluttered environments.
Contribution
It proposes a CVaR-constrained training method combined with post-training neural network reachability analysis for safer robot navigation policies.
Findings
Policies trained with CVaR constraints have larger safety margins.
The method achieves a 98.3% success rate in navigation scenarios.
Reachability verification reveals risks not captured by average cost metrics.
Abstract
Safe navigation for mobile robots demands policies that remain reliable under the high-consequence perception uncertainty of cluttered environments. Yet most existing safe reinforcement learning (RL) methods assess safety through average cumulative cost. Such metrics can mask dangerous tail-risk behaviors. To address this, we propose a framework that trains risk-sensitive policies through Conditional Value-at-Risk (CVaR) constrained optimization on an off-policy TD3 backbone and evaluates their safety margins post-training through neural network reachability verification. During training, the policy is optimized under CVaR constraints on cumulative costs, promoting sensitivity to high-cost tail outcomes rather than average behavior alone. After training, we compute action reachable sets under bounded observation uncertainty using Taylor Model analysis, yielding a safety rate metric that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
