Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation

Qisong He; Xinmiao Huang; Jinwei Hu; Zhuoyun Li; Yi Dong; Changshun Wu; Xiaowei Huang

arXiv:2605.14174·cs.RO·May 15, 2026

Safety-Constrained Reinforcement Learning with Post-Training Reachability Verification for Robot Navigation

Qisong He, Xinmiao Huang, Jinwei Hu, Zhuoyun Li, Yi Dong, Changshun Wu, Xiaowei Huang

PDF

TL;DR

This paper introduces a risk-sensitive reinforcement learning framework for robot navigation that enhances safety margins and enables formal reachability verification, improving reliability in cluttered environments.

Contribution

It proposes a CVaR-constrained training method combined with post-training neural network reachability analysis for safer robot navigation policies.

Findings

01

Policies trained with CVaR constraints have larger safety margins.

02

The method achieves a 98.3% success rate in navigation scenarios.

03

Reachability verification reveals risks not captured by average cost metrics.

Abstract

Safe navigation for mobile robots demands policies that remain reliable under the high-consequence perception uncertainty of cluttered environments. Yet most existing safe reinforcement learning (RL) methods assess safety through average cumulative cost. Such metrics can mask dangerous tail-risk behaviors. To address this, we propose a framework that trains risk-sensitive policies through Conditional Value-at-Risk (CVaR) constrained optimization on an off-policy TD3 backbone and evaluates their safety margins post-training through neural network reachability verification. During training, the policy is optimized under CVaR constraints on cumulative costs, promoting sensitivity to high-cost tail outcomes rather than average behavior alone. After training, we compute action reachable sets under bounded observation uncertainty using Taylor Model analysis, yielding a safety rate metric that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.