Conservative Safety Critics for Exploration
Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine,, Florian Shkurti, Animesh Garg

TL;DR
This paper introduces a conservative safety critic for reinforcement learning that bounds the risk of catastrophic failures during exploration, ensuring safer training while maintaining competitive task performance.
Contribution
It proposes a novel conservative safety critic that provides provable safety guarantees and balances safety with policy improvement in RL.
Findings
Achieves lower failure rates during training compared to prior methods.
Provides theoretical guarantees of safety and convergence.
Demonstrates effectiveness on navigation, manipulation, and locomotion tasks.
Abstract
Safe exploration presents a major challenge in reinforcement learning (RL): when active data collection requires deploying partially trained policies, we must ensure that these policies avoid catastrophically unsafe regions, while still enabling trial and error learning. In this paper, we target the problem of safe exploration in RL by learning a conservative safety estimate of environment states through a critic, and provably upper bound the likelihood of catastrophic failures at every training iteration. We theoretically characterize the tradeoff between safety and policy improvement, show that the safety constraints are likely to be satisfied with high probability during training, derive provable convergence guarantees for our approach, which is no worse asymptotically than standard RL, and demonstrate the efficacy of the proposed approach on a suite of challenging navigation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning
