Provable Safe Reinforcement Learning with Binary Feedback
Andrew Bennett, Dipendra Misra, Nathan Kallus

TL;DR
This paper introduces SABRE, a meta-algorithm for safe reinforcement learning using binary safety feedback, ensuring safety during training and near-optimal policies with provable guarantees.
Contribution
The paper proposes SABRE, a novel meta-algorithm that leverages active learning principles to enable provably safe RL with binary feedback, applicable across various MDP settings.
Findings
SABRE guarantees safety during training with high probability.
SABRE finds near-optimal safe policies under technical assumptions.
The approach reduces safety query complexity through active exploration.
Abstract
Safety is a crucial necessity in many applications of reinforcement learning (RL), whether robotic, automotive, or medical. Many existing approaches to safe RL rely on receiving numeric safety feedback, but in many cases this feedback can only take binary values; that is, whether an action in a given state is safe or unsafe. This is particularly true when feedback comes from human experts. We therefore consider the problem of provable safe RL when given access to an offline oracle providing binary feedback on the safety of state, action pairs. We provide a novel meta algorithm, SABRE, which can be applied to any MDP setting given access to a blackbox PAC RL algorithm for that setting. SABRE applies concepts from active learning to reinforcement learning to provably control the number of queries to the safety oracle. SABRE works by iteratively exploring the state space to find regions…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Safety Systems Engineering in Autonomy · Software Reliability and Analysis Research
