Q-learning-based Model-free Safety Filter
Guo Ning Sue, Yogita Choudhary, Richard Desatnik, Carmel Majidi, John, Dolan, Guanya Shi

TL;DR
This paper introduces a simple, model-free safety filter framework using Q-learning that can be integrated with various RL algorithms to ensure safety in complex robotic systems, validated through simulations and real-world experiments.
Contribution
It presents a novel, versatile safety filtering method that requires no system model and can be easily combined with existing RL algorithms.
Findings
Effective safety filtering demonstrated in simulations and real-world robotic experiments.
Theoretical analysis supports the safety threshold mechanism.
Framework seamlessly integrates with various RL algorithms.
Abstract
Ensuring safety via safety filters in real-world robotics presents significant challenges, particularly when the system dynamics is complex or unavailable. To handle this issue, learning-based safety filters recently gained popularity, which can be classified as model-based and model-free methods. Existing model-based approaches requires various assumptions on system model (e.g., control-affine), which limits their application in complex systems, and existing model-free approaches need substantial modifications to standard RL algorithms and lack versatility. This paper proposes a simple, plugin-and-play, and effective model-free safety filter learning framework. We introduce a novel reward formulation and use Q-learning to learn Q-value functions to safeguard arbitrary task specific nominal policies via filtering out their potentially unsafe actions. The threshold used in the filtering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRisk and Safety Analysis · Autonomous Vehicle Technology and Safety
MethodsQ-Learning
