Safety Assessment in Reinforcement Learning via Model Predictive Control
Jeff Pflueger, Michael Everett

TL;DR
This paper introduces a safety assessment method for reinforcement learning that uses model predictive control to prevent unsafe actions without needing explicit safety specifications, ensuring safer training.
Contribution
It proposes leveraging reversibility and model predictive path integral control to provide safety guarantees in model-free reinforcement learning without detailed safety knowledge.
Findings
Successfully aborts unsafe actions during training
Achieves comparable training progress to baseline methods
Provides safety guarantees without explicit safety constraints
Abstract
Model-free reinforcement learning approaches are promising for control but typically lack formal safety guarantees. Existing methods to shield or otherwise provide these guarantees often rely on detailed knowledge of the safety specifications. Instead, this work's insight is that many difficult-to-specify safety issues are best characterized by invariance. Accordingly, we propose to leverage reversibility as a method for preventing these safety issues throughout the training process. Our method uses model-predictive path integral control to check the safety of an action proposed by a learned policy throughout training. A key advantage of this approach is that it only requires the ability to query the black-box dynamics, not explicit knowledge of the dynamics or safety constraints. Experimental results demonstrate that the proposed algorithm successfully aborts before all unsafe actions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
