Verifiable Model-Free Safety Filters via Reinforcement Learning
Bihui Yin, Yiwen Lu, Yuchen Jiang, Yilin Mo

TL;DR
This paper introduces a model-free reinforcement learning safety filter that uses a QP-based structure to guarantee safety without relying on system models, outperforming traditional methods.
Contribution
It proposes a novel DRL-based safety filter using a QP formulation that provides formal safety guarantees without system identification.
Findings
Outperforms traditional model-based PSFs in safety guarantees
Reduces intervention and computational load compared to baselines
Provides formal safety certification for the learned filter
Abstract
This paper presents a reinforcement learning approach of a model-free safety filter, drawing inspiration from the framework of model-based Predictive Safety Filters (PSFs). Similar to conventional PSFs, our method adopts a Quadratic Programming (QP) formulation by representing the filter as an unrolled QP solver network. However, unlike existing PSFs that derive QP parameters explicitly from system models, we learn these parameters directly through Deep Reinforcement Learning (DRL), thereby eliminating the dependency on accurate system identification. Furthermore, compared to traditional neural network-based methods, this QP structure allows us to furnish a formal certificate for the persistent safety of the learned filter. Numerical results demonstrate that our method outperforms both conventional model-based PSFs and RL-trained Multi-Layer Perceptron (MLP) baselines in terms of safety…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
