Multi-Constraint Safe Reinforcement Learning via Closed-form Solution for Log-Sum-Exp Approximation of Control Barrier Functions
Chenggang Wang, Xinyi Wang, Yutong Dong, Lei Song, Xinping Guan

TL;DR
This paper introduces a novel safe reinforcement learning method that uses a closed-form solution for control barrier functions, reducing computational costs and ensuring safety during training and deployment.
Contribution
It proposes a closed-form solution for multi-constraint control barrier functions, enabling efficient and safe RL training without differentiable optimization.
Findings
Reduces training computational costs significantly.
Maintains provable safety guarantees during learning.
Outperforms existing methods relying on differentiable optimization.
Abstract
The safety of training task policies and their subsequent application using reinforcement learning (RL) methods has become a focal point in the field of safe RL. A central challenge in this area remains the establishment of theoretical guarantees for safety during both the learning and deployment processes. Given the successful implementation of Control Barrier Function (CBF)-based safety strategies in a range of control-affine robotic systems, CBF-based safe RL demonstrates significant promise for practical applications in real-world scenarios. However, integrating these two approaches presents several challenges. First, embedding safety optimization within the RL training pipeline requires that the optimization outputs be differentiable with respect to the input parameters, a condition commonly referred to as differentiable optimization, which is non-trivial to solve. Second, the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
