Model-based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian
Baiyu Peng, Jingliang Duan, Jianyu Chen, Shengbo Eben Li, Genjin Xie,, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun

TL;DR
This paper introduces the separated proportional-integral Lagrangian (SPIL) algorithm for chance-constrained reinforcement learning, effectively reducing oscillations and conservatism in safety-critical applications like autonomous navigation.
Contribution
The paper proposes a novel SPIL algorithm that unifies proportional and integral control for safer, less conservative RL policies under uncertainty, with an integral separation technique and model-based gradient computation.
Findings
Reduces oscillations in RL safety policies.
Decreases conservatism in policy learning.
Successfully applied to real-world robot navigation.
Abstract
Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address these shortcomings by proposing a separated proportional-integral Lagrangian (SPIL) algorithm. We first review the constrained policy optimization process from a feedback control perspective, which regards the penalty weight as the control input and the safe probability as the control output. Based on this, the penalty method is formulated as a proportional controller, and the Lagrangian method is formulated as an integral controller. We then unify them and present a proportional-integral…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel Reduction and Neural Networks · Advanced Multi-Objective Optimization Algorithms · Probabilistic and Robust Engineering Design
