Model-based Chance-Constrained Reinforcement Learning via Separated   Proportional-Integral Lagrangian

Baiyu Peng; Jingliang Duan; Jianyu Chen; Shengbo Eben Li; Genjin Xie,; Congsheng Zhang; Yang Guan; Yao Mu; Enxin Sun

arXiv:2108.11623·cs.LG·August 27, 2021

Model-based Chance-Constrained Reinforcement Learning via Separated Proportional-Integral Lagrangian

Baiyu Peng, Jingliang Duan, Jianyu Chen, Shengbo Eben Li, Genjin Xie,, Congsheng Zhang, Yang Guan, Yao Mu, Enxin Sun

PDF

Open Access

TL;DR

This paper introduces the separated proportional-integral Lagrangian (SPIL) algorithm for chance-constrained reinforcement learning, effectively reducing oscillations and conservatism in safety-critical applications like autonomous navigation.

Contribution

The paper proposes a novel SPIL algorithm that unifies proportional and integral control for safer, less conservative RL policies under uncertainty, with an integral separation technique and model-based gradient computation.

Findings

01

Reduces oscillations in RL safety policies.

02

Decreases conservatism in policy learning.

03

Successfully applied to real-world robot navigation.

Abstract

Safety is essential for reinforcement learning (RL) applied in the real world. Adding chance constraints (or probabilistic constraints) is a suitable way to enhance RL safety under uncertainty. Existing chance-constrained RL methods like the penalty methods and the Lagrangian methods either exhibit periodic oscillations or learn an over-conservative or unsafe policy. In this paper, we address these shortcomings by proposing a separated proportional-integral Lagrangian (SPIL) algorithm. We first review the constrained policy optimization process from a feedback control perspective, which regards the penalty weight as the control input and the safe probability as the control output. Based on this, the penalty method is formulated as a proportional controller, and the Lagrangian method is formulated as an integral controller. We then unify them and present a proportional-integral…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Advanced Multi-Objective Optimization Algorithms · Probabilistic and Robust Engineering Design