Predictive Lagrangian Optimization for Constrained Reinforcement   Learning

Tianqi Zhang; Puzhen Yuan; Guojian Zhan; Ziyu Lin; Yao Lyu; Zhenzhi; Qin; Jingliang Duan; Liping Zhang; Shengbo Eben Li

arXiv:2501.15217·cs.LG·January 28, 2025

Predictive Lagrangian Optimization for Constrained Reinforcement Learning

Tianqi Zhang, Puzhen Yuan, Guojian Zhan, Ziyu Lin, Yao Lyu, Zhenzhi, Qin, Jingliang Duan, Liping Zhang, Shengbo Eben Li

PDF

Open Access

TL;DR

This paper introduces a novel framework connecting constrained reinforcement learning with feedback control systems, leading to the development of the predictive Lagrangian optimization algorithm that outperforms traditional PID-based methods.

Contribution

It establishes a general equivalence framework between constrained RL and feedback control, and proposes the PLO algorithm using model predictive control for improved performance.

Findings

01

PLO achieves up to 7.2% larger feasible region.

02

PLO maintains comparable average reward to existing methods.

03

Framework unifies various feedback controllers for constrained RL.

Abstract

Constrained optimization is popularly seen in reinforcement learning for addressing complex control tasks. From the perspective of dynamic system, iteratively solving a constrained optimization problem can be framed as the temporal evolution of a feedback control system. Classical constrained optimization methods, such as penalty and Lagrangian approaches, inherently use proportional and integral feedback controllers. In this paper, we propose a more generic equivalence framework to build the connection between constrained optimization and feedback control system, for the purpose of developing more effective constrained RL algorithms. Firstly, we define that each step of the system evolution determines the Lagrange multiplier by solving a multiplier feedback optimal control problem (MFOCP). In this problem, the control input is multiplier, the state is policy parameters, the dynamics is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Scheduling and Optimization Algorithms · Traffic control and management