Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and   Global Convergence

Feiran Zhao; Keyou You

arXiv:2406.03734·math.OC·June 7, 2024·1 cites

Policy Gradient Methods for the Cost-Constrained LQR: Strong Duality and Global Convergence

Feiran Zhao, Keyou You

PDF

Open Access

TL;DR

This paper develops a policy gradient primal-dual approach for solving cost-constrained LQR problems, establishing strong duality, convergence guarantees, and validating results through simulations in safety-critical control scenarios.

Contribution

It introduces a novel primal-dual policy gradient method for constrained LQR, proving strong duality and convergence despite non-convexity, with theoretical and empirical validation.

Findings

01

Strong duality holds for the cost-constrained LQR problem.

02

The proposed method converges to the optimal solution.

03

Simulations confirm theoretical convergence and effectiveness.

Abstract

In safety-critical applications, reinforcement learning (RL) needs to consider safety constraints. However, theoretical understandings of constrained RL for continuous control are largely absent. As a case study, this paper presents a cost-constrained LQR formulation, where a number of LQR costs with user-defined penalty matrices are subject to constraints. To solve it, we propose a policy gradient primal-dual method to find an optimal state feedback gain. Despite the non-convexity of the cost-constrained LQR problem, we provide a constructive proof for strong duality and a geometric interpretation of an optimal multiplier set. By proving that the concave dual function is Lipschitz smooth, we further provide convergence guarantees for the PG primal-dual method. Finally, we perform simulations to validate our theoretical findings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization · Stochastic processes and financial applications · Stability and Control of Uncertain Systems