Policy Gradient Adaptive Control for the LQR: Indirect and Direct Approaches

Feiran Zhao; Alessandro Chiuso; Florian D\"orfler

arXiv:2505.03706·math.OC·June 16, 2025

Policy Gradient Adaptive Control for the LQR: Indirect and Direct Approaches

Feiran Zhao, Alessandro Chiuso, Florian D\"orfler

PDF

Open Access 1 Repo

TL;DR

This paper introduces policy gradient adaptive control methods for the LQR that utilize online data to adaptively improve control policies, ensuring stability and convergence through both indirect and direct approaches, with enhanced variants like natural gradient and Gauss-Newton.

Contribution

It develops a unified framework for indirect and direct PGAC for LQR, incorporating natural gradient and Gauss-Newton methods, and provides stability, convergence, and robustness guarantees.

Findings

01

Proves stability and convergence of PGAC methods.

02

Demonstrates robustness and efficiency through simulations.

03

Introduces regularization to handle noise uncertainty.

Abstract

Motivated by recent advances of reinforcement learning and direct data-driven control, we propose policy gradient adaptive control (PGAC) for the linear quadratic regulator (LQR), which uses online closed-loop data to improve the control policy while maintaining stability. Our method adaptively updates the policy in feedback by descending the gradient of the LQR cost and is categorized as indirect, when gradients are computed via an estimated model, versus direct, when gradients are derived from data using sample covariance parameterization. Beyond the vanilla gradient, we also showcase the merits of the natural gradient and Gauss-Newton methods for the policy update. Notably, natural gradient descent bridges the indirect and direct PGAC, and the Gauss-Newton method of the indirect PGAC leads to an adaptive version of the celebrated Hewer's algorithm. To account for the uncertainty from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

feiran-zhao-eth/policy-gradient-adaptive-control
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Control Systems Optimization