LQR through the Lens of First Order Methods: Discrete-time Case
Jingjing Bu, Afshin Mesbahi, Maryam Fazel, and Mehran Mesbahi

TL;DR
This paper analyzes the LQR problem using first order optimization methods, demonstrating smoothness, coercivity, and exponential stability of flows, and providing convergence guarantees for discretized algorithms.
Contribution
It introduces a first order optimization perspective to LQR, analyzing gradient, natural gradient, and quasi-Newton flows, with convergence guarantees and stepsize criteria.
Findings
Flows are exponentially stable and admit unique solutions.
Gradient descent and natural gradient descent converge linearly with proper stepsizes.
Quasi-Newton iteration achieves quadratic convergence, recovering the Hewer algorithm.
Abstract
We consider the Linear-Quadratic-Regulator (LQR) problem in terms of optimizing a real-valued matrix function over the set of feedback gains. Such a setup facilitates examining the implications of a natural initial-state independent formulation of LQR in designing first order algorithms. It is shown that this cost function is smooth and coercive, and provide an alternate means of noting its gradient dominated property. In the process, we provide a number of analytic observations on the LQR cost when directly analyzed in terms of the feedback gain. We then examine three types of well-posed flows for LQR: gradient flow, natural gradient flow and the quasi-Newton flow. The coercive property suggests that these flows admit unique solutions while gradient dominated property indicates that the corresponding Lyapunov functionals decay at an exponential rate; we also prove that these flows are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Control Systems and Identification · Stability and Control of Uncertain Systems
