On the Optimization Landscape of Dynamic Output Feedback Linear   Quadratic Control

Jingliang Duan; Wenhan Cao; Yang Zheng; Lin Zhao

arXiv:2201.09598·math.OC·November 2, 2023

On the Optimization Landscape of Dynamic Output Feedback Linear Quadratic Control

Jingliang Duan, Wenhan Cao, Yang Zheng, Lin Zhao

PDF

Open Access 2 Repos

TL;DR

This paper analyzes the complex optimization landscape of dynamic output-feedback linear quadratic control (dLQR), providing theoretical insights into policy gradient methods and establishing conditions for their optimality and equivalence with LQG control.

Contribution

It characterizes the optimization landscape of dLQR, proves the uniqueness of stationary points under observability, and links dLQR with LQG control for stochastic systems.

Findings

01

Uniqueness of stationary points for observable dLQR.

02

Conditions under which dLQR and LQG are equivalent.

03

Insights into policy gradient algorithm design for partially observed systems.

Abstract

The convergence of policy gradient algorithms hinges on the optimization landscape of the underlying optimal control problem. Theoretical insights into these algorithms can often be acquired from analyzing those of linear quadratic control. However, most of the existing literature only considers the optimization landscape for static full-state or output feedback policies (controllers). We investigate the more challenging case of dynamic output-feedback policies for linear quadratic regulation (abbreviated as dLQR), which is prevalent in practice but has a rather complicated optimization landscape. We first show how the dLQR cost varies with the coordinate transformation of the dynamic controller and then derive the optimal transformation for a given observable stabilizing controller. One of our core results is the uniqueness of the stationary point of dLQR when it is observable, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Advanced Bandit Algorithms Research