Policy Optimization in the Linear Quadratic Gaussian Problem: A Frequency Domain Perspective
Haoran Li, Xun Li, Yuan-Hua Ni, Xuebo Zhang

TL;DR
This paper introduces a frequency domain approach to certify global optimality in the non-convex LQG control problem, extending to infinite-dimensional controller spaces and including a data-driven extension, with theoretical guarantees and numerical validation.
Contribution
It derives a necessary and sufficient optimality condition for stationary points in parameterized LQG problems, and develops a gradient-based algorithm with global convergence in an infinite-dimensional space.
Findings
A tractable optimality certificate based on controllability and observability.
A gradient algorithm with proven global convergence.
Numerical experiments validating theoretical results.
Abstract
The Linear Quadratic Gaussian (LQG) problem is a classic and widely studied model in optimal control, providing a fundamental framework for designing controllers for linear systems subject to process and observation noises. In recent years, researchers have increasingly focused on directly parameterizing dynamic controllers and optimizing the LQG cost over the resulting parameterized set. However, this parameterization typically gives rise to a highly non-convex optimization landscape for the resulting parameterized LQG problem. To our knowledge, there is currently no general method for certifying the global optimality of candidate controller parameters in this setting. In this work, we address these gaps with the following contributions. First, we derive a necessary and sufficient condition for the global optimality of stationary points in a parameterized LQG problems. This condition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Bandit Algorithms Research · Reinforcement Learning in Robotics
