Model-Free Output Feedback Stabilization via Policy Gradient Methods
Ankang Zhang, Ming Chi, Xiaoling Wang, Lintao Ye

TL;DR
This paper develops a model-free policy gradient approach for stabilizing partially observable linear systems using output feedback, extending RL methods to more realistic control scenarios.
Contribution
It introduces a zeroth-order policy gradient algorithm for output feedback stabilization in unknown linear systems, with explicit sample complexity analysis.
Findings
Algorithm converges to stabilizing output feedback policies.
Explicit sample complexity bounds are derived.
Numerical experiments validate the approach.
Abstract
Stabilizing a dynamical system is a fundamental problem that serves as a cornerstone for many complex tasks in the field of control systems. The problem becomes challenging when the system model is unknown. Among the Reinforcement Learning (RL) algorithms that have been successfully applied to solve problems pertaining to unknown linear dynamical systems, the policy gradient (PG) method stands out due to its ease of implementation and can solve the problem in a model-free manner. However, most of the existing works on PG methods for unknown linear dynamical systems assume full-state feedback. In this paper, we take a step towards model-free learning for partially observable linear dynamical systems with output feedback and focus on the fundamental stabilization problem of the system. We propose an algorithmic framework that stretches the boundary of PG methods to the problem without…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
