Output-Feedback Stabilizing Policy Iteration for Convergence Assurance of Unknown Discrete-Time Systems with Unmeasurable States
Dongdong Li, Jiuxiang Dong

TL;DR
This paper introduces a data-driven output-feedback policy iteration method that guarantees stability and convergence for unknown linear discrete-time systems with unmeasurable states, using only input-output data.
Contribution
It develops a novel output-feedback stabilizing policy iteration framework that ensures convergence without requiring known states or initial stabilizing policies.
Findings
The method guarantees stability during learning.
It successfully learns stabilizing policies solely from input-output data.
Simulation results validate the effectiveness of the approach.
Abstract
This note proposes a data-driven output-feedback stabilizing policy iteration for unknown linear discrete-time systems with unmeasurable states. Existing policy iteration methods for optimal control must start from a stabilizing control policy, which is particularly challenging to obtain for unknown systems, especially when states are unavailable. In such cases, it is more difficult to guarantee stability and convergence performance. To address this problem, an output-feedback stabilizing policy iteration framework is developed to learn closed-loop stabilizing control policies while ensuring convergence performance. Specifically, cumulative scalar parameters are introduced to compress the original system to a stable scale. Then, by integrating modified policy iteration with parameter update rules, the system is gradually amplified/restored to the original system while preserving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Stability and Control of Uncertain Systems · Advanced Control Systems Optimization
