A Q-learning algorithm for discrete-time linear-quadratic control with random parameters of unknown distribution: convergence and stabilization
Kai Du, Qingxin Meng, and Fu Zhang

TL;DR
This paper introduces a Q-learning based algorithm for discrete-time linear-quadratic control problems with random parameters, demonstrating convergence and system stabilization without prior statistical knowledge.
Contribution
It develops an online iterative Q-learning algorithm for systems with unknown parameter distributions, establishing convergence and stabilization results.
Findings
The learning sequence converges under certain conditions.
The control law stabilizes the system when the problem is well-posed.
Numerical examples validate theoretical results.
Abstract
This paper studies an infinite horizon optimal control problem for discrete-time linear systems and quadratic criteria, both with random parameters which are independent and identically distributed with respect to time. A classical approach is to solve an algebraic Riccati equation that involves mathematical expectations and requires certain statistical information of the parameters. In this paper, we propose an online iterative algorithm in the spirit of Q-learning for the situation where only one random sample of parameters emerges at each time step. The first theorem proves the equivalence of three properties: the convergence of the learning sequence, the well-posedness of the control problem, and the solvability of the algebraic Riccati equation. The second theorem shows that the adaptive feedback control in terms of the learning sequence stabilizes the system as long as the control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Advanced Control Systems Optimization · Control Systems and Identification
