Distributed Q-Learning for Stochastic LQ Control with Unknown   Uncertainty

Zhaorong Zhang (1); Juanjuan Xu (1); Xun Li (2) ((1) Shandong; University (2) the Hong Kong Polytechnic University)

arXiv:2201.05342·math.OC·January 17, 2022

Distributed Q-Learning for Stochastic LQ Control with Unknown Uncertainty

Zhaorong Zhang (1), Juanjuan Xu (1), Xun Li (2) ((1) Shandong, University (2) the Hong Kong Polytechnic University)

PDF

Open Access

TL;DR

This paper introduces a distributed Q-learning algorithm for stochastic linear quadratic control systems with unknown parameters, ensuring convergence to the optimal controller over an infinite horizon.

Contribution

It presents a novel distributed stochastic approximation method to solve Riccati equations in uncertain stochastic control systems, with proven convergence guarantees.

Findings

01

The algorithm converges asymptotically to the optimal solution.

02

The method effectively handles unknown system uncertainties.

03

Numerical simulations confirm the convergence and stability.

Abstract

This paper studies a discrete-time stochastic control problem with linear quadratic criteria over an infinite-time horizon. We focus on a class of control systems whose system matrices are associated with random parameters involving unknown statistical properties. In particular, we design a distributed Q-learning algorithm to tackle the Riccati equation and derive the optimal controller stabilizing the system. The key technique is that we convert the problem of solving the Riccati equation into deriving the zero point of a matrix equation and devise a distributed stochastic approximation method to compute the estimates of the zero point. The convergence analysis proves that the distributed Q-learning algorithm converges to the correct value eventually. A numerical example sheds light on that the distributed Q-learning algorithm converges asymptotically.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Stability and Control of Uncertain Systems