Comparative Study of Q-Learning for State-Feedback LQG Control with an Unknown Model
Mingxiang Liu, Dami\'an Marelli, Minyue Fu, Qianqian Cai

TL;DR
This paper compares traditional system identification methods with a novel Q-learning approach for designing LQG controllers when system parameters are unknown, concluding that classic methods are more efficient and accurate.
Contribution
It introduces a Q-learning-based method for SF-LQG control with unknown parameters and compares it rigorously to the classic identification approach.
Findings
Classic approach is asymptotically efficient and most accurate.
Proposed Q-learning method achieves asymptotic optimality.
Classic method is more computationally efficient.
Abstract
We study the problem of designing a state feedback linear quadratic Gaussian (LQG) controller for a system in which the system matrices as well as the process noise covariance are unknown. We do a rigorous comparison between two approaches. The first is the classic one in which a system identification stage is used to estimate the unknown parameters, which are then used in a state-feedback LQG (SF-LQG) controller design. The second approach is a recently proposed one using a reinforcement learning paradigm called Q-learning. We do the comparison in terms of complexity and accuracy of the resulting controller. We show that the classic approach asymptotically efficient, giving virtually no room for improvement in terms of accuracy. We also propose a novel Q-learning-based method which we show asymptotically achieves the optimal controller design. We complement our proposed method with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Control Systems and Identification · Advanced Bandit Algorithms Research
