Minimax Q-learning Control for Linear Systems Using the Wasserstein Metric
Feiran Zhao, Keyou You

TL;DR
This paper introduces a Q-learning approach for linear quadratic regulation of unknown linear systems, using Wasserstein penalties to handle distribution uncertainty, with proven convergence to an optimal minimax controller.
Contribution
It develops a novel Q-learning method for LQR problems with distribution uncertainty, incorporating Wasserstein penalties and providing convergence guarantees.
Findings
Converges to an optimal minimax controller.
Handles distribution uncertainty in stochastic disturbances.
Provides a new framework for model-free control of unknown linear systems.
Abstract
Stochastic optimal control usually requires an explicit dynamical model with probability distributions, which are difficult to obtain in practice. In this work, we consider the linear quadratic regulator (LQR) problem of unknown linear systems and adopt a Wasserstein penalty to address the distribution uncertainty of additive stochastic disturbances. By constructing an equivalent deterministic game of the penalized LQR problem, we propose a Q-learning method with convergence guarantees to learn an optimal minimax controller.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAcute Ischemic Stroke Management · Risk and Portfolio Optimization · Statistical Methods and Inference
