Regret Lower Bounds for Learning Linear Quadratic Gaussian Systems
Ingvar Ziemann, Henrik Sandberg

TL;DR
This paper derives fundamental regret lower bounds for learning control in linear Gaussian systems, revealing how system properties influence the difficulty of control and learning simultaneously.
Contribution
It introduces new regret lower bounds that incorporate control-theoretic parameters, extending to partially observed systems and improving understanding of system difficulty.
Findings
Regret scales as (\u221a{T}) with time horizon T.
Hard-to-control systems are also hard to learn to control.
Results extend to partially observed systems with poor observability.
Abstract
TWe establish regret lower bounds for adaptively controlling an unknown linear Gaussian system with quadratic costs. We combine ideas from experiment design, estimation theory and a perturbation bound of certain information matrices to derive regret lower bounds exhibiting scaling on the order of magnitude in the time horizon . Our bounds accurately capture the role of control-theoretic parameters and we are able to show that systems that are hard to control are also hard to learn to control; when instantiated to state feedback systems we recover the dimensional dependency of earlier work but with improved scaling with system-theoretic constants such as system costs and Gramians. Furthermore, we extend our results to a class of partially observed systems and demonstrate that systems with poor observability structure also are hard to learn to control.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Advanced Bandit Algorithms Research · Machine Learning and Algorithms
