Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret
Alon Cohen, Tomer Koren, Yishay Mansour

TL;DR
This paper introduces a computationally-efficient algorithm for Linear Quadratic Control systems that achieves near-optimal regret bounds, addressing a longstanding open problem in the field.
Contribution
It provides the first efficient algorithm with $ ilde{O}( oot T)$ regret for unknown linear quadratic control, resolving key open questions.
Findings
Achieves $ ilde{O}( oot T)$ regret in linear quadratic control
First computationally-efficient algorithm with this regret bound
Resolves open problems from prior foundational works
Abstract
We present the first computationally-efficient algorithm with regret for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve an open question of Abbasi-Yadkori and Szepesv\'ari (2011) and Dean, Mania, Matni, Recht, and Tu (2018).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Adaptive Dynamic Programming Control
