Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$   Regret

Alon Cohen; Tomer Koren; Yishay Mansour

arXiv:1902.06223·cs.LG·February 26, 2019·20 cites

Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret

Alon Cohen, Tomer Koren, Yishay Mansour

PDF

Open Access

TL;DR

This paper introduces a computationally-efficient algorithm for Linear Quadratic Control systems that achieves near-optimal regret bounds, addressing a longstanding open problem in the field.

Contribution

It provides the first efficient algorithm with $ ilde{O}( oot T)$ regret for unknown linear quadratic control, resolving key open questions.

Findings

01

Achieves $ ilde{O}( oot T)$ regret in linear quadratic control

02

First computationally-efficient algorithm with this regret bound

03

Resolves open problems from prior foundational works

Abstract

We present the first computationally-efficient algorithm with $O (T)$ regret for learning in Linear Quadratic Control systems with unknown dynamics. By that, we resolve an open question of Abbasi-Yadkori and Szepesv\'ari (2011) and Dean, Mania, Matni, Recht, and Tu (2018).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Adaptive Dynamic Programming Control