Online Linear Quadratic Control

Alon Cohen; Avinatan Hassidim; Tomer Koren; Nevena Lazic; Yishay; Mansour; Kunal Talwar

arXiv:1806.07104·cs.LG·June 20, 2018·19 cites

Online Linear Quadratic Control

Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay, Mansour, Kunal Talwar

PDF

Open Access

TL;DR

This paper introduces efficient online algorithms for controlling linear systems with noisy dynamics and adversarial costs, achieving sublinear regret by leveraging a novel SDP relaxation that ensures stability and rapid mixing.

Contribution

It presents the first efficient online control algorithms with $O( oot{T})$ regret guarantees using a new SDP relaxation ensuring stability and fast mixing.

Findings

01

Achieves $O( oot{T})$ regret in online linear control.

02

Introduces a novel SDP relaxation for steady-state distribution.

03

Ensures policies are strongly stable with exponential mixing.

Abstract

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee $O (T)$ regret under mild assumptions, where $T$ is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to "strongly stable" policies that mix exponentially fast to a steady state.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Machine Learning and Algorithms