Online Linear Quadratic Control
Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay, Mansour, Kunal Talwar

TL;DR
This paper introduces efficient online algorithms for controlling linear systems with noisy dynamics and adversarial costs, achieving sublinear regret by leveraging a novel SDP relaxation that ensures stability and rapid mixing.
Contribution
It presents the first efficient online control algorithms with $O( oot{T})$ regret guarantees using a new SDP relaxation ensuring stability and fast mixing.
Findings
Achieves $O( oot{T})$ regret in online linear control.
Introduces a novel SDP relaxation for steady-state distribution.
Ensures policies are strongly stable with exponential mixing.
Abstract
We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses. We present the first efficient online learning algorithms in this setting that guarantee regret under mild assumptions, where is the time horizon. Our algorithms rely on a novel SDP relaxation for the steady-state distribution of the system. Crucially, and in contrast to previously proposed relaxations, the feasible solutions of our SDP all correspond to "strongly stable" policies that mix exponentially fast to a steady state.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Machine Learning and Algorithms
