Online estimation and control with optimal pathlength regret

Gautam Goel; Babak Hassibi

arXiv:2110.12544·cs.LG·December 8, 2021

Online estimation and control with optimal pathlength regret

Gautam Goel, Babak Hassibi

PDF

Open Access

TL;DR

This paper introduces the first pathlength regret bounds for online control and estimation in linear dynamical systems, leveraging reductions to robust estimation problems, and demonstrates their superiority over traditional methods in variable environments.

Contribution

It provides novel pathlength regret bounds for online control and estimation, connecting these problems to variational robust estimation, and offers algorithms that adapt better to changing environments.

Findings

01

Pathlength-optimal algorithms outperform traditional methods in variable environments.

02

Reductions to robust estimation problems are effective for deriving regret bounds.

03

Numerical simulations validate the improved performance of the proposed algorithms.

Abstract

A natural goal when designing online learning algorithms for non-stationary environments is to bound the regret of the algorithm in terms of the temporal variation of the input sequence. Intuitively, when the variation is small, it should be easier for the algorithm to achieve low regret, since past observations are predictive of future inputs. Such data-dependent "pathlength" regret bounds have recently been obtained for a wide variety of online learning problems, including OCO and bandits. We obtain the first pathlength regret bounds for online control and estimation (e.g. Kalman filtering) in linear dynamical systems. The key idea in our derivation is to reduce pathlength-optimal filtering and control to certain variational problems in robust estimation and control; these reductions may be of independent interest. Numerical simulations confirm that our pathlength-optimal algorithms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Smart Grid Energy Management · Distributed Sensor Networks and Detection Algorithms