Regret Analysis: a control perspective
Travis E. Gibson, Sawal Acharya

TL;DR
This paper explores the connections and differences between online learning and adaptive control, focusing on regret analysis and proposing a new paradigm called online adaptive control.
Contribution
It provides a detailed comparison of the analysis methods in online learning and adaptive control, and introduces the concept of online adaptive control.
Findings
Regret analysis applied to gradient descent for convex functions.
Control-based analysis of streaming regression problems.
Discussion of a new paradigm: online adaptive control.
Abstract
Online learning and model reference adaptive control have many interesting intersections. One area where they differ however is in how the algorithms are analyzed and what objective or metric is used to discriminate "good" algorithms from "bad" algorithms. In adaptive control there are usually two objectives: 1) prove that all time varying parameters/states of the system are bounded, and 2) that the instantaneous error between the adaptively controlled system and a reference system converges to zero over time (or at least a compact set). For online learning the performance of algorithms is often characterized by the regret the algorithm incurs. Regret is defined as the cumulative loss (cost) over time from the online algorithm minus the cumulative loss (cost) of the single optimal fixed parameter choice in hindsight. Another significant difference between the two areas of research is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Advanced Adaptive Filtering Techniques · Extremum Seeking Control Systems
MethodsNetwork On Network
