The Many Faces of Exponential Weights in Online Learning
Dirk van der Hoeven, Tim van Erven, Wojciech Kot{\l}owski

TL;DR
This paper presents an alternative perspective on online learning by emphasizing Exponential Weights (EW), unifying many existing methods and bounds, and exploring the benefits of sampling from the EW posterior.
Contribution
It demonstrates that many standard online learning algorithms and regret bounds can be derived from the EW framework using surrogate losses and posterior sampling.
Findings
Online Gradient Descent is recoverable via EW with Gaussian priors.
Online Mirror Descent instances correspond to EW with specific priors.
Sampling from the EW posterior achieves optimal rates in bandit linear optimization.
Abstract
A standard introduction to online learning might place Online Gradient Descent at its center and then proceed to develop generalizations and extensions like Online Mirror Descent and second-order methods. Here we explore the alternative approach of putting Exponential Weights (EW) first. We show that many standard methods and their regret bounds then follow as a special case by plugging in suitable surrogate losses and playing the EW posterior mean. For instance, we easily recover Online Gradient Descent by using EW with a Gaussian prior on linearized losses, and, more generally, all instances of Online Mirror Descent based on regular Bregman divergences also correspond to EW with a prior that depends on the mirror map. Furthermore, appropriate quadratic surrogate losses naturally give rise to Online Gradient Descent for strongly convex losses and to Online Newton Step. We further…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms
