Follow the Leader If You Can, Hedge If You Must

Steven de Rooij; Tim van Erven; Peter D. Gr\"unwald; Wouter M. Koolen

arXiv:1301.0534·cs.LG·August 31, 2021·98 cites

Follow the Leader If You Can, Hedge If You Must

Steven de Rooij, Tim van Erven, Peter D. Gr\"unwald, Wouter M. Koolen

PDF

Open Access

TL;DR

The paper introduces FlipFlop, a novel algorithm that combines the strengths of Follow-the-Leader and hedging strategies, achieving low regret in stochastic settings and strong worst-case guarantees.

Contribution

It presents FlipFlop, the first method to provably merge FTL's efficiency with hedging strategies' robustness, along with AdaHedge for dynamic learning rate tuning.

Findings

01

FlipFlop achieves regret close to FTL without losing worst-case guarantees.

02

AdaHedge improves dynamic learning rate tuning over previous methods.

03

Both algorithms are invariant under loss rescaling and can handle negative losses.

Abstract

Follow-the-Leader (FTL) is an intuitive sequential prediction strategy that guarantees constant regret in the stochastic setting, but has terrible performance for worst-case data. Other hedging strategies have better worst-case guarantees but may perform much worse than FTL if the data are not maximally adversarial. We introduce the FlipFlop algorithm, which is the first method that provably combines the best of both worlds. As part of our construction, we develop AdaHedge, which is a new way of dynamically tuning the learning rate in Hedge without using the doubling trick. AdaHedge refines a method by Cesa-Bianchi, Mansour and Stoltz (2007), yielding slightly improved worst-case guarantees. By interleaving AdaHedge and FTL, the FlipFlop algorithm achieves regret within a constant factor of the FTL regret, without sacrificing AdaHedge's worst-case guarantees. AdaHedge and FlipFlop…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning