Mirror Descent Meets Fixed Share (and feels no regret)

Nicol\`o Cesa-Bianchi; Pierre Gaillard (INRIA Paris - Rocquencourt,; DMA); Gabor Lugosi (ICREA); Gilles Stoltz (INRIA Paris - Rocquencourt; DMA,; GREGH)

arXiv:1202.3323·cs.LG·September 28, 2012·25 cites

Mirror Descent Meets Fixed Share (and feels no regret)

Nicol\`o Cesa-Bianchi, Pierre Gaillard (INRIA Paris - Rocquencourt,, DMA), Gabor Lugosi (ICREA), Gilles Stoltz (INRIA Paris - Rocquencourt, DMA,, GREGH)

PDF

Open Access

TL;DR

This paper unifies and extends the analysis of mirror descent with entropic regularization, showing that projection and weight sharing approaches are essentially equivalent in achieving various regret bounds.

Contribution

It provides a novel unified analysis demonstrating the equivalence of projection and weight sharing methods in mirror descent for regret minimization.

Findings

01

Unified analysis of projection and weight sharing techniques

02

Extended regret bounds covering shifting, adaptive, and discounted regrets

03

Potential improvements for small losses and adaptive parameter tuning

Abstract

Mirror descent with an entropic regularizer is known to achieve shifting regret bounds that are logarithmic in the dimension. This is done using either a carefully designed projection or by a weight sharing technique. Via a novel unified analysis, we show that these two approaches deliver essentially equivalent bounds on a notion of regret generalizing shifting, adaptive, discounted, and other related regrets. Our analysis also captures and extends the generalized weight sharing technique of Bousquet and Warmuth, and can be refined in several ways, including improvements for small losses and adaptive tuning of parameters.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms