Online learning over a finite action set with limited switching

Jason Altschuler; Kunal Talwar

arXiv:1803.01548·cs.LG·November 16, 2021

Online learning over a finite action set with limited switching

Jason Altschuler, Kunal Talwar

PDF

TL;DR

This paper advances the understanding of switching costs and budgets in online learning and multi-armed bandits, providing high probability guarantees and a complete characterization of the complexity for various switching constraints.

Contribution

It introduces the first high probability algorithms for switching costs, and fully characterizes the complexity of switching budgets in online learning and bandits.

Findings

01

First high probability algorithms achieving optimal regret and switch bounds.

02

Complete characterization of switching budget complexity for PFE and MAB.

03

Steady decay of minimax rate in bandits with limited switches.

Abstract

This paper studies the value of switching actions in the Prediction From Experts (PFE) problem and Adversarial Multi-Armed Bandits (MAB) problem. First, we revisit the well-studied and practically motivated setting of PFE with switching costs. Many algorithms are known to achieve the minimax optimal order of $O (T lo g n)$ in expectation for both regret and number of switches, where $T$ is the number of iterations and $n$ the number of actions. However, no high probability (h.p.) guarantees are known. Our main technical contribution is the first algorithms which with h.p. achieve this optimal order for both regret and switches. This settles an open problem of [Devroye et al., 2015], and directly implies the first h.p. guarantees for several problems of interest. Next, to investigate the value of switching actions at a more granular level, we introduce the setting of switching…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.