Better Best of Both Worlds Bounds for Bandits with Switching Costs

Idan Amir; Guy Azov; Tomer Koren; Roi Livni

arXiv:2206.03098·cs.LG·November 3, 2022·5 cites

Better Best of Both Worlds Bounds for Bandits with Switching Costs

Idan Amir, Guy Azov, Tomer Koren, Roi Livni

PDF

Open Access 1 Video

TL;DR

This paper introduces a simple, effective algorithm for bandits with switching costs that achieves optimal regret bounds in both adversarial and stochastic regimes, improving upon previous results.

Contribution

The paper presents a novel algorithm that attains minimax optimal regret in adversarial settings and improved bounds in stochastic regimes for bandits with switching costs.

Findings

01

Achieves $ ilde{O}(T^{2/3})$ regret in adversarial setting.

02

Attains $ ilde{O}(rac{ ext{log}(T)}{ riangle^2})$ regret in stochastic setting.

03

Provides a lower bound showing certain regret is unavoidable.

Abstract

We study best-of-both-worlds algorithms for bandits with switching cost, recently addressed by Rouyer, Seldin and Cesa-Bianchi, 2021. We introduce a surprisingly simple and effective algorithm that simultaneously achieves minimax optimal regret bound of $O (T^{2/3})$ in the oblivious adversarial setting and a bound of $O (min {lo g (T) / Δ^{2}, T^{2/3}})$ in the stochastically-constrained regime, both with (unit) switching costs, where $Δ$ is the gap between the arms. In the stochastically constrained case, our bound improves over previous results due to Rouyer et al., that achieved regret of $O (T^{1/3} /Δ)$ . We accompany our results with a lower bound showing that, in general, $\tilde{Ω} (min {1/ Δ^{2}, T^{2/3}})$ regret is unavoidable in the stochastically-constrained case for algorithms with $O (T^{2/3})$ worst-case regret.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Better Best of Both Worlds Bounds for Bandits with Switching Costs· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Optimization and Search Problems