An Algorithm for Stochastic and Adversarial Bandits with Switching Costs

Chlo\'e Rouyer; Yevgeny Seldin; Nicol\`o Cesa-Bianchi

arXiv:2102.09864·cs.LG·February 22, 2021·1 cites

An Algorithm for Stochastic and Adversarial Bandits with Switching Costs

Chlo\'e Rouyer, Yevgeny Seldin, Nicol\`o Cesa-Bianchi

PDF

Open Access 1 Video

TL;DR

This paper introduces a versatile algorithm for multiarmed bandits with switching costs, achieving optimal regret bounds in both stochastic and adversarial settings without prior knowledge of the environment.

Contribution

It adapts the Tsallis-INF algorithm to handle switching costs, providing minimax optimal regret bounds across different regimes and extending to time-varying switching costs.

Findings

01

Achieves minimax optimal regret bounds in adversarial and stochastic regimes.

02

Performs competitively with baseline algorithms in various settings.

03

Handles environments with changing switching costs over time.

Abstract

We propose an algorithm for stochastic and adversarial multiarmed bandits with switching costs, where the algorithm pays a price $λ$ every time it switches the arm being played. Our algorithm is based on adaptation of the Tsallis-INF algorithm of Zimmert and Seldin (2021) and requires no prior knowledge of the regime or time horizon. In the oblivious adversarial setting it achieves the minimax optimal regret bound of $O ((λ K)^{1/3} T^{2/3} + K T)$ , where $T$ is the time horizon and $K$ is the number of arms. In the stochastically constrained adversarial regime, which includes the stochastic regime as a special case, it achieves a regret bound of $O (((λ K)^{2/3} T^{1/3} + ln T) \sum_{i \neq = i^{*}} Δ_{i}^{- 1})$ , where $Δ_{i}$ are the suboptimality gaps and $i^{*}$ is a unique optimal arm. In the special case of $λ = 0$ (no…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

An Algorithm for Stochastic and Adversarial Bandits with Switching Costs· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning · Gaussian Processes and Bayesian Inference