Random Shuffling and Resets for the Non-stationary Stochastic Bandit   Problem

Robin Allesiardo; Rapha\"el F\'eraud; Odalric-Ambrym Maillard

arXiv:1609.02139·cs.AI·September 9, 2016·1 cites

Random Shuffling and Resets for the Non-stationary Stochastic Bandit Problem

Robin Allesiardo, Rapha\"el F\'eraud, Odalric-Ambrym Maillard

PDF

Open Access

TL;DR

This paper introduces a shuffling-based modification to Successive Elimination for non-stationary stochastic bandits, improving guarantees and extending applicability to non-stationary and switching scenarios.

Contribution

It proposes a randomized shuffling approach for Successive Elimination, enabling effective best-arm identification and regret control in non-stationary bandit problems.

Findings

01

Achieves same sample complexity as original in non-stationary settings.

02

Fails to control regret without shuffling in non-stationary scenarios.

03

Provides bounds for switching arm scenarios with adaptive algorithms.

Abstract

We consider a non-stationary formulation of the stochastic multi-armed bandit where the rewards are no longer assumed to be identically distributed. For the best-arm identification task, we introduce a version of Successive Elimination based on random shuffling of the $K$ arms. We prove that under a novel and mild assumption on the mean gap $Δ$ , this simple but powerful modification achieves the same guarantees in term of sample complexity and cumulative regret than its original version, but in a much wider class of problems, as it is not anymore constrained to stationary distributions. We also show that the original {\sc Successive Elimination} fails to have controlled regret in this more general scenario, thus showing the benefit of shuffling. We then remove our mild assumption and adapt the algorithm to the best-arm identification task with switching arms. We adapt the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Auction Theory and Applications