The Batch Complexity of Bandit Pure Exploration
Adrienne Tuynman, R\'emy Degenne

TL;DR
This paper investigates the minimal number of batches needed in fixed-confidence pure exploration for stochastic bandits, providing lower bounds, a new batched algorithm, and bounds on sample and batch complexities.
Contribution
It introduces an instance-dependent lower bound on batch complexity and proposes a general batched algorithm with proven upper bounds, advancing understanding of batched pure exploration.
Findings
Lower bounds on batch complexity for pure exploration
A new batched algorithm with provable guarantees
Application to best-arm identification and thresholding bandits
Abstract
In a fixed-confidence pure exploration problem in stochastic multi-armed bandits, an algorithm iteratively samples arms and should stop as early as possible and return the correct answer to a query about the arms distributions. We are interested in batched methods, which change their sampling behaviour only a few times, between batches of observations. We give an instance-dependent lower bound on the number of batches used by any sample efficient algorithm for any pure exploration task. We then give a general batched algorithm and prove upper bounds on its expected sample complexity and batch complexity. We illustrate both lower and upper bounds on best-arm identification and thresholding bandits.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Data Stream Mining Techniques · Auction Theory and Applications
