An optimal algorithm for the Thresholding Bandit Problem
Andrea Locatelli, Maurilio Gutzeit, and Alexandra Carpentier

TL;DR
This paper introduces an optimal, parameter-free algorithm for the combinatorial pure exploration stochastic bandit problem with a fixed budget, providing the first such strategy with proven optimality.
Contribution
It presents the first non-trivial fixed-budget pure exploration algorithm with proven optimality in the combinatorial bandit setting.
Findings
Algorithm is proven to be optimal through matching bounds.
First non-trivial fixed-budget pure exploration strategy with optimality.
Provides theoretical guarantees for the proposed method.
Abstract
We study a specific \textit{combinatorial pure exploration stochastic bandit problem} where the learner aims at finding the set of arms whose means are above a given threshold, up to a given precision, and \textit{for a fixed time horizon}. We propose a parameter-free algorithm based on an original heuristic, and prove that it is optimal for this problem by deriving matching upper and lower bounds. To the best of our knowledge, this is the first non-trivial pure exploration setting with \textit{fixed budget} for which optimal strategies are constructed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
