Problem Dependent View on Structured Thresholding Bandit Problems
James Cheshire, Pierre M\'enard, Alexandra Carpentier

TL;DR
This paper explores the stochastic Thresholding Bandit problem under shape constraints, providing bounds and algorithms for monotonic and concave mean sequences in a fixed budget setting.
Contribution
It introduces problem-dependent bounds and algorithms for structured TBP with monotonic and concave mean sequences, extending the unstructured case.
Findings
Bounds match in the problem-dependent regime up to constants.
Provides algorithms with theoretical guarantees.
Analyzes probability of error in structured TBP settings.
Abstract
We investigate the problem dependent regime in the stochastic Thresholding Bandit problem (TBP) under several shape constraints. In the TBP, the objective of the learner is to output, at the end of a sequential game, the set of arms whose means are above a given threshold. The vanilla, unstructured, case is already well studied in the literature. Taking as the number of arms, we consider the case where (i) the sequence of arm's means is monotonically increasing (MTBP) and (ii) the case where is concave (CTBP). We consider both cases in the problem dependent regime and study the probability of error - i.e. the probability to mis-classify at least one arm. In the fixed budget setting, we provide upper and lower bounds for the probability of error in both the concave and monotone settings, as well as associated algorithms. In both settings the bounds…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems
