Minimax and Bayes Optimal Best-Arm Identification
Masahiro Kato

TL;DR
This paper develops an adaptive sampling strategy for best-arm identification that is proven to be both asymptotically minimax and Bayes optimal, optimizing simple regret in fixed-budget experiments.
Contribution
It introduces a novel two-stage adaptive procedure combining a Gaussian minimax game with an optimal sampling ratio, achieving theoretical optimality in simple regret.
Findings
The proposed strategy is asymptotically minimax optimal.
The strategy is also Bayes optimal for simple regret.
Upper bounds match the lower bounds exactly, including constants.
Abstract
This study investigates minimax and Bayes optimal strategies for fixed-budget best-arm identification. We consider an adaptive procedure consisting of a sampling phase followed by a recommendation phase, and we design an adaptive experiment within this framework to efficiently identify the best arm, defined as the one with the highest expected outcome. In our proposed strategy, the sampling phase consists of two stages. The first stage is a pilot phase, in which we allocate samples uniformly across arms to eliminate clearly suboptimal arms and to estimate outcome variances. Before entering the second stage, we solve a Gaussian minimax game, which yields a sampling ratio and a decision rule. In the second stage, samples are allocated according to this sampling ratio. After the sampling phase, the procedure enters the recommendation phase, where we select an arm using the decision rule.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Process Monitoring
