Information-Directed Selection for Top-Two Algorithms
Wei You, Chao Qin, Zihao Wang, Shuoguang Yang

TL;DR
This paper introduces an information-directed selection (IDS) method for top-two algorithms in multi-armed bandits, achieving asymptotic optimality for best-k arm identification and demonstrating superior empirical performance.
Contribution
It extends the top-two algorithm framework to best-k identification, introduces IDS for arm selection, and proves asymptotic optimality of top-two Thompson sampling.
Findings
IDS-based top-two algorithms outperform non-adaptive methods.
Proven asymptotic optimality for Gaussian best-arm identification.
Numerical results show significant performance improvements.
Abstract
We consider the best-k-arm identification problem for multi-armed bandits, where the objective is to select the exact set of k arms with the highest mean rewards by sequentially allocating measurement effort. We characterize the necessary and sufficient conditions for the optimal allocation using dual variables. Remarkably these optimality conditions lead to the extension of top-two algorithm design principle (Russo, 2020), initially proposed for best-arm identification. Furthermore, our optimality conditions induce a simple and effective selection rule dubbed information-directed selection (IDS) that selects one of the top-two candidates based on a measure of information gain. As a theoretical guarantee, we prove that integrated with IDS, top-two Thompson sampling is (asymptotically) optimal for Gaussian best-arm identification, solving a glaring open problem in the pure exploration…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Distributed Sensor Networks and Detection Algorithms
