On Finding the Largest Mean Among Many
Kevin Jamieson, Matthew Malloy, Robert Nowak, Sebastien Bubeck

TL;DR
This paper investigates the sample complexity of identifying the best distribution with the largest mean in multi-armed bandit problems, introducing a new adaptive algorithm that achieves linear sample complexity in many cases.
Contribution
It presents a single-parameter bandit model covering linear to superlinear complexities and introduces PRISM, an adaptive algorithm with linear sample complexity for broad distribution classes.
Findings
PRISM achieves linear sample complexity in many scenarios.
Adaptive procedures outperform non-adaptive ones in sample efficiency.
Non-adaptive methods can require polynomially more samples than adaptive methods.
Abstract
Sampling from distributions to find the one with the largest mean arises in a broad range of applications, and it can be mathematically modeled as a multi-armed bandit problem in which each distribution is associated with an arm. This paper studies the sample complexity of identifying the best arm (largest mean) in a multi-armed bandit problem. Motivated by large-scale applications, we are especially interested in identifying situations where the total number of samples that are necessary and sufficient to find the best arm scale linearly with the number of arms. We present a single-parameter multi-armed bandit model that spans the range from linear to superlinear sample complexity. We also give a new algorithm for best arm identification, called PRISM, with linear sample complexity for a wide range of mean distributions. The algorithm, like most exploration procedures for multi-armed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Cognitive Radio Networks and Spectrum Sensing
