Exploration in the Limit
Brian M. Cho, Nathan Kallus

TL;DR
This paper introduces a new asymptotic framework for fixed-confidence best arm identification that offers tighter error control and better handles nonparametric distributions, leveraging novel confidence sequences and covariate integration.
Contribution
It proposes a relaxed asymptotic error control formulation, develops a novel confidence sequence, and designs a flexible BAI algorithm that improves practical sample efficiency.
Findings
Reduces average sample complexities in experiments.
Matches Gaussian BAI worst-case sample complexity.
Ensures approximate error control in nonparametric settings.
Abstract
In fixed-confidence best arm identification (BAI), the objective is to quickly identify the optimal option while controlling the probability of error below a desired threshold. Despite the plethora of BAI algorithms, existing methods typically fall short in practical settings, as stringent exact error control requires using loose tail inequalities and/or parametric restrictions. To overcome these limitations, we introduce a relaxed formulation that requires valid error control asymptotically with respect to a minimum sample size. This aligns with many real-world settings that often involve weak signals, high desired significance, and post-experiment inference requirements, all of which necessitate long horizons. This allows us to achieve tighter optimality, while better handling flexible nonparametric outcome distributions and fully leveraging individual-level contexts. We develop a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGaussian Processes and Bayesian Inference · Statistical Methods and Inference · Machine Learning and Algorithms
