Best Arm Identification in Stochastic Bandits: Beyond $\beta-$optimality

Arpan Mukherjee; Ali Tajer

arXiv:2301.03785·stat.ML·June 26, 2023

Best Arm Identification in Stochastic Bandits: Beyond $\beta-$optimality

Arpan Mukherjee, Ali Tajer

PDF

Open Access

TL;DR

This paper presents a new framework and algorithm for best arm identification in stochastic bandits that achieves optimal performance with computational efficiency, applicable to a broad class of distributions beyond exponential families.

Contribution

It introduces a novel approach that balances optimality and efficiency by sequentially estimating allocations with sufficient fidelity, applicable to general distribution families.

Findings

01

Achieves optimal sample complexity in BAI.

02

Maintains computational efficiency with decision rules.

03

Performs well across various distribution families.

Abstract

This paper investigates a hitherto unaddressed aspect of best arm identification (BAI) in stochastic multi-armed bandits in the fixed-confidence setting. Two key metrics for assessing bandit algorithms are computational efficiency and performance optimality (e.g., in sample complexity). In stochastic BAI literature, there have been advances in designing algorithms to achieve optimal performance, but they are generally computationally expensive to implement (e.g., optimization-based methods). There also exist approaches with high computational efficiency, but they have provable gaps to the optimal performance (e.g., the $β$ -optimal approaches in top-two methods). This paper introduces a framework and an algorithm for BAI that achieves optimal performance with a computationally efficient set of decision rules. The central process that facilitates this is a routine for sequentially…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Forecasting Techniques and Applications · Machine Learning and Algorithms