Fixed-Confidence Guarantees for Bayesian Best-Arm Identification

Xuedong Shang; Rianne de Heide; Emilie Kaufmann; Pierre M\'enard,; Michal Valko

arXiv:1910.10945·cs.LG·October 29, 2019·5 cites

Fixed-Confidence Guarantees for Bayesian Best-Arm Identification

Xuedong Shang, Rianne de Heide, Emilie Kaufmann, Pierre M\'enard,, Michal Valko

PDF

Open Access 1 Datasets

TL;DR

This paper analyzes the Top-Two Thompson Sampling algorithm for fixed-confidence best-arm identification in bandit problems, introducing a new variant T3C that reduces computational complexity and providing the first sample complexity analysis for these methods.

Contribution

It offers the first sample complexity analysis of TTTS and T3C with a Bayesian stopping rule for Gaussian bandits, addressing an open question from Russo (2016).

Findings

01

TTTS and T3C are justified for fixed-confidence best-arm identification.

02

T3C reduces computational burden compared to TTTS.

03

New posterior convergence results for Gaussian and Bernoulli bandits.

Abstract

We investigate and provide new insights on the sampling rule called Top-Two Thompson Sampling (TTTS). In particular, we justify its use for fixed-confidence best-arm identification. We further propose a variant of TTTS called Top-Two Transportation Cost (T3C), which disposes of the computational burden of TTTS. As our main contribution, we provide the first sample complexity analysis of TTTS and T3C when coupled with a very natural Bayesian stopping rule, for bandits with Gaussian rewards, solving one of the open questions raised by Russo (2016). We also provide new posterior convergence results for TTTS under two models that are commonly used in practice: bandits with Gaussian and Bernoulli rewards and conjugate priors.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

misovalko/my-research-papers
dataset· 21 dl
21 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Auction Theory and Applications