Gamification of Pure Exploration for Linear Bandits

R\'emy Degenne; Pierre M\'enard; Xuedong Shang; Michal Valko

arXiv:2007.00953·stat.ML·July 3, 2020·23 cites

Gamification of Pure Exploration for Linear Bandits

R\'emy Degenne, Pierre M\'enard, Xuedong Shang, Michal Valko

PDF

Open Access 1 Datasets 1 Video

TL;DR

This paper introduces the first asymptotically optimal algorithm for pure exploration in linear bandits, improving the efficiency and robustness of best-arm identification methods.

Contribution

It provides a comprehensive comparison of optimality notions and develops an efficient, asymptotically optimal algorithm for fixed-confidence pure exploration in linear bandits.

Findings

01

The algorithm achieves asymptotic optimality in linear bandit pure exploration.

02

It bypasses previous computational difficulties in optimal design.

03

The approach is efficiently implementable in practice.

Abstract

We investigate an active pure-exploration setting, that includes best-arm identification, in the context of linear stochastic bandits. While asymptotically optimal algorithms exist for standard multi-arm bandits, the existence of such algorithms for the best-arm identification in linear bandits has been elusive despite several attempts to address it. First, we provide a thorough comparison and new insight over different notions of optimality in the linear case, including G-optimality, transductive optimality from optimal experimental design and asymptotic optimality. Second, we design the first asymptotically optimal algorithm for fixed-confidence pure exploration in linear bandits. As a consequence, our algorithm naturally bypasses the pitfall caused by a simple but difficult instance, that most prior algorithms had to be engineered to deal with explicitly. Finally, we avoid the need…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

misovalko/my-research-papers
dataset· 21 dl
21 dl

Videos

Gamification of Pure Exploration for Linear Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research