Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear   Bandits

Marc Jourdan; R\'emy Degenne

arXiv:2206.04456·stat.ML·June 10, 2022

Choosing Answers in $\varepsilon$-Best-Answer Identification for Linear Bandits

Marc Jourdan, R\'emy Degenne

PDF

Open Access

TL;DR

This paper introduces a new approach for identifying an answer within an epsilon margin of the best in linear bandits, emphasizing the importance of selecting the furthest answer to optimize sample complexity.

Contribution

It develops a novel method for epsilon-best-answer identification in linear bandits, highlighting the need to choose the furthest answer rather than the highest mean for asymptotic optimality.

Findings

01

The proposed algorithm is asymptotically optimal.

02

It outperforms existing modified best-arm identification algorithms.

03

The method is empirically competitive.

Abstract

In pure-exploration problems, information is gathered sequentially to answer a question on the stochastic environment. While best-arm identification for linear bandits has been extensively studied in recent years, few works have been dedicated to identifying one arm that is $ε$ -close to the best one (and not exactly the best one). In this problem with several correct answers, an identification algorithm should focus on one candidate among those answers and verify that it is correct. We demonstrate that picking the answer with highest mean does not allow an algorithm to reach asymptotic optimality in terms of expected sample complexity. Instead, a \textit{furthest answer} should be identified. Using that insight to choose the candidate answer carefully, we develop a simple procedure to adapt best-arm identification algorithms to tackle $ε$ -best-answer identification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems