Best-Arm Identification in Linear Bandits
Marta Soare, Alessandro Lazaric, R\'emi Munos

TL;DR
This paper addresses the challenge of efficiently identifying the best arm in linear bandit problems by developing strategies that leverage the linear structure to minimize sampling while ensuring high confidence in the result.
Contribution
It introduces novel sample allocation strategies for best-arm identification in linear bandits that exploit the problem's global linear structure to improve efficiency.
Findings
Proposed strategies effectively identify the best arm with fewer samples.
Analysis shows the importance of exploiting linear structure for better estimates.
Empirical results demonstrate improved performance over baseline methods.
Abstract
We study the best-arm identification problem in linear bandit, where the rewards of the arms depend linearly on an unknown parameter and the objective is to return the arm with the largest reward. We characterize the complexity of the problem and introduce sample allocation strategies that pull arms to identify the best arm with a fixed confidence, while minimizing the sample budget. In particular, we show the importance of exploiting the global linear structure to improve the estimate of the reward of near-optimal arms. We analyze the proposed strategies and compare their empirical performance. Finally, as a by-product of our analysis, we point out the connection to the -optimality criterion used in optimal experimental design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Reinforcement Learning in Robotics
