Optimal Best-Arm Identification in Bandits with Access to Offline Data
Shubhada Agrawal, Sandeep Juneja, Karthikeyan Shanmugam, Arun Sai, Suggala

TL;DR
This paper studies the problem of identifying the best arm in a stochastic bandit setting by combining offline data with online learning, providing lower bounds and efficient algorithms with near-optimal sample complexity.
Contribution
It introduces a framework for best-arm identification that integrates offline data with online exploration, deriving lower bounds and proposing algorithms that achieve these bounds.
Findings
Algorithms match the lower bound on sample complexity for small δ.
Proposed methods are computationally efficient with near-linear per-sample cost.
Theoretical analysis characterizes the optimality conditions of the problem.
Abstract
Learning paradigms based purely on offline data as well as those based solely on sequential online learning have been well-studied in the literature. In this paper, we consider combining offline data with online learning, an area less studied but of obvious practical importance. We consider the stochastic -armed bandit problem, where our goal is to identify the arm with the highest mean in the presence of relevant offline data, with confidence . We conduct a lower bound analysis on policies that provide such probabilistic correctness guarantees. We develop algorithms that match the lower bound on sample complexity when is small. Our algorithms are computationally efficient with an average per-sample acquisition cost of , and rely on a careful characterization of the optimality conditions of the lower bound problem.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Auction Theory and Applications
