Skyline Identification in Multi-Armed Bandits
Albert Cheu, Ravi Sundaram, Jonathan Ullman

TL;DR
This paper studies the problem of identifying the skyline of arms in a multi-armed bandit setting, providing tight bounds on the sample complexity for approximate skyline identification, which improves over naive and previous Pareto-optimal algorithms.
Contribution
It introduces the concept of an $ ext{epsilon}$-approximate skyline in multi-armed bandits and establishes matching upper and lower bounds for its identification, advancing understanding of sample complexity.
Findings
Sample complexity is $ ilde{ heta}(rac{n}{ ext{epsilon}^2})$ for $ ext{epsilon}$-skyline identification.
The bounds improve over naive algorithms and previous Pareto-optimal methods.
Skyline identification complexity lies between best arm and full reward estimation.
Abstract
We introduce a variant of the classical PAC multi-armed bandit problem. There is an ordered set of arms , each with some stochastic reward drawn from some unknown bounded distribution. The goal is to identify the of the set , consisting of all arms such that has larger expected reward than all lower-numbered arms . We define a natural notion of an -approximate skyline and prove matching upper and lower bounds for identifying an -skyline. Specifically, we show that in order to identify an -skyline from among arms with probability , samples are necessary and sufficient. When , our results improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
