Quantile Multi-Armed Bandits: Optimal Best-Arm Identification and a Differentially Private Scheme
Kontantinos E. Nikolakakis, Dionysios S. Kalogerias, Or Sheffet and, Anand D. Sarwate

TL;DR
This paper introduces optimal algorithms for identifying the best arm based on quantiles in stochastic multi-armed bandits, including a differentially private method suitable for private reward settings, with proven optimality and finite sample complexity.
Contribution
It presents a novel successive elimination algorithm for quantile-based best-arm identification and a differentially private variant with finite sample complexity, both with theoretical guarantees.
Findings
The non-private algorithm is $ ext{δ}$-PAC and nearly optimal in sample complexity.
The differentially private algorithm maintains finite sample complexity even with infinite support distributions.
Both algorithms do not require prior knowledge of the suboptimality gap or statistical parameters.
Abstract
We study the best-arm identification problem in multi-armed bandits with stochastic, potentially private rewards, when the goal is to identify the arm with the highest quantile at a fixed, prescribed level. First, we propose a (non-private) successive elimination algorithm for strictly optimal best-arm identification, we show that our algorithm is -PAC and we characterize its sample complexity. Further, we provide a lower bound on the expected number of pulls, showing that the proposed algorithm is essentially optimal up to logarithmic factors. Both upper and lower complexity bounds depend on a special definition of the associated suboptimality gap, designed in particular for the quantile bandit problem, as we show when the gap approaches zero, best-arm identification is impossible. Second, motivated by applications where the rewards are private, we provide a differentially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
