Differentially Private Multi-Armed Bandits in the Shuffle Model
Jay Tenenbaum, Haim Kaplan, Yishay Mansour, Uri Stemmer

TL;DR
This paper introduces a differentially private algorithm for multi-armed bandits in the shuffle model, achieving regret bounds close to centralized algorithms and outperforming local model algorithms.
Contribution
It provides the first $( ext{} ext{} ext{)}$-differentially private MAB algorithm in the shuffle model with near-optimal regret bounds.
Findings
Regret bounds nearly match centralized algorithms.
Outperforms existing local model algorithms.
Achieves $( ext{} ext{} ext{)}$-differential privacy in the shuffle model.
Abstract
We give an -differentially private algorithm for the multi-armed bandit (MAB) problem in the shuffle model with a distribution-dependent regret of , and a distribution-independent regret of , where is the number of rounds, is the suboptimality gap of the arm , and is the total number of arms. Our upper bound almost matches the regret of the best known algorithms for the centralized model, and significantly outperforms the best known algorithm in the local model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Machine Learning and Algorithms
