Differentially Private Multi-Armed Bandits in the Shuffle Model

Jay Tenenbaum; Haim Kaplan; Yishay Mansour; Uri Stemmer

arXiv:2106.02900·cs.LG·October 29, 2021·5 cites

Differentially Private Multi-Armed Bandits in the Shuffle Model

Jay Tenenbaum, Haim Kaplan, Yishay Mansour, Uri Stemmer

PDF

Open Access 1 Video

TL;DR

This paper introduces a differentially private algorithm for multi-armed bandits in the shuffle model, achieving regret bounds close to centralized algorithms and outperforming local model algorithms.

Contribution

It provides the first $( ext{} ext{} ext{)}$-differentially private MAB algorithm in the shuffle model with near-optimal regret bounds.

Findings

01

Regret bounds nearly match centralized algorithms.

02

Outperforms existing local model algorithms.

03

Achieves $( ext{} ext{} ext{)}$-differential privacy in the shuffle model.

Abstract

We give an $(ε, δ)$ -differentially private algorithm for the multi-armed bandit (MAB) problem in the shuffle model with a distribution-dependent regret of $O ((\sum_{a \in [k] : Δ_{a} > 0} \frac{l o g T}{Δ _{a}}) + \frac{k l o g \frac{1}{δ} l o g T}{ε})$ , and a distribution-independent regret of $O (k T lo g T + \frac{k l o g \frac{1}{δ} l o g T}{ε})$ , where $T$ is the number of rounds, $Δ_{a}$ is the suboptimality gap of the arm $a$ , and $k$ is the total number of arms. Our upper bound almost matches the regret of the best known algorithms for the centralized model, and significantly outperforms the best known algorithm in the local model.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Differentially Private Multi-Armed Bandits in the Shuffle Model· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Machine Learning and Algorithms