Multi-Armed Bandits with Local Differential Privacy

Wenbo Ren; Xingyu Zhou; Jia Liu; Ness B. Shroff

arXiv:2007.03121·cs.LG·July 8, 2020·28 cites

Multi-Armed Bandits with Local Differential Privacy

Wenbo Ren, Xingyu Zhou, Jia Liu, Ness B. Shroff

PDF

Open Access

TL;DR

This paper studies multi-armed bandit algorithms under local differential privacy constraints, establishing regret bounds and proposing algorithms that match these bounds, with experimental validation.

Contribution

It provides the first regret bounds for MAB algorithms with LDP guarantees and introduces algorithms that achieve these bounds.

Findings

01

Proved a lower bound on regret for LDP-MAB algorithms.

02

Designed algorithms with regret upper bounds matching the lower bound.

03

Numerical experiments confirm theoretical results.

Abstract

This paper investigates the problem of regret minimization for multi-armed bandit (MAB) problems with local differential privacy (LDP) guarantee. In stochastic bandit systems, the rewards may refer to the users' activities, which may involve private information and the users may not want the agent to know. However, in many cases, the agent needs to know these activities to provide better services such as recommendations and news feeds. To handle this dilemma, we adopt differential privacy and study the regret upper and lower bounds for MAB algorithms with a given LDP guarantee. In this paper, we prove a lower bound and propose algorithms whose regret upper bounds match the lower bound up to constant factors. Numerical experiments also confirm our conclusions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Age of Information Optimization