Optimal Rates of (Locally) Differentially Private Heavy-tailed Multi-Armed Bandits
Youming Tao, Yulian Wu, Peng Zhao, Di Wang

TL;DR
This paper develops and analyzes differentially private algorithms for heavy-tailed multi-armed bandit problems, achieving near-optimal regret rates in both central and local privacy models, and introduces new estimators and hard instances.
Contribution
It introduces the first near-optimal private algorithms for heavy-tailed MABs under both central and local differential privacy models, with new estimators and lower bounds.
Findings
Algorithms achieve near-optimal regret rates.
Heavy-tailed rewards require different privacy techniques.
Experimental results support theoretical guarantees.
Abstract
In this paper we investigate the problem of stochastic multi-armed bandits (MAB) in the (local) differential privacy (DP/LDP) model. Unlike previous results that assume bounded/sub-Gaussian reward distributions, we focus on the setting where each arm's reward distribution only has -th moment with some . In the first part, we study the problem in the central -DP model. We first provide a near-optimal result by developing a private and robust Upper Confidence Bound (UCB) algorithm. Then, we improve the result via a private and robust version of the Successive Elimination (SE) algorithm. Finally, we establish the lower bound to show that the instance-dependent regret of our improved algorithm is optimal. In the second part, we study the problem in the -LDP model. We propose an algorithm that can be seen as locally private and robust version of SE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Age of Information Optimization
