Shuffle Private Linear Contextual Bandits
Sayak Ray Chowdhury, Xingyu Zhou

TL;DR
This paper introduces a shuffle privacy model for linear contextual bandits, achieving better utility than local models and approaching central model performance, with new algorithms and regret guarantees.
Contribution
It proposes a novel shuffle privacy framework for linear bandits, improving regret bounds and bridging the gap between local and central differential privacy models.
Findings
Regret can be reduced to approximately $ ilde{O}(T^{3/5})$ with shuffle privacy.
In non-unique user scenarios, regret scales as $ ilde{O}(T^{2/3})$, matching the central model.
Algorithms are validated through simulations on synthetic data.
Abstract
Differential privacy (DP) has been recently introduced to linear contextual bandits to formally address the privacy concerns in its associated personalized services to participating users (e.g., recommendations). Prior work largely focus on two trust models of DP: the central model, where a central server is responsible for protecting users sensitive data, and the (stronger) local model, where information needs to be protected directly on user side. However, there remains a fundamental gap in the utility achieved by learning algorithms under these two privacy models, e.g., regret in the central model as compared to regret in the local model, if all users are unique within a learning horizon . In this work, we aim to achieve a stronger model of trust than the central model, while suffering a smaller regret than the local model by considering…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Bandit Algorithms Research · Age of Information Optimization
