Thompson Sampling Itself is Differentially Private
Tingting Ou, Marco Avella Medina, Rachel Cummings

TL;DR
This paper demonstrates that the classical Thompson sampling algorithm for multi-armed bandits inherently provides differential privacy guarantees without modification, and explores simple adjustments to enhance privacy while maintaining low regret.
Contribution
It proves the inherent differential privacy of Thompson sampling and introduces simple modifications to improve privacy-utility tradeoffs with theoretical and empirical validation.
Findings
Thompson sampling is differentially private without changes.
Simple modifications can tighten privacy guarantees.
Tuning parameters improves privacy-regret balance.
Abstract
In this work we first show that the classical Thompson sampling algorithm for multi-arm bandits is differentially private as-is, without any modification. We provide per-round privacy guarantees as a function of problem parameters and show composition over rounds; since the algorithm is unchanged, existing regret bounds still hold and there is no loss in performance due to privacy. We then show that simple modifications -- such as pre-pulling all arms a fixed number of times, increasing the sampling variance -- can provide tighter privacy guarantees. We again provide privacy guarantees that now depend on the new parameters introduced in the modification, which allows the analyst to tune the privacy guarantee as desired. We also provide a novel regret analysis for this new algorithm, and show how the new parameters also impact expected regret. Finally, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Topology and Set Theory
