The Price of Differential Privacy For Online Learning

Naman Agarwal; Karan Singh

arXiv:1701.07953·cs.LG·June 15, 2017·24 cites

The Price of Differential Privacy For Online Learning

Naman Agarwal, Karan Singh

PDF

Open Access

TL;DR

This paper develops differentially private algorithms for online linear optimization, achieving near-optimal regret bounds and demonstrating that privacy can be incorporated with minimal additional cost in certain settings.

Contribution

It introduces differentially private algorithms for online linear optimization with optimal regret bounds, improving previous bounds especially in bandit settings.

Findings

01

Full-information setting achieves regret of O(√T)+~O(1/ε).

02

Bandit setting achieves regret of ~O((1/ε)√T), improving over previous bounds.

03

Differential privacy can be achieved with minimal impact on regret in online learning.

Abstract

We design differentially private algorithms for the problem of online linear optimization in the full information and bandit settings with optimal $\tilde{O} (T)$ regret bounds. In the full-information setting, our results demonstrate that $ϵ$ -differential privacy may be ensured for free -- in particular, the regret bounds scale as $O (T) + \tilde{O} (\frac{1}{ϵ})$ . For bandit linear optimization, and as a special case, for non-stochastic multi-armed bandits, the proposed algorithm achieves a regret of $\tilde{O} (\frac{1}{ϵ} T)$ , while the previously known best regret bound was $\tilde{O} (\frac{1}{ϵ} T^{\frac{2}{3}})$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Privacy-Preserving Technologies in Data · Mobile Crowdsensing and Crowdsourcing