Between Pure and Approximate Differential Privacy

Thomas Steinke; Jonathan Ullman

arXiv:1501.06095·cs.DS·January 27, 2015

Between Pure and Approximate Differential Privacy

Thomas Steinke, Jonathan Ullman

PDF

TL;DR

This paper establishes a new optimal lower bound on the sample complexity for answering statistical queries under differential privacy, smoothly bridging pure and approximate privacy regimes, and introduces improved private algorithms.

Contribution

It provides the first lower bound that optimally depends on elta, interpolating between pure and approximate differential privacy, and offers improved private algorithms for statistical queries.

Findings

01

Lower bound on sample complexity: elta;d/lphapsilon

02

Optimal dependence on elta, psilon, and d for high-dimensional data

03

Enhanced private algorithms with reduced sample complexity

Abstract

We show a new lower bound on the sample complexity of $(ε, δ)$ -differentially private algorithms that accurately answer statistical queries on high-dimensional databases. The novelty of our bound is that it depends optimally on the parameter $δ$ , which loosely corresponds to the probability that the algorithm fails to be private, and is the first to smoothly interpolate between approximate differential privacy ( $δ > 0$ ) and pure differential privacy ( $δ = 0$ ). Specifically, we consider a database $D \in {\pm 1}^{n \times d}$ and its \emph{one-way marginals}, which are the $d$ queries of the form "What fraction of individual records have the $i$ -th bit set to $+ 1$ ?" We show that in order to answer all of these queries to within error $\pm α$ (on average) while satisfying $(ε, δ)$ -differential privacy, it is necessary that $$ n \geq…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.