A Learning Theory Approach to Non-Interactive Database Privacy
Avrim Blum, Katrina Ligett, Aaron Roth

TL;DR
This paper explores the limits of non-interactive data privacy, proposing mechanisms for synthetic data release with utility guarantees based on learning theory, and introduces a new privacy notion called distributional privacy.
Contribution
It introduces a learning-theoretic framework for non-interactive privacy mechanisms, including new algorithms and the concept of distributional privacy, strengthening privacy guarantees.
Findings
Synthetic data release is feasible for large query classes with error depending on VC-dimension.
Releasing synthetic data for simple classes over continuous domains is impossible.
A polynomial-time algorithm for halfspace queries with relaxed utility guarantees is provided.
Abstract
In this paper we demonstrate that, ignoring computational constraints, it is possible to privately release synthetic databases that are useful for large classes of queries -- much larger in size than the database itself. Specifically, we give a mechanism that privately releases synthetic data for a class of queries over a discrete domain with error that grows as a function of the size of the smallest net approximately representing the answers to that class of queries. We show that this in particular implies a mechanism for counting queries that gives error guarantees that grow only with the VC-dimension of the class of queries, which itself grows only logarithmically with the size of the query class. We also show that it is not possible to privately release even simple classes of queries (such as intervals and their generalizations) over continuous domains. Despite this, we give a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Complexity and Algorithms in Graphs
