Differentially Private Data Release over Multiple Tables
Badih Ghazi, Xiao Hu, Ravi Kumar, Pasin Manurangsi

TL;DR
This paper develops a differentially private method for releasing synthetic data answering multiple linear queries over multiple database tables, addressing the challenges posed by join operations and providing near-optimal algorithms.
Contribution
It introduces a general algorithm for differentially private data release over multiple tables with joins and proves its near-optimality, also improving utility for hierarchical joins.
Findings
Algorithm achieves parameterized near-optimality for simple join queries.
Proposes a data partition method for hierarchical joins to enhance utility.
Addresses sensitivity amplification caused by complex join relationships.
Abstract
We study synthetic data release for answering multiple linear queries over a set of database tables in a differentially private way. Two special cases have been considered in the literature: how to release a synthetic dataset for answering multiple linear queries over a single table, and how to release the answer for a single counting (join size) query over a set of database tables. Compared to the single-table case, the join operator makes query answering challenging, since the sensitivity (i.e., by how much an individual data record can affect the answer) could be heavily amplified by complex join relationships. We present an algorithm for the general problem, and prove a lower bound illustrating that our general algorithm achieves parameterized optimality (up to logarithmic factors) on some simple queries (e.g., two-table join queries) in the most commonly-used privacy parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Complexity and Algorithms in Graphs
