DP-PQD: Privately Detecting Per-Query Gaps In Synthetic Data Generated By Black-Box Mechanisms
Shweta Patwa, Danyu Sun, Amir Gilad, Ashwin Machanavajjhala, and, Sudeepa Roy

TL;DR
DP-PQD is a framework that provides differential privacy guarantees for assessing the accuracy of specific aggregated queries on synthetic data, helping users determine if the data is suitable for their analysis needs.
Contribution
The paper introduces DP-PQD, a novel differentially-private framework for per-query quality assessment of synthetic data, addressing a gap in existing data generation systems.
Findings
Effective private algorithms for count, sum, and median queries.
Experimental validation shows reliable per-query accuracy detection.
Framework enables trust in synthetic data for specific queries.
Abstract
Synthetic data generation methods, and in particular, private synthetic data generation methods, are gaining popularity as a means to make copies of sensitive databases that can be shared widely for research and data analysis. Some of the fundamental operations in data analysis include analyzing aggregated statistics, e.g., count, sum, or median, on a subset of data satisfying some conditions. When synthetic data is generated, users may be interested in knowing if their aggregated queries generating such statistics can be reliably answered on the synthetic data, for instance, to decide if the synthetic data is suitable for specific tasks. However, the standard data generation systems do not provide "per-query" quality guarantees on the synthetic data, and the users have no way of knowing how much the aggregated statistics on the synthetic data can be trusted. To address this problem, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Blockchain Technology Applications and Security
