Differentially Private Synthetic Data with Private Density Estimation
Nikolija Bojkovic, Po-Ling Loh

TL;DR
This paper introduces a new differentially private synthetic data generation method that improves upon previous algorithms by using private density estimation, enabling accurate data synthesis for both discrete and continuous data.
Contribution
It adapts an existing optimization-based algorithm by incorporating private density estimation, enhancing computational guarantees and applicability to continuous distributions.
Findings
Improved privacy-preserving synthetic data generation for discrete distributions.
Development of a novel algorithm for continuous data synthesis.
Demonstrated applications to various statistical tasks.
Abstract
The need to analyze sensitive data, such as medical records or financial data, has created a critical research challenge in recent years. In this paper, we adopt the framework of differential privacy, and explore mechanisms for generating an entire dataset which accurately captures characteristics of the original data. We build upon the work of Boedihardjo et al, which laid the foundations for a new optimization-based algorithm for generating private synthetic data. Importantly, we adapt their algorithm by replacing a uniform sampling step with a private distribution estimator; this allows us to obtain better computational guarantees for discrete distributions, and develop a novel algorithm suitable for continuous distributions. We also explore applications of our work to several statistical tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Pharmacological Effects and Toxicity Studies
