Differentially Private Low-dimensional Synthetic Data from High-dimensional Datasets
Yiyun He, Thomas Strohmer, Roman Vershynin, Yizhe Zhu

TL;DR
This paper introduces a differentially private algorithm that efficiently generates low-dimensional synthetic data from high-dimensional datasets, overcoming the curse of dimensionality with a utility guarantee based on Wasserstein distance.
Contribution
It presents a novel private PCA method with near-optimal accuracy that does not require a spectral gap, enabling better synthetic data generation from high-dimensional data.
Findings
Achieves utility guarantees for synthetic data with respect to Wasserstein distance.
Provides a private PCA algorithm with near-optimal accuracy without spectral gap assumptions.
Effectively mitigates the curse of dimensionality in differentially private data synthesis.
Abstract
Differentially private synthetic data provide a powerful mechanism to enable data analysis while protecting sensitive information about individuals. However, when the data lie in a high-dimensional space, the accuracy of the synthetic data suffers from the curse of dimensionality. In this paper, we propose a differentially private algorithm to generate low-dimensional synthetic data efficiently from a high-dimensional dataset with a utility guarantee with respect to the Wasserstein distance. A key step of our algorithm is a private principal component analysis (PCA) procedure with a near-optimal accuracy bound that circumvents the curse of dimensionality. Unlike the standard perturbation analysis, our analysis of private PCA works without assuming the spectral gap for the covariance matrix.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Chaos-based Image/Signal Encryption
MethodsPrincipal Components Analysis
