Differentially Private Synthetic Data via APIs 3: Using Simulators Instead of Foundation Model
Zinan Lin, Tadas Baltrusaitis, Wenyu Wang, Sergey Yekhanin

TL;DR
This paper introduces Sim-PE, a method that uses simulators instead of foundation models within the Private Evolution framework to generate differentially private synthetic data, broadening applicability and improving performance.
Contribution
It demonstrates that simulators, including non-neural network data synthesizers, can be integrated into PE, expanding its use beyond foundation models for DP data synthesis.
Findings
Sim-PE improves classification accuracy by up to 3x.
Reduces FID by up to 80%.
Offers greater efficiency in DP synthetic data generation.
Abstract
Differentially private (DP) synthetic data, which closely resembles the original private data while maintaining strong privacy guarantees, has become a key tool for unlocking the value of private data without compromising privacy. Recently, Private Evolution (PE) has emerged as a promising method for generating DP synthetic data. Unlike other training-based approaches, PE only requires access to inference APIs from foundation models, enabling it to harness the power of state-of-the-art (SoTA) models. However, a suitable foundation model for a specific private data domain is not always available. In this paper, we discover that the PE framework is sufficiently general to allow APIs beyond foundation models. In particular, we demonstrate that many SoTA data synthesizers that do not rely on neural networks--such as computer graphics-based image generators, which we refer to as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance
