PCEvolve: Private Contrastive Evolution for Synthetic Dataset Generation via Few-Shot Private Data and Generative APIs
Jianqing Zhang, Yang Liu, Jie Fu, Yang Hua, Tianyuan Zou, Jian Cao, Qiang Yang

TL;DR
PCEvolve is a novel algorithm that enhances privacy-preserving synthetic data generation by leveraging inter-class contrastive relationships and API-assisted mechanisms, especially effective in few-shot private data scenarios.
Contribution
It introduces PCEvolve, a new method that improves differential privacy synthetic image generation by mining contrastive relationships and integrating them into an adapted Exponential Mechanism.
Findings
PCEvolve outperforms Private Evolution (PE) and other baselines in experiments.
The method effectively generates high-quality DP synthetic images.
Leveraging API access enhances privacy-preserving data utility.
Abstract
The rise of generative APIs has fueled interest in privacy-preserving synthetic data generation. While the Private Evolution (PE) algorithm generates Differential Privacy (DP) synthetic images using diffusion model APIs, it struggles with few-shot private data due to the limitations of its DP-protected similarity voting approach. In practice, the few-shot private data challenge is particularly prevalent in specialized domains like healthcare and industry. To address this challenge, we propose a novel API-assisted algorithm, Private Contrastive Evolution (PCEvolve), which iteratively mines inherent inter-class contrastive relationships in few-shot private data beyond individual data points and seamlessly integrates them into an adapted Exponential Mechanism (EM) to optimize DP's utility in an evolution loop. We conduct extensive experiments on four specialized datasets, demonstrating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Generative Adversarial Networks and Image Synthesis · Cryptography and Data Security
