"What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation
Meenatchi Sundaram Muthu Selva Annamalai, Georgi Ganev, Emiliano De, Cristofaro

TL;DR
This paper rigorously audits differentially private synthetic data generation algorithms, revealing that many implementations have higher privacy leakage than theoretical bounds suggest, and introduces improved white-box attack methods.
Contribution
It provides a comprehensive empirical evaluation of DP-SDG implementations, demonstrating the limitations of black-box MIAs and proposing stronger white-box attacks for accurate privacy assessment.
Findings
Black-box MIAs are often too weak to detect actual privacy leaks.
White-box MIAs reveal previously undetected DP violations.
Automated auditing identified known and new privacy violations.
Abstract
Differentially private synthetic data generation (DP-SDG) algorithms are used to release datasets that are structurally and statistically similar to sensitive data while providing formal bounds on the information they leak. However, bugs in algorithms and implementations may cause the actual information leakage to be higher. This prompts the need to verify whether the theoretical guarantees of state-of-the-art DP-SDG implementations also hold in practice. We do so via a rigorous auditing process: we compute the information leakage via an adversary playing a distinguishing game and running membership inference attacks (MIAs). If the leakage observed empirically is higher than the theoretical bounds, we identify a DP violation; if it is non-negligibly lower, the audit is loose. We audit six DP-SDG implementations using different datasets and threat models and find that black-box MIAs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries
