Tabular Data Synthesis with Differential Privacy: A Survey
Mengmeng Yang, Chi-Hung Chi, Kwok-Yan Lam, Jie Feng, Taolin Guo, Wei, Ni

TL;DR
This survey reviews methods for generating synthetic tabular data with differential privacy, discussing their challenges, classifications, evaluation techniques, and future research directions to enable privacy-preserving data sharing.
Contribution
It provides a comprehensive classification and comparison of existing differentially private tabular data synthesis methods, highlighting their strengths, weaknesses, and research gaps.
Findings
Statistical and deep learning approaches each have unique advantages and challenges.
Evaluation methods vary in effectiveness for assessing data utility and privacy.
Identified key research gaps and future directions in differentially private data synthesis.
Abstract
Data sharing is a prerequisite for collaborative innovation, enabling organizations to leverage diverse datasets for deeper insights. In real-world applications like FinTech and Smart Manufacturing, transactional data, often in tabular form, are generated and analyzed for insight generation. However, such datasets typically contain sensitive personal/business information, raising privacy concerns and regulatory risks. Data synthesis tackles this by generating artificial datasets that preserve the statistical characteristics of real data, removing direct links to individuals. However, attackers can still infer sensitive information using background knowledge. Differential privacy offers a solution by providing provable and quantifiable privacy protection. Consequently, differentially private data synthesis has emerged as a promising approach to privacy-aware data sharing. This paper…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Advanced Steganography and Watermarking Techniques · Chaos-based Image/Signal Encryption
