Quantifying Membership Disclosure Risk for Tabular Synthetic Data Using Kernel Density Estimators
Rajdeep Pathak, Sayantee Jana

TL;DR
This paper introduces a KDE-based method to quantify membership inference risks in synthetic tabular data, providing a practical tool for privacy risk assessment without extensive computational costs.
Contribution
It presents a novel KDE-based approach for membership inference risk quantification that is more effective and practical than previous methods, especially for tabular synthetic data.
Findings
Outperforms prior baseline in F1 score and risk characterization
Works effectively across multiple real-world datasets and generators
Does not require shadow models for risk assessment
Abstract
The use of synthetic data has become increasingly popular as a privacy-preserving alternative to sharing real datasets, especially in sensitive domains such as healthcare, finance, and demography. However, the privacy assurances of synthetic data are not absolute, and remain susceptible to membership inference attacks (MIAs), where adversaries aim to determine whether a specific individual was present in the dataset used to train the generator. In this work, we propose a practical and effective method to quantify membership disclosure risk in tabular synthetic datasets using kernel density estimators (KDEs). Our KDE-based approach models the distribution of nearest-neighbour distances between synthetic data and the training records, allowing probabilistic inference of membership and enabling robust evaluation via ROC curves. We propose two attack models: a 'True Distribution Attack',…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Scientific Computing and Data Management
