Private Tabular Survey Data Products through Synthetic Microdata Generation
Jingchen Hu, Terrance D. Savitsky, Matthew R. Williams

TL;DR
This paper introduces two privacy-preserving methods for generating synthetic microdata from survey data, ensuring differential privacy while maintaining data utility for public release and analysis.
Contribution
It presents novel synthetic microdata approaches that guarantee probabilistic differential privacy and improve utility over traditional additive-noise methods.
Findings
Our methods achieve asymptotic global probabilistic differential privacy.
Synthetic microdata maintains higher utility compared to Laplace Mechanism.
Enables public release of microdata with preserved privacy and analytical value.
Abstract
We propose two synthetic microdata approaches to generate private tabular survey data products for public release. We adapt a pseudo posterior mechanism that downweights by-record likelihood contributions with weights based on their identification disclosure risks to producing tabular products for survey data. Our method applied to an observed survey database achieves an asymptotic global probabilistic differential privacy guarantee. Our two approaches synthesize the observed sample distribution of the outcome and survey weights, jointly, such that both quantities together possess a privacy guarantee. The privacy-protected outcome and survey weights are used to construct tabular cell estimates (where the cell inclusion indicators are treated as known and public) and associated standard errors to correct for survey sampling bias. Through a real data application to the Survey…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Healthcare Policy and Management · Ethics in Clinical Research
