Bayesian Pseudo Posterior Synthesis for Data Privacy Protection
Jingchen Hu, Terrance D. Savitsky

TL;DR
This paper introduces a Bayesian pseudo posterior approach with record-specific weights to improve data utility while protecting privacy in synthetic data generation, demonstrated through simulations and real data application.
Contribution
It proposes a novel inverse risk-weighted pseudo posterior method that better balances data utility and privacy compared to existing scalar weighting techniques.
Findings
Enhanced utility preservation with record-specific weighting.
Effective privacy protection demonstrated in simulations.
Theoretical analysis of frequentist properties and uncertainty quantification.
Abstract
Statistical agencies utilize models to synthesize respondent-level data for release to the general public as an alternative to the actual data records. A Bayesian model synthesizer encodes privacy protection by employing a hierarchical prior construction that induces smoothing of the real data distribution. Synthetic respondent-level data records are often preferred to summary data tables due to the many possible uses by researchers and data analysts. Agencies balance a trade-off between utility of the synthetic data versus disclosure risks and hold a specific target threshold for disclosure risk before releasing synthetic datasets. We introduce a pseudo posterior likelihood that exponentiates each contribution by an observation record-indexed weight in (0, 1), defined to be inversely proportional to the disclosure risk for that record in the synthetic data. Our use of a vector of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models · Data-Driven Disease Surveillance
