Risk-Efficient Bayesian Data Synthesis for Privacy Protection

Jingchen Hu; Terrance D. Savitsky; Matthew R. Williams

arXiv:1908.07639·stat.ME·June 2, 2022

Risk-Efficient Bayesian Data Synthesis for Privacy Protection

Jingchen Hu, Terrance D. Savitsky, Matthew R. Williams

PDF

1 Repo

TL;DR

This paper introduces a risk-aware Bayesian data synthesis method that adjusts for privacy risks using weighted pseudo likelihoods, improving data privacy and utility in statistical releases.

Contribution

It develops a novel risk-adjusted pseudo likelihood approach for Bayesian data synthesis that mitigates re-identification risks while maintaining data utility.

Findings

01

Risk-adjusted synthesizer improves privacy protection overall.

02

Pairwise risk-based weighting reduces re-identification risk more effectively.

03

Method enhances data utility while controlling privacy risks.

Abstract

Statistical agencies utilize models to synthesize respondent-level data for release to the public for privacy protection. In this work, we efficiently induce privacy protection into any Bayesian synthesis model by employing a pseudo likelihood that exponentiates each likelihood contribution by an observation record-indexed weight in [0, 1], defined to be inversely proportional to the identification risk for that record. We start with the marginal probability of identification risk for a record, which is composed as the probability that the identity of the record may be disclosed. Our application to the Consumer Expenditure Surveys (CE) of the U.S. Bureau of Labor Statistics demonstrates that the marginally risk-adjusted synthesizer provides an overall improved privacy protection; however, the identification risks actually increase for some moderate-risk records after risk-adjusted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

monika76five/Risk_Efficient_Bayesian_Synthesis
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.