Inference With Combining Rules From Multiple Differentially Private   Synthetic Datasets

Leila Nombo; Anne-Sophie Charest

arXiv:2405.04769·stat.ME·May 9, 2024

Inference With Combining Rules From Multiple Differentially Private Synthetic Datasets

Leila Nombo, Anne-Sophie Charest

PDF

Open Access

TL;DR

This paper investigates how to perform statistical inference using differentially private synthetic datasets, adapting methods from missing data imputation, and evaluates their accuracy across various scenarios.

Contribution

It extends existing inference procedures based on combining rules to the context of differentially private synthetic datasets, providing empirical evaluation of their effectiveness.

Findings

01

Combining rules can yield accurate inference in some contexts.

02

Performance varies depending on the data generation method and analysis scenario.

03

Empirical results highlight limitations and potential of the proposed approach.

Abstract

Differential privacy (DP) has been accepted as a rigorous criterion for measuring the privacy protection offered by random mechanisms used to obtain statistics or, as we will study here, synthetic datasets from confidential data. Methods to generate such datasets are increasingly numerous, using varied tools including Bayesian models, deep neural networks and copulas. However, little is still known about how to properly perform statistical inference with these differentially private synthetic (DIPS) datasets. The challenge is for the analyses to take into account the variability from the synthetic data generation in addition to the usual sampling variability. A similar challenge also occurs when missing data is imputed before analysis, and statisticians have developed appropriate inference procedures for this case, which we tend extended to the case of synthetic datasets for privacy. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data