Statistical Inference for Privatized Data with Unknown Sample Size
Jordan Awan, Andres Felipe Barrientos, Nianqiao Ju

TL;DR
This paper develops theoretical and algorithmic tools for analyzing privatized data under unbounded differential privacy, focusing on asymptotic behavior, Bayesian inference, and practical algorithms for finite samples.
Contribution
It introduces new asymptotic results, a reversible jump MCMC algorithm, and a Monte Carlo EM method for inference on privatized data with unbounded DP.
Findings
Sampling distributions under unbounded DP approach those under bounded DP as sample size grows.
The proposed algorithms enable valid Bayesian and frequentist inference from privatized data.
Empirical applications demonstrate the effectiveness of the methods in linear regression and survey data.
Abstract
We develop both theory and algorithms to analyze privatized data in unbounded differential privacy (DP), where even the sample size is considered a sensitive quantity that requires privacy protection. We show that the distance between the sampling distributions under unbounded DP and bounded DP goes to zero as the sample size goes to infinity, provided that the noise used to privatize is at an appropriate rate; we also establish that Approximate Bayesian Computation (ABC)-type posterior distributions converge under similar assumptions. We further give asymptotic results in the regime where the privacy budget for goes to infinity, establishing similarity of sampling distributions as well as showing that the MLE in the unbounded setting converges to the bounded-DP MLE. To facilitate valid, finite-sample Bayesian inference on privatized data under unbounded DP, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
