Resampling methods for private statistical inference

Karan Chadha; John Duchi; Rohith Kuditipudi

arXiv:2402.07131·stat.ML·June 5, 2024·1 cites

Resampling methods for private statistical inference

Karan Chadha, John Duchi, Rohith Kuditipudi

PDF

Open Access

TL;DR

This paper introduces two differentially private bootstrap methods for constructing confidence intervals that maintain accuracy comparable to non-private methods while producing significantly shorter intervals.

Contribution

It proposes novel private bootstrap variants that achieve near non-private error rates and shorter confidence intervals, validated through empirical experiments.

Findings

01

Achieve similar coverage accuracy to non-private methods.

02

Produce confidence intervals at least 10 times shorter.

03

Maintain asymptotic bounds on coverage error.

Abstract

We consider the task of constructing confidence intervals with differential privacy. We propose two private variants of the non-parametric bootstrap, which privately compute the median of the results of multiple "little" bootstraps run on partitions of the data and give asymptotic bounds on the coverage error of the resulting confidence intervals. For a fixed differential privacy parameter $ϵ$ , our methods enjoy the same error rates as that of the non-private bootstrap to within logarithmic factors in the sample size $n$ . We empirically validate the performance of our methods for mean estimation, median estimation, and logistic regression with both real and synthetic data. Our methods achieve similar coverage accuracy to existing methods (and non-private baselines) while providing notably shorter ( $≳ 10$ times) confidence intervals than previous approaches.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData-Driven Disease Surveillance · Statistical Methods and Bayesian Inference

MethodsLogistic Regression