TL;DR
This paper introduces the first differentially private algorithm for ANOVA, enabling privacy-preserving analysis of sensitive data while maintaining statistical utility, including p-value computation and experimental validation.
Contribution
It presents a novel differentially private ANOVA algorithm with a method for p-value calculation, filling a gap in privacy-preserving statistical testing.
Findings
The private ANOVA maintains statistical power with several thousand samples.
The algorithm accurately computes p-values considering privacy noise.
Experimental results demonstrate practical utility in real datasets.
Abstract
Modern society generates an incredible amount of data about individuals, and releasing summary statistics about this data in a manner that provably protects individual privacy would offer a valuable resource for researchers in many fields. We present the first algorithm for analysis of variance (ANOVA) that preserves differential privacy, allowing this important statistical test to be conducted (and the results released) on databases of sensitive information. In addition to our private algorithm for the F test statistic, we show a rigorous way to compute p-values that accounts for the added noise needed to preserve privacy. Finally, we present experimental results quantifying the statistical power of this differentially private version of the test, finding that a sample of several thousand observations is frequently enough to detect variation between groups. The differentially private…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
