DP-PCA: Statistically Optimal and Differentially Private PCA
Xiyang Liu, Weihao Kong, Prateek Jain, Sewoong Oh

TL;DR
This paper introduces DP-PCA, a differentially private PCA algorithm that achieves near-optimal statistical error rates with fewer samples and less error than previous methods, especially for sub-Gaussian data.
Contribution
The paper presents DP-PCA, a novel single-pass private PCA algorithm that overcomes previous limitations by requiring fewer samples and reducing error, using adaptive private mean estimation.
Findings
Achieves nearly optimal error rates for sub-Gaussian data.
Requires only O(d) samples for effective PCA.
Establishes a lower bound showing sub-Gaussian assumptions are necessary.
Abstract
We study the canonical statistical task of computing the principal component from i.i.d.~data in dimensions under -differential privacy. Although extensively studied in literature, existing solutions fall short on two key aspects: () even for Gaussian data, existing private algorithms require the number of samples to scale super-linearly with , i.e., , to obtain non-trivial results while non-private PCA requires only , and () existing techniques suffer from a non-vanishing error even when the randomness in each data point is arbitrarily small. We propose DP-PCA, which is a single-pass algorithm that overcomes both limitations. It is based on a private minibatch gradient ascent method that relies on {\em private mean estimation}, which adds minimal noise required to ensure privacy by adapting to the variance of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Statistical Methods and Bayesian Inference
MethodsPrincipal Components Analysis
