Robust and Differentially Private PCA for non-Gaussian data
Minwoo Kim, Sungkyu Jung

TL;DR
This paper introduces a differentially private PCA method that is robust to heavy-tailed and contaminated data, overcoming limitations of previous approaches by leveraging elliptical distribution properties and bounded transformations.
Contribution
The paper presents a novel differentially private PCA technique applicable to heavy-tailed and contaminated data, with theoretical guarantees and improved empirical performance.
Findings
Outperforms existing methods in non-Gaussian data scenarios
Provides robustness against data contamination
Achieves effective private subspace recovery
Abstract
Recent advances have sparked significant interest in the development of privacy-preserving Principal Component Analysis (PCA). However, many existing approaches rely on restrictive assumptions, such as assuming sub-Gaussian data or being vulnerable to data contamination. Additionally, some methods are computationally expensive or depend on unknown model parameters that must be estimated, limiting their accessibility for data analysts seeking privacy-preserving PCA. In this paper, we propose a differentially private PCA method applicable to heavy-tailed and potentially contaminated data. Our approach leverages the property that the covariance matrix of properly rescaled data preserves eigenvectors and their order under elliptical distributions, which include Gaussian and heavy-tailed distributions. By applying a bounded transformation, we enable straightforward computation of principal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
