TL;DR
This paper provides new finite sample guarantees for PCA that are valid even with non-isotropic and data-dependent noise, improving previous results and enabling applications like robust PCA with missing data.
Contribution
It introduces novel finite sample bounds for PCA under correlated, non-isotropic, and data-dependent noise, advancing theoretical understanding and practical guarantees.
Findings
Sample complexity near-optimal in certain regimes
Guarantees for PCA with sparse data-dependent noise
Applicability to PCA with missing data
Abstract
This work obtains novel finite sample guarantees for Principal Component Analysis (PCA). These hold even when the corrupting noise is non-isotropic, and a part (or all of it) is data-dependent. Because of the latter, in general, the noise and the true data are correlated. The results in this work are a significant improvement over those given in our earlier work where this "correlated-PCA" problem was first studied. In fact, in certain regimes, our results imply that the sample complexity required to achieve subspace recovery error that is a constant fraction of the noise level is near-optimal. Useful corollaries of our result include guarantees for PCA in sparse data-dependent noise and for PCA with missing data. An important application of the former is in proving correctness of the subspace update step of a popular online algorithm for dynamic robust PCA.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
