Streaming Heteroscedastic Probabilistic PCA with Missing Data
Kyle Gilman, David Hong, Jeffrey A. Fessler, Laura Balzano

TL;DR
This paper introduces SHASTA-PCA, a streaming algorithm that effectively estimates low-dimensional subspaces from high-dimensional, heteroscedastic, and incomplete data, outperforming existing methods.
Contribution
It develops a novel streaming PCA method that handles heteroscedastic noise and missing data using stochastic EM, with low computational and memory requirements.
Findings
SHASTA-PCA outperforms state-of-the-art streaming PCA in heteroscedastic settings.
The method effectively estimates subspaces from incomplete, high-dimensional streaming data.
Application to astronomy data demonstrates practical utility.
Abstract
Streaming principal component analysis (PCA) is an integral tool in large-scale machine learning for rapidly estimating low-dimensional subspaces from very high-dimensional data arriving at a high rate. However, modern datasets increasingly combine data from a variety of sources, and thus may exhibit heterogeneous quality across samples. Standard streaming PCA algorithms do not account for non-uniform noise, so their subspace estimates can quickly degrade. While the recently proposed Heteroscedastic Probabilistic PCA Technique (HePPCAT) addresses this heterogeneity, it was not designed to handle streaming data, which may exhibit non-stationary behavior. Moreover, HePPCAT does not allow for missing entries in the data, which can be common in streaming data. This paper proposes the Streaming HeteroscedASTic Algorithm for PCA (SHASTA-PCA) to bridge this divide. SHASTA-PCA employs a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBlind Source Separation Techniques · Sparse and Compressive Sensing Techniques · Functional Brain Connectivity Studies
MethodsPrincipal Components Analysis
