Lazy stochastic principal component analysis
Michael Wojnowicz, Dinh Nguyen, Li Li, and Xuan Zhao

TL;DR
Lazy SPCA simplifies stochastic PCA for large datasets, reducing computational costs while maintaining the same approximation quality and predictive performance as standard SPCA, especially in distributed environments.
Contribution
We introduce Lazy SPCA, a computationally efficient variant of SPCA that preserves approximation quality and is well-suited for large-scale distributed computation.
Findings
Lazy SPCA reduces computation time significantly in large datasets.
Lazy SPCA maintains the same predictive performance as standard SPCA.
Lazy SPCA outperforms random projections in predictive tasks.
Abstract
Stochastic principal component analysis (SPCA) has become a popular dimensionality reduction strategy for large, high-dimensional datasets. We derive a simplified algorithm, called Lazy SPCA, which has reduced computational complexity and is better suited for large-scale distributed computation. We prove that SPCA and Lazy SPCA find the same approximations to the principal subspace, and that the pairwise distances between samples in the lower-dimensional space is invariant to whether SPCA is executed lazily or not. Empirical studies find downstream predictive performance to be identical for both methods, and superior to random projections, across a range of predictive models (linear regression, logistic lasso, and random forests). In our largest experiment with 4.6 million samples, Lazy SPCA reduced 43.7 hours of computation to 9.9 hours. Overall, Lazy SPCA relies exclusively on matrix…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Face and Expression Recognition · Sparse and Compressive Sensing Techniques
