Combinatorial Sparse PCA Beyond the Spiked Identity Model
Syamantak Kumar, Purnamrita Sarkar, Kevin Tian, Peiyuan Zhang

TL;DR
This paper introduces a new combinatorial algorithm for sparse PCA that works beyond the traditional spiked identity model, providing theoretical guarantees and practical evaluation on real data.
Contribution
It presents the first combinatorial method with provable success for general covariance matrices in sparse PCA, extending beyond the spiked identity model.
Findings
Counterexamples show limitations of existing combinatorial algorithms
New combinatorial method with global convergence guarantees
Method performs well on synthetic and real-world datasets
Abstract
Sparse PCA is one of the most well-studied problems in high-dimensional statistics. In this problem, we are given samples from a distribution with covariance , whose top eigenvector is -sparse. Existing sparse PCA algorithms can be broadly categorized into (1) combinatorial algorithms (e.g., diagonal or elementwise covariance thresholding) and (2) SDP-based algorithms. While combinatorial algorithms are much simpler, they are typically only analyzed under the spiked identity model (where for some ), whereas SDP-based algorithms require no additional assumptions on . We demonstrate explicit counterexample covariances against the success of standard combinatorial algorithms for sparse PCA, when moving beyond the spiked identity model. In light of this discrepancy, we give the first combinatorial method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Machine Learning and Algorithms
