Do semidefinite relaxations solve sparse PCA up to the information   limit?

Robert Krauthgamer; Boaz Nadler; Dan Vilenchik

arXiv:1306.3690·math.ST·June 4, 2015

Do semidefinite relaxations solve sparse PCA up to the information limit?

Robert Krauthgamer, Boaz Nadler, Dan Vilenchik

PDF

TL;DR

None

Contribution

None

Abstract

Estimating the leading principal components of data, assuming they are sparse, is a central task in modern high-dimensional statistics. Many algorithms were developed for this sparse PCA problem, from simple diagonal thresholding to sophisticated semidefinite programming (SDP) methods. A key theoretical question is under what conditions can such algorithms recover the sparse principal components? We study this question for a single-spike model with an $ℓ_{0}$ -sparse eigenvector, in the asymptotic regime as dimension $p$ and sample size $n$ both tend to infinity. Amini and Wainwright [Ann. Statist. 37 (2009) 2877-2921] proved that for sparsity levels $k \geq Ω (n / lo g p)$ , no algorithm, efficient or not, can reliably recover the sparse eigenvector. In contrast, for $k \leq O (n / lo g p)$ , diagonal thresholding is consistent. It was further conjectured that an SDP approach may…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPrincipal Components Analysis