Minimax sparse principal subspace estimation in high dimensions

Vincent Q. Vu; Jing Lei

arXiv:1211.0373·math.ST·January 6, 2014

Minimax sparse principal subspace estimation in high dimensions

Vincent Q. Vu, Jing Lei

PDF

TL;DR

This paper investigates minimax rates for sparse principal subspace estimation in high-dimensional settings, introducing new notions of sparsity and establishing optimal bounds with a novel proof technique.

Contribution

It introduces two notions of $\, ext{ell}_q$ sparsity for subspaces, derives optimal minimax bounds, and develops a new variational $\, ext{sin} heta$ theorem for spectral estimation.

Findings

01

Bounds are optimal for row sparse subspaces.

02

Bounds are nearly optimal for column sparse subspaces.

03

The rates match known results for sparse regression.

Abstract

We study sparse principal components analysis in high dimensions, where $p$ (the number of variables) can be much larger than $n$ (the number of observations), and analyze the problem of estimating the subspace spanned by the principal eigenvectors of the population covariance matrix. We introduce two complementary notions of $ℓ_{q}$ subspace sparsity: row sparsity and column sparsity. We prove nonasymptotic lower and upper bounds on the minimax subspace estimation error for $0 \leq q \leq 1$ . The bounds are optimal for row sparse subspaces and nearly optimal for column sparse subspaces, they apply to general classes of covariance matrices, and they show that $ℓ_{q}$ constrained estimates can achieve optimal minimax rates without restrictive spiked covariance conditions. Interestingly, the form of the rates matches known results for sparse regression when the effective noise variance is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.