Optimal detection of sparse principal components in high dimension

Quentin Berthet; Philippe Rigollet

arXiv:1202.5070·math.ST·January 30, 2014

Optimal detection of sparse principal components in high dimension

Quentin Berthet, Philippe Rigollet

PDF

TL;DR

This paper analyzes the detection of sparse principal components in high-dimensional data, proposing a minimax optimal test and a computationally efficient relaxation, highlighting a fundamental trade-off between statistical power and computational feasibility.

Contribution

It introduces a near-optimal convex relaxation for detecting sparse principal components and provides evidence of an inherent statistical-computational trade-off.

Findings

01

The proposed relaxation detects sparse PCs at near-optimal levels.

02

Simulation results show good performance of the relaxation method.

03

Theoretical evidence suggests no significant improvements are possible beyond current methods.

Abstract

We perform a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general, and we describe a computationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.