A randomized algorithm for principal component analysis

Vladimir Rokhlin; Arthur Szlam; and Mark Tygert

arXiv:0809.2274·stat.CO·June 4, 2010·SIAM J. Matrix Anal. Appl.·21 cites

A randomized algorithm for principal component analysis

Vladimir Rokhlin, Arthur Szlam, and Mark Tygert

PDF

Open Access

TL;DR

This paper introduces an efficient randomized algorithm for principal component analysis that achieves near-optimal accuracy for low-rank matrix approximations across matrices of any size, with demonstrated numerical effectiveness.

Contribution

The paper presents a novel randomized algorithm for PCA that guarantees high accuracy regardless of matrix size, improving upon prior methods lacking such guarantees.

Findings

01

Achieves near-optimal spectral norm accuracy in low-rank approximations

02

Works efficiently for matrices of arbitrary sizes

03

Validated through numerical experiments

Abstract

Principal component analysis (PCA) requires the computation of a low-rank approximation to a matrix containing the data being analyzed. In many applications of PCA, the best possible accuracy of any rank-deficient approximation is at most a few digits (measured in the spectral norm, relative to the spectral norm of the matrix being approximated). In such circumstances, efficient algorithms have not come with guarantees of good accuracy, unless one or both dimensions of the matrix being approximated are small. We describe an efficient algorithm for the low-rank approximation of matrices that produces accuracy very close to the best possible, for matrices of arbitrary sizes. We illustrate our theoretical results via several numerical examples.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Tensor decomposition and applications