Near-Optimal Stochastic Approximation for Online Principal Component   Estimation

Chris Junchi Li; Mengdi Wang; Han Liu; Tong Zhang

arXiv:1603.05305·math.OC·October 9, 2017·Math. Program.

Near-Optimal Stochastic Approximation for Online Principal Component Estimation

Chris Junchi Li, Mengdi Wang, Han Liu, Tong Zhang

PDF

TL;DR

This paper provides a nearly optimal finite-sample error analysis for online PCA algorithms by framing them as stochastic approximation processes, achieving bounds close to theoretical lower limits.

Contribution

It introduces a novel stochastic approximation framework for online PCA and establishes the first nearly optimal finite-sample error bounds under subgaussian data assumptions.

Findings

01

Finite-sample error bounds match minimax lower bounds

02

Online PCA analyzed as stochastic approximation

03

First to achieve nearly optimal bounds for online PCA

Abstract

Principal component analysis (PCA) has been a prominent tool for high-dimensional data analysis. Online algorithms that estimate the principal component by processing streaming data are of tremendous practical and theoretical interests. Despite its rich applications, theoretical convergence analysis remains largely open. In this paper, we cast online PCA into a stochastic nonconvex optimization problem, and we analyze the online PCA algorithm as a stochastic approximation iteration. The stochastic approximation iteration processes data points incrementally and maintains a running estimate of the principal component. We prove for the first time a nearly optimal finite-sample error bound for the online PCA algorithm. Under the subgaussian assumption, we show that the finite-sample error bound closely matches the minimax information lower bound.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPrincipal Components Analysis