Beyond Sin-Squared Error: Linear-Time Entrywise Uncertainty Quantification for Streaming PCA
Syamantak Kumar, Shourya Pandey, Purnamrita Sarkar

TL;DR
This paper introduces a new statistical inference framework for streaming PCA using Oja's algorithm, enabling entrywise uncertainty quantification with sharp error bounds and efficient variance estimation methods.
Contribution
It provides the first sharp entrywise uncertainty quantification for streaming PCA, including a Bernstein-type concentration bound, a CLT, and a computationally efficient variance estimation algorithm.
Findings
Derived a sharp Bernstein-type concentration bound for eigenvector entries.
Established a CLT for a subset of eigenvector entries.
Proposed a median-of-means based variance estimator with empirical accuracy.
Abstract
We propose a novel statistical inference framework for streaming principal component analysis (PCA) using Oja's algorithm, enabling the construction of confidence intervals for individual entries of the estimated eigenvector. Most existing works on streaming PCA focus on providing sharp sin-squared error guarantees. Recently, there has been some interest in uncertainty quantification for the sin-squared error. However, uncertainty quantification or sharp error guarantees for entries of the estimated eigenvector in the streaming setting remains largely unexplored. We derive a sharp Bernstein-type concentration bound for elements of the estimated vector matching the optimal error rate up to logarithmic factors. We also establish a Central Limit Theorem for a suitably centered and scaled subset of the entries. To efficiently estimate the coordinate-wise variance, we introduce a provably…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Gaussian Processes and Bayesian Inference · Statistical Methods and Inference
MethodsPrincipal Components Analysis · Focus
