Communication-efficient Algorithms for Distributed Stochastic Principal Component Analysis
Dan Garber, Ohad Shamir, Nathan Srebro

TL;DR
This paper develops communication-efficient distributed algorithms for Principal Component Analysis that achieve near-centralized accuracy, introducing correction and iterative methods to improve efficiency and consistency across different data regimes.
Contribution
It introduces a simple correction step and an iterative algorithm for distributed PCA that improve communication efficiency and estimation accuracy compared to existing methods.
Findings
Correction step ensures consistency with centralized ERM for large n.
Iterative algorithm accelerates convergence over previous methods.
Algorithms perform well across various data regimes.
Abstract
We study the fundamental problem of Principal Component Analysis in a statistical distributed setting in which each machine out of stores a sample of points sampled i.i.d. from a single unknown distribution. We study algorithms for estimating the leading principal component of the population covariance matrix that are both communication-efficient and achieve estimation error of the order of the centralized ERM solution that uses all samples. On the negative side, we show that in contrast to results obtained for distributed estimation under convexity assumptions, for the PCA objective, simply averaging the local ERM solutions cannot guarantee error that is consistent with the centralized ERM. We show that this unfortunate phenomena can be remedied by performing a simple correction step which correlates between the individual solutions, and provides an estimator that is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSparse and Compressive Sensing Techniques · Random Matrices and Applications · Distributed Sensor Networks and Detection Algorithms
