TL;DR
This paper introduces a communication-efficient distributed algorithm for eigenspace estimation, particularly for PCA, that aligns local solutions with minimal communication, matching centralized accuracy.
Contribution
It develops a novel alignment scheme for distributed eigenspace estimation that requires only one round of communication, addressing issues of non-uniqueness in local solutions.
Findings
Achieves centralized PCA error rate in distributed setting
Requires only a single communication round
Effective for spectral problems with rotational symmetry
Abstract
Distributed computing is a standard way to scale up machine learning and data science algorithms to process large amounts of data. In such settings, avoiding communication amongst machines is paramount for achieving high performance. Rather than distribute the computation of existing algorithms, a common practice for avoiding communication is to compute local solutions or parameter estimates on each machine and then combine the results; in many convex optimization problems, even simple averaging of local solutions can work well. However, these schemes do not work when the local solutions are not unique. Spectral methods are a collection of such problems, where solutions are orthonormal bases of the leading invariant subspace of an associated data matrix, which are only unique up to rotation and reflections. Here, we develop a communication-efficient distributed algorithm for computing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
