Analysis of PCA Algorithms in Distributed Environments
Tarek Elgamal, Mohamed Hefeeda

TL;DR
This paper analyzes various PCA algorithms in distributed environments, focusing on their scalability limitations, computational and communication complexities, and software implementations to guide better algorithm selection and development.
Contribution
It provides a comprehensive comparison of distributed PCA methods, highlighting their scalability bottlenecks and offering insights for selecting suitable algorithms and designing new scalable solutions.
Findings
Identifies key bottlenecks in distributed PCA scalability
Compares time and communication complexities of methods
Recommends software libraries for different scenarios
Abstract
Classical machine learning algorithms often face scalability bottlenecks when they are applied to large-scale data. Such algorithms were designed to work with small data that is assumed to fit in the memory of one machine. In this report, we analyze different methods for computing an important machine learing algorithm, namely Principal Component Analysis (PCA), and we comment on its limitations in supporting large datasets. The methods are analyzed and compared across two important metrics: time complexity and communication complexity. We consider the worst-case scenarios for both metrics, and we identify the software libraries that implement each method. The analysis in this report helps researchers and engineers in (i) understanding the main bottlenecks for scalability in different PCA algorithms, (ii) choosing the most appropriate method and software library for a given application…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Face and Expression Recognition · Blind Source Separation Techniques
MethodsPrincipal Components Analysis
