A geometric approach to archetypal analysis and non-negative matrix factorization
Anil Damle, Yuekai Sun

TL;DR
This paper introduces a geometric method for archetypal analysis and non-negative matrix factorization that efficiently finds extreme points in high-dimensional data, suitable for large distributed datasets with minimal data passes.
Contribution
It presents a novel geometric approach to NMF and archetypal analysis, enabling efficient computation in high-dimensional, large-scale, distributed data environments.
Findings
Efficiently finds extreme points in high dimensions.
Requires only two passes over large datasets.
Applicable to distributed data storage systems.
Abstract
Archetypal analysis and non-negative matrix factorization (NMF) are staples in a statisticians toolbox for dimension reduction and exploratory data analysis. We describe a geometric approach to both NMF and archetypal analysis by interpreting both problems as finding extreme points of the data cloud. We also develop and analyze an efficient approach to finding extreme points in high dimensions. For modern massive datasets that are too large to fit on a single machine and must be stored in a distributed setting, our approach makes only a small number of passes over the data. In fact, it is possible to obtain the NMF or perform archetypal analysis with just two passes over the data.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Computational Physics and Python Applications · Graph Theory and Algorithms
