Large Enforced Sparse Non-Negative Matrix Factorization
Brendan Gavin, Vijay Gadepally, Jeremy Kepner

TL;DR
This paper introduces a modified NMF algorithm that enforces sparsity in intermediate matrices, enabling efficient large-scale topic modeling while maintaining or improving accuracy and convergence.
Contribution
The paper proposes a simple modification to NMF that enforces sparsity, improving scalability and performance on large datasets without sacrificing accuracy.
Findings
Enforcing sparsity reduces memory usage and computation time.
The modified NMF maintains or improves topic model accuracy.
Sparsity enforcement accelerates convergence of the algorithm.
Abstract
Non-negative matrix factorization (NMF) is a common method for generating topic models from text data. NMF is widely accepted for producing good results despite its relative simplicity of implementation and ease of computation. One challenge with applying NMF to large datasets is that intermediate matrix products often become dense, stressing the memory and compute elements of a system. In this article, we investigate a simple but powerful modification of a common NMF algorithm that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate empirically that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
