Learning a Latent Simplex in Input-Sparsity Time
Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan, David P. Woodruff, and Samson Zhou

TL;DR
This paper introduces a faster algorithm for learning a latent simplex from data matrices, reducing the dependence on the number of non-zero entries and improving efficiency in applications like clustering and topic modeling.
Contribution
It presents a novel input-sparsity time algorithm that approximates the top-k singular space, enabling efficient latent simplex learning without the previous linear dependence on k.
Findings
Achieves input-sparsity time for latent simplex learning
Provides a low-rank approximation with small angular distance to top-k singular space
Circumvents the need for multiple full matrix-vector products
Abstract
We consider the problem of learning a latent -vertex simplex , given access to , which can be viewed as a data matrix with points that are obtained by randomly perturbing latent points in the simplex (potentially beyond ). A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models, and topic models can be cast as learning a latent simplex. Bhattacharyya and Kannan (SODA, 2020) give an algorithm for learning such a latent simplex in time roughly , where is the number of non-zeros in . We show that the dependence on in the running time is unnecessary given a natural assumption about the mass of the top singular values of , which holds in many of these applications. Further, we show this assumption is necessary, as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms
