Learning a Latent Simplex in Input-Sparsity Time

Ainesh Bakshi; Chiranjib Bhattacharyya; Ravi Kannan; David P. Woodruff; and Samson Zhou

arXiv:2105.08005·cs.LG·May 18, 2021·1 cites

Learning a Latent Simplex in Input-Sparsity Time

Ainesh Bakshi, Chiranjib Bhattacharyya, Ravi Kannan, David P. Woodruff, and Samson Zhou

PDF

Open Access

TL;DR

This paper introduces a faster algorithm for learning a latent simplex from data matrices, reducing the dependence on the number of non-zero entries and improving efficiency in applications like clustering and topic modeling.

Contribution

It presents a novel input-sparsity time algorithm that approximates the top-k singular space, enabling efficient latent simplex learning without the previous linear dependence on k.

Findings

01

Achieves input-sparsity time for latent simplex learning

02

Provides a low-rank approximation with small angular distance to top-k singular space

03

Circumvents the need for multiple full matrix-vector products

Abstract

We consider the problem of learning a latent $k$ -vertex simplex $K \subset R^{d}$ , given access to $A \in R^{d \times n}$ , which can be viewed as a data matrix with $n$ points that are obtained by randomly perturbing latent points in the simplex $K$ (potentially beyond $K$ ). A large class of latent variable models, such as adversarial clustering, mixed membership stochastic block models, and topic models can be cast as learning a latent simplex. Bhattacharyya and Kannan (SODA, 2020) give an algorithm for learning such a latent simplex in time roughly $O (k \cdot nnz (A))$ , where $nnz (A)$ is the number of non-zeros in $A$ . We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$ , which holds in many of these applications. Further, we show this assumption is necessary, as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms