Sketching Transformed Matrices with Applications to Natural Language   Processing

Yingyu Liang; Zhao Song; Mengdi Wang; Lin F. Yang; Xin Yang

arXiv:2002.09812·cs.DS·February 25, 2020·1 cites

Sketching Transformed Matrices with Applications to Natural Language Processing

Yingyu Liang, Zhao Song, Mengdi Wang, Lin F. Yang, Xin Yang

PDF

Open Access

TL;DR

This paper introduces a space-efficient sketching method for computing products involving large, transformed matrices, enabling scalable matrix decompositions in NLP applications like word embeddings.

Contribution

The paper presents a novel sketching algorithm for transformed matrices that is space-efficient, generalizable to various functions, and applicable to low-rank approximation tasks.

Findings

01

The sketching method achieves small approximation error.

02

It is efficient in both space and computational time.

03

Experimental results validate the theoretical guarantees.

Abstract

Suppose we are given a large matrix $A = (a_{i, j})$ that cannot be stored in memory but is in a disk or is presented in a data stream. However, we need to compute a matrix decomposition of the entry-wisely transformed matrix, $f (A) := (f (a_{i, j}))$ for some function $f$ . Is it possible to do it in a space efficient way? Many machine learning applications indeed need to deal with such large transformed matrices, for example word embedding method in NLP needs to work with the pointwise mutual information (PMI) matrix, while the entrywise transformation makes it difficult to apply known linear algebraic tools. Existing approaches for this problem either need to store the whole matrix and perform the entry-wise transformation afterwards, which is space consuming or infeasible, or need to redesign the learning method, which is application specific and requires substantial remodeling. In this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and Algorithms · Tensor decomposition and applications