A Generative Word Embedding Model and its Low Rank Positive Semidefinite   Solution

Shaohua Li; Jun Zhu; Chunyan Miao

arXiv:1508.03826·cs.CL·August 18, 2015

A Generative Word Embedding Model and its Low Rank Positive Semidefinite Solution

Shaohua Li, Jun Zhu, Chunyan Miao

PDF

Open Access 1 Repo

TL;DR

This paper introduces a generative word embedding model that incorporates latent factors and offers an interpretable approach, outperforming traditional matrix factorization methods and rivaling neural embedding models on benchmark datasets.

Contribution

The paper proposes a novel generative word embedding model based on a low rank positive semidefinite solution, providing interpretability and scalability.

Findings

01

Competitive with word2vec on benchmarks

02

Outperforms other matrix factorization methods

03

Scalable optimization via eigendecomposition

Abstract

Most existing word embedding methods can be categorized into Neural Embedding Models and Matrix Factorization (MF)-based methods. However some models are opaque to probabilistic interpretation, and MF-based methods, typically solved using Singular Value Decomposition (SVD), may incur loss of corpus information. In addition, it is desirable to incorporate global latent factors, such as topics, sentiments or writing styles, into the word embedding model. Since generative models provide a principled way to incorporate latent factors, we propose a generative word embedding model, which is easy to interpret, and can serve as a basis of more sophisticated latent factor models. The model inference reduces to a low rank weighted positive semidefinite approximation problem. Its optimization is approached by eigendecomposition on a submatrix, followed by online blockwise regression, which is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

askerlee/topicvec
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text and Document Classification Technologies