Learning Word Representations with Hierarchical Sparse Coding

Dani Yogatama; Manaal Faruqui; Chris Dyer; Noah A. Smith

arXiv:1406.2035·cs.CL·November 7, 2014

Learning Word Representations with Hierarchical Sparse Coding

Dani Yogatama, Manaal Faruqui, Chris Dyer, Noah A. Smith

PDF

TL;DR

This paper introduces a hierarchical sparse coding method for learning word representations, leveraging linguistic insights, with an efficient algorithm capable of handling large corpora, and demonstrates superior performance on multiple NLP benchmarks.

Contribution

The paper presents a novel hierarchical regularization approach for sparse coding in word representation learning, along with a fast stochastic proximal algorithm enabling large-scale training.

Findings

01

Outperforms or matches state-of-the-art on benchmark tasks

02

Efficient learning algorithm suitable for billions of tokens

03

Provides publicly available word embeddings

Abstract

We propose a new method for learning word representations using hierarchical regularization in sparse coding inspired by the linguistic study of word meanings. We show an efficient learning algorithm based on stochastic proximal methods that is significantly faster than previous approaches, making it possible to perform hierarchical sparse coding on a corpus of billions of word tokens. Experiments on various benchmark tasks---word similarity ranking, analogies, sentence completion, and sentiment analysis---demonstrate that the method outperforms or is competitive with state-of-the-art methods. Our word representations are available at \url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.