LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Xiang Li; Tao Qin; Jian Yang; Tie-Yan Liu

arXiv:1610.09893·cs.CL·November 1, 2016·42 cites

LightRNN: Memory and Computation-Efficient Recurrent Neural Networks

Xiang Li, Tao Qin, Jian Yang, Tie-Yan Liu

PDF

Open Access

TL;DR

LightRNN introduces a memory-efficient, fast training recurrent neural network using a novel 2-Component shared embedding that maintains accuracy while significantly reducing model size and training time.

Contribution

The paper proposes a new RNN architecture with shared embeddings that drastically reduces model size and training time without sacrificing performance.

Findings

01

Achieves comparable perplexity to state-of-the-art models.

02

Reduces model size by a factor of 40-100.

03

Speeds up training by a factor of 2.

Abstract

Recurrent neural networks (RNNs) have achieved state-of-the-art performances in many natural language processing tasks, such as language modeling and machine translation. However, when the vocabulary is large, the RNN model will become very big (e.g., possibly beyond the memory capacity of a GPU device) and its training will become very inefficient. In this work, we propose a novel technique to tackle this challenge. The key idea is to use 2-Component (2C) shared embedding for word representations. We allocate every word in the vocabulary into a table, each row of which is associated with a vector, and each column associated with another vector. Depending on its position in the table, a word is jointly represented by two components: a row vector and a column vector. Since the words in the same row share the row vector and the words in the same column share the column vector, we only…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications