Learning Compact Recurrent Neural Networks

Zhiyun Lu; Vikas Sindhwani; Tara N. Sainath

arXiv:1604.02594·cs.LG·April 12, 2016·21 cites

Learning Compact Recurrent Neural Networks

Zhiyun Lu, Vikas Sindhwani, Tara N. Sainath

PDF

Open Access

TL;DR

This paper explores methods to compress recurrent neural networks, including LSTMs, by using low-rank factorizations and parameter sharing, achieving significant size reduction with minimal performance loss.

Contribution

It introduces a hybrid compression strategy combining structured matrices and shared low-rank factors to effectively reduce RNN and LSTM model sizes.

Findings

01

Reduced LSTM parameters by 75% with only 0.3% WER increase.

02

Hybrid approach outperforms other compression techniques.

03

Maintains state-of-the-art performance on speech recognition tasks.

Abstract

Recurrent neural networks (RNNs), including long short-term memory (LSTM) RNNs, have produced state-of-the-art results on a variety of speech recognition tasks. However, these models are often too large in size for deployment on mobile devices with memory and latency constraints. In this work, we study mechanisms for learning compact RNNs and LSTMs via low-rank factorizations and parameter sharing schemes. Our goal is to investigate redundancies in recurrent architectures where compression can be admitted without losing performance. A hybrid strategy of using structured matrices in the bottom layers and shared low-rank factors on the top layers is found to be particularly effective, reducing the parameters of a standard LSTM by 75%, at a small cost of 0.3% increase in WER, on a 2,000-hr English Voice Search task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing

MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory