On the Compression of Recurrent Neural Networks with an Application to   LVCSR acoustic modeling for Embedded Speech Recognition

Rohit Prabhavalkar; Ouais Alsharif; Antoine Bruguier; Ian McGraw

arXiv:1603.08042·cs.CL·May 3, 2016·43 cites

On the Compression of Recurrent Neural Networks with an Application to LVCSR acoustic modeling for Embedded Speech Recognition

Rohit Prabhavalkar, Ouais Alsharif, Antoine Bruguier, Ian McGraw

PDF

Open Access

TL;DR

This paper presents a method for compressing recurrent neural networks, specifically LSTM acoustic models, to enable efficient speech recognition on mobile devices without significant accuracy loss.

Contribution

It introduces a joint compression technique for recurrent and non-recurrent weights, significantly reducing model size while maintaining performance.

Findings

01

LSTM model size reduced to one-third of original

02

Negligible accuracy loss after compression

03

Applicable to embedded speech recognition systems

Abstract

We study the problem of compressing recurrent neural networks (RNNs). In particular, we focus on the compression of RNN acoustic models, which are motivated by the goal of building compact and accurate speech recognition systems which can be run efficiently on mobile devices. In this work, we present a technique for general recurrent model compression that jointly compresses both recurrent and non-recurrent inter-layer weight matrices. We find that the proposed technique allows us to reduce the size of our Long Short-Term Memory (LSTM) acoustic model to a third of its original size with negligible loss in accuracy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing