Dynamically Hierarchy Revolution: DirNet for Compressing Recurrent   Neural Network on Mobile Devices

Jie Zhang; Xiaolong Wang; Dawei Li; Yalin Wang

arXiv:1806.01248·cs.LG·June 12, 2018·5 cites

Dynamically Hierarchy Revolution: DirNet for Compressing Recurrent Neural Network on Mobile Devices

Jie Zhang, Xiaolong Wang, Dawei Li, Yalin Wang

PDF

Open Access

TL;DR

This paper introduces DirNet, a novel dynamic compression method for RNNs that significantly reduces model size and enables real-time inference on mobile devices with minimal accuracy loss.

Contribution

DirNet employs an optimized dictionary learning algorithm to dynamically adjust compression and sparsity across layers, outperforming prior methods in RNN model compression.

Findings

01

Achieves up to 8x model size reduction on mobile devices

02

Maintains real-time inference with negligible accuracy loss

03

Outperforms previous compression approaches in experiments

Abstract

Recurrent neural networks (RNNs) achieve cutting-edge performance on a variety of problems. However, due to their high computational and memory demands, deploying RNNs on resource constrained mobile devices is a challenging task. To guarantee minimum accuracy loss with higher compression rate and driven by the mobile resource requirement, we introduce a novel model compression approach DirNet based on an optimized fast dictionary learning algorithm, which 1) dynamically mines the dictionary atoms of the projection dictionary matrix within layer to adjust the compression rate 2) adaptively changes the sparsity of sparse codes cross the hierarchical layers. Experimental results on language model and an ASR model trained with a 1000h speech dataset demonstrate that our method significantly outperforms prior approaches. Evaluated on off-the-shelf mobile devices, we are able to reduce the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Topic Modeling