Pushing the limits of RNN Compression

Urmish Thakker; Igor Fedorov; Jesse Beu; Dibakar Gope; Chu Zhou,; Ganesh Dasika; Matthew Mattina

arXiv:1910.02558·cs.LG·October 10, 2019

Pushing the limits of RNN Compression

Urmish Thakker, Igor Fedorov, Jesse Beu, Dibakar Gope, Chu Zhou,, Ganesh Dasika, Matthew Mattina

PDF

TL;DR

This paper presents a novel RNN compression method using Kronecker products that achieves 16-38x size reduction with minimal accuracy loss, outperforming existing techniques across multiple benchmarks.

Contribution

Introduces Kronecker product-based compression for RNNs, significantly reducing size while maintaining or improving accuracy compared to prior methods.

Findings

01

Achieves 16-38x compression with minimal accuracy loss.

02

Outperforms pruning and low-rank matrix factorization on 4 benchmarks.

03

Improves inference run-time in resource-constrained environments.

Abstract

Recurrent Neural Networks (RNN) can be difficult to deploy on resource constrained devices due to their size. As a result, there is a need for compression techniques that can significantly compress RNNs without negatively impacting task accuracy. This paper introduces a method to compress RNNs for resource constrained environments using Kronecker product (KP). KPs can compress RNN layers by 16-38x with minimal accuracy loss. We show that KP can beat the task accuracy achieved by other state-of-the-art compression techniques (pruning and low-rank matrix factorization) across 4 benchmarks spanning 3 different applications, while simultaneously improving inference run-time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsKollen-Pollack Learning