Doping: A technique for efficient compression of LSTM models using   sparse structured additive matrices

Urmish Thakker; Paul N. Whatmough; Zhigang Liu; Matthew Mattina; Jesse; Beu

arXiv:2102.07071·cs.LG·February 16, 2021·1 cites

Doping: A technique for efficient compression of LSTM models using sparse structured additive matrices

Urmish Thakker, Paul N. Whatmough, Zhigang Liu, Matthew Mattina, Jesse, Beu

PDF

Open Access

TL;DR

This paper introduces doping, a novel method that adds sparse matrices to structured matrices for compressing LSTM models efficiently, achieving high compression ratios with minimal accuracy loss.

Contribution

The paper proposes doping and associated regularization techniques to improve structured matrix compression of neural networks, demonstrating state-of-the-art results in NLP tasks.

Findings

01

Achieves 10-25x compression with minor accuracy loss.

02

Outperforms pruning and low-rank methods significantly.

03

Enables hardware-efficient deployment with 2.5-5.5x speed-up.

Abstract

Structured matrices, such as those derived from Kronecker products (KP), are effective at compressing neural networks, but can lead to unacceptable accuracy loss when applied to large models. In this paper, we propose the notion of doping -- addition of an extremely sparse matrix to a structured matrix. Doping facilitates additional degrees of freedom for a small number of parameters, allowing them to independently diverge from the fixed structure. To train LSTMs with doped structured matrices, we introduce the additional parameter matrix while slowly annealing its sparsity level. However, we find that performance degrades as we slowly sparsify the doping matrix, due to co-matrix adaptation (CMA) between the structured and the sparse matrices. We address this over dependence on the sparse matrix using a co-matrix dropout regularization (CMR) scheme. We provide empirical evidence to show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Machine Learning and ELM · Advanced Neural Network Applications

MethodsPruning · Dropout · Kollen-Pollack Learning