Structured Multi-Hashing for Model Compression

Elad Eban; Yair Movshovitz-Attias; Hao Wu; Mark Sandler; Andrew Poon,; Yerlan Idelbayev; Miguel A. Carreira-Perpinan

arXiv:1911.11177·cs.LG·November 27, 2019

Structured Multi-Hashing for Model Compression

Elad Eban, Yair Movshovitz-Attias, Hao Wu, Mark Sandler, Andrew Poon,, Yerlan Idelbayev, Miguel A. Carreira-Perpinan

PDF

1 Video

TL;DR

This paper introduces a structured multi-hashing method for deep neural network compression that significantly reduces model size while maintaining or improving accuracy, enabling deployment on resource-constrained devices.

Contribution

The authors propose a novel end-to-end trainable multi-hashing technique based on matrix products, allowing direct control of model size across various architectures.

Findings

01

Reduced EfficientNet-B4 to B0 size with 3% accuracy gain

02

Compressed ResNet32 by 75% with no accuracy loss

03

Achieved 10x compression on CIFAR10 with over 90% accuracy

Abstract

Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this limitation by reducing the memory footprint, latency, or energy consumption of a model with minimal impact on accuracy. We focus on the task of reducing the number of learnable variables in the model. In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of model size of any deep network and is trained end-to-end. We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and MobileNet architecture families. Our method allows us to drastically decrease the number of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Structured Multi-Hashing for Model Compression· youtube

Taxonomy

MethodsRMSProp · ReLU6 · Hard Swish · 7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Tether Customer Service Number +1-833-534-1729 · Global Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Depthwise Convolution · Pointwise Convolution