TL;DR
This paper introduces a structured multi-hashing method for deep neural network compression that significantly reduces model size while maintaining or improving accuracy, enabling deployment on resource-constrained devices.
Contribution
The authors propose a novel end-to-end trainable multi-hashing technique based on matrix products, allowing direct control of model size across various architectures.
Findings
Reduced EfficientNet-B4 to B0 size with 3% accuracy gain
Compressed ResNet32 by 75% with no accuracy loss
Achieved 10x compression on CIFAR10 with over 90% accuracy
Abstract
Despite the success of deep neural networks (DNNs), state-of-the-art models are too large to deploy on low-resource devices or common server configurations in which multiple models are held in memory. Model compression methods address this limitation by reducing the memory footprint, latency, or energy consumption of a model with minimal impact on accuracy. We focus on the task of reducing the number of learnable variables in the model. In this work we combine ideas from weight hashing and dimensionality reductions resulting in a simple and powerful structured multi-hashing method based on matrix products that allows direct control of model size of any deep network and is trained end-to-end. We demonstrate the strength of our approach by compressing models from the ResNet, EfficientNet, and MobileNet architecture families. Our method allows us to drastically decrease the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Structured Multi-Hashing for Model Compression· youtube
Taxonomy
MethodsRMSProp · ReLU6 · Hard Swish · 7 Fastest Ways to Call American Airlines Reservations Number (USA Guide) · Tether Customer Service Number +1-833-534-1729 · Global Average Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Bottleneck Residual Block · Depthwise Convolution · Pointwise Convolution
