SlimNets: An Exploration of Deep Model Compression and Acceleration
Ini Oguntola, Subby Olubeko, Christopher Sweeney

TL;DR
This paper evaluates and compares three methods for deep neural network compression and acceleration, demonstrating that combining pruning and knowledge distillation can significantly reduce model size while maintaining high accuracy.
Contribution
It provides a comparative analysis of weight pruning, low rank factorization, and knowledge distillation, highlighting the effectiveness of their combination for model compression.
Findings
Combining pruning and knowledge distillation yields 85x smaller models.
The combined approach retains 96% of original accuracy.
Individual methods are effective but less so than combined techniques.
Abstract
Deep neural networks have achieved increasingly accurate results on a wide variety of complex tasks. However, much of this improvement is due to the growing use and availability of computational resources (e.g use of GPUs, more layers, more parameters, etc). Most state-of-the-art deep networks, despite performing well, over-parameterize approximate functions and take a significant amount of time to train. With increased focus on deploying deep neural networks on resource constrained devices like smart phones, there has been a push to evaluate why these models are so resource hungry and how they can be made more efficient. This work evaluates and compares three distinct methods for deep model compression and acceleration: weight pruning, low rank factorization, and knowledge distillation. Comparisons on VGG nets trained on CIFAR10 show that each of the models on their own are effective,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
MethodsPruning · Dropout · Dense Connections · *Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Softmax · Convolution · Ethereum Customer Service Number +1-833-534-1729
