Learning both Weights and Connections for Efficient Neural Networks
Song Han, Jeff Pool, John Tran, William J. Dally

TL;DR
This paper introduces a method to significantly reduce neural network size and computation by learning and pruning important connections, enabling efficient deployment without accuracy loss.
Contribution
It presents a three-step pruning and retraining approach that automatically learns which connections are essential during training.
Findings
Reduced AlexNet parameters by 9x on ImageNet
Decreased VGG-16 parameters by 13x without accuracy loss
Achieved order-of-magnitude efficiency improvements
Abstract
Neural networks are both computationally intensive and memory intensive, making them difficult to deploy on embedded systems. Also, conventional networks fix the architecture before training starts; as a result, training cannot improve the architecture. To address these limitations, we describe a method to reduce the storage and computation required by neural networks by an order of magnitude without affecting their accuracy by learning only the important connections. Our method prunes redundant connections using a three-step method. First, we train the network to learn which connections are important. Next, we prune the unimportant connections. Finally, we retrain the network to fine tune the weights of the remaining connections. On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6.7 million, without incurring accuracy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
Methods1x1 Convolution · Convolution · Local Response Normalization · Grouped Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · Dropout · Dense Connections · Max Pooling · Softmax · How do I speak to a person at Expedia?-/+/
