Post-training deep neural network pruning via layer-wise calibration
Ivan Lazarevich, Alexander Kozlov, Nikita Malinin

TL;DR
This paper introduces a fast, post-training neural network pruning method that achieves high sparsity with minimal accuracy loss, suitable for deployment on standard hardware, including a data-free version for vision models.
Contribution
The authors propose a novel layer-wise calibration approach for post-training pruning, including a data-free extension for computer vision models, enabling efficient model compression with minimal accuracy drop.
Findings
Achieves ~1.5% accuracy drop at 50% sparsity for ResNet50 on ImageNet.
Data-free pruning with synthetic fractal images yields state-of-the-art results.
Real-data pruning achieves 65% sparsity with ~1% accuracy loss.
Abstract
We present a post-training weight pruning method for deep neural networks that achieves accuracy levels tolerable for the production setting and that is sufficiently fast to be run on commodity hardware such as desktop CPUs or edge devices. We propose a data-free extension of the approach for computer vision models based on automatically-generated synthetic fractal images. We obtain state-of-the-art results for data-free neural network pruning, with ~1.5% top@1 accuracy drop for a ResNet50 on ImageNet at 50% sparsity rate. When using real data, we are able to get a ResNet50 model on ImageNet with 65% sparsity rate in 8-bit precision in a post-training setting with a ~1% top@1 accuracy drop. We release the code as a part of the OpenVINO(TM) Post-Training Optimization tool.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
