An Entropy-based Pruning Method for CNN Compression
Jian-Hao Luo, Jianxin Wu

TL;DR
This paper introduces an entropy-based filter pruning method for CNNs that effectively accelerates and compresses models while reducing memory usage during training, demonstrated on benchmark datasets.
Contribution
The paper proposes a novel entropy-based filter importance criterion for CNN pruning, improving compression and acceleration performance over existing methods.
Findings
Achieved 3.3x speed-up and 16.64x compression on VGG-16.
Achieved 1.54x acceleration and 1.47x compression on ResNet-50.
Maintained about 1% top-5 accuracy loss.
Abstract
This paper aims to simultaneously accelerate and compress off-the-shelf CNN models via filter pruning strategy. The importance of each filter is evaluated by the proposed entropy-based method first. Then several unimportant filters are discarded to get a smaller CNN model. Finally, fine-tuning is adopted to recover its generalization ability which is damaged during filter pruning. Our method can reduce the size of intermediate activations, which would dominate most memory footprint during model training stage but is less concerned in previous compression methods. Experiments on the ILSVRC-12 benchmark demonstrate the effectiveness of our method. Compared with previous filter importance evaluation criteria, our entropy-based method obtains better performance. We achieve 3.3x speed-up and 16.64x compression on VGG-16, 1.54x acceleration and 1.47x compression on ResNet-50, both with about…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition
MethodsPruning
