Structured Probabilistic Pruning for Convolutional Neural Network Acceleration
Huan Wang, Qiming Zhang, Yuehai Wang, Haoji Hu

TL;DR
This paper introduces Structured Probabilistic Pruning, a novel method for CNN acceleration that probabilistically prunes weights, achieving significant speedups with minimal accuracy loss and applicability to various network architectures.
Contribution
The paper presents a probabilistic pruning approach that improves CNN acceleration by dynamically adjusting pruning probabilities, outperforming deterministic methods.
Findings
Achieves 4x speedup with only 0.3% top-5 accuracy loss on AlexNet.
Attains 0.8% accuracy loss with 4x speedup on VGG-16.
Successfully accelerates ResNet-50 with 2x speedup and minimal accuracy loss.
Abstract
In this paper, we propose a novel progressive parameter pruning method for Convolutional Neural Network acceleration, named Structured Probabilistic Pruning (SPP), which effectively prunes weights of convolutional layers in a probabilistic manner. Unlike existing deterministic pruning approaches, where unimportant weights are permanently eliminated, SPP introduces a pruning probability for each weight, and pruning is guided by sampling from the pruning probabilities. A mechanism is designed to increase and decrease pruning probabilities based on importance criteria in the training process. Experiments show that, with 4x speedup, SPP can accelerate AlexNet with only 0.3% loss of top-5 accuracy and VGG-16 with 0.8% loss of top-5 accuracy in ImageNet classification. Moreover, SPP can be directly applied to accelerate multi-branch CNN networks, such as ResNet, without specific adaptations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsPruning · Average Pooling · Global Average Pooling · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Kaiming Initialization · Residual Connection · Convolution · Residual Block
