Soft Masking for Cost-Constrained Channel Pruning
Ryan Humble, Maying Shen, Jorge Albericio Latorre, Eric Darve1, Jose, M. Alvarez

TL;DR
This paper introduces SMCP, a novel soft masking approach for channel pruning in CNNs that allows pruned channels to recover, improving accuracy under cost constraints on ImageNet and PASCAL VOC.
Contribution
We propose a soft mask re-parameterization method for channel pruning that enables channels to adaptively return, addressing accuracy loss in traditional pruning methods.
Findings
Outperforms prior channel pruning methods on ImageNet.
Achieves better accuracy with fewer channels.
Effectively balances model size and performance.
Abstract
Structured channel pruning has been shown to significantly accelerate inference time for convolution neural networks (CNNs) on modern hardware, with a relatively minor loss of network accuracy. Recent works permanently zero these channels during training, which we observe to significantly hamper final accuracy, particularly as the fraction of the network being pruned increases. We propose Soft Masking for cost-constrained Channel Pruning (SMCP) to allow pruned channels to adaptively return to the network while simultaneously pruning towards a target cost constraint. By adding a soft mask re-parameterization of the weights and channel pruning from the perspective of removing input channels, we allow gradient updates to previously pruned channels and the opportunity for the channels to later return to the network. We then formulate input channel pruning as a global resource allocation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning
MethodsPruning · Convolution
