Softer Pruning, Incremental Regularization
Linhang Cai, Zhulin An, Chuanguang Yang, Yongjun Xu

TL;DR
This paper introduces SofteR Filter Pruning (SRFP) and its variant ASRFP, which improve network pruning by retaining and decaying pruned filters, leading to better compression and accuracy in deep neural networks.
Contribution
The paper proposes SRFP and ASRFP methods that utilize trained pruned filters through decay, enhancing pruning effectiveness and transferability compared to existing methods.
Findings
ASRFP prunes 40% of parameters on ResNet-34 with minimal accuracy loss.
Methods outperform existing pruning techniques across various networks and datasets.
Theoretical analysis shows SRFP and ASRFP as incremental regularization of pruned filters.
Abstract
Network pruning is widely used to compress Deep Neural Networks (DNNs). The Soft Filter Pruning (SFP) method zeroizes the pruned filters during training while updating them in the next training epoch. Thus the trained information of the pruned filters is completely dropped. To utilize the trained pruned filters, we proposed a SofteR Filter Pruning (SRFP) method and its variant, Asymptotic SofteR Filter Pruning (ASRFP), simply decaying the pruned weights with a monotonic decreasing parameter. Our methods perform well across various networks, datasets and pruning rates, also transferable to weight pruning. On ILSVRC-2012, ASRFP prunes 40% of the parameters on ResNet-34 with 1.63% top-1 and 0.68% top-5 accuracy improvement. In theory, SRFP and ASRFP are an incremental regularization of the pruned filters. Besides, We note that SRFP and ASRFP pursue better results while slowing down the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Anomaly Detection Techniques and Applications · Domain Adaptation and Few-Shot Learning
MethodsPruning
