A Greedy Hierarchical Approach to Whole-Network Filter-Pruning in CNNs
Kiran Purohit, Anurag Reddy Parvathgari, Sourangshu Bhattacharya

TL;DR
This paper introduces a fast, hierarchical filter pruning method for CNNs that efficiently reduces model size and computational cost while maintaining accuracy, using a greedy approach based on classification loss.
Contribution
It proposes a novel two-level greedy hierarchical pruning algorithm that is computationally efficient and outperforms existing methods on multiple CNN architectures.
Findings
Reduces RAM from 7.6 GB to 1.5 GB for ResNext101.
Achieves 94% FLOPS reduction on CIFAR-10.
Maintains accuracy after pruning.
Abstract
Deep convolutional neural networks (CNNs) have achieved impressive performance in many computer vision tasks. However, their large model sizes require heavy computational resources, making pruning redundant filters from existing pre-trained CNNs an essential task in developing efficient models for resource-constrained devices. Whole-network filter pruning algorithms prune varying fractions of filters from each layer, hence providing greater flexibility. Current whole-network pruning methods are either computationally expensive due to the need to calculate the loss for each pruned filter using a training dataset, or use various heuristic / learned criteria for determining the pruning fractions for each layer. This paper proposes a two-level hierarchical approach for whole-network filter pruning which is efficient and uses the classification loss as the final criterion. The lower-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsPruning
