TL;DR
This paper introduces a balanced filter pruning method that achieves near-optimal performance and fast pruning speed by layer-wise pruning based on loss variation, applicable to various architectures with minimal retraining.
Contribution
The proposed method enables layer-wise pruning without iterative retraining, efficiently finds pruning thresholds, and incorporates layer group pruning for networks with short connections.
Findings
Outperforms many state-of-the-art pruning methods.
Achieves a good balance between model performance and pruning speed.
Applicable to common neural network architectures.
Abstract
Filter pruning has drawn more attention since resource constrained platform requires more compact model for deployment. However, current pruning methods suffer either from the inferior performance of one-shot methods, or the expensive time cost of iterative training methods. In this paper, we propose a balanced filter pruning method for both performance and pruning speed. Based on the filter importance criteria, our method is able to prune a layer with approximate layer-wise optimal pruning rate at preset loss variation. The network is pruned in the layer-wise way without the time consuming prune-retrain iteration. If a pre-defined pruning rate for the entire network is given, we also introduce a method to find the corresponding loss variation threshold with fast converging speed. Moreover, we propose the layer group pruning and channel selection mechanism for channel alignment in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
