TL;DR
Gator is a novel channel pruning method that uses learned gating mechanisms and a new layer dependency formulation to effectively reduce computational costs and improve hardware speedup in neural networks, especially ResNet-50.
Contribution
Gator introduces a flexible pruning approach with learned gates and a new layer dependency model, enabling pruning of complex network structures and achieving state-of-the-art results.
Findings
50% FLOPs reduction with only 0.4% top-5 accuracy drop on ResNet-50.
1.4x faster GPU latency compared to previous methods.
Outperforms MobileNetV2 and SqueezeNet in accuracy at similar runtimes.
Abstract
The rise of neural network (NN) applications has prompted an increased interest in compression, with a particular focus on channel pruning, which does not require any additional hardware. Most pruning methods employ either single-layer operations or global schemes to determine which channels to remove followed by fine-tuning of the network. In this paper we present Gator, a channel-pruning method which temporarily adds learned gating mechanisms for pruning of individual channels, and which is trained with an additional auxiliary loss, aimed at reducing the computational cost due to memory, (theoretical) speedup (in terms of FLOPs), and practical, hardware-specific speedup. Gator introduces a new formulation of dependencies between NN layers which, in contrast to most previous methods, enables pruning of non-sequential parts, such as layers on ResNet's highway, and even removing entire…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning · Pointwise Convolution · Fire Module · Max Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Depthwise Convolution · Depthwise Separable Convolution · Xavier Initialization · Residual Block
