Group Fisher Pruning for Practical Network Compression
Liyang Liu, Shilong Zhang, Zhanghui Kuang, Aojun Zhou, Jing-Hao Xue,, Xinjiang Wang, Yimin Chen, Wenming Yang, Qingmin Liao, Wayne Zhang

TL;DR
This paper introduces a general channel pruning method based on Fisher information that effectively compresses complex neural network structures, improving inference speed by reducing memory usage without losing accuracy.
Contribution
It proposes a layer grouping algorithm and a unified Fisher-based importance metric, enabling pruning of coupled channels in complex network architectures.
Findings
Effective pruning of residual and grouped channels
Boosted inference speed without accuracy loss
Validated on multiple network architectures and tasks
Abstract
Network compression has been widely studied since it is able to reduce the memory and computation cost during inference. However, previous methods seldom deal with complicated structures like residual connections, group/depth-wise convolution and feature pyramid network, where channels of multiple layers are coupled and need to be pruned simultaneously. In this paper, we present a general channel pruning approach that can be applied to various complicated structures. Particularly, we propose a layer grouping algorithm to find coupled channels automatically. Then we derive a unified metric based on Fisher information to evaluate the importance of a single channel and coupled channels. Moreover, we find that inference speedup on GPUs is more correlated with the reduction of memory rather than FLOPs, and thus we employ the memory reduction of each channel to normalize the importance. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
MethodsPruning · Pointwise Convolution · Depthwise Convolution · Residual Connection · *Communicated@Fast*How Do I Communicate to Expedia? · Average Pooling · Grouped Convolution · Depthwise Separable Convolution · Kaiming Initialization · 1x1 Convolution
