Layer-wise Model Pruning based on Mutual Information
Chun Fan, Jiwei Li, Xiang Ao, Fei Wu, Yuxian Meng, Xiaofei Sun

TL;DR
This paper introduces a layer-wise model pruning method based on mutual information that improves speed and accuracy over traditional weight-based pruning by leveraging global training signals and dense representations.
Contribution
It proposes a novel pruning strategy that operates globally using mutual information, leading to more efficient and effective model compression.
Findings
Achieves greater speedup than weight-based pruning methods.
Maintains higher performance at the same sparsity levels.
Operates from a top-down, global perspective for pruning.
Abstract
The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level. Extensive experiments show that at the same sparsity level, the proposed strategy offers both greater speedup and higher performances than weight-based pruning methods (e.g., magnitude pruning, movement pruning).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPruning
