Layer-wise Model Pruning based on Mutual Information

Chun Fan; Jiwei Li; Xiang Ao; Fei Wu; Yuxian Meng; Xiaofei Sun

arXiv:2108.12594·cs.CL·August 31, 2021

Layer-wise Model Pruning based on Mutual Information

Chun Fan, Jiwei Li, Xiang Ao, Fei Wu, Yuxian Meng, Xiaofei Sun

PDF

TL;DR

This paper introduces a layer-wise model pruning method based on mutual information that improves speed and accuracy over traditional weight-based pruning by leveraging global training signals and dense representations.

Contribution

It proposes a novel pruning strategy that operates globally using mutual information, leading to more efficient and effective model compression.

Findings

01

Achieves greater speedup than weight-based pruning methods.

02

Maintains higher performance at the same sparsity levels.

03

Operates from a top-down, global perspective for pruning.

Abstract

The proposed pruning strategy offers merits over weight-based pruning techniques: (1) it avoids irregular memory access since representations and matrices can be squeezed into their smaller but dense counterparts, leading to greater speedup; (2) in a manner of top-down pruning, the proposed method operates from a more global perspective based on training signals in the top layer, and prunes each layer by propagating the effect of global signals through layers, leading to better performances at the same sparsity level. Extensive experiments show that at the same sparsity level, the proposed strategy offers both greater speedup and higher performances than weight-based pruning methods (e.g., magnitude pruning, movement pruning).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPruning