MultiPruner: Balanced Structure Removal in Foundation Models
J. Pablo Mu\~noz, Jinjie Yuan, Nilesh Jain

TL;DR
MultiPruner introduces a multidimensional, iterative pruning method for large pre-trained models that balances structure removal across residual blocks, channels, and attention heads, leading to improved compression and accuracy.
Contribution
It extends existing pruning techniques by proposing a balanced, fine-grained, multidimensional pruning strategy that enhances zero-shot accuracy and model compression.
Findings
Outperforms recent training-free pruning methods in accuracy.
Achieves higher compression ratios with fewer resources.
Demonstrates effectiveness across various large models.
Abstract
Recently, state-of-the-art approaches for pruning large pre-trained models (LPMs) have demonstrated that the training-free removal of non-critical residual blocks in Transformers is viable for reducing model size, achieving results that outperform previous training-free pruning approaches. Motivated by these findings, we extend BlockPruner (Zhong et al., 2024) and propose MultiPruner, a pruning approach that surpasses recent training-free pruning methods by adopting a multidimensional, iterative, fine-grained pruning strategy. In MultiPruner, multidimensional pruning reinstates the structural balance in block-pruned models by sequentially compressing along three dimensions: i) residual blocks, ii) channels of multilayer perceptrons (MLP), and iii) attention heads. This solution enhances zero-shot accuracy on downstream tasks compared to other techniques while improving model compression…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDrilling and Well Engineering · Infrastructure Maintenance and Monitoring · Grouting, Rheology, and Soil Mechanics
MethodsSoftmax · Attention Is All You Need · Pruning
