Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun, Han Yu, Lizhen Cui, Xiaoxiao Li

TL;DR
This paper introduces a Shapley Value-based non-uniform pruning method for large language models that assigns different pruning levels to layers based on their contribution, improving efficiency and performance.
Contribution
The paper proposes SV-NUP, a novel non-uniform pruning approach utilizing Shapley Values to quantify layer importance and optimize pruning across large language models.
Findings
Significant perplexity reduction on LLaMA models.
Non-uniform pruning outperforms uniform methods.
Efficient approximation reduces computational overhead.
Abstract
Pruning large language models (LLMs) is a promising solution for reducing model sizes and computational complexity while preserving performance. Traditional layer-wise pruning methods often adopt a uniform sparsity approach across all layers, which leads to suboptimal performance due to the varying significance of individual transformer layers within the model not being accounted for. To this end, we propose the Shapley Value-based Non-Uniform Pruning (SV-NUP) method for LLMs. This approach quantifies the contribution of each transformer layer to the overall model performance, enabling the assignment of tailored pruning budgets to different layers to retain critical parameters. To further improve efficiency, we design the Sliding Window-based Shapley Value approximation method. It substantially reduces computational overhead compared to exact SV calculation methods. Extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems
MethodsADaptive gradient method with the OPTimal convergence rate · Pruning · OPT
