Large Language Model Pruning

Hanjuan Huang (1)(2); Hao-Jia Song (1); Hsing-Kuo Pao (1) ((1); Dept. of Computer Science; Information Engineering National Taiwan; University of Science; Technology; Taipei; Taiwan; (2) College of; Mechanical; Electrical Engineering; WUYI University; Wuyishan; China)

arXiv:2406.00030·cs.CL·June 4, 2024·1 cites

Large Language Model Pruning

Hanjuan Huang (1)(2), Hao-Jia Song (1), Hsing-Kuo Pao (1) ((1), Dept. of Computer Science, Information Engineering National Taiwan, University of Science, Technology, Taipei, Taiwan, (2) College of, Mechanical, Electrical Engineering, WUYI University, Wuyishan, China)

PDF

Open Access

TL;DR

This paper introduces a theoretically grounded neuron pruning method for large language models that enhances explainability and reduces model complexity without sacrificing performance.

Contribution

It proposes a mutual information-based pruning technique with a theoretical foundation, specifically tailored for large-scale language models, and explores differences in pruning criteria between small and large models.

Findings

01

Pruning criteria are less sensitive in large models.

02

The proposed method outperforms state-of-the-art pruning techniques.

03

The approach improves model explainability and efficiency.

Abstract

We surely enjoy the larger the better models for their superior performance in the last couple of years when both the hardware and software support the birth of such extremely huge models. The applied fields include text mining and others. In particular, the success of LLMs on text understanding and text generation draws attention from researchers who have worked on NLP and related areas for years or even decades. On the side, LLMs may suffer from problems like model overfitting, hallucination, and device limitation to name a few. In this work, we suggest a model pruning technique specifically focused on LLMs. The proposed methodology emphasizes the explainability of deep learning models. By having the theoretical foundation, we obtain a trustworthy deep model so that huge models with a massive number of model parameters become not quite necessary. A mutual information-based estimation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech and dialogue systems

MethodsPruning