Chip-Tuning: Classify Before Language Models Say

Fangwei Zhu; Dian Li; Jiajun Huang; Gang Liu; Hui Wang; Zhifang Sui

arXiv:2410.06541·cs.CL·October 14, 2024

Chip-Tuning: Classify Before Language Models Say

Fangwei Zhu, Dian Li, Jiajun Huang, Gang Liu, Hui Wang, Zhifang Sui

PDF

Open Access 1 Repo

TL;DR

This paper introduces chip-tuning, a structured pruning method for large language models that uses probing classifiers to effectively reduce model size by up to 50% with minimal performance loss.

Contribution

It proposes a novel pruning framework that attaches tiny classifiers to model layers, enabling effective layer removal while maintaining accuracy.

Findings

01

Chip-tuning achieves up to 50% pruning ratio.

02

It outperforms previous methods in accuracy and efficiency.

03

Applicable to multimodal models and compatible with finetuning.

Abstract

The rapid development in the performance of large language models (LLMs) is accompanied by the escalation of model size, leading to the increasing cost of model training and inference. Previous research has discovered that certain layers in LLMs exhibit redundancy, and removing these layers brings only marginal loss in model performance. In this paper, we adopt the probing technique to explain the layer redundancy in LLMs and demonstrate that language models can be effectively pruned with probing classifiers. We propose chip-tuning, a simple and effective structured pruning framework specialized for classification problems. Chip-tuning attaches tiny probing classifiers named chips to different layers of LLMs, and trains chips with the backbone model frozen. After selecting a chip for classification, all layers subsequent to the attached layer could be removed with marginal performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

qq-mm/chiptuning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsPruning