Restoring Pruned Large Language Models via Lost Component Compensation

Zijian Feng; Hanzhang Zhou; Zixiao Zhu; Tianjiao Li; Jia Jim Deryl Chua; Lee Onn Mak; Gee Wah Ng; Kezhi Mao

arXiv:2510.21834·cs.LG·October 28, 2025

Restoring Pruned Large Language Models via Lost Component Compensation

Zijian Feng, Hanzhang Zhou, Zixiao Zhu, Tianjiao Li, Jia Jim Deryl Chua, Lee Onn Mak, Gee Wah Ng, Kezhi Mao

PDF

TL;DR

This paper introduces RestoreLCC, a novel method for restoring pruned large language models by selectively reintroducing lost information in attention heads, significantly improving performance without sacrificing efficiency.

Contribution

It proposes a targeted, plug-and-play restoration technique that leverages attention activation differences to recover pruned model performance more effectively than existing methods.

Findings

01

RestoreLCC outperforms state-of-the-art baselines in performance recovery.

02

The method is compatible with various pruning schemes.

03

RestoreLCC maintains model sparsity and inference efficiency.

Abstract

Pruning is a widely used technique to reduce the size and inference cost of large language models (LLMs), but it often causes performance degradation. To mitigate this, existing restoration methods typically employ parameter-efficient fine-tuning (PEFT), such as LoRA, to recover the pruned model's performance. However, most PEFT methods are designed for dense models and overlook the distinct properties of pruned models, often resulting in suboptimal recovery. In this work, we propose a targeted restoration strategy for pruned models that restores performance while preserving their low cost and high efficiency. We observe that pruning-induced information loss is reflected in attention activations, and selectively reintroducing components of this information can significantly recover model performance. Based on this insight, we introduce RestoreLCC (Restoring Pruned LLMs via Lost…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.