DipSVD: Dual-importance Protected SVD for Efficient LLM Compression
Xuan Ding, Rui Sun, Yunjian Zhang, Xiu Yan, Yueqi Zhou, Kaihao Huang, Suzhong Fu, Chuanlong Xie, Yao Zhu

TL;DR
DipSVD introduces a dual-importance protection mechanism for SVD-based LLM compression, focusing on preserving critical components and balancing layer importance to improve performance at high compression ratios.
Contribution
The paper proposes a novel dual-level importance protection mechanism for SVD-based LLM compression, enhancing preservation of critical components and layer importance balancing.
Findings
Outperforms existing SVD-based methods across multiple benchmarks.
Achieves superior performance at high compression ratios.
Effectively preserves critical singular vectors and layer importance.
Abstract
The ever-increasing computational demands and deployment costs of large language models (LLMs) have spurred numerous compressing methods. Compared to quantization and unstructured pruning, SVD compression offers superior hardware compatibility and theoretical guarantees. However, existing SVD-based methods focus on the overall discrepancy between the original and compressed matrices while overlooking the protection of critical components within the matrix, which leads to inferior performance in the compressed models. This paper proposes a dual-level importance protection mechanism to enhance SVD-based compression methods: (1) local importance protection: preserving the most critical singular vectors within each weight matrix through channel-weighted data whitening; and (2) global importance protection: enabling less important layers to bear a greater portion of the compression burden…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Advanced Wireless Communication Techniques · Algorithms and Data Compression
MethodsFocus
