SVD-LLM: Truncation-aware Singular Value Decomposition for Large   Language Model Compression

Xin Wang; Yu Zheng; Zhongwei Wan; Mi Zhang

arXiv:2403.07378·cs.CL·March 18, 2025·1 cites

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

Xin Wang, Yu Zheng, Zhongwei Wan, Mi Zhang

PDF

Open Access 1 Repo

TL;DR

SVD-LLM introduces a truncation-aware SVD-based compression method for large language models that maintains accuracy at high compression ratios by incorporating data whitening and parameter updates.

Contribution

It proposes a novel SVD-based post-training compression technique with truncation-aware data whitening and sequential low-rank approximation for improved LLM compression.

Findings

01

Outperforms existing methods at high compression ratios

02

Effective across multiple datasets and LLM models

03

Reduces accuracy loss through parameter updates

Abstract

The advancements in Large Language Models (LLMs) have been hindered by their substantial sizes, which necessitates LLM compression methods for practical deployment. Singular Value Decomposition (SVD) offers a promising solution for LLM compression. However, state-of-the-art SVD-based LLM compression methods have two key limitations: truncating smaller singular values may lead to higher compression loss, and the lack of update on the compressed weights after SVD truncation. In this work, we propose SVD-LLM, a SVD-based post-training LLM compression method that addresses the limitations of existing methods. SVD-LLM incorporates a truncation-aware data whitening technique to ensure a direct mapping between singular values and compression loss. Moreover, SVD-LLM adopts a parameter update with sequential low-rank approximation to compensate for the accuracy degradation after SVD compression.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aiot-mlsys-lab/svd-llm
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Speech Recognition and Synthesis · Algorithms and Data Compression