AdaSVD: Adaptive Singular Value Decomposition for Large Language Models
Zhiteng Li, Mingyuan Xia, Jingyuan Zhang, Zheng Hui, Haotong Qin, Linghe Kong, Yulun Zhang, Xiaokang Yang

TL;DR
AdaSVD is an adaptive SVD-based compression method for large language models that intelligently compensates for truncation errors and assigns layer-specific compression ratios, leading to better performance and memory efficiency.
Contribution
AdaSVD introduces adaptive error compensation and layer-wise compression ratios, improving upon existing SVD-based LLM compression techniques.
Findings
Outperforms state-of-the-art SVD-based methods across multiple models.
Achieves significant memory reduction with minimal performance loss.
Demonstrates effectiveness across various LLM and VLM benchmarks.
Abstract
Large language models (LLMs) have achieved remarkable success in natural language processing (NLP) tasks, yet their substantial memory requirements present significant challenges for deployment on resource-constrained devices. Singular Value Decomposition (SVD) has emerged as a promising compression technique for LLMs, offering considerable reductions in memory overhead. However, existing SVD-based methods often struggle to effectively mitigate the errors introduced by SVD truncation, leading to a noticeable performance gap when compared to the original models. Furthermore, applying a uniform compression ratio across all transformer layers fails to account for the varying importance of different layers. To address these challenges, we propose AdaSVD, an adaptive SVD-based LLM compression approach. Specifically, AdaSVD introduces adaComp, which adaptively compensates for SVD truncation…
Peer Reviews
Decision·ICLR 2026 Conference Withdrawn Submission
1. The proposed method achieves state-of-the-art performance among SVD-based compression techniques across multiple model architectures and evaluation metrics. 2. It demonstrates practical engineering contributions, including the introduction of a stack-of-batch technique that improves memory manipulation efficiency during compression and inference.
1. The methodological novelty appears limited, offering relatively few new insights to the field. The centered compression proximal objective (Eq. 5) is already well-established in prior literature and has a known closed-form optimal solution. Empirically solving it through iterative optimization lacks theoretical justification, and it remains unclear why a suboptimal solution yields better empirical performance. Additionally, the layer-wise compression using importance scores has been explored
- Clearly presents two practical improvements to SVD: post-truncation adjustment (adaComp) and layer-wise ratio allocation (adaCR). The SoB description is straightforward and easy to understand. - Provides a thorough comparison with several SVD baselines (SVD/FWSVD/ASVD/SVD-LLM (v1)) across multiple ratios and LM/VLM benchmarks. - Implementation details including whitening and a 256-sample calibration set are provided.
- Limited novelty. Both components mostly build on existing ideas: (a) updating low-rank factors using calibration data is a standard post-truncation method, and (b) non-uniform, layer-wise rank allocation has been studied before (e.g., SVD-LLM, SVD-LLM (v2)). The “importance” metric is just a simple similarity measure without deeper theoretical justification. - Narrow scope of contribution. The paper focuses on improving rank allocation and post-truncation tuning within the SVD pipeline, rather
1. The proposed adaComp makes the optimization process more stable. 2. The proposed method is evaluated on both LLM and VLM, which shows its generality.
1. Lack of evaluation on modern LLMs. All experimental results are based on older models such as LLaMA2-7B and Mistral-7B. The study should include evaluations on more recent models, such as the LLaMA3 and Qwen series, to strengthen its relevance and generalizability. 2. The authors claim that their method outperforms previous approaches across compression ratios ranging from 40% to 80%. However, the presented results only cover the range from 40% to 60%, leaving the higher ratios unverified. 3.
1. This paper is well-written and easy to follow. The delicate illustration and text can significantly help readers to better understand the idea of this paper. 2. Experiments is extensive and comprehensive. The experiments covers multiple LLMs from different LLM family as well as different downstream tasks. 3. This paper presents in-depth analysis regarding the proposed method and overall assessment is good, providing a convincing evidence to demonstrate the superiority of the proposed method
1. Lack of novelity and seemly incremental contribution. The *AdaComp* is almost the same as the early version of SVD-LLM (https://arxiv.org/pdf/2403.07378v1), where it also adopts a closed-form update to the decomposed matrix. Additionally, the compression ratio allocation strategy in *AdaCR* is not a new thing. This naive compression ratio allocation strategy appears in many submissions to preivous conferences. It originally comes from *Outlier Weighed Layerwise Sparsity (OWL): A Missing Secr
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
