Zero Sum SVD: Balancing Loss Sensitivity for Low Rank LLM Compression
Ali Abbasi, Chayne Thrash, Haoran Qin, Shansita Sharma, Sepehr Seifi, Soheil Kolouri

TL;DR
Zero Sum SVD introduces a novel post-training low-rank compression method for large language models that automatically allocates heterogeneous ranks across matrices, improving performance without complex optimization.
Contribution
The paper presents ZS-SVD, a global singular component selection technique using activation whitening and loss estimates, eliminating the need for iterative rank optimization.
Findings
Consistent performance improvements across multiple LLMs and benchmarks.
Automatic heterogeneous rank allocation without optimization.
Effective low-rank compression with minimal loss in accuracy.
Abstract
Advances in large language models have driven strong performance across many tasks, but their memory and compute costs still hinder deployment. SVD-based compression reduces storage and can speed up inference via low-rank factors, yet performance depends on how rank is allocated under a global compression ratio. Prior methods often use homogeneous ranks for similarly sized matrices, despite large differences in loss sensitivity, or rely on expensive iterative pre-truncation optimization to determine per matrix ranks. We propose \textbf{Zero Sum SVD} (\textbf{ZS-SVD}), a post-training method that performs \emph{global} singular component selection using activation whitening and first-order calibration loss estimates in whitened coordinates. \textbf{ZS-SVD} prunes components across the whole model with a \textbf{zero sum} rule that keeps the cumulative predicted loss change near zero,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Big Data and Digital Economy · Advanced Neural Network Applications
