Globally optimized SVD compression of LLMs via Fermi-function-based rank selection and gauge fixing
Roman Rausch, David Jansen, Sukhbinder Singh, Rom\'an Or\'us

TL;DR
This paper introduces a physics-inspired method for optimizing SVD-based compression of large language models by globally selecting ranks and fixing gauge freedom, improving efficiency and redundancy removal.
Contribution
It proposes FermiGrad for optimal rank selection via continuous relaxation and PivGa for lossless gauge fixing, advancing LLM compression techniques.
Findings
FermiGrad achieves better rank optimization than traditional methods.
PivGa reduces parameter redundancy without loss of accuracy.
The combined approach improves compression efficiency.
Abstract
Large Language Models (LLMs) are very demanding in terms of their computational resources. Low-rank decompositions of LLM weights, e.g. via Singular Value Decomposition (SVD), is a promising approach for LLM compression, but presents several practical hurdles, e.g. selecting appropriate layer-wise ranks and getting rid of its parameter redundancy. In this work, we present two physics-inspired improvements to SVD LLM compression: (1) \textbf{FermiGrad}, a gradient-descent algorithm that determines globally optimal layer-wise ranks by relaxing the discrete singular-value truncation into a continuous optimization using the Fermi function; (2) \textbf{PivGa}, an additional \textit{lossless} compression of the low-rank factors that exploits the intrinsic gauge freedom in their parametrization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science · Topic Modeling · Gaussian Processes and Bayesian Inference
