Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications

Yang Li; Daniel Agyei Asante; Changsheng Zhao; Ernie Chang; Yangyang Shi; Vikas Chandra

arXiv:2405.15877·cs.LG·December 22, 2025

Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications

Yang Li, Daniel Agyei Asante, Changsheng Zhao, Ernie Chang, Yangyang Shi, Vikas Chandra

PDF

Open Access

TL;DR

This paper presents a low-rank decomposition method to compress large language models by removing redundant components, enabling efficient deployment on resource-limited devices while preserving performance for specific applications.

Contribution

It introduces a novel application-specific low-rank decomposition approach that effectively prunes and enhances pretrained LLMs for targeted tasks.

Findings

01

Significant reduction in model size achieved

02

Maintains comparable accuracy to state-of-the-art methods

03

Effective on Llama 2 models for reasoning and code generation

Abstract

Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding. This makes it challenging to deploy them on devices with limited resources, such as personal computers and mobile/wearable devices, and results in substantial inference costs in resource-rich environments like cloud servers. To extend the use of LLMs, we introduce a low-rank decomposition approach to effectively compress these models, tailored to the requirements of specific applications. We observe that LLMs pretrained on general datasets contain many redundant components not needed for particular applications. Our method focuses on identifying and removing these redundant parts, retaining only the necessary elements for the target applications. Specifically, we represent the weight matrices of LLMs as a linear combination of base…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis

MethodsBalanced Selection · LLaMA