Basis Selection: Low-Rank Decomposition of Pretrained Large Language Models for Target Applications
Yang Li, Daniel Agyei Asante, Changsheng Zhao, Ernie Chang, Yangyang Shi, Vikas Chandra

TL;DR
This paper presents a low-rank decomposition method to compress large language models by removing redundant components, enabling efficient deployment on resource-limited devices while preserving performance for specific applications.
Contribution
It introduces a novel application-specific low-rank decomposition approach that effectively prunes and enhances pretrained LLMs for targeted tasks.
Findings
Significant reduction in model size achieved
Maintains comparable accuracy to state-of-the-art methods
Effective on Llama 2 models for reasoning and code generation
Abstract
Large language models (LLMs) significantly enhance the performance of various applications, but they are computationally intensive and energy-demanding. This makes it challenging to deploy them on devices with limited resources, such as personal computers and mobile/wearable devices, and results in substantial inference costs in resource-rich environments like cloud servers. To extend the use of LLMs, we introduce a low-rank decomposition approach to effectively compress these models, tailored to the requirements of specific applications. We observe that LLMs pretrained on general datasets contain many redundant components not needed for particular applications. Our method focuses on identifying and removing these redundant parts, retaining only the necessary elements for the target applications. Specifically, we represent the weight matrices of LLMs as a linear combination of base…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsBalanced Selection · LLaMA
