AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

Qingru Zhang; Minshuo Chen; Alexander Bukharin; Nikos Karampatziakis,; Pengcheng He; Yu Cheng; Weizhu Chen; Tuo Zhao

arXiv:2303.10512·cs.CL·December 22, 2023·32 cites

AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning

Qingru Zhang, Minshuo Chen, Alexander Bukharin, Nikos Karampatziakis,, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao

PDF

Open Access 2 Repos 10 Models

TL;DR

AdaLoRA introduces an adaptive method for allocating parameter budgets in fine-tuning large language models, improving efficiency and performance by focusing on more important weights through a novel singular value decomposition approach.

Contribution

It proposes AdaLoRA, a new adaptive budget allocation method that uses importance scores and SVD to enhance parameter-efficient fine-tuning of large models.

Findings

01

Significant performance improvements over baselines in low budget scenarios

02

Effective parameter pruning based on importance scores

03

Validated across NLP, question answering, and generation tasks

Abstract

Fine-tuning large pre-trained language models on downstream tasks has become an important paradigm in NLP. However, common practice fine-tunes all of the parameters in a pre-trained model, which becomes prohibitive when a large number of downstream tasks are present. Therefore, many fine-tuning methods are proposed to learn incremental updates of pre-trained weights in a parameter efficient way, e.g., low-rank increments. These methods often evenly distribute the budget of incremental updates across all pre-trained weight matrices, and overlook the varying importance of different weight parameters. As a consequence, the fine-tuning performance is suboptimal. To bridge this gap, we propose AdaLoRA, which adaptively allocates the parameter budget among weight matrices according to their importance score. In particular, AdaLoRA parameterizes the incremental updates in the form of singular…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications