QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning

Hossein Rajabzadeh; Mojtaba Valipour; Tianshu Zhu; Marzieh Tahaei; Hyock Ju Kwon; Ali Ghodsi; Boxing Chen; Mehdi Rezagholizadeh

arXiv:2402.10462·cs.LG·August 11, 2025·2 cites

QDyLoRA: Quantized Dynamic Low-Rank Adaptation for Efficient Large Language Model Tuning

Hossein Rajabzadeh, Mojtaba Valipour, Tianshu Zhu, Marzieh Tahaei, Hyock Ju Kwon, Ali Ghodsi, Boxing Chen, Mehdi Rezagholizadeh

PDF

Open Access

TL;DR

QDyLoRA introduces a dynamic, quantized low-rank adaptation method that allows efficient fine-tuning of large language models across multiple ranks on limited hardware, outperforming existing static approaches.

Contribution

It proposes a novel quantized dynamic low-rank adaptation technique enabling flexible, efficient LLM fine-tuning across various ranks without additional training.

Findings

01

QDyLoRA can fine-tune Falcon-40b across ranks 1 to 64 on a single GPU.

02

QDyLoRA is competitive with QLoRA and outperforms at optimal ranks.

03

The method reduces hardware requirements for large model fine-tuning.

Abstract

Finetuning large language models requires huge GPU memory, restricting the choice to acquire Larger models. While the quantized version of the Low-Rank Adaptation technique, named QLoRA, significantly alleviates this issue, finding the efficient LoRA rank is still challenging. Moreover, QLoRA is trained on a pre-defined rank and, therefore, cannot be reconfigured for its lower ranks without requiring further fine-tuning steps. This paper proposes QDyLoRA -Quantized Dynamic Low-Rank Adaptation-, as an efficient quantization approach for dynamic low-rank adaptation. Motivated by Dynamic LoRA, QDyLoRA is able to efficiently finetune LLMs on a set of pre-defined LoRA ranks. QDyLoRA enables fine-tuning Falcon-40b for ranks 1 to 64 on a single 32 GB V100-GPU through one round of fine-tuning. Experimental results show that QDyLoRA is competitive to QLoRA and outperforms when employing its…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling

MethodsSparse Evolutionary Training