LoRA: Low-Rank Adaptation of Large Language Models
Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi, Li, Shean Wang, Lu Wang, Weizhu Chen

TL;DR
LoRA introduces a low-rank adaptation method that significantly reduces the number of trainable parameters in large language models, enabling efficient fine-tuning without sacrificing performance.
Contribution
The paper proposes LoRA, a novel low-rank adaptation technique that drastically decreases trainable parameters and memory usage during fine-tuning of large language models.
Findings
LoRA reduces trainable parameters by 10,000 times compared to full fine-tuning.
LoRA achieves comparable or better performance than traditional fine-tuning.
LoRA has no additional inference latency and improves training throughput.
Abstract
An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗addy88/gpt-j-8bitmodel· 10 dl· ♡ 210 dl♡ 2
- 🤗addy88/gptj8model· 7 dl· ♡ 17 dl♡ 1
- 🤗hivemind/gpt-j-6B-8bitmodel· 34 dl· ♡ 13234 dl♡ 132
- 🤗fractalego/samsumbotmodel· 4 dl4 dl
- 🤗mrm8488/bertin-gpt-j-6B-ES-8bitmodel· 7 dl· ♡ 77 dl♡ 7
- 🤗joaoalvarenga/bloom-8bitmodel· 10 dl· ♡ 7510 dl♡ 75
- 🤗mrm8488/bloom-6b3-8bitmodel· 8 dl· ♡ 48 dl♡ 4
- 🤗mrm8488/bloom-1b3-8bitmodel· 8 dl· ♡ 38 dl♡ 3
- 🤗mrm8488/bertin-gpt-j-6B-ES-v1-8bitmodel· 8 dl· ♡ 58 dl♡ 5
- 🤗JosephusCheung/ACertainModelmodel· 102 dl· ♡ 159102 dl♡ 159
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · WordPiece · BERT · RoBERTa · How do I file a dispute with Expedia?*DisputeFastService · DeBERTa · Absolute Position Encodings
