Exploring the impact of low-rank adaptation on the performance, efficiency, and regularization of RLHF
Simeng Sun, Dhawal Gupta, Mohit Iyyer

TL;DR
This paper demonstrates that low-rank adaptation (LoRA) enables efficient RLHF training of large language models, reducing computational costs significantly while maintaining or improving performance and regularization effects.
Contribution
It introduces a LoRA-based RLHF method that reduces resource requirements and analyzes the effects of different regularizers and training configurations.
Findings
LoRA achieves better performance than full fine-tuning with only 0.2% of parameters tuned.
Removing KL regularization does not harm performance in LoRA setup.
LoRA mitigates the negative impact of PPO on factuality.
Abstract
During the last stage of RLHF, a large language model is aligned to human intents via PPO training, a process that generally requires large-scale computational resources. In this technical report, we empirically investigate an efficient implementation of RLHF using low-rank adaptation (LoRA), which allows us to align the LLaMA 7B checkpoint on the Alpaca dataset using only two A100 GPUs instead of the eight required for full model fine-tuning. Despite tuning only 0.2% of LLaMA 7B's parameters, our implementation achieves better performance than the publicly-released AlpacaFarm checkpoint with full model fine-tuning. Next, we analyze several configurations of our LoRA-based PPO implementation, varying the form of the KL regularization term in the training objective. We find that (1) removing this penalty term does not harm performance on the AlpacaFarm evaluation set under our LoRA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Domain Adaptation and Few-Shot Learning
MethodsEntropy Regularization · Proximal Policy Optimization · ALIGN
