Parameter Efficient Reinforcement Learning from Human Feedback
Hakim Sidahmed, Samrat Phatale, Alex Hutcheson, Zhuonan Lin, and Zhang Chen, Zac Yu, Jarvis Jin, Simral Chaudhary, Roman, Komarytsia, Christiane Ahlheim, Yonghao Zhu, Bowen Li, Saravanan, Ganesh, Bill Byrne, Jessica Hoffmann, Hassan Mansoor, Wei Li and, Abhinav Rastogi

TL;DR
This paper demonstrates that parameter-efficient RLHF using LoRA achieves similar alignment performance to traditional RLHF but with significantly reduced training time and memory usage, promoting wider adoption.
Contribution
It empirically evaluates PE-RLHF with LoRA, showing substantial efficiency gains while maintaining effectiveness across diverse tasks.
Findings
PE-RLHF achieves comparable performance to RLHF.
Training time reduced by up to 90%.
Memory footprint reduced by up to 50%.
Abstract
While Reinforcement Learning from Human Feedback (RLHF) effectively aligns pretrained Large Language and Vision-Language Models (LLMs, and VLMs) with human preferences, its computational cost and complexity hamper its wider adoption. To alleviate some of the computational burden of fine-tuning, parameter efficient methods, like LoRA were introduced. In this work, we empirically evaluate the setup of Parameter Efficient Reinforcement Learning from Human Feedback (PE-RLHF) that leverages LoRA fine-tuning for Reward Modeling, and Reinforcement Learning. We benchmark the PE-RLHF setup on six diverse datasets spanning summarization, harmless/helpful response generation, UI automation, and visual question answering in terms of effectiveness of the trained models, and the training resources required. Our findings show, for the first time, that PE-RLHF achieves comparable performance to RLHF,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsALIGN
