ReaL: Efficient RLHF Training of Large Language Models with Parameter   Reallocation

Zhiyu Mei; Wei Fu; Kaiwei Li; Guangju Wang; Huanchen Zhang; Yi Wu

arXiv:2406.14088·cs.DC·April 25, 2025·2 cites

ReaL: Efficient RLHF Training of Large Language Models with Parameter Reallocation

Zhiyu Mei, Wei Fu, Kaiwei Li, Guangju Wang, Huanchen Zhang, Yi Wu

PDF

Open Access 1 Repo

TL;DR

ReaL introduces a dynamic parameter reallocation system that optimizes parallelization strategies for RLHF training of large language models, significantly improving training efficiency and performance.

Contribution

The paper presents ReaL, a novel system that automatically discovers and deploys efficient execution plans for RLHF training by dynamically reallocating parameters across resources.

Findings

01

Achieves up to 3.58x speedup over baseline methods.

02

81% average performance improvement over heuristic approaches.

03

Effective on LLaMA models with up to 70 billion parameters.

Abstract

Reinforcement Learning from Human Feedback (RLHF) is a pivotal technique for empowering large language model (LLM) applications. Compared with the supervised training process of LLMs, the RLHF training process is much more sophisticated, requiring a diverse range of computation workloads with intricate dependencies between multiple LLM instances. Therefore, simply adopting the fixed parallelization strategies from supervised training for LLMs can be insufficient for RLHF and result in low training efficiency. To overcome this limitation, we propose a novel technique named parameter ReaLlocation, which dynamically adapts the parallelization strategies for different workloads during training by redistributing LLM parameters across the training cluster. Building upon this idea, we introduce ReaL, a pioneering system for efficient RLHF training. ReaL introduces the concept of an execution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

openpsi-project/realhf
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Speech Recognition and Synthesis

MethodsLLaMA