DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training

Zhixin Wang; Tianyi Zhou; Liming Liu; Ao Li; Jiarui Hu; Dian Yang; Yinhui Lu; Jinlong Hou; Siyuan Feng; Yuan Cheng; Yuan Qi

arXiv:2507.13833·cs.DC·September 10, 2025

DistFlow: A Fully Distributed RL Framework for Scalable and Efficient LLM Post-Training

Zhixin Wang, Tianyi Zhou, Liming Liu, Ao Li, Jiarui Hu, Dian Yang, Yinhui Lu, Jinlong Hou, Siyuan Feng, Yuan Cheng, Yuan Qi

PDF

Open Access

TL;DR

DistFlow is a fully distributed reinforcement learning framework for large language models that achieves near-linear scalability and significant efficiency improvements by eliminating centralized control and enabling independent worker operation.

Contribution

It introduces a novel multi-controller, fully distributed RL framework that enhances scalability and flexibility for LLM post-training.

Findings

01

Achieves near-linear scalability up to 1024 GPUs.

02

Up to 7x throughput improvement over SOTA frameworks.

03

Decouples resource configuration from execution logic.

Abstract

Reinforcement learning (RL) has become the pivotal post-training technique for large language model (LLM). Effectively scaling reinforcement learning is now the key to unlocking advanced reasoning capabilities and ensuring safe, goal-aligned behavior in the most powerful LLMs. Mainstream frameworks usually employ a hybrid-controller architecture where a single-controller dispatches the overall execution logic and manages overall data transfer and the multi-controller executes distributed computation. For large-scale reinforcement learning, minor load imbalances can introduce significant bottlenecks, ultimately constraining the scalability of the system. To address this limitation, we introduce DistFlow, a novel, fully distributed RL framework designed to break scaling barrier. We adopt a multi-controller paradigm that dispatches data transfer and execution tasks to all workers, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Multimodal Machine Learning Applications