AReaL-Hex: Accommodating Asynchronous RL Training over Heterogeneous GPUs

Ran Yan; Youhe Jiang; Tianyuan Wu; Jiaxuan Gao; Zhiyu Mei; Wei Fu; Haohui Mai; Wei Wang; Yi Wu; Binhang Yuan

arXiv:2511.00796·cs.DC·November 4, 2025

AReaL-Hex: Accommodating Asynchronous RL Training over Heterogeneous GPUs

Ran Yan, Youhe Jiang, Tianyuan Wu, Jiaxuan Gao, Zhiyu Mei, Wei Fu, Haohui Mai, Wei Wang, Yi Wu, Binhang Yuan

PDF

Open Access

TL;DR

AReaL-Hex is a heterogeneity-aware asynchronous RL training system that optimally schedules GPU resources, significantly improving throughput and reducing costs for training large language models.

Contribution

It introduces a novel scheduling framework combining MILP and graph partitioning to efficiently utilize heterogeneous GPUs in RL training.

Findings

01

Up to 1.50x higher training throughput compared to homogeneous systems.

02

Up to 1.46x reduction in training cost at the same throughput.

03

Effective mapping of I/O-bound and compute-bound tasks to cost-efficient resources.

Abstract

Maximizing training throughput and cost-efficiency of RL for LLMs is essential to democratize this advanced technique. One promising but challenging approach is to deploy such a computational workflow over heterogeneous GPUs. Unlike conventional large-scale LLM pretraining, RL training generally decomposes into three coupled stages, i.e., rollout generation, reward computation, and policy/value updates, which exhibit markedly different compute intensities, memory footprints, and communication patterns. Recent research shows that fully asynchronous RL training can disaggregate these stages across disjoint hardware pools without sacrificing training stability, creating a great opportunity for real-world heterogeneous deployment. To this end, we present AReaL-Hex, a heterogeneity-aware asynchronous RL training system that effectively schedules how to execute rollout generation and policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsParallel Computing and Optimization Techniques · Graph Theory and Algorithms · Cloud Computing and Resource Management