Relax: An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Liujie Zhang; Benzhe Ning; Rui Yang; Xiaoyan Yu; Jiaxing Li; Lumeng Wu; Jia Liu; Minghao Li; Weihang Chen; Weiqi Hu; Lei Zhang

arXiv:2604.11554·cs.CL·April 15, 2026

Relax: An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Liujie Zhang, Benzhe Ning, Rui Yang, Xiaoyan Yu, Jiaxing Li, Lumeng Wu, Jia Liu, Minghao Li, Weihang Chen, Weiqi Hu, Lei Zhang

PDF

1 Repo

TL;DR

Relax is an open-source asynchronous reinforcement learning engine designed for omni-modal large language models, addressing scalability, robustness, and heterogeneity challenges with a novel architecture and achieving significant speedups.

Contribution

It introduces Relax, a scalable, fault-isolated, asynchronous RL engine with omni-native architecture supporting multi-modal data and efficient training at scale.

Findings

01

Relax achieves 1.20× speedup over veRL on Qwen3-4B.

02

Fully async mode delivers 2.00× speedup on Qwen3-Omni-30B.

03

Relax supports R3 with only 1.9% overhead, enabling stable omni-modal RL convergence.

Abstract

Reinforcement learning (RL) post-training has proven effective at unlocking reasoning, self-reflection, and tool-use capabilities in large language models. As models extend to omni-modal inputs and agentic multi-turn workflows, RL training systems face three interdependent challenges: heterogeneous data flows, operational robustness at scale, and the staleness -- throughput tradeoff. We present \textbf{Relax} (Reinforcement Engine Leveraging Agentic X-modality), an open-source RL training engine that addresses these challenges through three co-designed architectural layers. First, an \emph{omni-native architecture} builds multimodal support into the full stack -- from data preprocessing and modality-aware parallelism to inference generation -- rather than retrofitting it onto a text-centric pipeline. Second, each RL role runs as an independent, fault-isolated service that can be scaled,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rednote-ai/Relax
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.