WeChat-YATT: A Scalable, Simple, Efficient, and Production Ready Training Library

Junyu Wu; Weiming Chang; Xiaotao Liu; Guanyou He; Tingfeng Xian; Haoqiang Hong; Boqi Chen; Hongtao Tian; Tao Yang; Yunsheng Shi; Feng Lin; Ting Yao; Jiatao Xu

arXiv:2508.07970·cs.LG·August 19, 2025

WeChat-YATT: A Scalable, Simple, Efficient, and Production Ready Training Library

Junyu Wu, Weiming Chang, Xiaotao Liu, Guanyou He, Tingfeng Xian, Haoqiang Hong, Boqi Chen, Hongtao Tian, Tao Yang, Yunsheng Shi, Feng Lin, Ting Yao, Jiatao Xu

PDF

TL;DR

WeChat-YATT is a scalable, efficient RLHF training framework that improves orchestration and resource utilization for large multimodal models, demonstrated through extensive experiments and real-world deployment.

Contribution

Introduces a novel parallel controller programming model and dynamic resource placement schema to enhance scalability and efficiency in RLHF training workflows.

Findings

01

Significant throughput improvements over existing frameworks.

02

Effective reduction of hardware idle time and better GPU utilization.

03

Successful deployment for large-scale real-world applications.

Abstract

Reinforcement Learning from Human Feedback (RLHF) has emerged as a prominent paradigm for training large language models and multimodal systems. Despite the notable advances enabled by existing RLHF training frameworks, significant challenges remain to scale to complex multimodal workflows and adapt to dynamic workloads. In particular, current systems often encounter limitations related to controller scalability when managing large models, as well as inefficiencies in orchestrating intricate RLHF pipelines, especially in scenarios that require dynamic sampling and resource allocation. In this paper, we introduce WeChat-YATT Yet Another Transformer Trainer in WeChat, a simple, scalable, and balanced RLHF training framework specifically designed to address these challenges. WeChat-YATT features a parallel controller programming model that enables flexible and efficient orchestration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.