Highly Parallelized Reinforcement Learning Training with Relaxed Assignment Dependencies
Zhouyu He, Peng Qiao, Rongchun Li, Yong Dou, Yusong Tan

TL;DR
This paper introduces TianJi, a high-throughput distributed reinforcement learning system that relaxes assignment dependencies, enabling asynchronous communication and achieving significant speedups in training convergence and data transmission efficiency.
Contribution
TianJi is the first system to relax assignment dependencies in DRL training, improving parallelization, scalability, and convergence speed while maintaining convergence guarantees.
Findings
Achieves up to 4.37x faster convergence time.
Scales to 8 nodes with 1.6x convergence and 7.13x throughput speedup.
Outperforms existing systems in data transmission efficiency.
Abstract
As the demands for superior agents grow, the training complexity of Deep Reinforcement Learning (DRL) becomes higher. Thus, accelerating training of DRL has become a major research focus. Dividing the DRL training process into subtasks and using parallel computation can effectively reduce training costs. However, current DRL training systems lack sufficient parallelization due to data assignment between subtask components. This assignment issue has been ignored, but addressing it can further boost training efficiency. Therefore, we propose a high-throughput distributed RL training system called TianJi. It relaxes assignment dependencies between subtask components and enables event-driven asynchronous communication. Meanwhile, TianJi maintains clear boundaries between subtask components. To address convergence uncertainty from relaxed assignment dependencies, TianJi proposes a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Software-Defined Networks and 5G · Domain Adaptation and Few-Shot Learning
