RL over Commodity Networks: Overcoming the Bandwidth Barrier with Lossless Sparse Deltas
Chaoyi Ruan, Geng Luo, Xinyi Wan, Long Zhao, Qinghe Wang, Jiaan Zhu, Duling Xu, Guanbin Xu, Dehui Wei, Xiang Liu, Cheng Li, Haifeng Sun, Congcong Miao, Jialin Li

TL;DR
This paper introduces SparrowRL, a system that enables efficient reinforcement learning fine-tuning of large language models over commodity networks by transmitting only sparse, lossless parameter deltas, significantly reducing bandwidth and increasing throughput.
Contribution
SparrowRL is a novel system that leverages the sparsity of RL updates to enable high-performance training over commodity network links without information loss.
Findings
Reduces per-step data transfer by 79× for 8B models.
Improves throughput by 2.4–9.5× over full-weight broadcast.
Achieves near RDMA baseline throughput and higher tokens per dollar.
Abstract
LLM post-training with reinforcement learning (RL) requires frequent synchronization of large model parameters between the trainer and distributed rollout actors. High-throughput RL post-training therefore relies on dedicated RDMA HPC clusters, an infrastructure cost most organizations cannot absorb. A natural alternative is to aggregate loosely-coupled GPUs over standard Ethernet and WAN links, but this commodity connectivity cannot sustain full-weight broadcasts: synchronizing an 8B model can take over 100~seconds on bandwidth-limited links, while rollout generation typically takes tens of seconds. Toward making RL practical in this regime, we observe that RL fine-tuning yields highly sparse per-step updates, with only around 1\% of parameter elements changing. Atop this insight, we present SparrowRL, a novel high-performance RL training system that preserves bit-exact updates…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware-Defined Networks and 5G · Advanced Optical Network Technologies · Network Time Synchronization Technologies
