Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

Ruoyu Qin; Weiran He; Weixiao Huang; Yangkun Zhang; Yikai Zhao; Bo Pang; Xinran Xu; Yingdi Shan; Yongwei Wu; Mingxing Zhang

arXiv:2511.14617·cs.DC·April 6, 2026

Seer: Online Context Learning for Fast Synchronous LLM Reinforcement Learning

Ruoyu Qin, Weiran He, Weixiao Huang, Yangkun Zhang, Yikai Zhao, Bo Pang, Xinran Xu, Yingdi Shan, Yongwei Wu, Mingxing Zhang

PDF

TL;DR

Seer is a novel RL system for LLMs that reduces latency and improves throughput by leveraging request similarities and introducing coordinated load balancing, scheduling, and decoding techniques.

Contribution

Seer introduces a context-aware RL system with three techniques to address performance bottlenecks in synchronous LLM reinforcement learning.

Findings

01

Achieves up to 2.04× throughput improvement over state-of-the-art systems.

02

Reduces long-tail latency by 72-94%.

03

Effectively balances workload and accelerates generation.

Abstract

Reinforcement Learning (RL) has emerged as a critical technique for advancing modern Large Language Models (LLMs), yet existing synchronous RL systems face severe performance bottlenecks. The rollout phase, which dominates end-to-end iteration time, suffers from substantial long-tail latency and poor resource utilization due to inherent workload imbalance. We present Seer, a novel context learning RL system that addresses these challenges through a key observation: requests sharing the same prompt exhibit strong similarities in output lengths and response patterns. Leveraging this insight, Seer introduces three coordinated techniques: (1) divided rollout for dynamic load balancing, (2) context-aware scheduling to mitigate long-tail request delays, and (3) adaptive grouped speculative decoding to accelerate generation. These mechanisms work in concert to markedly reduce long-tail latency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.