Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward

Zikang Liu; Tongtian Yue; Yepeng Tang; Longteng Guo; Junxian Cai; Qingbin Liu; Xi Chen; Jing Liu

arXiv:2506.05433·cs.LG·June 9, 2025

Prefix Grouper: Efficient GRPO Training through Shared-Prefix Forward

Zikang Liu, Tongtian Yue, Yepeng Tang, Longteng Guo, Junxian Cai, Qingbin Liu, Xi Chen, Jing Liu

PDF

Open Access

TL;DR

Prefix Grouper significantly reduces the computational overhead of Group Relative Policy Optimization by sharing prefix encoding, enabling scalable and efficient training for long-context scenarios without altering the original optimization dynamics.

Contribution

It introduces a Shared-Prefix Forward strategy that restructures self-attention to eliminate redundant prefix encoding in GRPO, maintaining training equivalence and improving scalability.

Findings

01

Achieves identical forward outputs and gradients as standard GRPO.

02

Reduces training computational cost in long-prefix scenarios.

03

Enables larger group sizes within the same computational budget.

Abstract

Group Relative Policy Optimization (GRPO) enhances policy learning by computing gradients from relative comparisons among candidate outputs that share a common input prefix. Despite its effectiveness, GRPO introduces substantial computational overhead when processing long shared prefixes, which must be redundantly encoded for each group member. This inefficiency becomes a major scalability bottleneck in long-context learning scenarios. We propose Prefix Grouper, an efficient GRPO training algorithm that eliminates redundant prefix computation via a Shared-Prefix Forward strategy. In particular, by restructuring self-attention into two parts, our method enables the shared prefix to be encoded only once, while preserving full differentiability and compatibility with end-to-end training. We provide both theoretical and empirical evidence that Prefix Grouper is training-equivalent to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Machine Learning in Materials Science · Big Data and Digital Economy