Advances in GRPO for Generation Models: A Survey
Zexiang Liu, Xianglong He, Yangguang Li

TL;DR
This survey reviews Flow-GRPO, a reinforcement learning framework for aligning large-scale generative models with human preferences across multiple modalities, highlighting methodological advances and diverse applications.
Contribution
It provides a comprehensive overview of Flow-GRPO's development, methodological improvements, and its application across various generative paradigms and modalities.
Findings
Flow-GRPO enables stable reinforcement learning alignment for generative models.
Methodological advances include reward design, diversity, and mitigation of reward hacking.
Applications span text-to-image, video, speech, 3D, and multimodal systems.
Abstract
Large-scale flow matching models have achieved strong performance across generative tasks such as text-to-image, video, 3D, and speech synthesis. However, aligning their outputs with human preferences and task-specific objectives remains challenging. Flow-GRPO extends Group Relative Policy Optimization (GRPO) to generation models, enabling stable reinforcement learning alignment for generative systems. Since its introduction, Flow-GRPO has triggered rapid research growth, spanning methodological refinements and diverse application domains. This survey provides a comprehensive review of Flow-GRPO and its subsequent developments. We organize existing work along two primary dimensions. First, we analyze methodological advances beyond the original framework, including reward signal design, credit assignment, sampling efficiency, diversity preservation, reward hacking mitigation, and reward…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Music Technology and Sound Studies
