Advances in GRPO for Generation Models: A Survey

Zexiang Liu; Xianglong He; Yangguang Li

arXiv:2603.06623·cs.LG·March 10, 2026

Advances in GRPO for Generation Models: A Survey

Zexiang Liu, Xianglong He, Yangguang Li

PDF

Open Access

TL;DR

This survey reviews Flow-GRPO, a reinforcement learning framework for aligning large-scale generative models with human preferences across multiple modalities, highlighting methodological advances and diverse applications.

Contribution

It provides a comprehensive overview of Flow-GRPO's development, methodological improvements, and its application across various generative paradigms and modalities.

Findings

01

Flow-GRPO enables stable reinforcement learning alignment for generative models.

02

Methodological advances include reward design, diversity, and mitigation of reward hacking.

03

Applications span text-to-image, video, speech, 3D, and multimodal systems.

Abstract

Large-scale flow matching models have achieved strong performance across generative tasks such as text-to-image, video, 3D, and speech synthesis. However, aligning their outputs with human preferences and task-specific objectives remains challenging. Flow-GRPO extends Group Relative Policy Optimization (GRPO) to generation models, enabling stable reinforcement learning alignment for generative systems. Since its introduction, Flow-GRPO has triggered rapid research growth, spanning methodological refinements and diverse application domains. This survey provides a comprehensive review of Flow-GRPO and its subsequent developments. We organize existing work along two primary dimensions. First, we analyze methodological advances beyond the original framework, including reward signal design, credit assignment, sampling efficiency, diversity preservation, reward hacking mitigation, and reward…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Music Technology and Sound Studies