DanceGRPO: Unleashing GRPO on Visual Generation

Zeyue Xue; Jie Wu; Yu Gao; Fangyuan Kong; Lingting Zhu; Mengzhao Chen; Zhiheng Liu; Wei Liu; Qiushan Guo; Weilin Huang; Ping Luo

arXiv:2505.07818·cs.CV·August 29, 2025

DanceGRPO: Unleashing GRPO on Visual Generation

Zeyue Xue, Jie Wu, Yu Gao, Fangyuan Kong, Lingting Zhu, Mengzhao Chen, Zhiheng Liu, Wei Liu, Qiushan Guo, Weilin Huang, Ping Luo

PDF

Open Access 1 Repo

TL;DR

DanceGRPO introduces a stable and versatile reinforcement learning framework for visual content generation, effectively aligning models with human preferences across diverse tasks and models, outperforming previous methods significantly.

Contribution

The paper adapts Group Relative Policy Optimization (GRPO) for visual generation, overcoming stability issues of prior RL methods and demonstrating broad applicability and superior performance.

Findings

01

Outperforms baseline methods by up to 181% on benchmarks

02

Maintains stability across diffusion models and flows

03

Effectively optimizes for diverse human preferences

Abstract

Recent advances in generative AI have revolutionized visual content creation, yet aligning model outputs with human preferences remains a critical challenge. While Reinforcement Learning (RL) has emerged as a promising approach for fine-tuning generative models, existing methods like DDPO and DPOK face fundamental limitations - particularly their inability to maintain stable optimization when scaling to large and diverse prompt sets, severely restricting their practical utility. This paper presents DanceGRPO, a framework that addresses these limitations through an innovative adaptation of Group Relative Policy Optimization (GRPO) for visual generation tasks. Our key insight is that GRPO's inherent stability mechanisms uniquely position it to overcome the optimization challenges that plague prior RL-based approaches on visual generation. DanceGRPO establishes several significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xuezeyue/dancegrpo
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications · Music Technology and Sound Studies

MethodsDiffusion · Contrastive Language-Image Pre-training