DiverseGRPO: Mitigating Mode Collapse in Image Generation via Diversity-Aware GRPO
Henglin Liu, Huijuan Huang, Jing Wang, Chang Liu, Xiu Li, Xiangyang Ji

TL;DR
DiverseGRPO introduces a novel approach to mitigate mode collapse in image generation by incorporating diversity-aware regularization and distributional rewards, significantly enhancing visual diversity without sacrificing quality.
Contribution
The paper proposes a distributional creativity bonus and structure-aware regularization to improve diversity in GRPO-based image generation, addressing mode collapse.
Findings
Achieves 13-18% improvement in semantic diversity at matched quality levels.
Establishes a new Pareto frontier between image quality and diversity.
Effectively mitigates mode collapse in GRPO through novel regularization techniques.
Abstract
Reinforcement learning (RL), particularly GRPO, improves image generation quality significantly by comparing the relative performance of images generated within the same group. However, in the later stages of training, the model tends to produce homogenized outputs, lacking creativity and visual diversity, which restricts its application scenarios. This issue can be analyzed from both reward modeling and generation dynamics perspectives. First, traditional GRPO relies on single-sample quality as the reward signal, driving the model to converge toward a few high-reward generation modes while neglecting distribution-level diversity. Second, conventional GRPO regularization neglects the dominant role of early-stage denoising in preserving diversity, causing a misaligned regularization budget that limits the achievable quality--diversity trade-off. Motivated by these insights, we revisit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Domain Adaptation and Few-Shot Learning
