Towards Self-Improvement of Diffusion Models via Group Preference Optimization

Renjie Chen; Wenfeng Lin; Yichen Zhang; Jiangchuan Wei; Boyuan Liu; Chao Feng; Jiao Ran; Mingyu Guo

arXiv:2505.11070·cs.CV·May 19, 2025

Towards Self-Improvement of Diffusion Models via Group Preference Optimization

Renjie Chen, Wenfeng Lin, Yichen Zhang, Jiangchuan Wei, Boyuan Liu, Chao Feng, Jiao Ran, Mingyu Guo

PDF

Open Access

TL;DR

This paper introduces Group Preference Optimization (GPO), a self-improvement technique for diffusion models that enhances performance by leveraging model capabilities without external data, addressing limitations of existing preference-based methods.

Contribution

The paper proposes GPO, a novel self-improvement method that extends DPO to group preferences and uses reward standardization, improving diffusion models without additional data or inference overhead.

Findings

01

GPO improves diffusion model performance across various tasks.

02

Extending DPO to group preferences enhances robustness.

03

GPO achieves significant accuracy gains in computer vision applications.

Abstract

Aligning text-to-image (T2I) diffusion models with Direct Preference Optimization (DPO) has shown notable improvements in generation quality. However, applying DPO to T2I faces two challenges: the sensitivity of DPO to preference pairs and the labor-intensive process of collecting and annotating high-quality data. In this work, we demonstrate that preference pairs with marginal differences can degrade DPO performance. Since DPO relies exclusively on relative ranking while disregarding the absolute difference of pairs, it may misclassify losing samples as wins, or vice versa. We empirically show that extending the DPO from pairwise to groupwise and incorporating reward standardization for reweighting leads to performance gains without explicit data selection. Furthermore, we propose Group Preference Optimization (GPO), an effective self-improvement method that enhances performance by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image and Video Quality Assessment · Advanced Multi-Objective Optimization Algorithms

MethodsDiffusion · Direct Preference Optimization