ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models

Dmitrii Sorokin; Maksim Nakhodnov; Andrey Kuznetsov; Aibek Alanov

arXiv:2505.22569·cs.CV·May 29, 2025

ImageReFL: Balancing Quality and Diversity in Human-Aligned Diffusion Models

Dmitrii Sorokin, Maksim Nakhodnov, Andrey Kuznetsov, Aibek Alanov

PDF

Open Access 4 Models

TL;DR

This paper introduces ImageReFL, a novel approach combining a new sampling strategy and fine-tuning method to enhance diversity and quality in human-aligned diffusion models, addressing the trade-off between alignment and diversity.

Contribution

It presents combined generation and ImageReFL, two techniques that improve diversity and quality in diffusion models aligned with human preferences, with minimal loss of global structure.

Findings

01

Outperforms conventional reward tuning on quality and diversity metrics

02

User study confirms better balance of human preference and visual diversity

03

Mitigates early-stage overfitting to preserve global structure

Abstract

Recent advances in diffusion models have led to impressive image generation capabilities, but aligning these models with human preferences remains challenging. Reward-based fine-tuning using models trained on human feedback improves alignment but often harms diversity, producing less varied outputs. In this work, we address this trade-off with two contributions. First, we introduce \textit{combined generation}, a novel sampling strategy that applies a reward-tuned diffusion model only in the later stages of the generation process, while preserving the base model for earlier steps. This approach mitigates early-stage overfitting and helps retain global structure and diversity. Second, we propose \textit{ImageReFL}, a fine-tuning method that improves image diversity with minimal loss in quality by training on real images and incorporating multiple regularizers, including diffusion and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Enhancement Techniques · Visual Attention and Saliency Detection