Direct Diffusion Score Preference Optimization via Stepwise Contrastive Policy-Pair Supervision

Dohyun Kim; Seungwoo Lyu; Seung Wook Kim; Paul Hongsuck Seo

arXiv:2512.23426·cs.CV·December 30, 2025

Direct Diffusion Score Preference Optimization via Stepwise Contrastive Policy-Pair Supervision

Dohyun Kim, Seungwoo Lyu, Seung Wook Kim, Paul Hongsuck Seo

PDF

Open Access

TL;DR

This paper introduces DDSPO, a novel method for aligning diffusion model outputs with user preferences by using dense, stepwise supervision derived from a pretrained reference model, reducing reliance on manual labels.

Contribution

DDSPO provides a new approach for preference optimization in diffusion models by automatically generating dense supervision signals from a pretrained model, improving alignment without manual annotations.

Findings

01

Outperforms existing preference-based methods in text-image alignment.

02

Requires less supervision than prior approaches.

03

Enhances visual quality of generated images.

Abstract

Diffusion models have achieved impressive results in generative tasks such as text-to-image synthesis, yet they often struggle to fully align outputs with nuanced user intent and maintain consistent aesthetic quality. Existing preference-based training methods like Diffusion Direct Preference Optimization help address these issues but rely on costly and potentially noisy human-labeled datasets. In this work, we introduce Direct Diffusion Score Preference Optimization (DDSPO), which directly derives per-timestep supervision from winning and losing policies when such policies are available. Unlike prior methods that operate solely on final samples, DDSPO provides dense, transition-level signals across the denoising trajectory. In practice, we avoid reliance on labeled data by automatically generating preference signals using a pretrained reference model: we contrast its outputs when…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Machine Learning in Materials Science