When Preference Labels Fall Short: Aligning Diffusion Models from Real Data
Weiyan Chen, Weijian Deng, Yao Xiao, Weijie Tu, ZiYi Dong, Ibrahim Radwan, Liang Lin, Pengxu Wei

TL;DR
This paper explores using real data as supervision for aligning diffusion models, demonstrating it can be as effective as preference pairs from generated images, with practical benefits.
Contribution
It introduces a data-centric approach that leverages real images as reference points for preference alignment, reducing reliance on manually annotated preference pairs.
Findings
Real-data supervision achieves comparable performance to preference-based methods.
Constructing preference signals from real images is effective without manual annotations.
The approach offers a practical, label-efficient alternative for model alignment.
Abstract
Preference alignment aims to guide generative models by learning from comparisons between preferred and non-preferred samples. In practice, most existing approaches rely on preference pairs constructed from model-generated images. Such supervision is inherently relative and can be ambiguous when both samples exhibit artifacts or limited visual quality, making it difficult to infer what constitutes a truly desirable output. In this work, we investigate whether real data can serve as an alternative source of supervision for preference alignment. We adopt a data-centric perspective and study a curation strategy that treats real images as reference points and constructs preference signals by contrasting them with generated or perturbed samples, without requiring manually annotated preference pairs. Through empirical analysis, we show that real-data-based supervision provides effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
