Calibrated Multi-Preference Optimization for Aligning Diffusion Models

Kyungmin Lee; Xiaohang Li; Qifei Wang; Junfeng He; Junjie Ke; Ming-Hsuan Yang; Irfan Essa; Jinwoo Shin; Feng Yang; Yinxiao Li

arXiv:2502.02588·cs.CV·September 29, 2025

Calibrated Multi-Preference Optimization for Aligning Diffusion Models

Kyungmin Lee, Xiaohang Li, Qifei Wang, Junfeng He, Junjie Ke, Ming-Hsuan Yang, Irfan Essa, Jinwoo Shin, Feng Yang, Yinxiao Li

PDF

Open Access

TL;DR

This paper introduces CaPO, a novel method for aligning diffusion models with multiple preferences using reward calibration and Pareto frontier pair selection, eliminating the need for human annotations.

Contribution

The paper proposes Calibrated Preference Optimization (CaPO), a new approach that leverages multiple reward models and calibration techniques to improve preference alignment without human data.

Findings

01

CaPO outperforms prior methods like DPO on T2I benchmarks.

02

It effectively handles multi-preference scenarios.

03

The method improves alignment accuracy and generalization.

Abstract

Aligning text-to-image (T2I) diffusion models with preference optimization is valuable for human-annotated datasets, but the heavy cost of manual data collection limits scalability. Using reward models offers an alternative, however, current preference optimization methods fall short in exploiting the rich information, as they only consider pairwise preference distribution. Furthermore, they lack generalization to multi-preference scenarios and struggle to handle inconsistencies between rewards. To address this, we present Calibrated Preference Optimization (CaPO), a novel method to align T2I diffusion models by incorporating the general preference from multiple reward models without human annotated data. The core of our approach involves a reward calibration method to approximate the general preference by computing the expected win-rate against the samples generated by the pretrained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVehicle emissions and performance · Urban and Freight Transport Logistics · Transportation Planning and Optimization

MethodsDiffusion · ALIGN