CPO: Condition Preference Optimization for Controllable Image Generation

Zonglin Lyu; Ming Li; Xinxin Liu; Chen Chen

arXiv:2511.04753·cs.CV·November 10, 2025

CPO: Condition Preference Optimization for Controllable Image Generation

Zonglin Lyu, Ming Li, Xinxin Liu, Chen Chen

PDF

Open Access

TL;DR

This paper introduces Condition Preference Optimization (CPO), a novel method for improving controllability in text-to-image generation by training models to prefer control signals over images, reducing variance and computational costs.

Contribution

CPO is a new preference learning approach that trains models to prefer control signals directly, outperforming existing methods like ControlNet++ in controllability and efficiency.

Findings

01

CPO reduces error rates by over 10% in segmentation tasks.

02

CPO achieves 70-80% improvement in human pose control.

03

CPO consistently reduces errors in edge and depth map controls.

Abstract

To enhance controllability in text-to-image generation, ControlNet introduces image-based control signals, while ControlNet++ improves pixel-level cycle consistency between generated images and the input control signal. To avoid the prohibitive cost of back-propagating through the sampling process, ControlNet++ optimizes only low-noise timesteps (e.g., $t < 200$ ) using a single-step approximation, which not only ignores the contribution of high-noise timesteps but also introduces additional approximation errors. A straightforward alternative for optimizing controllability across all timesteps is Direct Preference Optimization (DPO), a fine-tuning method that increases model preference for more controllable images ( $I^{w}$ ) over less controllable ones ( $I^{l}$ ). However, due to uncertainty in generative models, it is difficult to ensure that win--lose image pairs differ only in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Computer Graphics and Visualization Techniques · Image Enhancement Techniques