ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback
Ming Li, Taojiannan Yang, Huafeng Kuang, Jie Wu, Zhaoning Wang,, Xuefeng Xiao, Chen Chen

TL;DR
ControlNet++ enhances the controllability of text-to-image diffusion models by explicitly optimizing pixel-level cycle consistency with an efficient reward strategy, leading to significant improvements in various conditional controls.
Contribution
It introduces a novel cycle consistency optimization method with an efficient reward strategy to improve controllable image generation in diffusion models.
Findings
Achieves 11.1% improvement in mIoU for segmentation masks
Achieves 13.4% improvement in SSIM for line-art edges
Achieves 7.6% reduction in RMSE for depth conditions
Abstract
To enhance the controllability of text-to-image diffusion models, existing efforts like ControlNet incorporated image-based conditional controls. In this paper, we reveal that existing methods still face significant challenges in generating images that align with the image conditional controls. To this end, we propose ControlNet++, a novel approach that improves controllable generation by explicitly optimizing pixel-level cycle consistency between generated images and conditional controls. Specifically, for an input conditional control, we use a pre-trained discriminative reward model to extract the corresponding condition of the generated images, and then optimize the consistency loss between the input conditional control and extracted condition. A straightforward implementation would be generating images from random noises and then calculating the consistency loss, but such an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Neuroimaging Techniques and Applications · Computer Graphics and Visualization Techniques
MethodsALIGN · Diffusion
