IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation
Xinchen Zhang, Ling Yang, Guohao Li, Yaqi Cai, Jiake Xie, Yong Tang,, Yujiu Yang, Mengdi Wang, Bin Cui

TL;DR
IterComp introduces an iterative feedback framework that leverages multiple diffusion models and reward learning to significantly improve compositional text-to-image generation, especially in complex scenarios.
Contribution
The paper proposes a novel iterative feedback learning approach that aggregates preferences from multiple models to enhance compositionality in diffusion-based image generation.
Findings
Outperforms previous SOTA methods like Omost and FLUX
Improves multi-category object composition accuracy
Enhances complex semantic alignment in generated images
Abstract
Advanced diffusion models like RPG, Stable Diffusion 3 and FLUX have made notable strides in compositional text-to-image generation. However, these methods typically exhibit distinct strengths for compositional generation, with some excelling in handling attribute binding and others in spatial relationships. This disparity highlights the need for an approach that can leverage the complementary strengths of various models to comprehensively improve the composition capability. To this end, we introduce IterComp, a novel framework that aggregates composition-aware model preferences from multiple models and employs an iterative feedback learning approach to enhance compositional generation. Specifically, we curate a gallery of six powerful open-source diffusion models and evaluate their three key compositional metrics: attribute binding, spatial relationships, and non-spatial relationships.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Image Retrieval and Classification Techniques
MethodsBalanced Selection · Diffusion
