Multi-party Collaborative Attention Control for Image Customization
Han Yang, Chuanguang Yang, Qiuli Wang, Zhulin An, Weilun Feng, Libo, Huang, Yongjun Xu

TL;DR
This paper presents MCA-Ctrl, a novel tuning-free method for high-quality image customization using both text and complex visual conditions, addressing limitations of existing methods in subject accuracy and background consistency.
Contribution
MCA-Ctrl introduces a new attention control mechanism and a subject localization module for improved zero-shot image customization in diffusion models.
Findings
Outperforms existing methods in zero-shot customization
Effectively reduces subject leakage and background inconsistency
Operates without additional training or fine-tuning
Abstract
The rapid advancement of diffusion models has increased the need for customized image generation. However, current customization methods face several limitations: 1) typically accept either image or text conditions alone; 2) customization in complex visual scenarios often leads to subject leakage or confusion; 3) image-conditioned outputs tend to suffer from inconsistent backgrounds; and 4) high computational costs. To address these issues, this paper introduces Multi-party Collaborative Attention Control (MCA-Ctrl), a tuning-free method that enables high-quality image customization using both text and complex visual conditions. Specifically, MCA-Ctrl leverages two key operations within the self-attention layer to coordinate multiple parallel diffusion processes and guide the target image generation. This approach allows MCA-Ctrl to capture the content and appearance of specific…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsColor perception and design · Image Retrieval and Classification Techniques · Visual Attention and Saliency Detection
