Improving Multi-Subject Consistency in Open-Domain Image Generation with Isolation and Reposition Attention
Huiguo He, Qiuyue Wang, Yuan Zhou, Yuxuan Cai, Hongyang Chao, Jian, Yin, Huan Yang

TL;DR
This paper introduces IR-Diffusion, a training-free diffusion model with Isolation and Reposition Attention, significantly improving multi-subject consistency in open-domain image generation by addressing internal attraction and positional referencing issues.
Contribution
The paper proposes a novel IR-Diffusion model with Isolation and Reposition Attention to enhance multi-subject consistency without additional training.
Findings
IR-Diffusion outperforms existing methods in multi-subject consistency.
Isolation Attention prevents subjects from converging into a single entity.
Reposition Attention aligns subjects to improve referencing accuracy.
Abstract
Training-free diffusion models have achieved remarkable progress in generating multi-subject consistent images within open-domain scenarios. The key idea of these methods is to incorporate reference subject information within the attention layer. However, existing methods still obtain suboptimal performance when handling numerous subjects. This paper reveals two primary issues contributing to this deficiency. Firstly, the undesired internal attraction between different subjects within the target image can lead to the convergence of multiple subjects into a single entity. Secondly, tokens tend to reference nearby tokens, which reduces the effectiveness of the attention mechanism when there is a significant positional difference between subjects in reference and target images. To address these issues, we propose a training-free diffusion model with Isolation and Reposition Attention,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Advanced Vision and Imaging
MethodsSoftmax · Attention Is All You Need · Diffusion
