AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization
Junjie Shentu, Matthew Watson, Noura Al Moubayed

TL;DR
AttenCraft is an attention-based method that improves the disentanglement of multiple concepts in text-to-image models, enabling more accurate and balanced concept learning without manual mask annotations.
Contribution
It introduces an attention-guided approach with adaptive sampling and feature-retaining training to address feature fusion and asynchronous learning in multi-concept disentanglement.
Findings
Achieves state-of-the-art image fidelity in experiments.
Effectively disentangles multiple concepts simultaneously.
Maintains prompt fidelity comparable to baseline models.
Abstract
Text-to-image (T2I) customization empowers users to adapt the T2I diffusion model to new concepts absent in the pre-training dataset. On this basis, capturing multiple new concepts from a single image has emerged as a new task, allowing the model to learn multiple concepts simultaneously or discard unwanted concepts. However, multiple-concept disentanglement remains a key challenge. Existing disentanglement models often exhibit two main issues: feature fusion and asynchronous learning across different concepts. To address these issues, we propose AttenCraft, an attention-based method for multiple-concept disentanglement. Our method uses attention maps to generate accurate masks for each concept in a single initialization step, aiding in concept disentanglement without requiring mask preparation from humans or specialized models. Moreover, we introduce an adaptive algorithm based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Image Retrieval and Classification Techniques · Advanced Text Analysis Techniques
MethodsDiffusion
