FlipConcept: Tuning-Free Multi-Concept Personalization for Text-to-Image Generation
Young Beom Woo, Sun Eung Kim, Seong-Whan Lee

TL;DR
FlipConcept enables high-quality, multi-concept personalization in text-to-image generation without additional tuning, using innovative attention and masking techniques to improve visual fidelity and prevent concept leakage.
Contribution
The paper introduces FlipConcept, a tuning-free method that effectively integrates multiple personalized concepts into images with enhanced fidelity and minimal leakage.
Findings
Outperforms existing models in multi-concept personalization
Does not require additional tuning or fine-tuning
Maintains high visual fidelity in complex scenes
Abstract
Integrating multiple personalized concepts into a single image has recently gained attention in text-to-image (T2I) generation. However, existing methods often suffer from performance degradation in complex scenes due to distortions in non-personalized regions and the need for additional fine-tuning, limiting their practicality. To address this issue, we propose FlipConcept, a novel approach that seamlessly integrates multiple personalized concepts into a single image without requiring additional tuning. We introduce guided appearance attention to enhance the visual fidelity of personalized concepts. Additionally, we introduce mask-guided noise mixing to protect non-personalized regions during concept integration. Lastly, we apply background dilution to minimize concept leakage, i.e., the undesired blending of personalized concepts with other objects in the image. In our experiments, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSoftmax · Attention Is All You Need
