A Training-Free Approach for Multi-ID Customization via Attention Adjustment and Spatial Control
Jiawei Lin, Guanlong Jiao, Jianjin Xu

TL;DR
This paper introduces MultiID, a training-free method for multi-ID image customization in computer vision, using attention adjustment and spatial control to improve quality and text controllability without additional training.
Contribution
The paper proposes a novel training-free approach that adapts existing single-ID models for multi-ID customization using ID-decoupled attention and spatial control strategies.
Findings
Achieves high-quality multi-ID customization without training.
Outperforms some training-based methods in quality and controllability.
Provides a new benchmark for evaluation called IDBench.
Abstract
Multi-ID customization is an interesting topic in computer vision and attracts considerable attention recently. Given the ID images of multiple individuals, its purpose is to generate a customized image that seamlessly integrates them while preserving their respective identities. Compared to single-ID customization, multi-ID customization is much more difficult and poses two major challenges. First, since the multi-ID customization model is trained to reconstruct an image from the cropped person regions, it often encounters the copy-paste issue during inference, leading to lower quality. Second, the model also suffers from inferior text controllability. The generated result simply combines multiple persons into one image, regardless of whether it is aligned with the input text. In this work, we propose MultiID to tackle this challenging task in a training-free manner. Since the existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Advanced Image and Video Retrieval Techniques
