A Training-Free Approach for Multi-ID Customization via Attention Adjustment and Spatial Control

Jiawei Lin; Guanlong Jiao; Jianjin Xu

arXiv:2511.20401·cs.CV·November 26, 2025

A Training-Free Approach for Multi-ID Customization via Attention Adjustment and Spatial Control

Jiawei Lin, Guanlong Jiao, Jianjin Xu

PDF

Open Access

TL;DR

This paper introduces MultiID, a training-free method for multi-ID image customization in computer vision, using attention adjustment and spatial control to improve quality and text controllability without additional training.

Contribution

The paper proposes a novel training-free approach that adapts existing single-ID models for multi-ID customization using ID-decoupled attention and spatial control strategies.

Findings

01

Achieves high-quality multi-ID customization without training.

02

Outperforms some training-based methods in quality and controllability.

03

Provides a new benchmark for evaluation called IDBench.

Abstract

Multi-ID customization is an interesting topic in computer vision and attracts considerable attention recently. Given the ID images of multiple individuals, its purpose is to generate a customized image that seamlessly integrates them while preserving their respective identities. Compared to single-ID customization, multi-ID customization is much more difficult and poses two major challenges. First, since the multi-ID customization model is trained to reconstruct an image from the cropped person regions, it often encounters the copy-paste issue during inference, leading to lower quality. Second, the model also suffers from inferior text controllability. The generated result simply combines multiple persons into one image, regardless of whether it is aligned with the input text. In this work, we propose MultiID to tackle this challenging task in a training-free manner. Since the existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Face recognition and analysis · Advanced Image and Video Retrieval Techniques