Inv-Adapter: ID Customization Generation via Image Inversion and Lightweight Adapter
Peng Xing, Ning Wang, Jianbo Ouyang, Zechao Li

TL;DR
This paper introduces Inv-Adapter, a lightweight method for ID customization in text-to-image generation that improves fidelity and efficiency by leveraging image inversion and a novel attention adapter without increasing model size.
Contribution
The paper proposes Inv-Adapter, a novel lightweight approach that extracts diffusion-domain representations via image inversion and embeds them into the model with an efficient attention adapter, avoiding additional encoders.
Findings
High ID fidelity and generation loyalty achieved
Faster generation speed and reduced training parameters
Competitive performance in ID customization and model scaling
Abstract
The remarkable advancement in text-to-image generation models significantly boosts the research in ID customization generation. However, existing personalization methods cannot simultaneously satisfy high fidelity and high-efficiency requirements. Their main bottleneck lies in the prompt image encoder, which produces weak alignment signals with the text-to-image model and significantly increased model size. Towards this end, we propose a lightweight Inv-Adapter, which first extracts diffusion-domain representations of ID images utilizing a pre-trained text-to-image model via DDIM image inversion, without additional image encoder. Benefiting from the high alignment of the extracted ID prompt features and the intermediate features of the text-to-image model, we then embed them efficiently into the base text-to-image model by carefully designing a lightweight attention adapter. We conduct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Advanced Image and Video Retrieval Techniques
MethodsBalanced Selection
