ID-EA: Identity-driven Text Enhancement and Adaptation with Textual Inversion for Personalized Text-to-Image Generation
Hyun-Jun Jin, Young-Eun Kim, and Seong-Whan Lee

TL;DR
ID-EA is a novel framework that enhances personalized text-to-image generation by aligning textual and visual identity embeddings, significantly improving identity preservation and efficiency over previous methods.
Contribution
The paper introduces ID-EA, a new approach that guides text embeddings to better match visual identity embeddings, improving identity consistency in personalized image generation.
Findings
Outperforms state-of-the-art in identity preservation metrics
Generates personalized portraits 15 times faster
Achieves high-fidelity personalized images with improved identity consistency
Abstract
Recently, personalized portrait generation with a text-to-image diffusion model has significantly advanced with Textual Inversion, emerging as a promising approach for creating high-fidelity personalized images. Despite its potential, current Textual Inversion methods struggle to maintain consistent facial identity due to semantic misalignments between textual and visual embedding spaces regarding identity. We introduce ID-EA, a novel framework that guides text embeddings to align with visual identity embeddings, thereby improving identity preservation in a personalized generation. ID-EA comprises two key components: the ID-driven Enhancer (ID-Enhancer) and the ID-conditioned Adapter (ID-Adapter). First, the ID-Enhancer integrates identity embeddings with a textual ID anchor, refining visual identity embeddings derived from a face recognition model using representative text embeddings.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsComputational and Text Analysis Methods · Topic Modeling · Video Analysis and Summarization
