Cross Initialization for Personalized Text-to-Image Generation
Lianyu Pang, Jian Yin, Haoran Xie, Qiping Wang, Qing Li, Xudong Mao

TL;DR
This paper introduces Cross Initialization, a novel method for personalized text-to-image generation that improves reconstruction quality and editability by narrowing the gap between initial and learned embeddings, reducing optimization steps, and enabling facial expression editing.
Contribution
The paper proposes Cross Initialization, a new approach that enhances personalization in text-to-image models by improving embedding initialization, reducing training steps, and enabling better facial expression editing.
Findings
Cross Initialization narrows the embedding gap significantly.
Reduces optimization steps from 5000 to 320.
Enables successful facial expression editing.
Abstract
Recently, there has been a surge in face personalization techniques, benefiting from the advanced capabilities of pretrained text-to-image diffusion models. Among these, a notable method is Textual Inversion, which generates personalized images by inverting given images into textual embeddings. However, methods based on Textual Inversion still struggle with balancing the trade-off between reconstruction quality and editability. In this study, we examine this issue through the lens of initialization. Upon closely examining traditional initialization methods, we identified a significant disparity between the initial and learned embeddings in terms of both scale and orientation. The scale of the learned embedding can be up to 100 times greater than that of the initial embedding. Such a significant change in the embedding could increase the risk of overfitting, thereby compromising the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Fetal and Pediatric Neurological Disorders
MethodsDiffusion
