CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization
Feize Wu, Yun Pang, Junyi Zhang, Lianyu Pang, Jian Yin, Baoquan Zhao,, Qing Li, Xudong Mao

TL;DR
CoRe introduces a context regularization technique for text embedding learning in text-to-image personalization, improving identity preservation and text alignment without needing image generation during training.
Contribution
The paper proposes a novel context regularization method for embedding new concepts into text encoders, enhancing personalization in text-to-image synthesis.
Findings
Outperforms baselines in identity preservation
Enhances text alignment accuracy
Applicable as a test-time optimization
Abstract
Recent advances in text-to-image personalization have enabled high-quality and controllable image synthesis for user-provided concepts. However, existing methods still struggle to balance identity preservation with text alignment. Our approach is based on the fact that generating prompt-aligned images requires a precise semantic understanding of the prompt, which involves accurately processing the interactions between the new concept and its surrounding context tokens within the CLIP text encoder. To address this, we aim to embed the new concept properly into the input embedding space of the text encoder, allowing for seamless integration with existing tokens. We introduce Context Regularization (CoRe), which enhances the learning of the new concept's text embedding by regularizing its context tokens in the prompt. This is based on the insight that appropriate output vectors of the text…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsImage Retrieval and Classification Techniques · Topic Modeling · Generative Adversarial Networks and Image Synthesis
MethodsContrastive Language-Image Pre-training
