Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models
Moab Arar, Rinon Gal, Yuval Atzmon, Gal Chechik, Daniel Cohen-Or,, Ariel Shamir, Amit H. Bermano

TL;DR
This paper introduces a domain-agnostic encoder-based method for fast personalization of text-to-image models, improving flexibility and semantic fidelity without requiring specialized datasets.
Contribution
The work presents a novel contrastive regularization technique enabling domain-agnostic T2I personalization, surpassing previous methods in flexibility and performance.
Findings
Achieves state-of-the-art personalization performance.
Produces more semantic and flexible token representations.
Does not require specialized datasets or prior concept information.
Abstract
Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts. Recently, encoder-based techniques have emerged as a new effective approach for T2I personalization, reducing the need for multiple images and long training times. However, most existing encoders are limited to a single-class domain, which hinders their ability to handle diverse concepts. In this work, we propose a domain-agnostic method that does not require any specialized dataset or prior information about the personalized concepts. We introduce a novel contrastive-based regularization technique to maintain high fidelity to the target concept characteristics while keeping the predicted embeddings close to editable regions of the latent space, by pushing the predicted tokens toward their nearest existing CLIP tokens. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
MethodsContrastive Language-Image Pre-training
