Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image   Models

Moab Arar; Rinon Gal; Yuval Atzmon; Gal Chechik; Daniel Cohen-Or,; Ariel Shamir; Amit H. Bermano

arXiv:2307.06925·cs.CV·July 14, 2023·2 cites

Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models

Moab Arar, Rinon Gal, Yuval Atzmon, Gal Chechik, Daniel Cohen-Or,, Ariel Shamir, Amit H. Bermano

PDF

Open Access

TL;DR

This paper introduces a domain-agnostic encoder-based method for fast personalization of text-to-image models, improving flexibility and semantic fidelity without requiring specialized datasets.

Contribution

The work presents a novel contrastive regularization technique enabling domain-agnostic T2I personalization, surpassing previous methods in flexibility and performance.

Findings

01

Achieves state-of-the-art personalization performance.

02

Produces more semantic and flexible token representations.

03

Does not require specialized datasets or prior concept information.

Abstract

Text-to-image (T2I) personalization allows users to guide the creative image generation process by combining their own visual concepts in natural language prompts. Recently, encoder-based techniques have emerged as a new effective approach for T2I personalization, reducing the need for multiple images and long training times. However, most existing encoders are limited to a single-class domain, which hinders their ability to handle diverse concepts. In this work, we propose a domain-agnostic method that does not require any specialized dataset or prior information about the personalized concepts. We introduce a novel contrastive-based regularization technique to maintain high fidelity to the target concept characteristics while keeping the predicted embeddings close to editable regions of the latent space, by pushing the predicted tokens toward their nearest existing CLIP tokens. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques

MethodsContrastive Language-Image Pre-training