Encoder-based Domain Tuning for Fast Personalization of Text-to-Image   Models

Rinon Gal; Moab Arar; Yuval Atzmon; Amit H. Bermano; Gal Chechik,; Daniel Cohen-Or

arXiv:2302.12228·cs.CV·March 7, 2023·5 cites

Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models

Rinon Gal, Moab Arar, Yuval Atzmon, Amit H. Bermano, Gal Chechik,, Daniel Cohen-Or

PDF

Open Access

TL;DR

This paper introduces an encoder-based domain tuning method that enables rapid personalization of text-to-image models with minimal training, significantly reducing time and storage while maintaining high quality.

Contribution

The authors propose a novel encoder-based approach that underfits on domain concepts to enable fast, efficient personalization of diffusion models using only a single image and a few training steps.

Findings

01

Personalization time reduced from minutes to seconds.

02

High-quality concept embedding with minimal data.

03

Effective generalization across diverse domains.

Abstract

Text-to-image personalization aims to teach a pre-trained diffusion model to reason about novel, user provided concepts, embedding them into new scenes guided by natural language prompts. However, current personalization approaches struggle with lengthy training times, high storage requirements or loss of identity. To overcome these limitations, we propose an encoder-based domain-tuning approach. Our key insight is that by underfitting on a large set of concepts from a given domain, we can improve generalization and create a model that is more amenable to quickly adding novel concepts from the same domain. Specifically, we employ two components: First, an encoder that takes as an input a single image of a target concept from a given domain, e.g. a specific face, and learns to map it into a word-embedding representing the concept. Second, a set of regularized weight-offsets for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Image Retrieval and Classification Techniques · Generative Adversarial Networks and Image Synthesis

MethodsDiffusion