Text-Free Learning of a Natural Language Interface for Pretrained Face   Generators

Xiaodan Du; Raymond A. Yeh; Nicholas Kolkin; Eli Shechtman; Greg; Shakhnarovich

arXiv:2209.03953·cs.CV·September 9, 2022·1 cites

Text-Free Learning of a Natural Language Interface for Pretrained Face Generators

Xiaodan Du, Raymond A. Yeh, Nicholas Kolkin, Eli Shechtman, Greg, Shakhnarovich

PDF

Open Access 1 Repo

TL;DR

This paper introduces Fast text2StyleGAN, a fast, text-guided face synthesis method that leverages CLIP and a CVAE, enabling natural language control over pre-trained GANs without re-training or optimization at test time.

Contribution

The paper presents a novel, fast, and training-free approach to adapt pre-trained GANs for natural language face synthesis using CLIP and CVAE, eliminating the need for fine-tuning or test-time optimization.

Findings

01

Faster image generation from text descriptions compared to prior methods.

02

More accurate face synthesis from natural language prompts.

03

No re-training or fine-tuning required for new text inputs.

Abstract

We propose Fast text2StyleGAN, a natural language interface that adapts pre-trained GANs for text-guided human face synthesis. Leveraging the recent advances in Contrastive Language-Image Pre-training (CLIP), no text data is required during training. Fast text2StyleGAN is formulated as a conditional variational autoencoder (CVAE) that provides extra control and diversity to the generated images at test time. Our model does not require re-training or fine-tuning of the GANs or CLIP when encountering new text prompts. In contrast to prior work, we do not rely on optimization at test time, making our method orders of magnitude faster than prior work. Empirically, on FFHQ dataset, our method offers faster and more accurate generation of images from natural language descriptions with varying levels of detail compared to prior work.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

duxiaodan/fast_text2stylegan
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Generative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning

MethodsTest · Contrastive Language-Image Pre-training