StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery

Or Patashnik; Zongze Wu; Eli Shechtman; Daniel Cohen-Or; Dani; Lischinski

arXiv:2103.17249·cs.CV·April 1, 2021

StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery

Or Patashnik, Zongze Wu, Eli Shechtman, Daniel Cohen-Or, Dani, Lischinski

PDF

5 Repos 1 Video

TL;DR

StyleCLIP introduces a novel text-based interface for manipulating StyleGAN-generated images by leveraging CLIP models, enabling intuitive and automatic semantic edits without manual latent space exploration.

Contribution

The paper develops a CLIP-based optimization scheme, a latent mapper for faster manipulation, and a method for input-agnostic style space directions, advancing text-driven image editing.

Findings

01

Effective text-guided image manipulation demonstrated

02

Faster and more stable edits with the latent mapper

03

Comparable or superior results to manual methods

Abstract

Inspired by the ability of StyleGAN to generate highly realistic images in a variety of domains, much recent work has focused on understanding how to use the latent spaces of StyleGAN to manipulate generated and real images. However, discovering semantically meaningful latent manipulations typically involves painstaking human examination of the many degrees of freedom, or an annotated collection of images for each desired manipulation. In this work, we explore leveraging the power of recently introduced Contrastive Language-Image Pre-training (CLIP) models in order to develop a text-based interface for StyleGAN image manipulation that does not require such manual effort. We first introduce an optimization scheme that utilizes a CLIP-based loss to modify an input latent vector in response to a user-provided text prompt. Next, we describe a latent mapper that infers a text-guided latent…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

This AI Made Me Look Like Obi-Wan Kenobi! 🧔· youtube

Taxonomy

MethodsHuMan(Expedia)||How do I get a human at Expedia? · R1 Regularization · Adaptive Instance Normalization · Dense Connections · Convolution · Feedforward Network · StyleGAN