Designing an Encoder for StyleGAN Image Manipulation

Omer Tov; Yuval Alaluf; Yotam Nitzan; Or Patashnik; Daniel Cohen-Or

arXiv:2102.02766·cs.CV·February 5, 2021·6 cites

Designing an Encoder for StyleGAN Image Manipulation

Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or

PDF

Open Access 5 Repos 1 Models

TL;DR

This paper studies the StyleGAN latent space to improve real image inversion for editing, balancing tradeoffs between distortion and perceptual quality, resulting in better editing performance with minimal reconstruction loss.

Contribution

It introduces principles for encoder design that control inversion proximity to training regions, enhancing real image editing quality with a novel encoder tailored for StyleGAN.

Findings

01

Improved real image inversion quality on challenging domains.

02

Balanced tradeoffs enable effective editing with minimal reconstruction loss.

03

Proposed encoder outperforms existing methods in qualitative and quantitative evaluations.

Abstract

Recently, there has been a surge of diverse methods for performing image editing by employing pre-trained unconditional generators. Applying these methods on real images, however, remains a challenge, as it necessarily requires the inversion of the images into their latent space. To successfully invert a real image, one needs to find a latent code that reconstructs the input image accurately, and more importantly, allows for its meaningful manipulation. In this paper, we carefully study the latent space of StyleGAN, the state-of-the-art unconditional generator. We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space. We then suggest two principles for designing encoders in a manner that allows one to control the proximity of the inversions to regions that StyleGAN was originally trained on. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

🤗
public-data/e4e
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Generative Adversarial Networks and Image Synthesis · Advanced Image and Video Retrieval Techniques

MethodsDense Connections · Convolution · Feedforward Network · Adaptive Instance Normalization · R1 Regularization · HuMan(Expedia)||How do I get a human at Expedia? · StyleGAN