User-Controllable Latent Transformer for StyleGAN Image Layout Editing

Yuki Endo

arXiv:2208.12408·cs.CV·August 29, 2022·1 cites

User-Controllable Latent Transformer for StyleGAN Image Layout Editing

Yuki Endo

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces an interactive framework that allows users to directly edit the layout of StyleGAN-generated images by annotating desired movements, using a transformer-based model trained with synthetic data.

Contribution

It presents a novel transformer-based latent editing method that enables intuitive, user-guided spatial layout adjustments in StyleGAN images without manual supervision.

Findings

01

Effective spatial layout editing demonstrated through quantitative metrics.

02

Outperforms existing latent editing methods in user-guided tasks.

03

Framework operates without manual data annotation, using synthetic training data.

Abstract

Latent space exploration is a technique that discovers interpretable latent directions and manipulates latent codes to edit various attributes in images generated by generative adversarial networks (GANs). However, in previous work, spatial control is limited to simple transformations (e.g., translation and rotation), and it is laborious to identify appropriate latent directions and adjust their parameters. In this paper, we tackle the problem of editing the StyleGAN image layout by annotating the image directly. To do so, we propose an interactive framework for manipulating latent codes in accordance with the user inputs. In our framework, the user annotates a StyleGAN image with locations they want to move or not and specifies a movement direction by mouse dragging. From these user inputs and initial latent codes, our latent transformer based on a transformer encoder-decoder…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

endo-yuki-t/UserControllableLT
pytorchOfficial

Models

🤗
radames/UserControllableLT
model· ♡ 3
♡ 3

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction · Computer Graphics and Visualization Techniques

MethodsStyleGAN · HuMan(Expedia)||How do I get a human at Expedia? · Dense Connections · Convolution · Adaptive Instance Normalization · R1 Regularization · Feedforward Network