User-Controllable Latent Transformer for StyleGAN Image Layout Editing
Yuki Endo

TL;DR
This paper introduces an interactive framework that allows users to directly edit the layout of StyleGAN-generated images by annotating desired movements, using a transformer-based model trained with synthetic data.
Contribution
It presents a novel transformer-based latent editing method that enables intuitive, user-guided spatial layout adjustments in StyleGAN images without manual supervision.
Findings
Effective spatial layout editing demonstrated through quantitative metrics.
Outperforms existing latent editing methods in user-guided tasks.
Framework operates without manual data annotation, using synthetic training data.
Abstract
Latent space exploration is a technique that discovers interpretable latent directions and manipulates latent codes to edit various attributes in images generated by generative adversarial networks (GANs). However, in previous work, spatial control is limited to simple transformations (e.g., translation and rotation), and it is laborious to identify appropriate latent directions and adjust their parameters. In this paper, we tackle the problem of editing the StyleGAN image layout by annotating the image directly. To do so, we propose an interactive framework for manipulating latent codes in accordance with the user inputs. In our framework, the user annotates a StyleGAN image with locations they want to move or not and specifies a movement direction by mouse dragging. From these user inputs and initial latent codes, our latent transformer based on a transformer encoder-decoder…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction · Computer Graphics and Visualization Techniques
MethodsStyleGAN · HuMan(Expedia)||How do I get a human at Expedia? · Dense Connections · Convolution · Adaptive Instance Normalization · R1 Regularization · Feedforward Network
