Spatial Latent Representations in Generative Adversarial Networks for   Image Generation

Maciej Sypetkowski

arXiv:2303.14552·cs.CV·March 28, 2023·1 cites

Spatial Latent Representations in Generative Adversarial Networks for Image Generation

Maciej Sypetkowski

PDF

Open Access

TL;DR

This paper introduces spatial latent spaces for StyleGAN2 that better capture spatial details and semantic information, enabling improved image manipulation, out-of-sample content representation, and high-quality image generation of arbitrary sizes.

Contribution

It proposes a novel family of spatial latent spaces, encoding methods, and training procedures that enhance image editing, quality, and scalability in GANs.

Findings

01

Encoding quality improved by up to 30% in LPIPS score.

02

Spatial latent spaces enable out-of-sample object arrangement representation.

03

Proposed training method improves FID score by 29% on SpaceNet.

Abstract

In the majority of GAN architectures, the latent space is defined as a set of vectors of given dimensionality. Such representations are not easily interpretable and do not capture spatial information of image content directly. In this work, we define a family of spatial latent spaces for StyleGAN2, capable of capturing more details and representing images that are out-of-sample in terms of the number and arrangement of object parts, such as an image of multiple faces or a face with more than two eyes. We propose a method for encoding images into our spaces, together with an attribute model capable of performing attribute editing in these spaces. We show that our spaces are effective for image manipulation and encode semantic information well. Our approach can be used on pre-trained generator models, and attribute edition can be done using pre-generated direction vectors making the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Image Processing and 3D Reconstruction · Multimodal Machine Learning Applications

MethodsHuMan(Expedia)||How do I get a human at Expedia? · Path Length Regularization · Weight Demodulation · Convolution · R1 Regularization