The Geometry of Deep Generative Image Models and its Applications
Binxu Wang, Carlos R. Ponce

TL;DR
This paper introduces a geometric framework to analyze the latent space of GANs, revealing its structure, interpretability, and applications in image manipulation and inversion.
Contribution
It develops a method to compute the Riemannian metric of GAN image manifolds, enabling better understanding and utilization of the latent space.
Findings
Image variation is concentrated along few major axes.
The latent space is highly anisotropic and homogeneous.
Many eigenvectors correspond to interpretable image transformations.
Abstract
Generative adversarial networks (GANs) have emerged as a powerful unsupervised method to model the statistical patterns of real-world data sets, such as natural images. These networks are trained to map random inputs in their latent space to new samples representative of the learned data. However, the structure of the latent space is hard to intuit due to its high dimensionality and the non-linearity of the generator, which limits the usefulness of the models. Understanding the latent space requires a way to identify input codes for existing real-world images (inversion), and a way to identify directions with known image transformations (interpretability). Here, we use a geometric framework to address both issues simultaneously. We develop an architecture-agnostic method to compute the Riemannian metric of the image manifold created by GANs. The eigen-decomposition of the metric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Cell Image Analysis Techniques · Advanced Image Processing Techniques
