Toward a Visual Concept Vocabulary for GAN Latent Space

Sarah Schwettmann; Evan Hernandez; David Bau; Samuel Klein; Jacob; Andreas; Antonio Torralba

arXiv:2110.04292·cs.CV·October 11, 2021

Toward a Visual Concept Vocabulary for GAN Latent Space

Sarah Schwettmann, Evan Hernandez, David Bau, Samuel Klein, Jacob, Andreas, Antonio Torralba

PDF

1 Repo

TL;DR

This paper presents a new method to create an open-ended, human-interpretable vocabulary of visual concepts in GAN latent spaces, enabling more precise and meaningful image manipulations.

Contribution

It introduces a three-component approach combining automatic detection, human annotation, and decomposition to build a reliable, composable visual concept vocabulary for GANs.

Findings

01

Concepts are reliable and generalize across classes and observers.

02

Enables fine-grained manipulation of image style and content.

03

Concepts are interpretable and composable.

Abstract

A large body of recent work has identified transformations in the latent spaces of generative adversarial networks (GANs) that consistently and interpretably transform generated images. But existing techniques for identifying these transformations rely on either a fixed vocabulary of pre-specified visual concepts, or on unsupervised disentanglement techniques whose alignment with human judgments about perceptual salience is unknown. This paper introduces a new method for building open-ended vocabularies of primitive visual concepts represented in a GAN's latent space. Our approach is built from three components: (1) automatic identification of perceptually salient directions based on their layer selectivity; (2) human annotation of these directions with free-form, compositional natural language descriptions; and (3) decomposition of these annotations into a visual concept vocabulary,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

schwettmann/visual-vocab
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.