Generative Models of Visually Grounded Imagination

Ramakrishna Vedantam; Ian Fischer; Jonathan Huang; Kevin Murphy

arXiv:1705.10762·cs.LG·November 12, 2018·50 cites

Generative Models of Visually Grounded Imagination

Ramakrishna Vedantam, Ian Fischer, Jonathan Huang, Kevin Murphy

PDF

Open Access

TL;DR

This paper introduces a novel variational auto-encoder framework for visually grounded imagination, enabling the generation of images from abstract concepts with new training objectives and inference methods, evaluated on MNIST-with-attributes and CelebA datasets.

Contribution

It presents a new training objective and product-of-experts inference network for VAEs to generate images from partially specified concepts, advancing visual imagination modeling.

Findings

01

Proposed method outperforms existing joint image-attribute VAEs.

02

Introduced MNIST-with-attributes dataset for evaluation.

03

Achieved better correctness, coverage, and compositionality metrics.

Abstract

It is easy for people to imagine what a man with pink hair looks like, even if they have never seen such a person before. We call the ability to create images of novel semantic concepts visually grounded imagination. In this paper, we show how we can modify variational auto-encoders to perform this task. Our method uses a novel training objective, and a novel product-of-experts inference network, which can handle partially specified (abstract) concepts in a principled and efficient way. We also propose a set of easy-to-compute evaluation metrics that capture our intuitive notions of what it means to have good visual imagination, namely correctness, coverage, and compositionality (the 3 C's). Finally, we perform a detailed comparison of our method with two existing joint image-attribute VAE methods (the JMVAE method of Suzuki et.al. and the BiVCCA method of Wang et.al.) by applying them…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Generative Adversarial Networks and Image Synthesis

MethodsUSD Coin Customer Service Number +1-833-534-1729