Generative Models of Visually Grounded Imagination
Ramakrishna Vedantam, Ian Fischer, Jonathan Huang, Kevin Murphy

TL;DR
This paper introduces a novel variational auto-encoder framework for visually grounded imagination, enabling the generation of images from abstract concepts with new training objectives and inference methods, evaluated on MNIST-with-attributes and CelebA datasets.
Contribution
It presents a new training objective and product-of-experts inference network for VAEs to generate images from partially specified concepts, advancing visual imagination modeling.
Findings
Proposed method outperforms existing joint image-attribute VAEs.
Introduced MNIST-with-attributes dataset for evaluation.
Achieved better correctness, coverage, and compositionality metrics.
Abstract
It is easy for people to imagine what a man with pink hair looks like, even if they have never seen such a person before. We call the ability to create images of novel semantic concepts visually grounded imagination. In this paper, we show how we can modify variational auto-encoders to perform this task. Our method uses a novel training objective, and a novel product-of-experts inference network, which can handle partially specified (abstract) concepts in a principled and efficient way. We also propose a set of easy-to-compute evaluation metrics that capture our intuitive notions of what it means to have good visual imagination, namely correctness, coverage, and compositionality (the 3 C's). Finally, we perform a detailed comparison of our method with two existing joint image-attribute VAE methods (the JMVAE method of Suzuki et.al. and the BiVCCA method of Wang et.al.) by applying them…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Generative Adversarial Networks and Image Synthesis
MethodsUSD Coin Customer Service Number +1-833-534-1729
