Multimodal Conditional Image Synthesis with Product-of-Experts GANs
Xun Huang, Arun Mallya, Ting-Chun Wang, Ming-Yu Liu

TL;DR
PoE-GAN introduces a multimodal conditional image synthesis framework that effectively integrates multiple user inputs, outperforming existing methods in both multimodal and unimodal scenarios with high-quality, diverse image generation.
Contribution
It presents the first framework capable of synthesizing images conditioned on multiple modalities or any subset, including none, with a novel product-of-experts generator and multiscale discriminator.
Findings
Outperforms existing unimodal methods in quality and diversity
Successfully synthesizes images conditioned on multiple modalities
Maintains high performance even with missing or partial inputs
Abstract
Existing conditional image synthesis frameworks generate images based on user inputs in a single modality, such as text, segmentation, sketch, or style reference. They are often unable to leverage multimodal user inputs when available, which reduces their practicality. To address this limitation, we propose the Product-of-Experts Generative Adversarial Networks (PoE-GAN) framework, which can synthesize images conditioned on multiple input modalities or any subset of them, even the empty set. PoE-GAN consists of a product-of-experts generator and a multimodal multiscale projection discriminator. Through our carefully designed training scheme, PoE-GAN learns to synthesize images with high quality and diversity. Besides advancing the state of the art in multimodal conditional image synthesis, PoE-GAN also outperforms the best existing unimodal conditional image synthesis approaches when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Image Retrieval and Classification Techniques · Advanced Image and Video Retrieval Techniques
