Variational methods for Conditional Multimodal Deep Learning
Gaurav Pandey, Ambedkar Dukkipati

TL;DR
This paper introduces a variational deep learning model called CMMA for conditional modality generation, enabling more accurate generation of one modality from another, demonstrated through face and attribute synthesis.
Contribution
The paper proposes the CMMA model that effectively learns conditional distributions between modalities, improving conditional generation over joint models.
Findings
Faces generated from attributes are more accurate and attribute-representative.
The model effectively modifies existing faces based on attribute changes.
CMMA outperforms other deep generative models in conditional face generation.
Abstract
In this paper, we address the problem of conditional modality learning, whereby one is interested in generating one modality given the other. While it is straightforward to learn a joint distribution over multiple modalities using a deep multimodal architecture, we observe that such models aren't very effective at conditional generation. Hence, we address the problem by learning conditional distributions between the modalities. We use variational methods for maximizing the corresponding conditional log-likelihood. The resultant deep model, which we refer to as conditional multimodal autoencoder (CMMA), forces the latent representation obtained from a single modality alone to be `close' to the joint representation obtained from multiple modalities. We use the proposed model to generate faces from attributes. We show that the faces generated from attributes using the proposed model, are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSolana Customer Service Number +1-833-534-1729
