Discriminative Multimodal Learning via Conditional Priors in Generative Models
Rogelio A. Mancisidor, Michael Kampffmeyer, Kjersti Aas, Robert, Jenssen

TL;DR
This paper introduces a novel conditional multimodal discriminative model that maximizes mutual information between joint representations and missing modalities, improving performance in various downstream tasks.
Contribution
The paper proposes a new generative model with an informative prior and likelihood-free training to better handle missing modalities and labels in multimodal learning.
Findings
Achieves state-of-the-art results in downstream classification
Improves acoustic inversion performance
Enhances image and annotation generation quality
Abstract
Deep generative models with latent variables have been used lately to learn joint representations and generative processes from multi-modal data. These two learning mechanisms can, however, conflict with each other and representations can fail to embed information on the data modalities. This research studies the realistic scenario in which all modalities and class labels are available for model training, but where some modalities and labels required for downstream tasks are missing. We show, in this scenario, that the variational lower bound limits mutual information between joint representations and missing modalities. We, to counteract these problems, introduce a novel conditional multi-modal discriminative model that uses an informative prior distribution and optimizes a likelihood-free objective function that maximizes mutual information between joint representations and missing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Speech and Audio Processing
