Mitigating Modality Collapse in Multimodal VAEs via Impartial   Optimization

Adri\'an Javaloy; Maryam Meghdadi; Isabel Valera

arXiv:2206.04496·cs.LG·June 10, 2022·6 cites

Mitigating Modality Collapse in Multimodal VAEs via Impartial Optimization

Adri\'an Javaloy, Maryam Meghdadi, Isabel Valera

PDF

Open Access 1 Repo

TL;DR

This paper addresses the issue of modality collapse in multimodal VAEs by detecting conflicting gradients and applying impartial optimization techniques, leading to improved performance across multiple tasks and datasets.

Contribution

It introduces a novel framework for detecting and mitigating gradient conflicts in multimodal VAEs, enhancing their ability to model all modalities effectively.

Findings

01

Significant improvement in reconstruction quality

02

Enhanced conditional generation capabilities

03

More coherent latent space representations

Abstract

A number of variational autoencoders (VAEs) have recently emerged with the aim of modeling multimodal data, e.g., to jointly model images and their corresponding captions. Still, multimodal VAEs tend to focus solely on a subset of the modalities, e.g., by fitting the image while neglecting the caption. We refer to this limitation as modality collapse. In this work, we argue that this effect is a consequence of conflicting gradients during multimodal VAE training. We show how to detect the sub-graphs in the computational graphs where gradients conflict (impartiality blocks), as well as how to leverage existing gradient-conflict solutions from multitask learning to mitigate modality collapse. That is, to ensure impartial optimization across modalities. We apply our training framework to several multimodal VAE models, losses and datasets from the literature, and empirically show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

adrianjav/impartial-vaes
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Generative Adversarial Networks and Image Synthesis · Multimodal Machine Learning Applications