Robust Multimodal Learning via Representation Decoupling

Shicai Wei; Yang Luo; Yuji Wang; Chunbo Luo

arXiv:2407.04458·cs.CV·July 8, 2024

Robust Multimodal Learning via Representation Decoupling

Shicai Wei, Yang Luo, Yuji Wang, Chunbo Luo

PDF

Open Access

TL;DR

This paper introduces DMRNet, a novel multimodal learning model that models input as probabilistic distributions to better capture modality-specific information and improve robustness to missing modalities.

Contribution

The paper proposes DMRNet, which models multimodal inputs as distributions and uses sampling to relax constraints, enhancing modality-specific learning and robustness.

Findings

01

DMRNet outperforms state-of-the-art methods on classification tasks.

02

The probabilistic approach improves robustness to missing modalities.

03

The hard combination regularizer balances training across modality combinations.

Abstract

Multimodal learning robust to missing modality has attracted increasing attention due to its practicality. Existing methods tend to address it by learning a common subspace representation for different modality combinations. However, we reveal that they are sub-optimal due to their implicit constraint on intra-class representation. Specifically, the sample with different modalities within the same class will be forced to learn representations in the same direction. This hinders the model from capturing modality-specific information, resulting in insufficient learning. To this end, we propose a novel Decoupled Multimodal Representation Network (DMRNet) to assist robust multimodal learning. Specifically, DMRNet models the input from different modality combinations as a probabilistic distribution instead of a fixed point in the latent space, and samples embeddings from the distribution for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies

MethodsSoftmax · Attention Is All You Need