Robust Multimodal Learning with Missing Modalities via   Parameter-Efficient Adaptation

Md Kaykobad Reza; Ashley Prater-Bennette; M. Salman Asif

arXiv:2310.03986·cs.CV·October 14, 2024

Robust Multimodal Learning with Missing Modalities via Parameter-Efficient Adaptation

Md Kaykobad Reza, Ashley Prater-Bennette, M. Salman Asif

PDF

Open Access

TL;DR

This paper introduces a parameter-efficient adaptation method for pretrained multimodal networks that enhances robustness to missing modalities by modulating intermediate features, outperforming existing approaches across diverse tasks and datasets.

Contribution

The authors propose a simple, parameter-efficient adaptation technique that improves robustness of multimodal models to missing data by modulating intermediate features, requiring less than 1% additional parameters.

Findings

01

Improves performance with missing modalities across five tasks.

02

Requires fewer than 1% of total parameters for adaptation.

03

Outperforms existing methods in robustness to missing modalities.

Abstract

Multimodal learning seeks to utilize data from multiple sources to improve the overall performance of downstream tasks. It is desirable for redundancies in the data to make multimodal systems robust to missing or corrupted observations in some correlated modalities. However, we observe that the performance of several existing multimodal networks significantly deteriorates if one or multiple modalities are absent at test time. To enable robustness to missing modalities, we propose a simple and parameter-efficient adaptation procedure for pretrained multimodal networks. In particular, we exploit modulation of intermediate features to compensate for the missing modalities. We demonstrate that such adaptation can partially bridge performance drop due to missing modalities and outperform independent, dedicated networks trained for the available modality combinations in some cases. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText and Document Classification Technologies · Music and Audio Processing · Speech Recognition and Synthesis