Enhancing multimodal cooperation via sample-level modality valuation

Yake Wei; Ruoxuan Feng; Zihe Wang; Di Hu

arXiv:2309.06255·cs.CV·June 17, 2024

Enhancing multimodal cooperation via sample-level modality valuation

Yake Wei, Ruoxuan Feng, Zihe Wang, Di Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a sample-level modality valuation metric to better understand and improve the cooperation of different modalities in multimodal learning, leading to significant performance gains.

Contribution

It proposes a novel sample-level modality valuation method to analyze and enhance multimodal cooperation at the sample level with theoretical support.

Findings

01

Sample-level modality discrepancy varies across samples.

02

Enhancing low-contributing modalities improves overall performance.

03

The method achieves significant improvements in multimodal cooperation.

Abstract

One primary topic of multimodal learning is to jointly incorporate heterogeneous information from different modalities. However most models often suffer from unsatisfactory multimodal cooperation which cannot jointly utilize all modalities well. Some methods are proposed to identify and enhance the worse learnt modality but they are often hard to provide the fine-grained observation of multimodal cooperation at sample-level with theoretical support. Hence it is essential to reasonably observe and improve the fine-grained cooperation between modalities especially when facing realistic scenarios where the modality discrepancy could vary across different samples. To this end we introduce a sample-level modality valuation metric to evaluate the contribution of each modality for each sample. Via modality valuation we observe that modality discrepancy indeed could be different at sample-level…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gewu-lab/valuate-and-enhance-multimodal-cooperation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning