SHAPE: An Unified Approach to Evaluate the Contribution and Cooperation   of Individual Modalities

Pengbo Hu; Xingyu Li; Yi Zhou

arXiv:2205.00302·cs.LG·May 3, 2022

SHAPE: An Unified Approach to Evaluate the Contribution and Cooperation of Individual Modalities

Pengbo Hu, Xingyu Li, Yi Zhou

PDF

Open Access 1 Repo

TL;DR

This paper introduces SHAPE scores, based on Shapley values, to quantify the contribution and cooperation of individual modalities in multi-modal deep learning models, aiding better understanding and fusion strategies.

Contribution

The paper proposes a novel SHAPE scoring method to systematically evaluate the contribution and cooperation of modalities in multi-modal models, addressing a gap in quantification.

Findings

01

Multi-modal models often rely on dominant modalities when modalities are complementary.

02

Models exploit cross-modal cooperation when modalities are indispensable.

03

Early-stage fusion is preferable when modalities significantly cooperate.

Abstract

As deep learning advances, there is an ever-growing demand for models capable of synthesizing information from multi-modal resources to address the complex tasks raised from real-life applications. Recently, many large multi-modal datasets have been collected, on which researchers actively explore different methods of fusing multi-modal information. However, little attention has been paid to quantifying the contribution of different modalities within the proposed models. In this paper, we propose the {\bf SH}apley v{\bf A}lue-based {\bf PE}rceptual (SHAPE) scores that measure the marginal contribution of individual modalities and the degree of cooperation across modalities. Using these scores, we systematically evaluate different fusion methods on different multi-modal datasets for different tasks. Our experiments suggest that for some tasks where different modalities are complementary,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhouyilab/shape
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Speech and dialogue systems