On Modality Bias Recognition and Reduction

Yangyang Guo; Liqiang Nie; Harry Cheng; Zhiyong Cheng; Mohan; Kankanhalli; Alberto Del Bimbo

arXiv:2202.12690·cs.CV·September 28, 2022

On Modality Bias Recognition and Reduction

Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan, Kankanhalli, Alberto Del Bimbo

PDF

Open Access 1 Repo

TL;DR

This paper systematically studies modality bias in multi-modal classification, highlighting how spurious correlations cause dominance of certain modalities, and proposes a loss function to mitigate this bias, improving model performance.

Contribution

It introduces a comprehensive analysis of modality bias, creates new OoD datasets for evaluation, and proposes a plug-and-play loss to reduce bias and enhance multi-modal learning.

Findings

01

The proposed method improves performance across multiple datasets.

02

Existing methods suffer from modality bias in OoD settings.

03

The loss function effectively reduces modality dominance in models.

Abstract

Making each modality in multi-modal data contribute is of vital importance to learning a versatile multi-modal model. Existing methods, however, are often dominated by one or few of modalities during model training, resulting in sub-optimal performance. In this paper, we refer to this problem as modality bias and attempt to study it in the context of multi-modal classification systematically and comprehensively. After stepping into several empirical analysis, we recognize that one modality affects the model prediction more just because this modality has a spurious correlation with instance labels. In order to primarily facilitate the evaluation on the modality bias problem, we construct two datasets respectively for the colored digit recognition and video action recognition tasks in line with the Out-of-Distribution (OoD) protocol. Collaborating with the benchmarks in the visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

guoyang9/AdaVQA
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition