Balanced Audiovisual Dataset for Imbalance Analysis
Wenke Xia, Xu Zhao, Xincheng Pang, Changqing Zhang, Di Hu

TL;DR
This paper investigates modality bias in audiovisual datasets, revealing that existing imbalance algorithms are limited and proposing a balanced dataset to better analyze and address modality imbalance in multimodal learning.
Contribution
It introduces a balanced audiovisual dataset with uniform modality discrepancy and re-evaluates existing imbalance algorithms on it, highlighting their limitations.
Findings
Existing algorithms only balance modalities roughly.
Models perform worse on samples with high modality discrepancy.
The balanced dataset reveals limitations of current imbalance methods.
Abstract
The imbalance problem is widespread in the field of machine learning, which also exists in multimodal learning areas caused by the intrinsic discrepancy between modalities of samples. Recent works have attempted to solve the modality imbalance problem from algorithm perspective, however, they do not fully analyze the influence of modality bias in datasets. Concretely, existing multimodal datasets are usually collected under specific tasks, where one modality tends to perform better than other ones in most conditions. In this work, to comprehensively explore the influence of modality bias, we first split existing datasets into different subsets by estimating sample-wise modality discrepancy. We surprisingly find that: the multimodal models with existing imbalance algorithms consistently perform worse than the unimodal one on specific subsets, in accordance with the modality bias. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Music and Audio Processing
