Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion
QingYuan Jiang, Longfei Huang, Yang Yang

TL;DR
This paper introduces a novel multimodal learning method that dynamically balances the classification abilities of different modalities using boosting, effectively addressing modality imbalance and improving performance over existing methods.
Contribution
It proposes a sustained boosting algorithm with adaptive classifier assignment to balance modality classification abilities, backed by theoretical convergence analysis.
Findings
Outperforms state-of-the-art multimodal learning baselines
Effectively mitigates modality imbalance in classification tasks
Demonstrates improved accuracy on widely used datasets
Abstract
Multimodal learning (MML) is significantly constrained by modality imbalance, leading to suboptimal performance in practice. While existing approaches primarily focus on balancing the learning of different modalities to address this issue, they fundamentally overlook the inherent disproportion in model classification ability, which serves as the primary cause of this phenomenon. In this paper, we propose a novel multimodal learning approach to dynamically balance the classification ability of weak and strong modalities by incorporating the principle of boosting. Concretely, we first propose a sustained boosting algorithm in multimodal learning by simultaneously optimizing the classification and residual errors. Subsequently, we introduce an adaptive classifier assignment strategy to dynamically facilitate the classification performance of the weak modality. Furthermore, we theoretically…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsImbalanced Data Classification Techniques · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning
