Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion

QingYuan Jiang; Longfei Huang; Yang Yang

arXiv:2502.20120·cs.CV·January 30, 2026

Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion

QingYuan Jiang, Longfei Huang, Yang Yang

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel multimodal learning method that dynamically balances the classification abilities of different modalities using boosting, effectively addressing modality imbalance and improving performance over existing methods.

Contribution

It proposes a sustained boosting algorithm with adaptive classifier assignment to balance modality classification abilities, backed by theoretical convergence analysis.

Findings

01

Outperforms state-of-the-art multimodal learning baselines

02

Effectively mitigates modality imbalance in classification tasks

03

Demonstrates improved accuracy on widely used datasets

Abstract

Multimodal learning (MML) is significantly constrained by modality imbalance, leading to suboptimal performance in practice. While existing approaches primarily focus on balancing the learning of different modalities to address this issue, they fundamentally overlook the inherent disproportion in model classification ability, which serves as the primary cause of this phenomenon. In this paper, we propose a novel multimodal learning approach to dynamically balance the classification ability of weak and strong modalities by incorporating the principle of boosting. Concretely, we first propose a sustained boosting algorithm in multimodal learning by simultaneously optimizing the classification and residual errors. Subsequently, we introduce an adaptive classifier assignment strategy to dynamically facilitate the classification performance of the weak modality. Furthermore, we theoretically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion· slideslive

Taxonomy

TopicsImbalanced Data Classification Techniques · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning