PDMP: Rethinking Balanced Multimodal Learning via Performance-Dominant Modality Prioritization

Shicai Wei; Chunbo Luo; Qiang Zhu; Yang Luo

arXiv:2604.05773·cs.CV·April 8, 2026

PDMP: Rethinking Balanced Multimodal Learning via Performance-Dominant Modality Prioritization

Shicai Wei, Chunbo Luo, Qiang Zhu, Yang Luo

PDF

TL;DR

This paper introduces PDMP, a strategy that prioritizes the performance-dominant modality in multimodal learning, improving optimization by asymmetric gradient modulation based on unimodal performance rankings.

Contribution

It proposes a novel modality prioritization method that enhances multimodal learning by focusing on the best-performing unimodal modality, independent of model structure.

Findings

01

PDMP outperforms existing methods on various datasets.

02

Prioritizing the performance-dominant modality improves multimodal optimization.

03

The method is flexible and applicable across different multimodal models.

Abstract

Multimodal learning has attracted increasing attention due to its practicality. However, it often suffers from insufficient optimization, where the multimodal model underperforms even compared to its unimodal counterparts. Existing methods attribute this problem to the imbalanced learning between modalities and solve it by gradient modulation. This paper argues that balanced learning is not the optimal setting for multimodal learning. On the contrary, imbalanced learning driven by the performance-dominant modality that has superior unimodal performance can contribute to better multimodal performance. And the under-optimization problem is caused by insufficient learning of the performance-dominant modality. To this end, we propose the Performance-Dominant Modality Prioritization (PDMP) strategy to assist multimodal learning. Specifically, PDMP firstly mines the performance-dominant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.