Balancing Multimodal Training Through Game-Theoretic Regularization
Konstantinos Kontras, Thomas Strypsteen, Christos Chatzichristos, Paul Pu Liang, Matthew Blaschko, Maarten De Vos

TL;DR
This paper introduces the Multimodal Competition Regularizer (MCR), a game-theoretic approach that balances modality contributions in multimodal learning, leading to improved performance by addressing modality competition during training.
Contribution
The paper presents a novel game-theoretic regularizer that adaptively balances modalities and refines mutual information bounds to enhance multimodal training effectiveness.
Findings
MCR outperforms existing training strategies and baselines.
Training modalities jointly improves performance on synthetic and real datasets.
Latent space permutations significantly boost computational efficiency.
Abstract
Multimodal learning holds promise for richer information extraction by capturing dependencies across data sources. Yet, current training methods often underperform due to modality competition, a phenomenon where modalities contend for training resources leaving some underoptimized. This raises a pivotal question: how can we address training imbalances, ensure adequate optimization across all modalities, and achieve consistent performance improvements as we transition from unimodal to multimodal data? This paper proposes the Multimodal Competition Regularizer (MCR), inspired by a mutual information (MI) decomposition designed to prevent the adverse effects of competition in multimodal training. Our key contributions are: 1) A game-theoretic framework that adaptively balances modality contributions by encouraging each to maximize its informative role in the final prediction 2) Refining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsSports Analytics and Performance
