MILES: Modality-Informed Learning Rate Scheduler for Balancing Multimodal Learning
Alejandro Guerra-Manzanares, Farah E. Shamout

TL;DR
MILES is a novel training strategy that dynamically adjusts learning rates based on modality utilization to improve the performance and balance of multimodal neural networks.
Contribution
We introduce MILES, a learning rate scheduler that balances modality contributions during training, enhancing multimodal and unimodal performance.
Findings
MILES outperforms seven state-of-the-art baselines across four tasks.
It effectively balances modality usage during training.
Results show improved multimodal and unimodal predictions.
Abstract
The aim of multimodal neural networks is to combine diverse data sources, referred to as modalities, to achieve enhanced performance compared to relying on a single modality. However, training of multimodal networks is typically hindered by modality overfitting, where the network relies excessively on one of the available modalities. This often yields sub-optimal performance, hindering the potential of multimodal learning and resulting in marginal improvements relative to unimodal models. In this work, we present the Modality-Informed Learning ratE Scheduler (MILES) for training multimodal joint fusion models in a balanced manner. MILES leverages the differences in modality-wise conditional utilization rates during training to effectively balance multimodal learning. The learning rate is dynamically adjusted during training to balance the speed of learning from each modality by the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Topic Modeling
