Balancing Multimodal Domain Generalization via Gradient Modulation and Projection
Hongzhao Li, Guohao Shen, Shupan Li, Mingliang Xu, Muhammad Haris Khan

TL;DR
This paper introduces Gradient Modulation Projection (GMP), a novel method to balance gradient contributions across modalities in Multimodal Domain Generalization, improving model generalization to unseen domains by dynamically adjusting gradients based on confidence levels.
Contribution
The paper proposes GMP, a unified gradient balancing strategy that decouples and modulates gradients for classification and domain-invariance, enhancing cross-domain generalization in MMDG.
Findings
GMP achieves state-of-the-art results on multiple benchmarks.
GMP effectively balances modality contributions during training.
GMP is compatible with various MMDG methods.
Abstract
Multimodal Domain Generalization (MMDG) leverages the complementary strengths of multiple modalities to enhance model generalization on unseen domains. A central challenge in multimodal learning is optimization imbalance, where modalities converge at different speeds during training. This imbalance leads to unequal gradient contributions, allowing some modalities to dominate the learning process while others lag behind. Existing balancing strategies typically regulate each modality's gradient contribution based on its classification performance on the source domain to alleviate this issue. However, relying solely on source-domain accuracy neglects a key insight in MMDG: modalities that excel on the source domain may generalize poorly to unseen domains, limiting cross-domain gains. To overcome this limitation, we propose Gradient Modulation Projection (GMP), a unified strategy that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Topic Modeling · Multimodal Machine Learning Applications
