G$^{2}$D: Boosting Multimodal Learning with Gradient-Guided Distillation

Mohammed Rakib; Arunkumar Bagavathi

arXiv:2506.21514·cs.CV·October 21, 2025

G$^{2}$D: Boosting Multimodal Learning with Gradient-Guided Distillation

Mohammed Rakib, Arunkumar Bagavathi

PDF

Open Access 1 Repo

TL;DR

G$^{2}$D introduces a gradient-guided distillation framework with dynamic modality prioritization to improve multimodal learning by balancing contributions from all modalities, especially weaker ones.

Contribution

This paper proposes G$^{2}$D, a novel knowledge distillation method with dynamic modality prioritization to address modality imbalance in multimodal learning.

Findings

01

Outperforms state-of-the-art methods in classification tasks.

02

Enhances the contribution of weak modalities during training.

03

Validated on multiple real-world datasets.

Abstract

Multimodal learning aims to leverage information from diverse data modalities to achieve more comprehensive performance. However, conventional multimodal models often suffer from modality imbalance, where one or a few modalities dominate model optimization, leading to suboptimal feature representation and underutilization of weak modalities. To address this challenge, we introduce Gradient-Guided Distillation (G $^{2}$ D), a knowledge distillation framework that optimizes the multimodal model with a custom-built loss function that fuses both unimodal and multimodal objectives. G $^{2}$ D further incorporates a dynamic sequential modality prioritization (SMP) technique in the learning process to ensure each modality leads the learning process, avoiding the pitfall of stronger modalities overshadowing weaker ones. We validate G $^{2}$ D on multiple real-world datasets and show that G $^{2}$ D…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

raison-lab/g2d
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications

MethodsKnowledge Distillation