Learnable Cross-modal Knowledge Distillation for Multi-modal Learning   with Missing Modality

Hu Wang; Congbo Ma; Jianpeng Zhang; Yuan Zhang; Jodie Avery; Louise; Hull; Gustavo Carneiro

arXiv:2310.01035·cs.CV·March 17, 2025

Learnable Cross-modal Knowledge Distillation for Multi-modal Learning with Missing Modality

Hu Wang, Congbo Ma, Jianpeng Zhang, Yuan Zhang, Jodie Avery, Louise, Hull, Gustavo Carneiro

PDF

Open Access

TL;DR

This paper introduces a learnable cross-modal knowledge distillation framework that adaptively identifies key modalities and transfers knowledge from the best performing ones to improve multi-modal learning with missing modalities, especially in medical image segmentation.

Contribution

The proposed LCKD method uniquely selects the most qualified teacher modalities and distills their knowledge to enhance performance when some modalities are missing.

Findings

01

LCKD outperforms existing methods significantly.

02

Achieves up to 5.99% improvement in segmentation Dice score.

03

Effective in handling missing modality scenarios in medical imaging.

Abstract

The problem of missing modalities is both critical and non-trivial to be handled in multi-modal models. It is common for multi-modal tasks that certain modalities contribute more compared to other modalities, and if those important modalities are missing, the model performance drops significantly. Such fact remains unexplored by current multi-modal approaches that recover the representation from missing modalities by feature reconstruction or blind feature aggregation from other modalities, instead of extracting useful information from the best performing modalities. In this paper, we propose a Learnable Cross-modal Knowledge Distillation (LCKD) model to adaptively identify important modalities and distil knowledge from them to help other modalities from the cross-modal perspective for solving the missing modality issue. Our approach introduces a teacher election procedure to select the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques

MethodsKnowledge Distillation