One-stage Modality Distillation for Incomplete Multimodal Learning
Shicai Wei, Yang Luo, Chunbo Luo

TL;DR
This paper introduces a one-stage modality distillation framework that unifies knowledge transfer and modality fusion, enabling effective learning from incomplete multimodal data in tasks like RGB-D classification and segmentation.
Contribution
The proposed framework combines privileged knowledge transfer and modality fusion into a single multi-task learning process, improving performance with incomplete modalities.
Findings
Achieves state-of-the-art results on RGB-D classification and segmentation.
Effectively handles incomplete modality input in various scenes.
Abstract
Learning based on multimodal data has attracted increasing interest recently. While a variety of sensory modalities can be collected for training, not all of them are always available in development scenarios, which raises the challenge to infer with incomplete modality. To address this issue, this paper presents a one-stage modality distillation framework that unifies the privileged knowledge transfer and modality information fusion into a single optimization procedure via multi-task learning. Compared with the conventional modality distillation that performs them independently, this helps to capture the valuable representation that can assist the final model inference directly. Specifically, we propose the joint adaptation network for the modality transfer task to preserve the privileged information. This addresses the representation heterogeneity caused by input discrepancy via the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Chemical Sensor Technologies
