A Dimensional Structure based Knowledge Distillation Method for Cross-Modal Learning
Lingyu Si, Hongwei Dong, Wenwen Qiang, Junzhi Yu, Wenlong Zhai,, Changwen Zheng, Fanjiang Xu, Fuchun Sun

TL;DR
This paper introduces a novel cross-modal knowledge distillation method based on dimensional structure analysis, improving learning performance in challenging visual tasks by enforcing feature independence and distribution uniformity.
Contribution
It proposes a new DS-based CMKD approach that enhances cross-modal learning by leveraging feature correlation and distribution, and provides a real-world dataset for community use.
Findings
Improved accuracy in cross-modal tasks with large modality gaps
Effective feature disentanglement through channel-wise independence
Validated on real-world and benchmark datasets
Abstract
Due to limitations in data quality, some essential visual tasks are difficult to perform independently. Introducing previously unavailable information to transfer informative dark knowledge has been a common way to solve such hard tasks. However, research on why transferred knowledge works has not been extensively explored. To address this issue, in this paper, we discover the correlation between feature discriminability and dimensional structure (DS) by analyzing and observing features extracted from simple and hard tasks. On this basis, we express DS using deep channel-wise correlation and intermediate spatial distribution, and propose a novel cross-modal knowledge distillation (CMKD) method for better supervised cross-modal learning (CML) performance. The proposed method enforces output features to be channel-wise independent and intermediate ones to be uniformly distributed, thereby…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Processing Techniques and Applications · Remote-Sensing Image Classification
MethodsKnowledge Distillation
