Revisiting Cross-Modal Knowledge Distillation: A Disentanglement Approach for RGBD Semantic Segmentation

Roger Ferrod; C\'assio F. Dantas; Luigi Di Caro; Dino Ienco

arXiv:2505.24361·cs.CV·June 2, 2025

Revisiting Cross-Modal Knowledge Distillation: A Disentanglement Approach for RGBD Semantic Segmentation

Roger Ferrod, C\'assio F. Dantas, Luigi Di Caro, Dino Ienco

PDF

1 Repo

TL;DR

This paper introduces CroDiNo-KD, a novel framework for RGBD semantic segmentation that uses disentanglement and contrastive learning to improve single-modality models, addressing limitations of traditional cross-modal knowledge distillation.

Contribution

The paper proposes CroDiNo-KD, a new approach that learns single-modality models from multi-modal data using disentanglement, contrastive learning, and decoupled data augmentation.

Findings

01

CroDiNo-KD outperforms recent CMKD frameworks on three RGBD datasets.

02

It demonstrates the effectiveness of disentanglement in cross-modal knowledge transfer.

03

The approach suggests a new perspective on distilling multi-modal information into single-modality models.

Abstract

Multi-modal RGB and Depth (RGBD) data are predominant in many domains such as robotics, autonomous driving and remote sensing. The combination of these multi-modal data enhances environmental perception by providing 3D spatial context, which is absent in standard RGB images. Although RGBD multi-modal data can be available to train computer vision models, accessing all sensor modalities during the inference stage may be infeasible due to sensor failures or resource constraints, leading to a mismatch between data modalities available during training and inference. Traditional Cross-Modal Knowledge Distillation (CMKD) frameworks, developed to address this task, are typically based on a teacher/student paradigm, where a multi-modal teacher distills knowledge into a single-modality student model. However, these approaches face challenges in teacher architecture choices and distillation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rogerferrod/crodino-kd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsContrastive Learning · Knowledge Distillation