Cross Modal Distillation for Supervision Transfer
Saurabh Gupta, Judy Hoffman, Jitendra Malik

TL;DR
This paper introduces a method for transferring supervision from labeled images in one modality to unlabeled images in another modality, enabling improved representation learning across different data types.
Contribution
It presents a novel cross-modal distillation technique that leverages labeled data from one modality to train models on unlabeled paired modalities, enhancing their performance.
Findings
Effective transfer of supervision from RGB to depth images.
Significant improvements in representation quality for unlabeled modalities.
Applicable as a pre-training step for new modalities with limited labels.
Abstract
In this work we propose a technique that transfers supervision between images from different modalities. We use learned representations from a large labeled modality as a supervisory signal for training representations for a new unlabeled paired modality. Our method enables learning of rich representations for unlabeled modalities and can be used as a pre-training procedure for new modalities with limited labeled data. We show experimental results where we transfer supervision from labeled RGB images to unlabeled depth and optical flow images and demonstrate large improvements for both these cross modal supervision transfers. Code, data and pre-trained models are available at https://github.com/s-gupta/fast-rcnn/tree/distillation
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Optical measurement and interference techniques
