Promoting CNNs with Cross-Architecture Knowledge Distillation for Efficient Monocular Depth Estimation
Zhimeng Zheng, Tao Huang, Gongsheng Li, Zuyi Wang

TL;DR
This paper introduces DisDepth, a novel cross-architecture knowledge distillation method that enhances lightweight CNN models for monocular depth estimation by leveraging transformer models, resulting in improved accuracy on standard datasets.
Contribution
The paper proposes a new distillation framework that effectively transfers knowledge from transformer models to CNNs for efficient depth estimation, including a ghost decoder and attentive loss.
Findings
Significant performance improvements on KITTI and NYU Depth V2 datasets.
Enhanced CNN models with global and local feature integration.
Effective knowledge transfer from transformers to lightweight CNNs.
Abstract
Recently, the performance of monocular depth estimation (MDE) has been significantly boosted with the integration of transformer models. However, the transformer models are usually computationally-expensive, and their effectiveness in light-weight models are limited compared to convolutions. This limitation hinders their deployment on resource-limited devices. In this paper, we propose a cross-architecture knowledge distillation method for MDE, dubbed DisDepth, to enhance efficient CNN models with the supervision of state-of-the-art transformer models. Concretely, we first build a simple framework of convolution-based MDE, which is then enhanced with a novel local-global convolution module to capture both local and global information in the image. To effectively distill valuable information from the transformer teacher and bridge the gap between convolution and transformer features, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIndustrial Vision Systems and Defect Detection · Optical measurement and interference techniques · Image Processing Techniques and Applications
MethodsKnowledge Distillation · Focus · Convolution
