DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects
Yiheng Huang, Junhong Chen, Nick Michiels, Muhammad Asim, Luc Claesen,, and Wenyin Liu

TL;DR
This paper introduces DistillGrasp, a depth completion network that uses knowledge distillation to improve depth estimation of transparent objects, achieving high accuracy and real-time performance.
Contribution
The paper presents a novel teacher-student framework with correlation modules and a specialized distillation loss for transparent object depth completion.
Findings
Teacher network outperforms state-of-the-art methods in accuracy.
Student network achieves 48 FPS with competitive accuracy.
System improves robotic grasping robustness.
Abstract
Due to the visual properties of reflection and refraction, RGB-D cameras cannot accurately capture the depth of transparent objects, leading to incomplete depth maps. To fill in the missing points, recent studies tend to explore new visual features and design complex networks to reconstruct the depth, however, these approaches tremendously increase computation, and the correlation of different visual features remains a problem. To this end, we propose an efficient depth completion network named DistillGrasp which distillates knowledge from the teacher branch to the student branch. Specifically, in the teacher branch, we design a position correlation block (PCB) that leverages RGB images as the query and key to search for the corresponding values, guiding the model to establish correct correspondence between two features and transfer it to the transparent areas. For the student branch,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing and 3D Reconstruction · Manufacturing Process and Optimization · 3D Surveying and Cultural Heritage
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
