DistillGrasp: Integrating Features Correlation with Knowledge   Distillation for Depth Completion of Transparent Objects

Yiheng Huang; Junhong Chen; Nick Michiels; Muhammad Asim; Luc Claesen,; and Wenyin Liu

arXiv:2408.00337·cs.CV·August 2, 2024

DistillGrasp: Integrating Features Correlation with Knowledge Distillation for Depth Completion of Transparent Objects

Yiheng Huang, Junhong Chen, Nick Michiels, Muhammad Asim, Luc Claesen,, and Wenyin Liu

PDF

Open Access

TL;DR

This paper introduces DistillGrasp, a depth completion network that uses knowledge distillation to improve depth estimation of transparent objects, achieving high accuracy and real-time performance.

Contribution

The paper presents a novel teacher-student framework with correlation modules and a specialized distillation loss for transparent object depth completion.

Findings

01

Teacher network outperforms state-of-the-art methods in accuracy.

02

Student network achieves 48 FPS with competitive accuracy.

03

System improves robotic grasping robustness.

Abstract

Due to the visual properties of reflection and refraction, RGB-D cameras cannot accurately capture the depth of transparent objects, leading to incomplete depth maps. To fill in the missing points, recent studies tend to explore new visual features and design complex networks to reconstruct the depth, however, these approaches tremendously increase computation, and the correlation of different visual features remains a problem. To this end, we propose an efficient depth completion network named DistillGrasp which distillates knowledge from the teacher branch to the student branch. Specifically, in the teacher branch, we design a position correlation block (PCB) that leverages RGB images as the query and key to search for the corresponding values, guiding the model to establish correct correspondence between two features and transfer it to the transparent areas. For the student branch,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Manufacturing Process and Optimization · 3D Surveying and Cultural Heritage

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings