Rethinking Transparent Object Grasping: Depth Completion with Monocular Depth Estimation and Instance Mask
Yaofeng Cheng, Xinkai Gao, Sen Zhang, Chao Zeng, Fusheng Zha, Lining Sun, Chenguang Yang

TL;DR
This paper introduces ReMake, a depth completion framework that uses instance masks and monocular depth estimation to improve the accuracy and generalization of transparent object grasping in robotic systems.
Contribution
The proposed ReMake framework explicitly distinguishes transparent regions to enhance depth completion and generalization, outperforming existing methods in real-world scenarios.
Findings
Outperforms existing approaches on benchmark datasets
Achieves higher accuracy in real-world transparent object grasping
Demonstrates improved generalization to complex lighting conditions
Abstract
Due to the optical properties, transparent objects often lead depth cameras to generate incomplete or invalid depth data, which in turn reduces the accuracy and reliability of robotic grasping. Existing approaches typically input the RGB-D image directly into the network to output the complete depth, expecting the model to implicitly infer the reliability of depth values. However, while effective in training datasets, such methods often fail to generalize to real-world scenarios, where complex light interactions lead to highly variable distributions of valid and invalid depth data. To address this, we propose ReMake, a novel depth completion framework guided by an instance mask and monocular depth estimation. By explicitly distinguishing transparent regions from non-transparent ones, the mask enables the model to concentrate on learning accurate depth estimation in these areas from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
