MVTrans: Multi-View Perception of Transparent Objects

Yi Ru Wang; Yuchi Zhao; Haoping Xu; Saggi Eppel; Alan Aspuru-Guzik,; Florian Shkurti; Animesh Garg

arXiv:2302.11683·cs.RO·February 24, 2023·6 cites

MVTrans: Multi-View Perception of Transparent Objects

Yi Ru Wang, Yuchi Zhao, Haoping Xu, Saggi Eppel, Alan Aspuru-Guzik,, Florian Shkurti, Animesh Garg

PDF

Open Access

TL;DR

MVTrans introduces a multi-view perception architecture that effectively handles transparent object detection, segmentation, and pose estimation without relying on unreliable depth maps, supported by a new large-scale dataset.

Contribution

The paper presents a novel end-to-end multi-view method for transparent object perception and a new synthetic dataset for training and evaluation.

Findings

01

Effective multi-view perception of transparent objects.

02

Eliminates dependence on unreliable depth maps.

03

Provides a large-scale synthetic dataset for training.

Abstract

Transparent object perception is a crucial skill for applications such as robot manipulation in household and laboratory settings. Existing methods utilize RGB-D or stereo inputs to handle a subset of perception tasks including depth and pose estimation. However, transparent object perception remains to be an open problem. In this paper, we forgo the unreliable depth map from RGB-D sensors and extend the stereo based method. Our proposed method, MVTrans, is an end-to-end multi-view architecture with multiple perception capabilities, including depth estimation, segmentation, and pose estimation. Additionally, we establish a novel procedural photo-realistic dataset generation pipeline and create a large-scale transparent object detection dataset, Syn-TODD, which is suitable for training networks with all three modalities, RGB-D, stereo and multi-view RGB. Project Site:…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Visual Attention and Saliency Detection