MVTrans: Multi-View Perception of Transparent Objects
Yi Ru Wang, Yuchi Zhao, Haoping Xu, Saggi Eppel, Alan Aspuru-Guzik,, Florian Shkurti, Animesh Garg

TL;DR
MVTrans introduces a multi-view perception architecture that effectively handles transparent object detection, segmentation, and pose estimation without relying on unreliable depth maps, supported by a new large-scale dataset.
Contribution
The paper presents a novel end-to-end multi-view method for transparent object perception and a new synthetic dataset for training and evaluation.
Findings
Effective multi-view perception of transparent objects.
Eliminates dependence on unreliable depth maps.
Provides a large-scale synthetic dataset for training.
Abstract
Transparent object perception is a crucial skill for applications such as robot manipulation in household and laboratory settings. Existing methods utilize RGB-D or stereo inputs to handle a subset of perception tasks including depth and pose estimation. However, transparent object perception remains to be an open problem. In this paper, we forgo the unreliable depth map from RGB-D sensors and extend the stereo based method. Our proposed method, MVTrans, is an end-to-end multi-view architecture with multiple perception capabilities, including depth estimation, segmentation, and pose estimation. Additionally, we establish a novel procedural photo-realistic dataset generation pipeline and create a large-scale transparent object detection dataset, Syn-TODD, which is suitable for training networks with all three modalities, RGB-D, stereo and multi-view RGB. Project Site:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Visual Attention and Saliency Detection
