Multi-modal Transfer Learning for Grasping Transparent and Specular Objects
Thomas Weng, Amith Pallankize, Yimin Tang, Oliver Kroemer, David Held

TL;DR
This paper presents a transfer learning method that enhances robotic grasping of transparent and reflective objects by leveraging multi-modal perception data, overcoming limitations of depth sensors without requiring additional grasp success labels.
Contribution
The authors introduce a novel transfer learning approach that uses paired multi-modal data to improve grasping of challenging objects without ground-truth labels.
Findings
Successfully grasps transparent objects
Effective transfer from uni-modal to multi-modal perception
No need for ground-truth grasp success labels
Abstract
State-of-the-art object grasping methods rely on depth sensing to plan robust grasps, but commercially available depth sensors fail to detect transparent and specular objects. To improve grasping performance on such objects, we introduce a method for learning a multi-modal perception model by bootstrapping from an existing uni-modal model. This transfer learning approach requires only a pre-existing uni-modal grasping model and paired multi-modal image data for training, foregoing the need for ground-truth grasp success labels nor real grasp attempts. Our experiments demonstrate that our approach is able to reliably grasp transparent and reflective objects. Video and supplementary material are available at https://sites.google.com/view/transparent-specular-grasping.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
