Learning to Detect Multi-Modal Grasps for Dexterous Grasping in Dense Clutter
Matt Corsaro, Stefanie Tellex, George Konidaris

TL;DR
This paper introduces a multi-modal grasp detection method that predicts success probabilities for various grasp types from partial point clouds, improving object retrieval in cluttered environments.
Contribution
The approach jointly predicts multiple grasp success probabilities from partial point clouds, enabling more versatile and effective grasping in cluttered scenes.
Findings
Object retrieval rate increased by 8.5% in cluttered environments.
System is agnostic to sensor placement and number.
Outperforms baselines with fewer grasp types.
Abstract
We propose an approach to multi-modal grasp detection that jointly predicts the probabilities that several types of grasps succeed at a given grasp pose. Given a partial point cloud of a scene, the algorithm proposes a set of feasible grasp candidates, then estimates the probabilities that a grasp of each type would succeed at each candidate pose. Predicting grasp success probabilities directly from point clouds makes our approach agnostic to the number and placement of depth sensors at execution time. We evaluate our system both in simulation and on a real robot with a Robotiq 3-Finger Adaptive Gripper and compare our network against several baselines that perform fewer types of grasps. Our experiments show that a system that explicitly models grasp type achieves an object retrieval rate 8.5% higher in a complex cluttered environment than our highest-performing baseline.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Hand Gesture Recognition Systems · Human Pose and Action Recognition
