ClutterDexGrasp: A Sim-to-Real System for General Dexterous Grasping in Cluttered Scenes
Zeyuan Chen, Qiyang Yan, Yuanpei Chen, Tianhao Wu, Jiyao Zhang, Zihan Ding, Jinzhou Li, Yaodong Yang, Hao Dong

TL;DR
This paper introduces ClutterDexGrasp, a zero-shot sim-to-real system for general dexterous grasping in cluttered scenes, combining simulation-trained policies with real-world deployment without additional training.
Contribution
It presents a novel two-stage teacher-student framework with curriculum learning and a 3D diffusion policy for robust, zero-shot transfer in complex cluttered environments.
Findings
Achieves zero-shot sim-to-real transfer for dexterous grasping
Demonstrates robust performance across diverse objects and clutter layouts
Introduces a safety curriculum for collision-free grasping
Abstract
Dexterous grasping in cluttered scenes presents significant challenges due to diverse object geometries, occlusions, and potential collisions. Existing methods primarily focus on single-object grasping or grasp-pose prediction without interaction, which are insufficient for complex, cluttered scenes. Recent vision-language-action models offer a potential solution but require extensive real-world demonstrations, making them costly and difficult to scale. To address these limitations, we revisit the sim-to-real transfer pipeline and develop key techniques that enable zero-shot deployment in reality while maintaining robust generalization. We propose ClutterDexGrasp, a two-stage teacher-student framework for closed-loop target-oriented dexterous grasping in cluttered scenes. The framework features a teacher policy trained in simulation using clutter density curriculum learning,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Motor Control and Adaptation · Human Pose and Action Recognition
