DiffuDepGrasp: Diffusion-based Depth Noise Modeling Empowers Sim2Real Robotic Grasping
Yingting Zhou, Wenbo Cui, Weiheng Liu, Guixing Chen, Haoran Li, and Dongbin Zhao

TL;DR
DiffuDepGrasp introduces a diffusion-based depth noise modeling framework that significantly improves zero-shot sim2real robotic grasping by synthesizing realistic sensor artifacts in simulation, leading to high success rates without additional deployment overhead.
Contribution
The paper presents a novel diffusion-based depth noise generator that enhances simulation realism and enables efficient zero-shot transfer for robotic grasping tasks.
Findings
Achieves 95.7% success rate on 12-object grasping
Enables zero-shot transfer with strong generalization
Reduces computational overhead during deployment
Abstract
Transferring the depth-based end-to-end policy trained in simulation to physical robots can yield an efficient and robust grasping policy, yet sensor artifacts in real depth maps like voids and noise establish a significant sim2real gap that critically impedes policy transfer. Training-time strategies like procedural noise injection or learned mappings suffer from data inefficiency due to unrealistic noise simulation, which is often ineffective for grasping tasks that require fine manipulation or dependency on paired datasets heavily. Furthermore, leveraging foundation models to reduce the sim2real gap via intermediate representations fails to mitigate the domain shift fully and adds computational overhead during deployment. This work confronts dual challenges of data inefficiency and deployment complexity. We propose DiffuDepGrasp, a deploy-efficient sim2real framework enabling…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobot Manipulation and Learning · Generative Adversarial Networks and Image Synthesis · Reinforcement Learning in Robotics
