Variational Shape Inference for Grasp Diffusion on SE(3)
S. Talha Bukhari, Kaivalya Agrawal, Zachary Kingston, Aniket Bera

TL;DR
This paper introduces a novel framework combining variational shape inference with diffusion models to improve multimodal grasp synthesis in robotics, demonstrating superior performance and robustness in simulation and real-world tasks.
Contribution
It proposes a new approach that leverages variational autoencoders and diffusion models for robust, multimodal grasp synthesis conditioned on object shape.
Findings
Outperforms state-of-the-art methods by 6.3% on ACRONYM dataset.
Demonstrates robustness to point cloud density deterioration.
Achieves 34% more successful grasps in real-world tests.
Abstract
Grasp synthesis is a fundamental task in robotic manipulation which usually has multiple feasible solutions. Multimodal grasp synthesis seeks to generate diverse sets of stable grasps conditioned on object geometry, making the robust learning of geometric features crucial for success. To address this challenge, we propose a framework for learning multimodal grasp distributions that leverages variational shape inference to enhance robustness against shape noise and measurement sparsity. Our approach first trains a variational autoencoder for shape inference using implicit neural representations, and then uses these learned geometric features to guide a diffusion model for grasp synthesis on the SE(3) manifold. Additionally, we introduce a test-time grasp optimization technique that can be integrated as a plugin to further enhance grasping performance. Experimental results demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
