3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation
Guangyao Zhou, Nishad Gothoskar, Lirui Wang, Joshua B. Tenenbaum, Dan, Gutfreund, Miguel L\'azaro-Gredilla, Dileep George, Vikash K. Mansinghka

TL;DR
This paper introduces 3DNEL, a probabilistic inverse graphics model that improves 6D pose estimation robustness from RGB-D images by modeling uncertainty and handling occlusions effectively.
Contribution
The paper presents 3DNEL, a novel probabilistic framework that unifies neural embeddings and depth data for robust 6D pose estimation and scene understanding.
Findings
Achieves state-of-the-art performance on YCB-Video dataset.
Demonstrates robustness under occlusion and challenging conditions.
Provides a principled way to incorporate prior scene knowledge.
Abstract
The ability to perceive and understand 3D scenes is crucial for many applications in computer vision and robotics. Inverse graphics is an appealing approach to 3D scene understanding that aims to infer the 3D scene structure from 2D images. In this paper, we introduce probabilistic modeling to the inverse graphics framework to quantify uncertainty and achieve robustness in 6D pose estimation tasks. Specifically, we propose 3D Neural Embedding Likelihood (3DNEL) as a unified probabilistic model over RGB-D images, and develop efficient inference procedures on 3D scene descriptions. 3DNEL effectively combines learned neural embeddings from RGB with depth information to improve robustness in sim-to-real 6D object pose estimation from RGB-D images. Performance on the YCB-Video dataset is on par with state-of-the-art yet is much more robust in challenging regimes. In contrast to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation· youtube
Taxonomy
TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Robotics and Sensor-Based Localization
