3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for   Robust 6D Pose Estimation

Guangyao Zhou; Nishad Gothoskar; Lirui Wang; Joshua B. Tenenbaum; Dan; Gutfreund; Miguel L\'azaro-Gredilla; Dileep George; Vikash K. Mansinghka

arXiv:2302.03744·cs.CV·September 8, 2023

3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation

Guangyao Zhou, Nishad Gothoskar, Lirui Wang, Joshua B. Tenenbaum, Dan, Gutfreund, Miguel L\'azaro-Gredilla, Dileep George, Vikash K. Mansinghka

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces 3DNEL, a probabilistic inverse graphics model that improves 6D pose estimation robustness from RGB-D images by modeling uncertainty and handling occlusions effectively.

Contribution

The paper presents 3DNEL, a novel probabilistic framework that unifies neural embeddings and depth data for robust 6D pose estimation and scene understanding.

Findings

01

Achieves state-of-the-art performance on YCB-Video dataset.

02

Demonstrates robustness under occlusion and challenging conditions.

03

Provides a principled way to incorporate prior scene knowledge.

Abstract

The ability to perceive and understand 3D scenes is crucial for many applications in computer vision and robotics. Inverse graphics is an appealing approach to 3D scene understanding that aims to infer the 3D scene structure from 2D images. In this paper, we introduce probabilistic modeling to the inverse graphics framework to quantify uncertainty and achieve robustness in 6D pose estimation tasks. Specifically, we propose 3D Neural Embedding Likelihood (3DNEL) as a unified probabilistic model over RGB-D images, and develop efficient inference procedures on 3D scene descriptions. 3DNEL effectively combines learned neural embeddings from RGB with depth information to improve robustness in sim-to-real 6D object pose estimation from RGB-D images. Performance on the YCB-Video dataset is on par with state-of-the-art yet is much more robust in challenging regimes. In contrast to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

deepmind/threednel
jaxOfficial

Videos

3D Neural Embedding Likelihood: Probabilistic Inverse Graphics for Robust 6D Pose Estimation· youtube

Taxonomy

TopicsHuman Pose and Action Recognition · Advanced Vision and Imaging · Robotics and Sensor-Based Localization