3DP3: 3D Scene Perception via Probabilistic Programming
Nishad Gothoskar, Marco Cusumano-Towner, Ben Zinberg, Matin, Ghavamizadeh, Falk Pollok, Austin Garrett, Joshua B. Tenenbaum, Dan, Gutfreund, Vikash K. Mansinghka

TL;DR
3DP3 is a probabilistic programming framework that infers detailed 3D scene structure and object poses from RGB-D images, outperforming deep learning methods in accuracy and generalization.
Contribution
The paper introduces 3DP3, a novel inverse graphics framework combining voxel models, scene graphs, and depth likelihoods for improved 3D scene understanding.
Findings
More accurate 6DoF object pose estimation than deep learning baselines
Better generalization to challenging scenes with novel viewpoints and occlusion
Effective inference of scene structure including object contacts and poses
Abstract
We present 3DP3, a framework for inverse graphics that uses inference in a structured generative model of objects, scenes, and images. 3DP3 uses (i) voxel models to represent the 3D shape of objects, (ii) hierarchical scene graphs to decompose scenes into objects and the contacts between them, and (iii) depth image likelihoods based on real-time graphics. Given an observed RGB-D image, 3DP3's inference algorithm infers the underlying latent 3D scene, including the object poses and a parsimonious joint parametrization of these poses, using fast bottom-up pose proposals, novel involutive MCMC updates of the scene graph structure, and, optionally, neural object detectors and pose estimators. We show that 3DP3 enables scene understanding that is aware of 3D shape, occlusion, and contact structure. Our results demonstrate that 3DP3 is more accurate at 6DoF object pose estimation from real…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Robotics and Sensor-Based Localization · 3D Shape Modeling and Analysis
MethodsAttentive Walk-Aggregating Graph Neural Network
