Bayes3D: fast learning and inference in structured generative models of   3D objects and scenes

Nishad Gothoskar; Matin Ghavami; Eric Li; Aidan Curtis; Michael; Noseworthy; Karen Chung; Brian Patton; William T. Freeman; Joshua B.; Tenenbaum; Mirko Klukas; Vikash K. Mansinghka

arXiv:2312.08715·cs.RO·December 15, 2023·1 cites

Bayes3D: fast learning and inference in structured generative models of 3D objects and scenes

Nishad Gothoskar, Matin Ghavami, Eric Li, Aidan Curtis, Michael, Noseworthy, Karen Chung, Brian Patton, William T. Freeman, Joshua B., Tenenbaum, Mirko Klukas, Vikash K. Mansinghka

PDF

Open Access

TL;DR

Bayes3D is a perception system that rapidly learns and infers 3D object shapes and scene composition with high accuracy and efficiency, even in cluttered environments, using a hierarchical Bayesian model and GPU acceleration.

Contribution

It introduces a novel hierarchical Bayesian model and GPU-accelerated inference algorithm for fast, uncertainty-aware 3D scene understanding and object recognition.

Findings

01

Learns 3D object models from few views

02

Recognizes objects more robustly than neural baselines

03

Tracks 3D objects faster than real time on GPU

Abstract

Robots cannot yet match humans' ability to rapidly learn the shapes of novel 3D objects and recognize them robustly despite clutter and occlusion. We present Bayes3D, an uncertainty-aware perception system for structured 3D scenes, that reports accurate posterior uncertainty over 3D object shape, pose, and scene composition in the presence of clutter and occlusion. Bayes3D delivers these capabilities via a novel hierarchical Bayesian model for 3D scenes and a GPU-accelerated coarse-to-fine sequential Monte Carlo algorithm. Quantitative experiments show that Bayes3D can learn 3D models of novel objects from just a handful of views, recognizing them more robustly and with orders of magnitude less training data than neural baselines, and tracking 3D objects faster than real time on a single GPU. We also demonstrate that Bayes3D learns complex 3D object models and accurately infers 3D scene…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Advanced Vision and Imaging · Robot Manipulation and Learning