ObPose: Leveraging Pose for Object-Centric Scene Inference and   Generation in 3D

Yizhe Wu; Oiwi Parker Jones; Ingmar Posner

arXiv:2206.03591·cs.CV·June 13, 2023

ObPose: Leveraging Pose for Object-Centric Scene Inference and Generation in 3D

Yizhe Wu, Oiwi Parker Jones, Ingmar Posner

PDF

Open Access

TL;DR

ObPose is an unsupervised 3D scene inference and generation model that learns object-centric representations by disentangling object location and appearance, leveraging pose as an inductive bias, and modeling scenes as compositions of NeRFs.

Contribution

ObPose introduces a novel unsupervised approach that uses pose as an inductive bias and voxelised NeRF approximations for object-centric 3D scene inference and generation.

Findings

01

Outperforms state-of-the-art in 3D scene inference on multiple datasets

02

Enables flexible scene editing and novel scene generation

03

Validates key design choices through ablation studies

Abstract

We present ObPose, an unsupervised object-centric inference and generation model which learns 3D-structured latent representations from RGB-D scenes. Inspired by prior art in 2D representation learning, ObPose considers a factorised latent space, separately encoding object location (where) and appearance (what). ObPose further leverages an object's pose (i.e. location and orientation), defined via a minimum volume principle, as a novel inductive bias for learning the where component. To achieve this, we propose an efficient, voxelised approximation approach to recover the object shape directly from a neural radiance field (NeRF). As a consequence, ObPose models each scene as a composition of NeRFs, richly representing individual objects. To evaluate the quality of the learned representations, ObPose is evaluated quantitatively on the YCB, MultiShapeNet, and CLEVR datatasets for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Robotics and Sensor-Based Localization