Diffusion-based Generation, Optimization, and Planning in 3D Scenes

Siyuan Huang; Zan Wang; Puhao Li; Baoxiong Jia; Tengyu Liu; Yixin Zhu,; Wei Liang; Song-Chun Zhu

arXiv:2301.06015·cs.CV·January 18, 2023·6 cites

Diffusion-based Generation, Optimization, and Planning in 3D Scenes

Siyuan Huang, Zan Wang, Puhao Li, Baoxiong Jia, Tengyu Liu, Yixin Zhu,, Wei Liang, Song-Chun Zhu

PDF

Open Access 2 Repos

TL;DR

SceneDiffuser is a novel diffusion-based model that unifies 3D scene generation, optimization, and planning, offering scene-awareness, physics-based reasoning, and goal orientation for diverse 3D understanding tasks.

Contribution

It introduces SceneDiffuser, a fully differentiable, scene-aware diffusion model that integrates generation, optimization, and planning in 3D scenes, surpassing prior methods.

Findings

01

Significant improvements in human pose and motion generation.

02

Enhanced dexterous grasp generation.

03

Superior path and motion planning results.

Abstract

We introduce SceneDiffuser, a conditional generative model for 3D scene understanding. SceneDiffuser provides a unified model for solving scene-conditioned generation, optimization, and planning. In contrast to prior works, SceneDiffuser is intrinsically scene-aware, physics-based, and goal-oriented. With an iterative sampling strategy, SceneDiffuser jointly formulates the scene-aware generation, physics-based optimization, and goal-oriented planning via a diffusion-based denoising process in a fully differentiable fashion. Such a design alleviates the discrepancies among different modules and the posterior collapse of previous scene-conditioned generative models. We evaluate SceneDiffuser with various 3D scene understanding tasks, including human pose and motion generation, dexterous grasp generation, path planning for 3D navigation, and motion planning for robot arms. The results show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Advanced Vision and Imaging