ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware   Diffusion Guidance

Yongwei Chen; Tengfei Wang; Tong Wu; Xingang Pan; Kui Jia; Ziwei Liu

arXiv:2403.12409·cs.CV·March 20, 2024·1 cites

ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance

Yongwei Chen, Tengfei Wang, Tong Wu, Xingang Pan, Kui Jia, Ziwei Liu

PDF

Open Access

TL;DR

ComboVerse is a novel framework that generates complex, multi-object 3D assets from images by learning to combine models and using spatially-aware diffusion guidance for improved accuracy.

Contribution

The paper introduces ComboVerse, a new method for compositional 3D asset creation that effectively models multiple objects and spatial arrangements from images.

Findings

01

Outperforms existing methods in compositional 3D generation

02

Effectively models complex multi-object 3D assets

03

Uses spatially-aware guidance for better object placement

Abstract

Generating high-quality 3D assets from a given image is highly desirable in various applications such as AR/VR. Recent advances in single-image 3D generation explore feed-forward models that learn to infer the 3D model of an object without optimization. Though promising results have been achieved in single object generation, these methods often struggle to model complex 3D assets that inherently contain multiple objects. In this work, we present ComboVerse, a 3D generation framework that produces high-quality 3D assets with complex compositions by learning to combine multiple models. 1) We first perform an in-depth analysis of this ``multi-object gap'' from both model and data perspectives. 2) Next, with reconstructed 3D models of different objects, we seek to adjust their sizes, rotation angles, and locations to create a 3D asset that matches the given image. 3) To automate this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAugmented Reality Applications · Robotics and Sensor-Based Localization · Surgical Simulation and Training

MethodsDiffusion