SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

Yen-Chi Cheng; Hsin-Ying Lee; Sergey Tulyakov; Alexander Schwing and; Liangyan Gui

arXiv:2212.04493·cs.CV·March 23, 2023·5 cites

SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation

Yen-Chi Cheng, Hsin-Ying Lee, Sergey Tulyakov, Alexander Schwing and, Liangyan Gui

PDF

Open Access 1 Repo

TL;DR

SDFusion is a versatile framework that enables multimodal 3D shape completion, reconstruction, and generation, allowing users to interactively generate and modify 3D assets using images, text, and partial shapes.

Contribution

It introduces a flexible, multi-modal diffusion-based model that unifies various 3D shape tasks into a single system with adjustable input influence.

Findings

01

Outperforms prior methods on shape completion, image-based 3D reconstruction, and text-to-3D tasks.

02

Supports combined multi-modal inputs for interactive shape generation.

03

Provides an efficient way to texture generated shapes using large-scale text-to-image models.

Abstract

In this work, we present a novel framework built to simplify 3D asset generation for amateur users. To enable interactive generation, our method supports a variety of input modalities that can be easily provided by a human, including images, text, partially observed shapes and combinations of these, further allowing to adjust the strength of each input. At the core of our approach is an encoder-decoder, compressing 3D shapes into a compact latent representation, upon which a diffusion model is learned. To enable a variety of multi-modal inputs, we employ task-specific encoders with dropout followed by a cross-attention mechanism. Due to its flexibility, our model naturally supports a variety of tasks, outperforming prior works on shape completion, image-based 3D reconstruction, and text-to-3D. Most interestingly, our model can combine all these tasks into one swiss-army-knife tool,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yccyenchicheng/SDFusion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Human Pose and Action Recognition · 3D Shape Modeling and Analysis

MethodsDiffusion · Dropout