Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Yuxuan Xue; Xianghui Xie; Riccardo Marin; Gerard Pons-Moll

arXiv:2412.06698·cs.CV·November 27, 2025·2 cites

Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy

Yuxuan Xue, Xianghui Xie, Riccardo Marin, Gerard Pons-Moll

PDF

Open Access 2 Models

TL;DR

Gen-3Diffusion introduces a novel approach combining 2D and 3D diffusion models to generate realistic, multi-view consistent 3D objects and avatars from a single image, enhancing generalization and accuracy.

Contribution

The paper proposes a synchronized 2D and 3D diffusion framework that improves multi-view consistency and generalization in 3D object and avatar generation from single images.

Findings

01

Produces high-fidelity 3D objects and avatars

02

Enhances multi-view consistency in generated images

03

Demonstrates strong generalization to diverse shapes and clothing

Abstract

Creating realistic 3D objects and clothed avatars from a single RGB image is an attractive yet challenging problem. Due to its ill-posed nature, recent works leverage powerful prior from 2D diffusion models pretrained on large datasets. Although 2D diffusion models demonstrate strong generalization capability, they cannot guarantee the generated multi-view images are 3D consistent. In this paper, we propose Gen-3Diffusion: Realistic Image-to-3D Generation via 2D & 3D Diffusion Synergy. We leverage a pre-trained 2D diffusion model and a 3D diffusion model via our elegantly designed process that synchronizes two diffusion models at both training and sampling time. The synergy between the 2D and 3D diffusion models brings two major advantages: 1) 2D helps 3D in generalization: the pretrained 2D model has strong generalization ability to unseen images, providing strong shape priors for the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Advanced Vision and Imaging

MethodsDiffusion