MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors   Enhanced Diffusion Model

Chenjie Cao; Chaohui Yu; Shang Liu; Fan Wang; Xiangyang Xue; Yanwei Fu

arXiv:2411.16157·cs.CV·March 7, 2025

MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model

Chenjie Cao, Chaohui Yu, Shang Liu, Fan Wang, Xiangyang Xue, Yanwei Fu

PDF

Open Access 1 Repo 1 Models

TL;DR

MVGenMaster is a diffusion-based multi-view synthesis model that uses 3D priors and a large-scale dataset to generate highly consistent novel views from limited input views.

Contribution

The paper introduces MVGenMaster, a novel diffusion model with 3D priors and a large dataset, improving multi-view synthesis quality and generalization.

Findings

01

Achieves up to 100 novel views with a single forward pass.

02

Outperforms existing methods on multiple benchmarks.

03

Demonstrates strong generalization to out-of-domain data.

Abstract

We introduce MVGenMaster, a multi-view diffusion model enhanced with 3D priors to address versatile Novel View Synthesis (NVS) tasks. MVGenMaster leverages 3D priors that are warped using metric depth and camera poses, significantly enhancing both generalization and 3D consistency in NVS. Our model features a simple yet effective pipeline that can generate up to 100 novel views conditioned on variable reference views and camera poses with a single forward process. Additionally, we have developed a comprehensive large-scale multi-view image dataset called MvD-1M, comprising up to 1.6 million scenes, equipped with well-aligned metric depth to train MVGenMaster. Moreover, we present several training and model modifications to strengthen the model with scaled-up datasets. Extensive evaluations across in- and out-of-domain benchmarks demonstrate the effectiveness of our proposed method and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ewrfcas/mvgenmaster
jaxOfficial

Models

🤗
ewrfcas/MVGenMaster
model· ♡ 2
♡ 2

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Medical Image Segmentation Techniques

MethodsDiffusion