MV-SAM3D: Adaptive Multi-View Fusion for Layout-Aware 3D Generation

Baicheng Li; Dong Wu; Jun Li; Shunkai Zhou; Zecui Zeng; Lusong Li; Hongbin Zha

arXiv:2603.11633·cs.CV·April 10, 2026

MV-SAM3D: Adaptive Multi-View Fusion for Layout-Aware 3D Generation

Baicheng Li, Dong Wu, Jun Li, Shunkai Zhou, Zecui Zeng, Lusong Li, Hongbin Zha

PDF

1 Repo

TL;DR

MV-SAM3D is a training-free framework that enhances multi-view 3D scene reconstruction by ensuring consistency, physical plausibility, and adaptive fusion of observations, significantly improving quality and realism.

Contribution

It introduces a novel multi-view fusion method with adaptive weighting and physics-aware optimization, enabling high-quality, physically plausible 3D scene generation without additional training.

Findings

01

Improves reconstruction fidelity on standard benchmarks.

02

Enhances layout plausibility with physics-aware object arrangements.

03

Achieves significant quality improvements without extra training.

Abstract

Recent unified 3D generation models have made remarkable progress in producing high-quality 3D assets from a single image. Notably, layout-aware approaches such as SAM3D can reconstruct multiple objects while preserving their spatial arrangement, opening the door to practical scene-level 3D generation. However, current methods are limited to single-view input and cannot leverage complementary multi-view observations, while independently estimated object poses often lead to physically implausible layouts such as interpenetration and floating artifacts. We present MV-SAM3D, a training-free framework that extends layout-aware 3D generation with multi-view consistency and physical plausibility. We formulate multi-view fusion as a Multi-Diffusion process in 3D latent space and propose two adaptive weighting strategies -- attention-entropy weighting and visibility weighting -- that enable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

devinli123/MV-SAM3D
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.