NVComposer: Boosting Generative Novel View Synthesis with Multiple Sparse and Unposed Images
Lingen Li, Zhaoyang Zhang, Yaowei Li, Jiale Xu, Wenbo Hu, Xiaoyu Li,, Weihao Cheng, Jinwei Gu, Tianfan Xue, Ying Shan

TL;DR
NVComposer introduces a novel generative approach for novel view synthesis that eliminates the need for explicit external pose alignment by implicitly inferring spatial relationships through a dual-stream diffusion model and geometry-aware feature alignment.
Contribution
It proposes NVComposer, a new method that removes external alignment requirements in multi-view NVS by jointly generating views and poses, and distilling geometric priors during training.
Findings
Achieves state-of-the-art performance in multi-view NVS tasks.
Improves synthesis quality with increasing unposed input views.
Removes reliance on external pose estimation, enhancing accessibility.
Abstract
Recent advancements in generative models have significantly improved novel view synthesis (NVS) from multi-view data. However, existing methods depend on external multi-view alignment processes, such as explicit pose estimation or pre-reconstruction, which limits their flexibility and accessibility, especially when alignment is unstable due to insufficient overlap or occlusions between views. In this paper, we propose NVComposer, a novel approach that eliminates the need for explicit external alignment. NVComposer enables the generative model to implicitly infer spatial and geometric relationships between multiple conditional views by introducing two key components: 1) an image-pose dual-stream diffusion model that simultaneously generates target novel views and condition camera poses, and 2) a geometry-aware feature alignment module that distills geometric priors from dense stereo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Computer Graphics and Visualization Techniques
MethodsDiffusion
