No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
Botao Ye, Sifei Liu, Haofei Xu, Xueting Li, Marc Pollefeys, Ming-Hsuan, Yang, Songyou Peng

TL;DR
NoPoSplat introduces a real-time, pose-free 3D scene reconstruction method from sparse unposed images, enabling high-quality novel view synthesis and accurate pose estimation without requiring pose or depth supervision.
Contribution
The paper presents a novel pose-free 3D Gaussian reconstruction model trained solely with photometric loss, eliminating the need for pose estimation during inference.
Findings
Achieves superior novel view synthesis quality in low-overlap scenarios.
Significantly outperforms state-of-the-art pose estimation methods without ground truth depth.
Operates in real-time during inference.
Abstract
We introduce NoPoSplat, a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from \textit{unposed} sparse multi-view images. Our model, trained exclusively with photometric loss, achieves real-time 3D Gaussian reconstruction during inference. To eliminate the need for accurate pose input during reconstruction, we anchor one input view's local camera coordinates as the canonical space and train the network to predict Gaussian primitives for all views within this space. This approach obviates the need to transform Gaussian primitives from local coordinates into a global coordinate system, thus avoiding errors associated with per-frame Gaussians and pose estimation. To resolve scale ambiguity, we design and compare various intrinsic embedding methods, ultimately opting to convert camera intrinsics into a token embedding and concatenate it with image tokens…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage and Object Detection Techniques
