Benchmarking Efficient & Effective Camera Pose Estimation Strategies for Novel View Synthesis
Jhacson Meza, Martin R. Oswald, Torsten Sattler

TL;DR
This paper introduces a benchmark for camera pose estimation in novel view synthesis, comparing classical and neural network-based SfM methods, and demonstrates strategies to balance efficiency and accuracy.
Contribution
It develops a benchmark for evaluating SfM methods in NVS, highlighting simple feature reduction and hybrid approaches as effective strategies.
Findings
Fewer features speed up classical SfM with high accuracy.
Feed-forward network estimates refined by classical SfM offer optimal efficiency and effectiveness.
Hybrid methods outperform purely neural or classical approaches.
Abstract
Novel view synthesis (NVS) approaches such as NeRFs or 3DGS can produce photo-realistic 3D scene representation from a set of images with known extrinsic and intrinsic parameters. The necessary camera poses and calibrations are typically obtained from the images via Structure-from-Motion (SfM). Classical SfM approaches rely on local feature matches between the images to estimate both the poses and a sparse 3D model of the scene, using bundle adjustment to refine initial pose, intrinsics, and geometry estimates. In order to increase run-time efficiency, recent SfM systems forgo optimization via bundle adjustment. Instead, they train feed-forward (transformer-based) neural networks to directly regress camera parameters and the 3D structure. While orders of magnitude more efficient, such recent works produce significantly less accurate estimates. To stimulate research on developing SfM…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · 3D Shape Modeling and Analysis · Robotics and Sensor-Based Localization
