CVD-SfM: A Cross-View Deep Front-end Structure-from-Motion System for Sparse Localization in Multi-Altitude Scenes
Yaxuan Li, Yewei Huang, Bijay Gaudel, Hamidreza Jafarnejadsani, and Brendan Englot

TL;DR
This paper introduces CVD-SfM, a novel cross-view deep structure-from-motion system designed for accurate sparse localization across multi-altitude scenes, validated on new datasets and outperforming existing methods.
Contribution
The paper presents a new multi-altitude pose estimation system integrating cross-view transformers and structure-from-motion, along with newly collected datasets for benchmarking.
Findings
Achieves superior accuracy and robustness in multi-altitude sparse pose estimation.
Demonstrates effectiveness across diverse environmental conditions and viewpoints.
Provides new datasets to benchmark multi-altitude localization methods.
Abstract
We present a novel multi-altitude camera pose estimation system, addressing the challenges of robust and accurate localization across varied altitudes when only considering sparse image input. The system effectively handles diverse environmental conditions and viewpoint variations by integrating the cross-view transformer, deep features, and structure-from-motion into a unified framework. To benchmark our method and foster further research, we introduce two newly collected datasets specifically tailored for multi-altitude camera pose estimation; datasets of this nature remain rare in the current literature. The proposed framework has been validated through extensive comparative analyses on these datasets, demonstrating that our system achieves superior performance in both accuracy and robustness for multi-altitude sparse pose estimation tasks compared to existing solutions, making it…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Advanced Vision and Imaging
