$L^3$:Scene-agnostic Visual Localization in the Wild
Yu Zhang, Muhua Zhu, Yifei Xue, Tie Ji, Yizhen Lao

TL;DR
This paper introduces $L^3$, a novel visual localization method that performs accurate scene localization in the wild without offline scene pre-processing by using online 3D reconstruction and pose refinement.
Contribution
$L^3$ is the first framework to achieve scene-agnostic localization without offline pre-processing, leveraging online 3D reconstruction and two-stage scale and pose refinement.
Findings
Performance comparable to state-of-the-art methods on benchmarks
Significantly more robust in sparse scenes with fewer reference images
Eliminates the need for offline scene representation storage
Abstract
Standard visual localization methods typically require offline pre-processing of scenes to obtain 3D structural information for better performance. This inevitably introduces additional computational and time costs, as well as the overhead of storing scene representations. Can we visually localize in a wild scene without any off-line preprocessing step? In this paper, we leverage the online inference capabilities of feed-forward 3D reconstruction networks to propose a novel map-free visual localization framework . Specifically, by performing direct online 3D reconstruction on RGB images, followed by two-stage metric scale recovery and pose refinement based on 2D-3D correspondences, achieves high accuracy without the need to pre-build or store any offline scene representations. Extensive experiments demonstrate not only that the performance is comparable to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
