Self-Supervised Feature Learning for Long-Term Metric Visual Localization
Yuxuan Chen, Timothy D. Barfoot

TL;DR
This paper introduces a self-supervised learning framework that enables long-term visual localization without ground-truth pose data by leveraging sequence-based image matching to train neural network features adaptable to environmental changes.
Contribution
It proposes a novel self-supervised approach for feature learning in visual localization, eliminating the need for ground-truth pose supervision and improving robustness to appearance variations.
Findings
Effective feature learning without ground-truth labels
Successful integration with existing localization pipelines
Achieved reliable localization over 22.4 km under varying conditions
Abstract
Visual localization is the task of estimating camera pose in a known scene, which is an essential problem in robotics and computer vision. However, long-term visual localization is still a challenge due to the environmental appearance changes caused by lighting and seasons. While techniques exist to address appearance changes using neural networks, these methods typically require ground-truth pose information to generate accurate image correspondences or act as a supervisory signal during training. In this paper, we present a novel self-supervised feature learning framework for metric visual localization. We use a sequence-based image matching algorithm across different sequences of images (i.e., experiences) to generate image correspondences without ground-truth labels. We can then sample image pairs to train a deep neural network that learns sparse features with associated descriptors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
