Condition-Invariant Multi-View Place Recognition
Jose M. Facil, Daniel Olid, Luis Montesano, Javier Civera

TL;DR
This paper introduces three novel deep network approaches that leverage temporal sequence information to improve visual place recognition under appearance changes, achieving more compact and accurate descriptors.
Contribution
It proposes Descriptor Grouping, Fusion, and Recurrent Descriptors methods to enhance deep networks for sequence-based place recognition, outperforming existing baselines.
Findings
More compact descriptors than baselines
Higher recognition accuracy in public databases
Effective handling of appearance variations
Abstract
Visual place recognition is particularly challenging when places suffer changes in its appearance. Such changes are indeed common, e.g., due to weather, night/day or seasons. In this paper we leverage on recent research using deep networks, and explore how they can be improved by exploiting the temporal sequence information. Specifically, we propose 3 different alternatives (Descriptor Grouping, Fusion and Recurrent Descriptors) for deep networks to use several frames of a sequence. We show that our approaches produce more compact and best performing descriptors than single- and multi-view baselines in the literature in two public databases.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques · Indoor and Outdoor Localization Technologies
