TL;DR
This paper evaluates various CNN models and data augmentation techniques for hierarchical visual localization of mobile robots using omnidirectional images, focusing on both rough room-level and fine image retrieval accuracy.
Contribution
It provides a comprehensive ablation study of CNN backbones and data augmentation effects specifically tailored for robot localization with omnidirectional imagery.
Findings
ConvNeXt improves localization accuracy.
Data augmentation enhances robustness under lighting changes.
Dual-step localization effectively combines rough and fine positioning.
Abstract
This work presents an evaluation of CNN models and data augmentation to carry out the hierarchical localization of a mobile robot by using omnidireccional images. In this sense, an ablation study of different state-of-the-art CNN models used as backbone is presented and a variety of data augmentation visual effects are proposed for addressing the visual localization of the robot. The proposed method is based on the adaption and re-training of a CNN with a dual purpose: (1) to perform a rough localization step in which the model is used to predict the room from which an image was captured, and (2) to address the fine localization step, which consists in retrieving the most similar image of the visual map among those contained in the previously predicted room by means of a pairwise comparison between descriptors obtained from an intermediate layer of the CNN. In this sense, we evaluate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsConvNeXt
