Exploring Self-Attention for Visual Odometry
Hamed Damirchi, Rooholla Khorrambakht, Hamid D. Taghirad

TL;DR
This paper investigates the use of self-attention mechanisms in visual odometry networks to improve feature extraction and odometry accuracy, especially in scenes with dynamic objects and texture-less regions.
Contribution
It introduces self-attention into visual odometry models and demonstrates its effectiveness through qualitative, quantitative, and saliency-based analyses.
Findings
Self-attention improves feature quality for odometry.
Enhanced odometry accuracy over state-of-the-art methods.
Saliency studies show better focus on relevant regions.
Abstract
Visual odometry networks commonly use pretrained optical flow networks in order to derive the ego-motion between consecutive frames. The features extracted by these networks represent the motion of all the pixels between frames. However, due to the existence of dynamic objects and texture-less surfaces in the scene, the motion information for every image region might not be reliable for inferring odometry due to the ineffectiveness of dynamic objects in derivation of the incremental changes in position. Recent works in this area lack attention mechanisms in their structures to facilitate dynamic reweighing of the feature maps for extracting more refined egomotion information. In this paper, we explore the effectiveness of self-attention in visual odometry. We report qualitative and quantitative results against the SOTA methods. Furthermore, saliency-based studies alongside specially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Advanced Image and Video Retrieval Techniques
