An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone Navigation
Olivier Brochu Dufour, Abolfazl Mohebbi, Sofiane Achiche

TL;DR
This paper introduces a real-time deep learning model with attention mechanisms for monocular visual odometry, enabling GPS-free drone navigation with improved accuracy and efficiency.
Contribution
It proposes a novel deep neural architecture combining CNN, LSTM, and self-attention for real-time monocular visual odometry in drones, outperforming previous models.
Findings
Converged 48% faster than previous RNN models.
Reduced mean translational drift by 22%.
Improved trajectory accuracy by 12%.
Abstract
Drones are increasingly used in fields like industry, medicine, research, disaster relief, defense, and security. Technical challenges, such as navigation in GPS-denied environments, hinder further adoption. Research in visual odometry is advancing, potentially solving GPS-free navigation issues. Traditional visual odometry methods use geometry-based pipelines which, while popular, often suffer from error accumulation and high computational demands. Recent studies utilizing deep neural networks (DNNs) have shown improved performance, addressing these drawbacks. Deep visual odometry typically employs convolutional neural networks (CNNs) and sequence modeling networks like recurrent neural networks (RNNs) to interpret scenes and deduce visual odometry from video sequences. This paper presents a novel real-time monocular visual odometry model for drones, using a deep neural architecture…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robotic Path Planning Algorithms · Advanced Image and Video Retrieval Techniques
MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention
