An Attention-Based Deep Learning Architecture for Real-Time Monocular   Visual Odometry: Applications to GPS-free Drone Navigation

Olivier Brochu Dufour; Abolfazl Mohebbi; Sofiane Achiche

arXiv:2404.17745·cs.RO·April 30, 2024·1 cites

An Attention-Based Deep Learning Architecture for Real-Time Monocular Visual Odometry: Applications to GPS-free Drone Navigation

Olivier Brochu Dufour, Abolfazl Mohebbi, Sofiane Achiche

PDF

Open Access

TL;DR

This paper introduces a real-time deep learning model with attention mechanisms for monocular visual odometry, enabling GPS-free drone navigation with improved accuracy and efficiency.

Contribution

It proposes a novel deep neural architecture combining CNN, LSTM, and self-attention for real-time monocular visual odometry in drones, outperforming previous models.

Findings

01

Converged 48% faster than previous RNN models.

02

Reduced mean translational drift by 22%.

03

Improved trajectory accuracy by 12%.

Abstract

Drones are increasingly used in fields like industry, medicine, research, disaster relief, defense, and security. Technical challenges, such as navigation in GPS-denied environments, hinder further adoption. Research in visual odometry is advancing, potentially solving GPS-free navigation issues. Traditional visual odometry methods use geometry-based pipelines which, while popular, often suffer from error accumulation and high computational demands. Recent studies utilizing deep neural networks (DNNs) have shown improved performance, addressing these drawbacks. Deep visual odometry typically employs convolutional neural networks (CNNs) and sequence modeling networks like recurrent neural networks (RNNs) to interpret scenes and deduce visual odometry from video sequences. This paper presents a novel real-time monocular visual odometry model for drones, using a deep neural architecture…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Robotic Path Planning Algorithms · Advanced Image and Video Retrieval Techniques

MethodsAttention Is All You Need · Softmax · Linear Layer · Multi-Head Attention