BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with   Multi-modality Refinement Module

Dongzhihan Wang; Yang Yang; Liang Xu

arXiv:2501.08659·cs.CV·May 1, 2025

BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-modality Refinement Module

Dongzhihan Wang, Yang Yang, Liang Xu

PDF

Open Access 1 Repo

TL;DR

BrightVO is a Transformer-based visual odometry model that integrates multi-modality data and a synthetic low-light dataset to enhance pose estimation accuracy in challenging lighting conditions.

Contribution

The paper introduces BrightVO, a novel Transformer-based VO framework with a multi-modality refinement module and a synthetic low-light dataset for improved robustness and accuracy.

Findings

01

Achieves 20% better accuracy on KITTI dataset.

02

Improves low-light pose estimation accuracy by 259%.

03

Outperforms existing VO methods in various lighting conditions.

Abstract

Visual odometry (VO) plays a crucial role in autonomous driving, robotic navigation, and other related tasks by estimating the position and orientation of a camera based on visual input. Significant progress has been made in data-driven VO methods, particularly those leveraging deep learning techniques to extract image features and estimate camera poses. However, these methods often struggle in low-light conditions because of the reduced visibility of features and the increased difficulty of matching keypoints. To address this limitation, we introduce BrightVO, a novel VO model based on Transformer architecture, which not only performs front-end visual feature extraction, but also incorporates a multi-modality refinement module in the back-end that integrates Inertial Measurement Unit (IMU) data. Using pose graph optimization, this module iteratively refines pose estimates to reduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anastasiawd/brightvo
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptical measurement and interference techniques · Advanced Optical Sensing Technologies · Robotics and Sensor-Based Localization

MethodsAttention Is All You Need · Absolute Position Encodings · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer