BRIGHT-VO: Brightness-Guided Hybrid Transformer for Visual Odometry with Multi-modality Refinement Module
Dongzhihan Wang, Yang Yang, Liang Xu

TL;DR
BrightVO is a Transformer-based visual odometry model that integrates multi-modality data and a synthetic low-light dataset to enhance pose estimation accuracy in challenging lighting conditions.
Contribution
The paper introduces BrightVO, a novel Transformer-based VO framework with a multi-modality refinement module and a synthetic low-light dataset for improved robustness and accuracy.
Findings
Achieves 20% better accuracy on KITTI dataset.
Improves low-light pose estimation accuracy by 259%.
Outperforms existing VO methods in various lighting conditions.
Abstract
Visual odometry (VO) plays a crucial role in autonomous driving, robotic navigation, and other related tasks by estimating the position and orientation of a camera based on visual input. Significant progress has been made in data-driven VO methods, particularly those leveraging deep learning techniques to extract image features and estimate camera poses. However, these methods often struggle in low-light conditions because of the reduced visibility of features and the increased difficulty of matching keypoints. To address this limitation, we introduce BrightVO, a novel VO model based on Transformer architecture, which not only performs front-end visual feature extraction, but also incorporates a multi-modality refinement module in the back-end that integrates Inertial Measurement Unit (IMU) data. Using pose graph optimization, this module iteratively refines pose estimates to reduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptical measurement and interference techniques · Advanced Optical Sensing Technologies · Robotics and Sensor-Based Localization
MethodsAttention Is All You Need · Absolute Position Encodings · Adam · Residual Connection · Dropout · Softmax · Byte Pair Encoding · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer
