TransFusionOdom: Interpretable Transformer-based LiDAR-Inertial Fusion Odometry Estimation
Leyuan Sun, Guanqun Ding, Yue Qiu, Yusuke Yoshiyasu, Fumio Kanehiro

TL;DR
This paper introduces TransFusionOdom, an end-to-end Transformer-based framework for LiDAR-Inertial odometry that adaptively fuses multi-modal data, interprets modality interactions, and demonstrates superior performance on benchmark datasets.
Contribution
The work proposes a novel Transformer-based fusion framework with multi-attention modules for multi-modal odometry, including visualization and validation on synthetic and real datasets.
Findings
Achieves superior odometry accuracy on KITTI dataset.
Effectively interprets multi-modal interactions through visualization.
Validates generalization with a synthetic multi-modal dataset.
Abstract
Multi-modal fusion of sensors is a commonly used approach to enhance the performance of odometry estimation, which is also a fundamental module for mobile robots. However, the question of \textit{how to perform fusion among different modalities in a supervised sensor fusion odometry estimation task?} is still one of challenging issues remains. Some simple operations, such as element-wise summation and concatenation, are not capable of assigning adaptive attentional weights to incorporate different modalities efficiently, which make it difficult to achieve competitive odometry results. Recently, the Transformer architecture has shown potential for multi-modal fusion tasks, particularly in the domains of vision with language. In this work, we propose an end-to-end supervised Transformer-based LiDAR-Inertial fusion framework (namely TransFusionOdom) for odometry estimation. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Robotic Path Planning Algorithms · Robot Manipulation and Learning
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Residual Connection · Softmax · Byte Pair Encoding
