TransFusionOdom: Interpretable Transformer-based LiDAR-Inertial Fusion   Odometry Estimation

Leyuan Sun; Guanqun Ding; Yue Qiu; Yusuke Yoshiyasu; Fumio Kanehiro

arXiv:2304.07728·cs.RO·March 20, 2025·1 cites

TransFusionOdom: Interpretable Transformer-based LiDAR-Inertial Fusion Odometry Estimation

Leyuan Sun, Guanqun Ding, Yue Qiu, Yusuke Yoshiyasu, Fumio Kanehiro

PDF

Open Access 1 Repo

TL;DR

This paper introduces TransFusionOdom, an end-to-end Transformer-based framework for LiDAR-Inertial odometry that adaptively fuses multi-modal data, interprets modality interactions, and demonstrates superior performance on benchmark datasets.

Contribution

The work proposes a novel Transformer-based fusion framework with multi-attention modules for multi-modal odometry, including visualization and validation on synthetic and real datasets.

Findings

01

Achieves superior odometry accuracy on KITTI dataset.

02

Effectively interprets multi-modal interactions through visualization.

03

Validates generalization with a synthetic multi-modal dataset.

Abstract

Multi-modal fusion of sensors is a commonly used approach to enhance the performance of odometry estimation, which is also a fundamental module for mobile robots. However, the question of \textit{how to perform fusion among different modalities in a supervised sensor fusion odometry estimation task?} is still one of challenging issues remains. Some simple operations, such as element-wise summation and concatenation, are not capable of assigning adaptive attentional weights to incorporate different modalities efficiently, which make it difficult to achieve competitive odometry results. Recently, the Transformer architecture has shown potential for multi-modal fusion tasks, particularly in the domains of vision with language. In this work, we propose an end-to-end supervised Transformer-based LiDAR-Inertial fusion framework (namely TransFusionOdom) for odometry estimation. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

rakugenson/multi-modal-dataset-for-odometry-estimation
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotics and Sensor-Based Localization · Robotic Path Planning Algorithms · Robot Manipulation and Learning

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Label Smoothing · Dropout · Residual Connection · Softmax · Byte Pair Encoding