Learning Multiplicative Interactions with Bayesian Neural Networks for Visual-Inertial Odometry
Kashmira Shinde, Jongseok Lee, Matthias Humt, Aydin Sezgin, Rudolph, Triebel

TL;DR
This paper introduces a multi-modal learning approach for monocular Visual-Inertial Odometry using Bayesian Neural Networks with multiplicative interactions, enhancing robustness to sensor degradation and outperforming state-of-the-art methods.
Contribution
It proposes a novel end-to-end multi-modal VIO model utilizing self-attention for multiplicative interactions and Bayesian uncertainty estimation, improving robustness and performance.
Findings
Achieves superior accuracy on the KITTI dataset.
Demonstrates increased robustness to sensor failures.
Provides empirical evidence for the effectiveness of multiplicative interactions.
Abstract
This paper presents an end-to-end multi-modal learning approach for monocular Visual-Inertial Odometry (VIO), which is specifically designed to exploit sensor complementarity in the light of sensor degradation scenarios. The proposed network makes use of a multi-head self-attention mechanism that learns multiplicative interactions between multiple streams of information. Another design feature of our approach is the incorporation of the model uncertainty using scalable Laplace Approximation. We evaluate the performance of the proposed approach by comparing it against the end-to-end state-of-the-art methods on the KITTI dataset and show that it achieves superior performance. Importantly, our work thereby provides an empirical evidence that learning multiplicative interactions can result in a powerful inductive bias for increased robustness to sensor failures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Vision and Imaging · Indoor and Outdoor Localization Technologies
