Learning Interpretable BEV Based VIO without Deep Neural Networks
Zexi Chen, Haozhe Du, Xuecheng Xu, Rong Xiong, Yiyi Liao, Yue Wang

TL;DR
This paper introduces a fully differentiable, interpretable bird-eye-view based visual-inertial odometry model that does not rely on deep neural networks, enabling end-to-end training and better generalization.
Contribution
The authors propose a novel BEV-based VIO model using differentiable Kalman filtering and camera projection, eliminating the need for deep neural networks and manual parameter tuning.
Findings
Achieves competitive performance with state-of-the-art methods.
Generalizes well to unseen scenes.
Can be trained end-to-end without deep neural networks.
Abstract
Monocular visual-inertial odometry (VIO) is a critical problem in robotics and autonomous driving. Traditional methods solve this problem based on filtering or optimization. While being fully interpretable, they rely on manual interference and empirical parameter tuning. On the other hand, learning-based approaches allow for end-to-end training but require a large number of training data to learn millions of parameters. However, the non-interpretable and heavy models hinder the generalization ability. In this paper, we propose a fully differentiable, and interpretable, bird-eye-view (BEV) based VIO model for robots with local planar motion that can be trained without deep neural networks. Specifically, we first adopt Unscented Kalman Filter as a differentiable layer to predict the pitch and roll, where the covariance matrices of noise are learned to filter out the noise of the IMU raw…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
