SelfOdom: Self-supervised Egomotion and Depth Learning via Bi-directional Coarse-to-Fine Scale Recovery
Hao Qu, Lilian Zhang, Xiaoping Hu, Xiaofeng He, Xianfei Pan, Changhao, Chen

TL;DR
SelfOdom is a self-supervised framework that accurately estimates pose and depth with global scale from monocular images, using a novel coarse-to-fine training strategy and inertial data fusion, excelling in diverse lighting conditions.
Contribution
It introduces a dual-network, self-supervised approach with a coarse-to-fine scale recovery method and inertial data integration for robust monocular odometry.
Findings
Outperforms traditional and learning-based VO and VIO models.
Effective in challenging lighting conditions, including night scenes.
Achieves accurate global scale pose and depth estimation.
Abstract
Accurately perceiving location and scene is crucial for autonomous driving and mobile robots. Recent advances in deep learning have made it possible to learn egomotion and depth from monocular images in a self-supervised manner, without requiring highly precise labels to train the networks. However, monocular vision methods suffer from a limitation known as scale-ambiguity, which restricts their application when absolute-scale is necessary. To address this, we propose SelfOdom, a self-supervised dual-network framework that can robustly and consistently learn and generate pose and depth estimates in global scale from monocular images. In particular, we introduce a novel coarse-to-fine training strategy that enables the metric scale to be recovered in a two-stage process. Furthermore, SelfOdom is flexible and can incorporate inertial data with images, which improves its robustness in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotics and Sensor-Based Localization · Advanced Neural Network Applications · Video Surveillance and Tracking Methods
MethodsSelf-Learning
