Self-Supervised Learning of Depth and Ego-Motion from Video by Alternative Training and Geometric Constraints from 3D to 2D
Jiaojiao Fang, Guizhong Liu

TL;DR
This paper introduces an alternative training approach for self-supervised depth and ego-motion estimation from video, utilizing geometric constraints and iterative optimization to improve performance without auxiliary tasks.
Contribution
It proposes a novel alternative training scheme with geometric constraints and a 3D structural consistency loss, enhancing depth and pose learning without auxiliary supervision.
Findings
Outperforms state-of-the-art self-supervised methods on benchmark datasets
Effectively utilizes epipolar geometry in pose estimation
Improves depth accuracy with a log-scale 3D consistency loss
Abstract
Self-supervised learning of depth and ego-motion from unlabeled monocular video has acquired promising results and drawn extensive attention. Most existing methods jointly train the depth and pose networks by photometric consistency of adjacent frames based on the principle of structure-from-motion (SFM). However, the coupling relationship of the depth and pose networks seriously influences the learning performance, and the re-projection relations is sensitive to scale ambiguity, especially for pose learning. In this paper, we aim to improve the depth-pose learning performance without the auxiliary tasks and address the above issues by alternative training each task and incorporating the epipolar geometric constraints into the Iterative Closest Point (ICP) based point clouds match process. Distinct from jointly training the depth and pose networks, our key idea is to better utilize the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Robotics and Sensor-Based Localization
