Self-Supervised Scale Recovery for Monocular Depth and Egomotion Estimation
Brandon Wagstaff, Jonathan Kelly

TL;DR
This paper introduces a novel scale recovery loss for monocular depth and egomotion estimation that enforces known camera height constraints, enabling metric predictions and better adaptability across environments.
Contribution
The paper proposes a new scale recovery loss that improves metric depth and egomotion estimation and allows network retraining in new environments.
Findings
Achieves competitive scale recovery without extra information.
Enables network retraining in new environments.
Produces more accurate egomotion estimates than test-time scale recovery methods.
Abstract
The self-supervised loss formulation for jointly training depth and egomotion neural networks with monocular images is well studied and has demonstrated state-of-the-art accuracy. One of the main limitations of this approach, however, is that the depth and egomotion estimates are only determined up to an unknown scale. In this paper, we present a novel scale recovery loss that enforces consistency between a known camera height and the estimated camera height, generating metric (scaled) depth and egomotion predictions. We show that our proposed method is competitive with other scale recovery techniques that require more information. Further, we demonstrate that our method facilitates network retraining within new environments, whereas other scale-resolving approaches are incapable of doing so. Notably, our egomotion network is able to produce more accurate estimates than a similar method…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Human Pose and Action Recognition · Video Coding and Compression Technologies
