Multimodal Scale Consistency and Awareness for Monocular Self-Supervised   Depth Estimation

Hemang Chawla; Arnav Varma; Elahe Arani; Bahram Zonooz

arXiv:2103.02451·cs.CV·February 3, 2023

Multimodal Scale Consistency and Awareness for Monocular Self-Supervised Depth Estimation

Hemang Chawla, Arnav Varma, Elahe Arani, Bahram Zonooz

PDF

1 Repo

TL;DR

This paper introduces a novel GPS-based loss function for monocular self-supervised depth estimation that ensures scale consistency and awareness, leveraging GPS data during training to improve depth accuracy without requiring GPS at inference.

Contribution

It proposes a dynamically-weighted GPS-to-Scale loss that enhances scale consistency in depth estimation by utilizing GPS data during training only, independent of scene or camera setup.

Findings

01

Improved scale-consistent depth estimation demonstrated across multiple datasets.

02

Enhanced depth accuracy even with low-frequency GPS data during training.

03

The method does not require GPS at inference, making it practical for real-world applications.

Abstract

Dense depth estimation is essential to scene-understanding for autonomous driving. However, recent self-supervised approaches on monocular videos suffer from scale-inconsistency across long sequences. Utilizing data from the ubiquitously copresent global positioning systems (GPS), we tackle this challenge by proposing a dynamically-weighted GPS-to-Scale (g2s) loss to complement the appearance-based losses. We emphasize that the GPS is needed only during the multimodal training, and not at inference. The relative distance between frames captured through the GPS provides a scale signal that is independent of the camera setup and scene distribution, resulting in richer learned feature representations. Through extensive evaluation on multiple datasets, we demonstrate scale-consistent and -aware depth estimation during inference, improving the performance even when training with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NeurAI-Lab/G2S
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsGreedy Policy Search