NimbleD: Enhancing Self-supervised Monocular Depth Estimation with   Pseudo-labels and Large-scale Video Pre-training

Albert Luginov; Muhammad Shahzad

arXiv:2408.14177·cs.CV·August 27, 2024

NimbleD: Enhancing Self-supervised Monocular Depth Estimation with Pseudo-labels and Large-scale Video Pre-training

Albert Luginov, Muhammad Shahzad

PDF

Open Access 1 Repo

TL;DR

NimbleD is a fast, lightweight self-supervised monocular depth estimation framework that uses pseudo-labels from a large vision model and large-scale video pre-training, achieving high performance without camera intrinsics.

Contribution

It introduces a novel, efficient self-supervised learning approach that leverages pseudo-labels and large-scale video pre-training, eliminating the need for camera intrinsics.

Findings

01

Achieves performance comparable to state-of-the-art models

02

Enables large-scale pre-training on publicly available videos

03

Maintains low latency suitable for AR/VR applications

Abstract

We introduce NimbleD, an efficient self-supervised monocular depth estimation learning framework that incorporates supervision from pseudo-labels generated by a large vision model. This framework does not require camera intrinsics, enabling large-scale pre-training on publicly available videos. Our straightforward yet effective learning strategy significantly enhances the performance of fast and lightweight models without introducing any overhead, allowing them to achieve performance comparable to state-of-the-art self-supervised monocular depth estimation models. This advancement is particularly beneficial for virtual and augmented reality applications requiring low latency inference. The source code, model weights, and acknowledgments are available at https://github.com/xapaxca/nimbled .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

xapaxca/nimbled
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques