Enhancing Monocular Depth Estimation with Multi-Source Auxiliary Tasks
Alessio Quercia, Erenus Yildiz, Zhuo Cao, Kai Krajsek, Abigail, Morrison, Ira Assent, Hanno Scharr

TL;DR
This paper introduces a method that leverages auxiliary datasets from related vision tasks to improve monocular depth estimation, achieving significant quality gains and data efficiency even with limited high-quality labeled data.
Contribution
It proposes an alternating training scheme with a shared decoder that effectively incorporates auxiliary datasets, especially semantic segmentation, to enhance MDE performance.
Findings
MDE quality improves by approximately 11% with auxiliary data.
Using semantic segmentation datasets yields additional gains.
Method reduces dataset size by at least 80% while maintaining quality.
Abstract
Monocular depth estimation (MDE) is a challenging task in computer vision, often hindered by the cost and scarcity of high-quality labeled datasets. We tackle this challenge using auxiliary datasets from related vision tasks for an alternating training scheme with a shared decoder built on top of a pre-trained vision foundation model, while giving a higher weight to MDE. Through extensive experiments we demonstrate the benefits of incorporating various in-domain auxiliary datasets and tasks to improve MDE quality on average by ~11%. Our experimental analysis shows that auxiliary tasks have different impacts, confirming the importance of task selection, highlighting that quality gains are not achieved by merely adding data. Remarkably, our study reveals that using semantic segmentation datasets as Multi-Label Dense Classification (MLDC) often results in additional quality gains. Lastly,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Optical measurement and interference techniques · Industrial Vision Systems and Defect Detection
