# To complete or to estimate, that is the question: A Multi-Task Approach   to Depth Completion and Monocular Depth Estimation

**Authors:** Amir Atapour-Abarghouei, Toby P. Breckon

arXiv: 1908.05540 · 2019-08-16

## TL;DR

This paper introduces a multi-task learning model that jointly performs sparse depth completion and monocular depth estimation, leveraging synthetic and real data to improve scene understanding in applications like autonomous driving.

## Contribution

The novel multi-task approach combines depth completion and monocular depth estimation in a single end-to-end trainable model, enhancing accuracy and robustness.

## Key findings

- Outperforms state-of-the-art methods in depth estimation and completion
- Uses adversarial training for high-quality depth predictions
- Effective with both synthetic and real-world datasets

## Abstract

Robust three-dimensional scene understanding is now an ever-growing area of research highly relevant in many real-world applications such as autonomous driving and robotic navigation. In this paper, we propose a multi-task learning-based model capable of performing two tasks:- sparse depth completion (i.e. generating complete dense scene depth given a sparse depth image as the input) and monocular depth estimation (i.e. predicting scene depth from a single RGB image) via two sub-networks jointly trained end to end using data randomly sampled from a publicly available corpus of synthetic and real-world images. The first sub-network generates a sparse depth image by learning lower level features from the scene and the second predicts a full dense depth image of the entire scene, leading to a better geometric and contextual understanding of the scene and, as a result, superior performance of the approach. The entire model can be used to infer complete scene depth from a single RGB image or the second network can be used alone to perform depth completion given a sparse depth input. Using adversarial training, a robust objective function, a deep architecture relying on skip connections and a blend of synthetic and real-world training data, our approach is capable of producing superior high quality scene depth. Extensive experimental evaluation demonstrates the efficacy of our approach compared to contemporary state-of-the-art techniques across both problem domains.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.05540/full.md

## Figures

22 figures with captions in the complete paper: https://tomesphere.com/paper/1908.05540/full.md

## References

66 references — full list in the complete paper: https://tomesphere.com/paper/1908.05540/full.md

---
Source: https://tomesphere.com/paper/1908.05540