# Neural RGB->D Sensing: Depth and Uncertainty from a Video Camera

**Authors:** Chao Liu, Jinwei Gu, Kihwan Kim, Srinivasa Narasimhan, Jan Kautz

arXiv: 1901.02571 · 2019-01-10

## TL;DR

This paper introduces a deep learning approach that transforms a monocular video into a reliable RGB-D sensing system by estimating per-pixel depth distributions and accumulating them over time to improve accuracy and stability.

## Contribution

It presents a novel method for continuous depth and uncertainty estimation from monocular videos, using depth probability distributions and Bayesian filtering, outperforming prior approaches.

## Key findings

- Achieves more accurate depth estimates than previous methods.
- Provides stable and robust depth predictions over time.
- Enables integration with classical 3D reconstruction techniques.

## Abstract

Depth sensing is crucial for 3D reconstruction and scene understanding. Active depth sensors provide dense metric measurements, but often suffer from limitations such as restricted operating ranges, low spatial resolution, sensor interference, and high power consumption. In this paper, we propose a deep learning (DL) method to estimate per-pixel depth and its uncertainty continuously from a monocular video stream, with the goal of effectively turning an RGB camera into an RGB-D camera. Unlike prior DL-based methods, we estimate a depth probability distribution for each pixel rather than a single depth value, leading to an estimate of a 3D depth probability volume for each input frame. These depth probability volumes are accumulated over time under a Bayesian filtering framework as more incoming frames are processed sequentially, which effectively reduces depth uncertainty and improves accuracy, robustness, and temporal stability. Compared to prior work, the proposed approach achieves more accurate and stable results, and generalizes better to new datasets. Experimental results also show the output of our approach can be directly fed into classical RGB-D based 3D scanning methods for 3D scene reconstruction.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1901.02571/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1901.02571/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/1901.02571/full.md

---
Source: https://tomesphere.com/paper/1901.02571