Deep Neural Networks for Accurate Depth Estimation with Latent Space Features
Siddiqui Muhammad Yasir, Hyunsik Ahn

TL;DR
This paper presents a novel deep learning framework utilizing latent space features and a dual encoder-decoder architecture to significantly improve monocular depth estimation accuracy, especially in complex indoor environments.
Contribution
It introduces a new depth estimation model with a dual encoder-decoder structure and a combined loss function, advancing the precision of monocular depth maps over existing methods.
Findings
Sets a new benchmark on NYU Depth V2 dataset
Reduces depth ambiguities and boundary blurring
Enhances depth map accuracy in complex indoor scenes
Abstract
Depth estimation plays a pivotal role in advancing human-robot interactions, especially in indoor environments where accurate 3D scene reconstruction is essential for tasks like navigation and object handling. Monocular depth estimation, which relies on a single RGB camera, offers a more affordable solution compared to traditional methods that use stereo cameras or LiDAR. However, despite recent progress, many monocular approaches struggle with accurately defining depth boundaries, leading to less precise reconstructions. In response to these challenges, this study introduces a novel depth estimation framework that leverages latent space features within a deep convolutional neural network to enhance the precision of monocular depth maps. The proposed model features dual encoder-decoder architecture, enabling both color-to-depth and depth-to-depth transformations. This structure allows…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
