Detail Preserving Depth Estimation from a Single Image Using Attention Guided Networks
Zhixiang Hao, Yu Li, Shaodi You, Feng Lu

TL;DR
This paper introduces an attention-guided neural network architecture for single image depth estimation that preserves structural details and produces high-quality depth maps efficiently without post-processing.
Contribution
It proposes a novel network combining Dense Feature Extractor and attention-based Depth Map Generator for improved depth detail preservation.
Findings
Achieves competitive quantitative results with state-of-the-art methods.
Produces depth maps with better structural detail preservation.
Runs at approximately 15 frames per second.
Abstract
Convolutional Neural Networks have demonstrated superior performance on single image depth estimation in recent years. These works usually use stacked spatial pooling or strided convolution to get high-level information which are common practices in classification task. However, depth estimation is a dense prediction problem and low-resolution feature maps usually generate blurred depth map which is undesirable in application. In order to produce high quality depth map, say clean and accurate, we propose a network consists of a Dense Feature Extractor (DFE) and a Depth Map Generator (DMG). The DFE combines ResNet and dilated convolutions. It extracts multi-scale information from input image while keeping the feature maps dense. As for DMG, we use attention mechanism to fuse multi-scale features produced in DFE. Our Network is trained end-to-end and does not need any post-processing.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques
MethodsConvolution
