Depth Monocular Estimation with Attention-based Encoder-Decoder Network from Single Image
Xin Zhang, Rabab Abdelfattah, Yuqi Song, Samuel A. Dauchert, and Xiaofeng wang

TL;DR
This paper introduces an attention-based encoder-decoder network for monocular depth estimation from single images, effectively handling artifacts and blurry edges with novel attention mechanisms and loss functions, validated on KITTI and NYU-V2 datasets.
Contribution
The work proposes a convolutional attention mechanism block and a novel loss function to improve monocular depth estimation accuracy in challenging scenarios.
Findings
Outperforms several baseline methods on KITTI and NYU-V2 datasets.
Effectively reduces grid artifacts and blurry edges in depth maps.
Enhances focus on relevant image features with minimal computational overhead.
Abstract
Depth information is the foundation of perception, essential for autonomous driving, robotics, and other source-constrained applications. Promptly obtaining accurate and efficient depth information allows for a rapid response in dynamic environments. Sensor-based methods using LIDAR and RADAR obtain high precision at the cost of high power consumption, price, and volume. While due to advances in deep learning, vision-based approaches have recently received much attention and can overcome these drawbacks. In this work, we explore an extreme scenario in vision-based settings: estimate a depth map from one monocular image severely plagued by grid artifacts and blurry edges. To address this scenario, We first design a convolutional attention mechanism block (CAMB) which consists of channel attention and spatial attention sequentially and insert these CAMBs into skip connections. As a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Advanced Vision and Imaging · Cell Image Analysis Techniques
