Depth Monocular Estimation with Attention-based Encoder-Decoder Network   from Single Image

Xin Zhang; Rabab Abdelfattah; Yuqi Song; Samuel A. Dauchert; and Xiaofeng wang

arXiv:2210.13646·cs.CV·October 26, 2022

Depth Monocular Estimation with Attention-based Encoder-Decoder Network from Single Image

Xin Zhang, Rabab Abdelfattah, Yuqi Song, Samuel A. Dauchert, and Xiaofeng wang

PDF

Open Access

TL;DR

This paper introduces an attention-based encoder-decoder network for monocular depth estimation from single images, effectively handling artifacts and blurry edges with novel attention mechanisms and loss functions, validated on KITTI and NYU-V2 datasets.

Contribution

The work proposes a convolutional attention mechanism block and a novel loss function to improve monocular depth estimation accuracy in challenging scenarios.

Findings

01

Outperforms several baseline methods on KITTI and NYU-V2 datasets.

02

Effectively reduces grid artifacts and blurry edges in depth maps.

03

Enhances focus on relevant image features with minimal computational overhead.

Abstract

Depth information is the foundation of perception, essential for autonomous driving, robotics, and other source-constrained applications. Promptly obtaining accurate and efficient depth information allows for a rapid response in dynamic environments. Sensor-based methods using LIDAR and RADAR obtain high precision at the cost of high power consumption, price, and volume. While due to advances in deep learning, vision-based approaches have recently received much attention and can overcome these drawbacks. In this work, we explore an extreme scenario in vision-based settings: estimate a depth map from one monocular image severely plagued by grid artifacts and blurry edges. To address this scenario, We first design a convolutional attention mechanism block (CAMB) which consists of channel attention and spatial attention sequentially and insert these CAMBs into skip connections. As a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Advanced Vision and Imaging · Cell Image Analysis Techniques