Attention-based Context Aggregation Network for Monocular Depth   Estimation

Yuru Chen; Haitao Zhao; Zhengwei Hu

arXiv:1901.10137·cs.CV·January 30, 2019·24 cites

Attention-based Context Aggregation Network for Monocular Depth Estimation

Yuru Chen, Haitao Zhao, Zhengwei Hu

PDF

Open Access 1 Repo

TL;DR

This paper introduces an attention-based network that adaptively models context for monocular depth estimation, reducing discretization errors and improving accuracy by combining image-level and pixel-level information.

Contribution

The paper proposes a novel ACAN model using self-attention for adaptive context aggregation and a soft ordinal inference for continuous depth prediction, advancing monocular depth estimation.

Findings

01

Achieves competitive results on NYU Depth V2 and KITTI datasets.

02

Reduces RMSE discretization error by about 1%.

03

Demonstrates the effectiveness of attention-based context modeling.

Abstract

Depth estimation is a traditional computer vision task, which plays a crucial role in understanding 3D scene geometry. Recently, deep-convolutional-neural-networks based methods have achieved promising results in the monocular depth estimation field. Specifically, the framework that combines the multi-scale features extracted by the dilated convolution based block (atrous spatial pyramid pooling, ASPP) has gained the significant improvement in the dense labeling task. However, the discretized and predefined dilation rates cannot capture the continuous context information that differs in diverse scenes and easily introduce the grid artifacts in depth estimation. In this paper, we propose an attention-based context aggregation network (ACAN) to tackle these difficulties. Based on the self-attention model, ACAN adaptively learns the task-specific similarities between pixels to model the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

miraiaroha/ACAN
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques