Monocular Depth Estimation Primed by Salient Point Detection and   Normalized Hessian Loss

Lam Huynh; Matteo Pedone; Phong Nguyen; Jiri Matas; Esa Rahtu; Janne; Heikkila

arXiv:2108.11098·cs.CV·August 26, 2021

Monocular Depth Estimation Primed by Salient Point Detection and Normalized Hessian Loss

Lam Huynh, Matteo Pedone, Phong Nguyen, Jiri Matas, Esa Rahtu, Janne, Heikkila

PDF

Open Access

TL;DR

This paper introduces a lightweight monocular depth estimation framework using salient point detection and a normalized Hessian loss, achieving state-of-the-art accuracy with fewer parameters.

Contribution

It proposes a novel self-attention based model with a normalized Hessian loss for improved accuracy and efficiency in monocular depth estimation.

Findings

01

State-of-the-art results on NYU-Depth-v2 and KITTI datasets.

02

Model is 3.1-38.4 times smaller than baseline approaches.

03

Demonstrates good generalization on SUN-RGBD dataset.

Abstract

Deep neural networks have recently thrived on single image depth estimation. That being said, current developments on this topic highlight an apparent compromise between accuracy and network size. This work proposes an accurate and lightweight framework for monocular depth estimation based on a self-attention mechanism stemming from salient point detection. Specifically, we utilize a sparse set of keypoints to train a FuSaNet model that consists of two major components: Fusion-Net and Saliency-Net. In addition, we introduce a normalized Hessian loss term invariant to scaling and shear along the depth direction, which is shown to substantially improve the accuracy. The proposed method achieves state-of-the-art results on NYU-Depth-v2 and KITTI while using 3.1-38.4 times smaller model in terms of the number of parameters than baseline approaches. Experiments on the SUN-RGBD further…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques