Self-Supervised Monocular Depth Estimation with Internal Feature Fusion

Hang Zhou; David Greenwood; Sarah Taylor

arXiv:2110.09482·cs.CV·November 22, 2021·58 cites

Self-Supervised Monocular Depth Estimation with Internal Feature Fusion

Hang Zhou, David Greenwood, Sarah Taylor

PDF

Open Access 1 Repo

TL;DR

This paper introduces DIFFNet, a novel self-supervised monocular depth estimation network that leverages semantic segmentation features through fusion and attention mechanisms, outperforming existing methods on KITTI.

Contribution

It proposes DIFFNet, integrating semantic features from HRNet into depth estimation with feature fusion and attention, enhancing accuracy over state-of-the-art methods.

Findings

01

Outperforms state-of-the-art on KITTI benchmark

02

Demonstrates effectiveness on higher resolution data

03

Extended evaluation on challenging cases

Abstract

Self-supervised learning for depth estimation uses geometry in image sequences for supervision and shows promising results. Like many computer vision tasks, depth network performance is determined by the capability to learn accurate spatial and semantic representations from images. Therefore, it is natural to exploit semantic segmentation networks for depth estimation. In this work, based on a well-developed semantic segmentation network HRNet, we propose a novel depth estimation network DIFFNet, which can make use of semantic information in down and upsampling procedures. By applying feature fusion and an attention mechanism, our proposed method outperforms the state-of-the-art monocular depth estimation methods on the KITTI benchmark. Our method also demonstrates greater potential on higher resolution training data. We propose an additional extended evaluation strategy by establishing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

brandleyzhou/diffnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques

MethodsTest · Batch Normalization · Residual Connection · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · HRNet