Self-Supervised Monocular Depth Estimation with Internal Feature Fusion
Hang Zhou, David Greenwood, Sarah Taylor

TL;DR
This paper introduces DIFFNet, a novel self-supervised monocular depth estimation network that leverages semantic segmentation features through fusion and attention mechanisms, outperforming existing methods on KITTI.
Contribution
It proposes DIFFNet, integrating semantic features from HRNet into depth estimation with feature fusion and attention, enhancing accuracy over state-of-the-art methods.
Findings
Outperforms state-of-the-art on KITTI benchmark
Demonstrates effectiveness on higher resolution data
Extended evaluation on challenging cases
Abstract
Self-supervised learning for depth estimation uses geometry in image sequences for supervision and shows promising results. Like many computer vision tasks, depth network performance is determined by the capability to learn accurate spatial and semantic representations from images. Therefore, it is natural to exploit semantic segmentation networks for depth estimation. In this work, based on a well-developed semantic segmentation network HRNet, we propose a novel depth estimation network DIFFNet, which can make use of semantic information in down and upsampling procedures. By applying feature fusion and an attention mechanism, our proposed method outperforms the state-of-the-art monocular depth estimation methods on the KITTI benchmark. Our method also demonstrates greater potential on higher resolution training data. We propose an additional extended evaluation strategy by establishing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Optical measurement and interference techniques
MethodsTest · Batch Normalization · Residual Connection · Convolution · *Communicated@Fast*How Do I Communicate to Expedia? · HRNet
