SemHint-MD: Learning from Noisy Semantic Labels for Self-Supervised Monocular Depth Estimation
Shan Lin, Yuheng Zhi, and Michael C. Yip

TL;DR
This paper introduces SemHint-MD, a self-supervised monocular depth estimation framework that leverages noisy semantic labels and cross-task refinement to improve depth accuracy without relying on ground truth data.
Contribution
It proposes a novel method to use noisy semantic labels and shared decoder parameters to enhance depth estimation in a self-supervised setting.
Findings
Improved depth estimation accuracy on KITTI benchmark.
Effective use of noisy semantic labels for self-supervised learning.
Enhanced performance in endoscopic tissue deformation tracking.
Abstract
Without ground truth supervision, self-supervised depth estimation can be trapped in a local minimum due to the gradient-locality issue of the photometric loss. In this paper, we present a framework to enhance depth by leveraging semantic segmentation to guide the network to jump out of the local minimum. Prior works have proposed to share encoders between these two tasks or explicitly align them based on priors like the consistency between edges in the depth and segmentation maps. Yet, these methods usually require ground truth or high-quality pseudo labels, which may not be easily accessible in real-world applications. In contrast, we investigate self-supervised depth estimation along with a segmentation branch that is supervised with noisy labels provided by models pre-trained with limited data. We extend parameter sharing from the encoder to the decoder and study the influence of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · AI in cancer detection · Medical Image Segmentation Techniques
MethodsALIGN
