TL;DR
This paper introduces an unsupervised domain adaptation method for depth prediction from images that leverages stereo algorithms and confidence measures to improve accuracy across different environments without requiring groundtruth labels.
Contribution
It presents a novel confidence-guided loss function for fine-tuning depth prediction models using only image pairs, addressing domain shift without supervised labels.
Findings
Outperforms existing unsupervised domain adaptation methods.
Effective for both stereo and monocular depth prediction architectures.
Successfully reduces domain shift in diverse environments.
Abstract
State-of-the-art approaches to infer dense depth measurements from images rely on CNNs trained end-to-end on a vast amount of data. However, these approaches suffer a drastic drop in accuracy when dealing with environments much different in appearance and/or context from those observed at training time. This domain shift issue is usually addressed by fine-tuning on smaller sets of images from the target domain annotated with depth labels. Unfortunately, relying on such supervised labeling is seldom feasible in most practical settings. Therefore, we propose an unsupervised domain adaptation technique which does not require groundtruth labels. Our method relies only on image pairs and leverages on classical stereo algorithms to produce disparity measurements alongside with confidence estimators to assess upon their reliability. We propose to fine-tune both depth-from-stereo as well as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
