Unsupervised Monocular Depth Perception: Focusing on Moving Objects
Hualie Jiang, Laiyan Ding, Zhenglong Sun, Rui Huang

TL;DR
This paper introduces an outlier masking technique and a multi-scale scheme to improve unsupervised monocular depth estimation, especially for moving objects, demonstrating significant improvements on KITTI and Cityscapes datasets.
Contribution
It proposes novel outlier masking and multi-scale methods to better handle occlusion and scene dynamics in unsupervised depth learning from monocular videos.
Findings
Enhanced depth accuracy for moving objects in unsupervised learning
Effective reduction of artifacts in depth maps
Improved depth and ego-motion estimation results
Abstract
As a flexible passive 3D sensing means, unsupervised learning of depth from monocular videos is becoming an important research topic. It utilizes the photometric errors between the target view and the synthesized views from its adjacent source views as the loss instead of the difference from the ground truth. Occlusion and scene dynamics in real-world scenes still adversely affect the learning, despite significant progress made recently. In this paper, we show that deliberately manipulating photometric errors can efficiently deal with these difficulties better. We first propose an outlier masking technique that considers the occluded or dynamic pixels as statistical outliers in the photometric error map. With the outlier masking, the network learns the depth of objects that move in the opposite direction to the camera more accurately. To the best of our knowledge, such cases have not…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Robotics and Sensor-Based Localization
