Joint Prediction of Monocular Depth and Structure using Planar and Parallax Geometry
Hao Xing, Yifan Cao, Maximilian Biber, Mingchuan Zhou, Darius Burschka

TL;DR
This paper introduces a novel supervised learning approach that combines structure from Plane and Parallax geometry with depth data to improve monocular depth estimation, especially on thin objects and edges.
Contribution
It integrates geometric structure information into a deep learning model, enhancing depth prediction accuracy over existing methods without relying solely on sparse LiDAR data.
Findings
Achieves state-of-the-art performance on KITTI and Cityscapes datasets.
Improves depth prediction for thin objects and edges.
Demonstrates robustness compared to structure-only baselines.
Abstract
Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data. However, LiDAR can only generate sparse 3D maps which causes losing information. Obtaining high-quality ground-truth depth data per pixel is difficult to acquire. In order to overcome this limitation, we propose a novel approach combining structure information from a promising Plane and Parallax geometry pipeline with depth information into a U-Net supervised learning network, which results in quantitative and qualitative improvement compared to existing popular learning-based methods. In particular, the model is evaluated on two large-scale and challenging datasets: KITTI Vision Benchmark and Cityscapes dataset and achieve the best performance in terms of relative error. Compared with pure depth supervision models, our model has impressive performance on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Max Pooling · Convolution · U-Net
