TL;DR
This paper introduces a content-adaptive multi-resolution merging technique that enhances monocular depth estimation models, enabling the generation of high-resolution, detailed depth maps from single images by combining low- and high-resolution estimations.
Contribution
It proposes a novel merging approach that leverages scene structure and resolution trade-offs to produce multi-megapixel depth maps with fine details, improving upon existing models.
Findings
Achieves high-resolution depth maps with detailed structures
Demonstrates effective merging of multi-resolution estimations
Improves depth estimation quality significantly
Abstract
Neural networks have shown great abilities in estimating depth from a single image. However, the inferred depth maps are well below one-megapixel resolution and often lack fine-grained details, which limits their practicality. Our method builds on our analysis on how the input resolution and the scene structure affects depth estimation performance. We demonstrate that there is a trade-off between a consistent scene structure and the high-frequency details, and merge low- and high-resolution estimations to take advantage of this duality using a simple depth merging network. We present a double estimation method that improves the whole-image depth estimation and a patch selection method that adds local details to the final result. We demonstrate that by merging estimations at different resolutions with changing context, we can generate multi-megapixel depth maps with a high level of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
