TL;DR
This paper introduces a novel SSI-based monocular depth estimation method that improves detail generation and generalizes well in real-world scenarios using only synthetic training data.
Contribution
It proposes leveraging SSI inputs for better scale-invariant depth estimation and introduces a sparse ordinal loss for high-resolution detail enhancement.
Findings
Achieves high-quality, detailed depth maps in diverse scenarios
Demonstrates strong zero-shot generalization in real-world applications
Outperforms existing methods in detail generation and generalization
Abstract
Existing methods for scale-invariant monocular depth estimation (SI MDE) often struggle due to the complexity of the task, and limited and non-diverse datasets, hindering generalizability in real-world scenarios. This is while shift-and-scale-invariant (SSI) depth estimation, simplifying the task and enabling training with abundant stereo datasets achieves high performance. We present a novel approach that leverages SSI inputs to enhance SI depth estimation, streamlining the network's role and facilitating in-the-wild generalization for SI depth estimation while only using a synthetic dataset for training. Emphasizing the generation of high-resolution details, we introduce a novel sparse ordinal loss that substantially improves detail generation in SSI MDE, addressing critical limitations in existing approaches. Through in-the-wild qualitative examples and zero-shot evaluation we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
