TL;DR
CylinderDepth introduces a cylindrical spatial attention mechanism to improve multi-view consistent self-supervised surround depth estimation, addressing depth inconsistency issues across overlapping images.
Contribution
The paper proposes a geometry-guided, cylindrical spatial attention approach that enhances depth consistency and accuracy in multi-view surround depth estimation.
Findings
Improves cross-view depth consistency on DDAD and nuScenes datasets.
Enhances overall depth accuracy compared to state-of-the-art methods.
Extends receptive field across views using cylindrical mapping.
Abstract
Self-supervised surround-view depth estimation enables dense, low-cost 3D perception with a 360{\deg} field of view from multiple minimally overlapping images. Yet, most existing methods suffer from depth estimates that are inconsistent across overlapping images. To address this limitation, we propose a novel geometry-guided method for calibrated, time-synchronized multi-camera rigs that predicts dense metric depth. Our approach targets two main sources of inconsistency: the limited receptive field in border regions of single-image depth estimation, and the difficulty of correspondence matching. We mitigate these two issues by extending the receptive field across views and restricting cross-view attention to a small neighborhood. To this end, we establish the neighborhood relationships between images by mapping the image-specific feature positions onto a shared cylinder. Based on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
