TL;DR
H-OmniStereo introduces a zero-shot omnidirectional stereo matching framework using a large synthetic dataset and a heading-aligned normal estimator, achieving high accuracy and generalization without domain-specific training.
Contribution
It presents a novel synthetic dataset and a heading-aligned monocular normal estimator to improve omnidirectional stereo matching in a zero-shot setting.
Findings
Outperforms existing methods on out-of-domain datasets
Generalizes well to real-world consumer cameras
Uses a large synthetic dataset for training
Abstract
Stereo matching on top-bottom equirectangular images provides an effective framework for full-surround perception, as vertically aligned epipolar lines enable the use of advanced perspective stereo architectures that are largely driven by large-scale datasets and monocular priors. However, the performance of such adaptations is severely limited by the scarcity of omnidirectional stereo datasets and the degradation of perspective monocular priors under spherical distortions. To address these challenges, we propose H-OmniStereo, a zero-shot omnidirectional stereo matching framework. First, we construct high-quality synthetic dataset comprising over 2.8 million top-bottom equirectangular stereo pairs to scale up training. Second, we introduce an equirectangular monocular normal estimator, specifically operating in a heading-aligned coordinate system. Beyond providing distortion-robust and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
