TL;DR
HITNet introduces a fast, multi-resolution neural network for real-time stereo matching that avoids full cost volume computation, achieving high accuracy with less computation.
Contribution
It proposes a novel architecture that uses geometric reasoning and slanted plane hypotheses without explicit cost volume construction for efficient stereo matching.
Findings
Ranks 1st-3rd on ETH3D benchmarks
Achieves top performance on Middlebury-v3
Faster than 100ms on KITTI benchmarks
Abstract
This paper presents HITNet, a novel neural network architecture for real-time stereo matching. Contrary to many recent neural network approaches that operate on a full cost volume and rely on 3D convolutions, our approach does not explicitly build a volume and instead relies on a fast multi-resolution initialization step, differentiable 2D geometric propagation and warping mechanisms to infer disparity hypotheses. To achieve a high level of accuracy, our network not only geometrically reasons about disparities but also infers slanted plane hypotheses allowing to more accurately perform geometric warping and upsampling operations. Our architecture is inherently multi-resolution allowing the propagation of information across different levels. Multiple experiments prove the effectiveness of the proposed approach at a fraction of the computation required by state-of-the-art methods. At the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Max Pooling · Convolution · Concatenated Skip Connection · U-Net · HITNet
