Hierarchical Deep Stereo Matching on High-resolution Images

Gengshan Yang; Joshua Manela; Michael Happold; Deva Ramanan

arXiv:1912.06704·cs.CV·December 17, 2019·5 cites

Hierarchical Deep Stereo Matching on High-resolution Images

Gengshan Yang, Joshua Manela, Michael Happold, Deva Ramanan

PDF

Open Access 2 Repos

TL;DR

This paper presents a hierarchical deep learning framework for real-time stereo matching on high-resolution images, achieving state-of-the-art accuracy and speed, suitable for time-critical applications like autonomous driving.

Contribution

The authors introduce a novel coarse-to-fine hierarchical approach and a high-resolution stereo dataset, enabling faster and more accurate disparity estimation.

Findings

01

Achieved SOTA performance on Middlebury-v3 and KITTI-15 datasets.

02

Significantly faster processing speed compared to existing methods.

03

Allows for anytime disparity reports with low latency.

Abstract

We explore the problem of real-time stereo matching on high-res imagery. Many state-of-the-art (SOTA) methods struggle to process high-res imagery because of memory constraints or speed limitations. To address this issue, we propose an end-to-end framework that searches for correspondences incrementally over a coarse-to-fine hierarchy. Because high-res stereo datasets are relatively rare, we introduce a dataset with high-res stereo pairs for both training and evaluation. Our approach achieved SOTA performance on Middlebury-v3 and KITTI-15 while running significantly faster than its competitors. The hierarchical design also naturally allows for anytime on-demand reports of disparity by capping intermediate coarse results, allowing us to accurately predict disparity for near-range structures with low latency (30ms). We demonstrate that the performance-vs-speed trade-off afforded by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings