Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches
Jure \v{Z}bontar, Yann LeCun

TL;DR
This paper introduces a CNN-based method for computing stereo matching costs by learning a similarity measure on image patches, leading to improved depth estimation accuracy on standard datasets.
Contribution
It presents a novel supervised learning approach for stereo matching cost computation using CNNs, with architectures optimized for speed and accuracy, outperforming existing methods.
Findings
Outperforms other methods on KITTI and Middlebury datasets
Effective CNN architectures for fast and accurate matching
Improved depth estimation results across multiple datasets
Abstract
We present a method for extracting depth information from a rectified image pair. Our approach focuses on the first stage of many stereo algorithms: the matching cost computation. We approach the problem by learning a similarity measure on small image patches using a convolutional neural network. Training is carried out in a supervised manner by constructing a binary classification data set with examples of similar and dissimilar pairs of patches. We examine two network architectures for this task: one tuned for speed, the other for accuracy. The output of the convolutional neural network is used to initialize the stereo matching cost. A series of post-processing steps follow: cross-based cost aggregation, semiglobal matching, a left-right consistency check, subpixel enhancement, a median filter, and a bilateral filter. We evaluate our method on the KITTI 2012, KITTI 2015, and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image Processing Techniques · Image Enhancement Techniques
