TL;DR
This paper presents a deep learning approach for identifying good correspondences in wide-baseline stereo matching, improving accuracy with minimal training data through a novel normalization technique.
Contribution
It introduces a simple, small multi-layer perceptron architecture with Context Normalization for robust correspondence classification and pose estimation.
Findings
Significantly outperforms previous methods on challenging datasets.
Requires less training data to achieve high accuracy.
Provides a robust inlier/outlier classification for stereo matching.
Abstract
We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while imbuing it with global information, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
