PVStereo: Pyramid Voting Module for End-to-End Self-Supervised Stereo Matching
Hengli Wang, Rui Fan, Peide Cai, Ming Liu

TL;DR
This paper introduces PVStereo, a self-supervised stereo matching method with a pyramid voting module and a new neural network architecture, achieving state-of-the-art results without requiring large labeled datasets.
Contribution
The paper presents a novel self-supervised stereo matching approach with a pyramid voting module and a new DCNN architecture, along with a large-scale synthetic dataset for training and evaluation.
Findings
Outperforms state-of-the-art self-supervised methods on KITTI benchmarks
Effective in diverse illumination and weather conditions
Reduces reliance on large labeled datasets
Abstract
Supervised learning with deep convolutional neural networks (DCNNs) has seen huge adoption in stereo matching. However, the acquisition of large-scale datasets with well-labeled ground truth is cumbersome and labor-intensive, making supervised learning-based approaches often hard to implement in practice. To overcome this drawback, we propose a robust and effective self-supervised stereo matching approach, consisting of a pyramid voting module (PVM) and a novel DCNN architecture, referred to as OptStereo. Specifically, our OptStereo first builds multi-scale cost volumes, and then adopts a recurrent unit to iteratively update disparity estimations at high resolution; while our PVM can generate reliable semi-dense disparity images, which can be employed to supervise OptStereo training. Furthermore, we publish the HKUST-Drive dataset, a large-scale synthetic stereo dataset, collected under…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion-Convolutional Neural Networks
