Multi-scale Cross-form Pyramid Network for Stereo Matching
Zhidong Zhu, Mingyi He, Yuchao Dai, Zhibo Rao, Bo Li

TL;DR
This paper introduces CFP-Net, a novel deep learning architecture for stereo matching that effectively aggregates multi-scale context information and regularizes cost volume, achieving state-of-the-art results on KITTI benchmarks.
Contribution
The paper proposes a new multi-scale cross-form pyramid network architecture that improves disparity estimation in stereo matching tasks.
Findings
Achieves state-of-the-art performance on KITTI 2012 and 2015 datasets.
Outperforms existing methods in ill-posed regions.
Effective multi-scale feature extraction and cost volume regularization.
Abstract
Stereo matching plays an indispensable part in autonomous driving, robotics and 3D scene reconstruction. We propose a novel deep learning architecture, which called CFP-Net, a Cross-Form Pyramid stereo matching network for regressing disparity from a rectified pair of stereo images. The network consists of three modules: Multi-Scale 2D local feature extraction module, Cross-form spatial pyramid module and Multi-Scale 3D Feature Matching and Fusion module. The Multi-Scale 2D local feature extraction module can extract enough multi-scale features. The Cross-form spatial pyramid module aggregates the context information in different scales and locations to form a cost volume. Moreover, it is proved to be more effective than SPP and ASPP in ill-posed regions. The Multi-Scale 3D feature matching and fusion module is proved to regularize the cost volume using two parallel 3D deconvolution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization
MethodsDilated Convolution · Spatial Pyramid Pooling · Atrous Spatial Pyramid Pooling
