MSDC-Net: Multi-Scale Dense and Contextual Networks for Automated   Disparity Map for Stereo Matching

Zhibo Rao; Mingyi He; Yuchao Dai; Zhidong Zhu; Bo Li and; Renjie He

arXiv:1904.12658·cs.CV·May 1, 2019·1 cites

MSDC-Net: Multi-Scale Dense and Contextual Networks for Automated Disparity Map for Stereo Matching

Zhibo Rao, Mingyi He, Yuchao Dai, Zhidong Zhu, Bo Li and, Renjie He

PDF

Open Access

TL;DR

This paper introduces MSDC-Net, a novel deep learning architecture that combines multi-scale fusion and residual 3D convolutions to improve disparity map prediction accuracy in stereo matching tasks, especially in non-occluded regions.

Contribution

The paper proposes a new multi-scale dense and contextual network architecture for stereo disparity estimation, integrating multi-scale fusion and residual 3D convolutions for enhanced performance.

Findings

01

Outperforms existing methods on Scene Flow and KITTI datasets

02

Achieves higher accuracy in non-occluded regions

03

Demonstrates effective multi-scale feature fusion

Abstract

Disparity prediction from stereo images is essential to computer vision applications including autonomous driving, 3D model reconstruction, and object detection. To predict accurate disparity map, we propose a novel deep learning architecture for detectingthe disparity map from a rectified pair of stereo images, called MSDC-Net. Our MSDC-Net contains two modules: multi-scale fusion 2D convolution and multi-scale residual 3D convolution modules. The multi-scale fusion 2D convolution module exploits the potential multi-scale features, which extracts and fuses the different scale features by Dense-Net. The multi-scale residual 3D convolution module learns the different scale geometry context from the cost volume which aggregated by the multi-scale fusion 2D convolution module. Experimental results on Scene Flow and KITTI datasets demonstrate that our MSDC-Net significantly outperforms…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques

Methods3D Convolution · Convolution