Cascaded multi-scale and multi-dimension convolutional neural network for stereo matching
Haihua Lu, Hai Xu, Li Zhang, Yong Zhao

TL;DR
This paper introduces a novel stereo matching network that combines multi-scale and multi-dimension convolutional approaches to improve disparity estimation accuracy without post-processing.
Contribution
It proposes a multi-scale matching cost computation sub-network and a multi-dimension aggregation sub-network, enhancing receptive fields and contextual understanding.
Findings
Achieves competitive results on KITTI benchmark
Effectively balances receptive field size and detail preservation
No additional post-processing needed for high accuracy
Abstract
Convolutional neural networks(CNN) have been shown to perform better than the conventional stereo algorithms for stereo estimation. Numerous efforts focus on the pixel-wise matching cost computation, which is the important building block for many start-of-the-art algorithms. However, those architectures are limited to small and single scale receptive fields and use traditional methods for cost aggregation or even ignore cost aggregation. Differently we take them both into consideration. Firstly, we propose a new multi-scale matching cost computation sub-network, in which two different sizes of receptive fields are implemented parallelly. In this way, the network can make the best use of both variants and balance the trade-off between the increase of receptive field and the loss of detail. Furthermore, we show that our multi-dimension aggregation sub-network which containing 2D…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques
Methods3D Convolution · Convolution
