Multi-Scale Cost Volumes Cascade Network for Stereo Matching
Xiaogang Jia, Wei Chen, Zhengfa Liang, Mingfei Wu, Yusong Tan, Libo, Huang

TL;DR
The paper introduces MSCVNet, a novel stereo matching network that combines traditional and neural methods to produce high-quality, fast disparity maps by multi-scale cost volume processing and a cascade hourglass architecture.
Contribution
It proposes a multi-scale cost volume cascade network with a new cost aggregation method and an algorithm for handling discontinuous disparity areas, improving speed and accuracy.
Findings
24 times faster than CSPN
Significantly more accurate than traditional methods
Outperforms other real-time stereo networks
Abstract
Stereo matching is essential for robot navigation. However, the accuracy of current widely used traditional methods is low, while methods based on CNN need expensive computational cost and running time. This is because different cost volumes play a crucial role in balancing speed and accuracy. Thus we propose MSCVNet, which combines traditional methods and neural networks to improve the quality of cost volume. Concretely, our network first generates multiple 3D cost volumes with different resolutions and then uses 2D convolutions to construct a novel cascade hourglass network for cost aggregation. Meanwhile, we design an algorithm to distinguish and calculate the loss for discontinuous areas of disparity result. According to the KITTI official website, our network is much faster than most top-performing methods (24 times than CSPN, 44 times than GANet, etc.). Meanwhile, compared to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Robotics and Sensor-Based Localization · Advanced Image and Video Retrieval Techniques
