Adaptive Multi-Modal Cross-Entropy Loss for Stereo Matching
Peng Xu, Zhiyu Xiang, Chenyu Qiao, Jingyun Fu, Tianyu Pu

TL;DR
This paper introduces an adaptive multi-modal cross-entropy loss for stereo matching that models complex pixel distributions, significantly improving disparity accuracy and state-of-the-art performance on benchmarks.
Contribution
The paper proposes a novel adaptive multi-modal cross-entropy loss that better models pixel distributions and enhances stereo network accuracy.
Findings
GANet with our loss ranks 1st on KITTI benchmarks.
Our method improves disparity map accuracy.
Enhanced generalization from synthetic to real data.
Abstract
Despite the great success of deep learning in stereo matching, recovering accurate disparity maps is still challenging. Currently, L1 and cross-entropy are the two most widely used losses for stereo network training. Compared with the former, the latter usually performs better thanks to its probability modeling and direct supervision to the cost volume. However, how to accurately model the stereo ground-truth for cross-entropy loss remains largely under-explored. Existing works simply assume that the ground-truth distributions are uni-modal, which ignores the fact that most of the edge pixels can be multi-modal. In this paper, a novel adaptive multi-modal cross-entropy loss (ADL) is proposed to guide the networks to learn different distribution patterns for each pixel. Moreover, we optimize the disparity estimator to further alleviate the bleeding or misalignment artifacts in inference.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Advanced Image Processing Techniques
