Mono2Stereo: Monocular Knowledge Transfer for Enhanced Stereo Matching
Yuran Wang, Yingping Liang, Hesong Li, Ying Fu

TL;DR
This paper introduces Mono2Stereo, a novel method that leverages monocular depth estimation to improve stereo matching through a two-stage training process involving synthetic data generation and knowledge distillation, resulting in better generalization and accuracy.
Contribution
The paper proposes a new monocular-to-stereo knowledge transfer framework with a synthetic data pipeline and a novel distillation strategy, enhancing stereo matching performance and generalization.
Findings
Pre-trained model shows strong zero-shot generalization.
Domain fine-tuning with S2DKD improves in-domain accuracy.
Synthetic data generation effectively bridges domain gaps.
Abstract
The generalization and performance of stereo matching networks are limited due to the domain gap of the existing synthetic datasets and the sparseness of GT labels in the real datasets. In contrast, monocular depth estimation has achieved significant advancements, benefiting from large-scale depth datasets and self-supervised strategies. To bridge the performance gap between monocular depth estimation and stereo matching, we propose leveraging monocular knowledge transfer to enhance stereo matching, namely Mono2Stereo. We introduce knowledge transfer with a two-stage training process, comprising synthetic data pre-training and real-world data fine-tuning. In the pre-training stage, we design a data generation pipeline that synthesizes stereo training data from monocular images. This pipeline utilizes monocular depth for warping and novel view synthesis and employs our proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Processing Techniques and Applications · Advanced Vision and Imaging · Image and Video Stabilization
MethodsALIGN · Knowledge Distillation · Inpainting
