TL;DR
This paper introduces BiDAStereo, a novel framework with bidirectional frame alignment and stabilizer network for improved stereo video disparity estimation, supported by new datasets and achieving state-of-the-art results.
Contribution
It proposes a bidirectional alignment mechanism, a new video processing framework, and a stabilizer network, along with synthetic and real datasets for outdoor natural scenes.
Findings
Achieved state-of-the-art results on multiple benchmarks.
Demonstrated improved prediction quality and robustness.
Provided new datasets for natural outdoor scenes.
Abstract
Video stereo matching is the task of estimating consistent disparity maps from rectified stereo videos. There is considerable scope for improvement in both datasets and methods within this area. Recent learning-based methods often focus on optimizing performance for independent stereo pairs, leading to temporal inconsistencies in videos. Existing video methods typically employ sliding window operation over time dimension, which can result in low-frequency oscillations corresponding to the window size. To address these challenges, we propose a bidirectional alignment mechanism for adjacent frames as a fundamental operation. Building on this, we introduce a novel video processing framework, BiDAStereo, and a plugin stabilizer network, BiDAStabilizer, compatible with general image-based methods. Regarding datasets, current synthetic object-based and indoor datasets are commonly used for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
