TemporalStereo: Efficient Spatial-Temporal Stereo Matching Network
Youmin Zhang, Matteo Poggi, Stefano Mattoccia

TL;DR
TemporalStereo is an efficient stereo matching network that leverages spatial and temporal information to improve accuracy, especially in challenging regions like occlusions and reflections, and performs well on various datasets.
Contribution
It introduces a novel coarse-to-fine network that exploits spatio-temporal data for improved stereo matching, functioning effectively in both single-pair and video modes.
Findings
Achieves state-of-the-art results on multiple datasets
Robust to dynamic objects in videos
Effective in handling occlusions and reflective regions
Abstract
We present TemporalStereo, a coarse-to-fine stereo matching network that is highly efficient, and able to effectively exploit the past geometry and context information to boost matching accuracy. Our network leverages sparse cost volume and proves to be effective when a single stereo pair is given. However, its peculiar ability to use spatio-temporal information across stereo sequences allows TemporalStereo to alleviate problems such as occlusions and reflective regions while enjoying high efficiency also in this latter case. Notably, our model -- trained once with stereo videos -- can run in both single-pair and temporal modes seamlessly. Experiments show that our network relying on camera motion is robust even to dynamic objects when running on videos. We validate TemporalStereo through extensive experiments on synthetic (SceneFlow, TartanAir) and real (KITTI 2012, KITTI 2015)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image Processing Techniques
