Revisiting Stereo Depth Estimation From a Sequence-to-Sequence   Perspective with Transformers

Zhaoshuo Li; Xingtong Liu; Nathan Drenkow; Andy Ding; Francis X.; Creighton; Russell H. Taylor; Mathias Unberath

arXiv:2011.02910·cs.CV·August 27, 2021·5 cites

Revisiting Stereo Depth Estimation From a Sequence-to-Sequence Perspective with Transformers

Zhaoshuo Li, Xingtong Liu, Nathan Drenkow, Andy Ding, Francis X., Creighton, Russell H. Taylor, Mathias Unberath

PDF

Open Access 1 Repo

TL;DR

This paper introduces STTR, a transformer-based approach for stereo depth estimation that replaces traditional cost volume methods with dense pixel matching, enabling better occlusion handling, confidence estimation, and domain generalization.

Contribution

The paper proposes a novel sequence-to-sequence transformer model for stereo matching, relaxing disparity range limitations and improving occlusion and confidence estimation.

Findings

01

Achieves promising results on synthetic and real datasets.

02

Generalizes across domains without fine-tuning.

03

Outperforms traditional methods in occlusion detection.

Abstract

Stereo depth estimation relies on optimal correspondence matching between pixels on epipolar lines in the left and right images to infer depth. In this work, we revisit the problem from a sequence-to-sequence correspondence perspective to replace cost volume construction with dense pixel matching using position information and attention. This approach, named STereo TRansformer (STTR), has several advantages: It 1) relaxes the limitation of a fixed disparity range, 2) identifies occluded regions and provides confidence estimates, and 3) imposes uniqueness constraints during the matching process. We report promising results on both synthetic and real-world datasets and demonstrate that STTR generalizes across different domains, even without fine-tuning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mli0603/stereo-transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Image Processing Techniques and Applications · Advanced Image Processing Techniques