Iterative Geometry Encoding Volume for Stereo Matching

Gangwei Xu; Xianqi Wang; Xiaohuan Ding; Xin Yang

arXiv:2303.06615·cs.CV·March 15, 2023·1 cites

Iterative Geometry Encoding Volume for Stereo Matching

Gangwei Xu, Xianqi Wang, Xiaohuan Ding, Xin Yang

PDF

Open Access 1 Repo 1 Models

TL;DR

This paper introduces IGEV-Stereo, a novel deep network architecture that encodes geometry and context for improved stereo matching, achieving state-of-the-art accuracy and efficiency on benchmark datasets.

Contribution

The paper proposes IGEV-Stereo, a new architecture that builds a combined geometry encoding volume and iteratively updates disparity maps, improving accuracy and speed.

Findings

01

Ranks 1st on KITTI 2015 and 2012 among published methods.

02

Fastest among the top 10 methods on KITTI benchmarks.

03

Demonstrates strong cross-dataset generalization and high inference efficiency.

Abstract

Recurrent All-Pairs Field Transforms (RAFT) has shown great potentials in matching tasks. However, all-pairs correlations lack non-local geometry knowledge and have difficulties tackling local ambiguities in ill-posed regions. In this paper, we propose Iterative Geometry Encoding Volume (IGEV-Stereo), a new deep network architecture for stereo matching. The proposed IGEV-Stereo builds a combined geometry encoding volume that encodes geometry and context information as well as local matching details, and iteratively indexes it to update the disparity map. To speed up the convergence, we exploit GEV to regress an accurate starting point for ConvGRUs iterations. Our IGEV-Stereo ranks $1^{s t}$ on KITTI 2015 and 2012 (Reflective) among all published methods and is the fastest among the top 10 methods. In addition, IGEV-Stereo has strong cross-dataset generalization as well as high inference…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gangweix/igev
pytorchOfficial

Models

🤗
shriarul5273/IGEV-Stereo
model

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Image and Video Retrieval Techniques · Robotics and Sensor-Based Localization

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings