AANet: Adaptive Aggregation Network for Efficient Stereo Matching

Haofei Xu; Juyong Zhang

arXiv:2004.09548·cs.CV·April 22, 2020·30 cites

AANet: Adaptive Aggregation Network for Efficient Stereo Matching

Haofei Xu, Juyong Zhang

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

AANet introduces a lightweight, efficient stereo matching architecture that replaces costly 3D convolutions with novel cost aggregation modules, achieving faster inference and competitive accuracy on benchmark datasets.

Contribution

The paper proposes a new architecture with intra-scale sparse cost aggregation and neural cross-scale approximation, significantly reducing computation while maintaining high accuracy.

Findings

01

Speeded up existing models by up to 41 times

02

Achieved competitive results on Scene Flow and KITTI datasets

03

Operates at 62ms inference time

Abstract

Despite the remarkable progress made by learning based stereo matching algorithms, one key challenge remains unsolved. Current state-of-the-art stereo models are mostly based on costly 3D convolutions, the cubic computational complexity and high memory consumption make it quite expensive to deploy in real-world applications. In this paper, we aim at completely replacing the commonly used 3D convolutions to achieve fast inference speed while maintaining comparable accuracy. To this end, we first propose a sparse points based intra-scale cost aggregation method to alleviate the well-known edge-fattening issue at disparity discontinuities. Further, we approximate traditional cross-scale cost aggregation algorithm with neural network layers to handle large textureless regions. Both modules are simple, lightweight, and complementary, leading to an effective and efficient architecture for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haofeixu/aanet
pytorchOfficial

Datasets

shriarul5273/AANet
dataset· 23 dl
23 dl

Videos

AANet: Adaptive Aggregation Network for Efficient Stereo Matching· youtube

Taxonomy

TopicsAdvanced Vision and Imaging · Image Enhancement Techniques · Advanced Image and Video Retrieval Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings