DepthMamba with Adaptive Fusion

Zelin Meng; Zhichen Wang

arXiv:2412.19964·cs.CV·December 31, 2024

DepthMamba with Adaptive Fusion

Zelin Meng, Zhichen Wang

PDF

Open Access

TL;DR

This paper introduces DepthMamba, a robust multi-view depth estimation method that adaptively fuses single-view and multi-view results using an attention mechanism, performing well under noisy pose conditions.

Contribution

It proposes a novel two-branch network with adaptive fusion and a new benchmark for evaluating depth estimation under pose noise.

Findings

01

Performs well on challenging scenes with dynamic objects and texture-less regions.

02

Achieves competitive results on KITTI and DDAD benchmarks.

03

Demonstrates robustness to noisy camera pose inputs.

Abstract

Multi-view depth estimation has achieved impressive performance over various benchmarks. However, almost all current multi-view systems rely on given ideal camera poses, which are unavailable in many real-world scenarios, such as autonomous driving. In this work, we propose a new robustness benchmark to evaluate the depth estimation system under various noisy pose settings. Surprisingly, we find current multi-view depth estimation methods or single-view and multi-view fusion methods will fail when given noisy pose settings. To tackle this challenge, we propose a two-branch network architecture which fuses the depth estimation results of single-view and multi-view branch. In specific, we introduced mamba to serve as feature extraction backbone and propose an attention-based fusion methods which adaptively select the most robust estimation results between the two branches. Thus, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRobotic Path Planning Algorithms · Modular Robots and Swarm Intelligence

MethodsMamba: Linear-Time Sequence Modeling with Selective State Spaces