Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging   Scenarios

Jialei Xu; Xianming Liu; Junjun Jiang; Kui Jiang; Rui Li; Kai Cheng,; Xiangyang Ji

arXiv:2402.11826·cs.CV·February 20, 2024·1 cites

Unveiling the Depths: A Multi-Modal Fusion Framework for Challenging Scenarios

Jialei Xu, Xianming Liu, Junjun Jiang, Kui Jiang, Rui Li, Kai Cheng,, Xiangyang Ji

PDF

Open Access

TL;DR

This paper introduces a multi-modal fusion framework that combines RGB and infrared data to improve monocular depth estimation in challenging environments like nighttime and adverse weather, leveraging confidence-guided fusion for robustness.

Contribution

It proposes a novel multi-modal depth estimation approach that independently computes depth maps, predicts confidence, and fuses modalities end-to-end, enhancing accuracy in difficult scenarios.

Findings

01

Effective depth estimation in challenging conditions

02

Robust performance on MS$^2$ and ViViD++ datasets

03

Outperforms single-modality methods

Abstract

Monocular depth estimation from RGB images plays a pivotal role in 3D vision. However, its accuracy can deteriorate in challenging environments such as nighttime or adverse weather conditions. While long-wave infrared cameras offer stable imaging in such challenging conditions, they are inherently low-resolution, lacking rich texture and semantics as delivered by the RGB image. Current methods focus solely on a single modality due to the difficulties to identify and integrate faithful depth cues from both sources. To address these issues, this paper presents a novel approach that identifies and integrates dominant cross-modality depth features with a learning-based framework. Concretely, we independently compute the coarse depth maps with separate networks by fully utilizing the individual depth cues from each modality. As the advantageous depth spreads across both modalities, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBIM and Construction Integration

MethodsFocus