Multimodal Object Detection via Probabilistic a priori Information   Integration

Hafsa El Hafyani; Bastien Pasdeloup; Camille Yver; Pierre Romenteau

arXiv:2405.15596·cs.CV·May 27, 2024

Multimodal Object Detection via Probabilistic a priori Information Integration

Hafsa El Hafyani, Bastien Pasdeloup, Camille Yver, Pierre Romenteau

PDF

Open Access 1 Repo

TL;DR

This paper introduces a probabilistic approach to multimodal object detection in remote sensing, addressing alignment issues by converting contextual information into probability maps and validating the method with extensive experiments.

Contribution

It proposes a novel early fusion architecture that effectively handles low-quality, misaligned multimodal data by integrating probabilistic contextual information.

Findings

01

Effective handling of misaligned modalities

02

Improved detection accuracy on DOTA dataset

03

Robustness to low-quality multimodal data

Abstract

Multimodal object detection has shown promise in remote sensing. However, multimodal data frequently encounter the problem of low-quality, wherein the modalities lack strict cell-to-cell alignment, leading to mismatch between different modalities. In this paper, we investigate multimodal object detection where only one modality contains the target object and the others provide crucial contextual information. We propose to resolve the alignment problem by converting the contextual binary information into probability maps. We then propose an early fusion architecture that we validate with extensive experiments on the DOTA dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

helhafyani/multimodal_fusion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Criteria Decision Making · Rough Sets and Fuzzy Logic