Multimodal Object Detection via Probabilistic a priori Information Integration
Hafsa El Hafyani, Bastien Pasdeloup, Camille Yver, Pierre Romenteau

TL;DR
This paper introduces a probabilistic approach to multimodal object detection in remote sensing, addressing alignment issues by converting contextual information into probability maps and validating the method with extensive experiments.
Contribution
It proposes a novel early fusion architecture that effectively handles low-quality, misaligned multimodal data by integrating probabilistic contextual information.
Findings
Effective handling of misaligned modalities
Improved detection accuracy on DOTA dataset
Robustness to low-quality multimodal data
Abstract
Multimodal object detection has shown promise in remote sensing. However, multimodal data frequently encounter the problem of low-quality, wherein the modalities lack strict cell-to-cell alignment, leading to mismatch between different modalities. In this paper, we investigate multimodal object detection where only one modality contains the target object and the others provide crucial contextual information. We propose to resolve the alignment problem by converting the contextual binary information into probability maps. We then propose an early fusion architecture that we validate with extensive experiments on the DOTA dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Criteria Decision Making · Rough Sets and Fuzzy Logic
