TL;DR
This paper introduces MultiModNet, a novel multi-modal remote sensing network that employs pyramid attention and gated fusion to improve land cover mapping by effectively integrating diverse data sources.
Contribution
The paper proposes a new multi-modality network with pyramid attention fusion and gated fusion units, enhancing the integration of multi-modal remote sensing data for land cover classification.
Findings
Outperforms existing methods on benchmark datasets
Demonstrates robustness and effectiveness in multi-modal land cover mapping
Provides scalable fusion techniques for diverse remote sensing modalities
Abstract
Multi-modality data is becoming readily available in remote sensing (RS) and can provide complementary information about the Earth's surface. Effective fusion of multi-modal information is thus important for various applications in RS, but also very challenging due to large domain differences, noise, and redundancies. There is a lack of effective and scalable fusion techniques for bridging multiple modality encoders and fully exploiting complementary information. To this end, we propose a new multi-modality network (MultiModNet) for land cover mapping of multi-modal remote sensing data based on a novel pyramid attention fusion (PAF) module and a gated fusion unit (GFU). The PAF module is designed to efficiently obtain rich fine-grained contextual representations from each modality with a built-in cross-level and cross-view attention fusion mechanism, and the GFU module utilizes a novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
