Deep Multimodal Fusion for Semantic Segmentation of Remote Sensing Earth Observation Data
Ivica Dimitrovski, Vlatko Spasev, Ivan Kitanovski

TL;DR
This paper introduces a late fusion deep learning model that combines high-resolution aerial imagery and satellite time series data to significantly improve semantic segmentation accuracy in remote sensing applications.
Contribution
It presents a novel dual-branch deep learning framework that effectively fuses spatial and temporal data sources for enhanced land cover segmentation.
Findings
Achieves state-of-the-art results on the FLAIR dataset.
Demonstrates the effectiveness of multi-modality fusion in remote sensing.
Improves segmentation robustness and accuracy.
Abstract
Accurate semantic segmentation of remote sensing imagery is critical for various Earth observation applications, such as land cover mapping, urban planning, and environmental monitoring. However, individual data sources often present limitations for this task. Very High Resolution (VHR) aerial imagery provides rich spatial details but cannot capture temporal information about land cover changes. Conversely, Satellite Image Time Series (SITS) capture temporal dynamics, such as seasonal variations in vegetation, but with limited spatial resolution, making it difficult to distinguish fine-scale objects. This paper proposes a late fusion deep learning model (LF-DLM) for semantic segmentation that leverages the complementary strengths of both VHR aerial imagery and SITS. The proposed model consists of two independent deep learning branches. One branch integrates detailed textures from aerial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computational Techniques and Applications
MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Layer Normalization · *Communicated@Fast*How Do I Communicate to Expedia? · Dense Connections · Concatenated Skip Connection · Adam · Residual Connection · Position-Wise Feed-Forward Layer
