Position Prediction Self-Supervised Learning for Multimodal Satellite Imagery Semantic Segmentation
John Waithaka, Moise Busogi

TL;DR
This paper introduces a position prediction self-supervised learning method tailored for multimodal satellite imagery, improving semantic segmentation performance over traditional reconstruction-based methods by emphasizing spatial reasoning and cross-modal interaction.
Contribution
It adapts LOCA for satellite data by extending SatMAE's channel grouping and introducing same-group attention masking, focusing on spatial localization rather than reconstruction.
Findings
Outperforms existing self-supervised methods on Sen1Floods11 dataset.
Encourages cross-modal interaction during pretraining.
Enhances spatial reasoning for semantic segmentation.
Abstract
Semantic segmentation of satellite imagery is crucial for Earth observation applications, but remains constrained by limited labelled training data. While self-supervised pretraining methods like Masked Autoencoders (MAE) have shown promise, they focus on reconstruction rather than localisation-a fundamental aspect of segmentation tasks. We propose adapting LOCA (Location-aware), a position prediction self-supervised learning method, for multimodal satellite imagery semantic segmentation. Our approach addresses the unique challenges of satellite data by extending SatMAE's channel grouping from multispectral to multimodal data, enabling effective handling of multiple modalities, and introducing same-group attention masking to encourage cross-modal interaction during pretraining. The method uses relative patch position prediction, encouraging spatial reasoning for localisation rather than…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMethane Hydrates and Related Phenomena · Advanced Image and Video Retrieval Techniques · Geochemistry and Geologic Mapping
