Cross-Scale Pretraining: Enhancing Self-Supervised Learning for Low-Resolution Satellite Imagery for Semantic Segmentation
John Waithaka, Gustave Bwirayesu, Moise Busogi

TL;DR
This paper introduces a spatial affinity component for self-supervised pretraining that leverages high-resolution satellite imagery to improve low-resolution image representations and segmentation performance.
Contribution
The authors propose a novel spatial affinity component that enhances self-supervised learning by incorporating high-resolution data, improving low-resolution satellite image segmentation.
Findings
Spatial affinity component outperforms models pretrained on HR or MR images alone.
Inclusion of HR imagery improves MR image representation learning.
The method enhances downstream segmentation performance.
Abstract
Self-supervised pretraining in remote sensing is mostly done using mid-spatial resolution (MR) image datasets due to their high availability. Given the release of high-resolution (HR) datasets, we ask how HR datasets can be included in self-supervised pretraining to enhance MR image representation learning and downstream segmentation performance on MR tasks. We design a spatial affinity component that can be added to existing self-supervised learning frameworks and that uses HR imagery to learn better representations of MR imagery. We test the spatial affinity component on two self-supervised learning frameworks and show that it outperforms models pretrained on HR or MR images alone.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
