Improving satellite imagery segmentation using multiple Sentinel-2 revisits
Kartik Jindgar, Grace W. Lindsay

TL;DR
This paper investigates how to best utilize multiple satellite revisits in remote sensing models, finding that latent space fusion of revisits improves segmentation performance, especially with SWIN Transformer architectures.
Contribution
It demonstrates that fusing multiple revisits in the latent space outperforms other methods, advancing the application of pre-trained models in satellite imagery analysis.
Findings
Latent space fusion of revisits yields superior segmentation results.
SWIN Transformer architecture outperforms U-nets and ViT models.
Results generalize to building density estimation.
Abstract
In recent years, analysis of remote sensing data has benefited immensely from borrowing techniques from the broader field of computer vision, such as the use of shared models pre-trained on large and diverse datasets. However, satellite imagery has unique features that are not accounted for in traditional computer vision, such as the existence of multiple revisits of the same location. Here, we explore the best way to use revisits in the framework of fine-tuning pre-trained remote sensing models. We focus on an applied research question of relevance to climate change mitigation -- power substation segmentation -- that is representative of applied uses of pre-trained models more generally. Through extensive tests of different multi-temporal input schemes across diverse model architectures, we find that fusing representations from multiple revisits in the model latent space is superior to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSatellite Image Processing and Photogrammetry
MethodsFocus
