Supervising Remote Sensing Change Detection Models with 3D Surface Semantics
Isaac Corley, Peyman Najafirad

TL;DR
This paper introduces CSIP, a contrastive pretraining method leveraging 2D optical images and 3D surface data to improve remote sensing change detection and segmentation tasks.
Contribution
The paper presents a novel joint pretraining approach using optical and 3D surface data, enhancing feature extraction for remote sensing applications.
Findings
Pretrained models outperform baseline methods on change detection datasets.
Surface semantics improve building segmentation accuracy.
Joint learning with 3D and 2D data benefits downstream tasks.
Abstract
Remote sensing change detection, identifying changes between scenes of the same location, is an active area of research with a broad range of applications. Recent advances in multimodal self-supervised pretraining have resulted in state-of-the-art methods which surpass vision models trained solely on optical imagery. In the remote sensing field, there is a wealth of overlapping 2D and 3D modalities which can be exploited to supervise representation learning in vision models. In this paper we propose Contrastive Surface-Image Pretraining (CSIP) for joint learning using optical RGB and above ground level (AGL) map pairs. We then evaluate these pretrained models on several building segmentation and change detection datasets to show that our method does, in fact, extract features relevant to downstream applications where natural and artificial surface information is relevant.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Remote Sensing in Agriculture · Remote Sensing and LiDAR Applications
