TL;DR
This paper investigates contrastive self-supervised learning for remote sensing imagery, demonstrating high accuracy in city classification across 3 million locations and highlighting domain discrepancies affecting performance.
Contribution
It introduces a large-scale benchmark for self-supervised remote sensing classification and compares its effectiveness against supervised methods, revealing key domain challenges.
Findings
Self-supervised methods achieve over 95% accuracy with minimal training.
Performance gap exists due to domain differences between natural and abstract imagery.
Models are open-sourced for community evaluation.
Abstract
Self-supervision based deep learning classification approaches have received considerable attention in academic literature. However, the performance of such methods on remote sensing imagery domains remains under-explored. In this work, we explore contrastive representation learning methods on the task of imagery-based city classification, an important problem in urban computing. We use satellite and map imagery across 2 domains, 3 million locations and more than 1500 cities. We show that self-supervised methods can build a generalizable representation from as few as 200 cities, with representations achieving over 95\% accuracy in unseen cities with minimal additional training. We also find that the performance discrepancy of such methods, when compared to supervised methods, induced by the domain discrepancy between natural imagery and abstract imagery is significant for remote sensing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
