SenCLIP: Enhancing zero-shot land-use mapping for Sentinel-2 with ground-level prompting
Pallavi Jain, Dino Ienco, Roberto Interdonato, Tristan Berchoux, Diego Marcos

TL;DR
SenCLIP enhances zero-shot land-use mapping from Sentinel-2 satellite images by aligning ground-level and satellite representations, significantly improving classification accuracy using free-form prompts.
Contribution
Introduces SenCLIP, a novel method transferring CLIP's representations to satellite imagery through paired ground-level photos, enabling effective zero-shot land-use classification.
Findings
Significant accuracy improvements over existing models.
Effective use of free-form textual prompts for satellite image classification.
Successful alignment of ground-level and satellite representations.
Abstract
Pre-trained vision-language models (VLMs), such as CLIP, demonstrate impressive zero-shot classification capabilities with free-form prompts and even show some generalization in specialized domains. However, their performance on satellite imagery is limited due to the underrepresentation of such data in their training sets, which predominantly consist of ground-level images. Existing prompting techniques for satellite imagery are often restricted to generic phrases like a satellite image of ..., limiting their effectiveness for zero-shot land-use and land-cover (LULC) mapping. To address these challenges, we introduce SenCLIP, which transfers CLIPs representation to Sentinel-2 imagery by leveraging a large dataset of Sentinel-2 images paired with geotagged ground-level photos from across Europe. We evaluate SenCLIP alongside other SOTA remote sensing VLMs on zero-shot LULC mapping tasks…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSatellite Image Processing and Photogrammetry · Advanced Computational Techniques and Applications · Remote Sensing and LiDAR Applications
MethodsContrastive Language-Image Pre-training
