Multi-Spectral Remote Sensing Image Retrieval Using Geospatial Foundation Models
Benedikt Blumenstiel, Viktoria Moor, Romeo Kienzler, Thomas, Brunschwiler

TL;DR
This paper demonstrates that Geospatial Foundation Models like Prithvi effectively retrieve multi-spectral satellite images, outperforming RGB-based models, with high accuracy and efficient compression methods for large-scale remote sensing data.
Contribution
It introduces the use of Prithvi for multi-spectral remote sensing image retrieval, showing strong performance and effective compression techniques without additional fine-tuning.
Findings
Prithvi achieves 97.62% mAP on BigEarthNet-43.
Prithvi outperforms RGB-based models in retrieval accuracy.
Binarized embeddings offer 32-fold compression with maintained accuracy.
Abstract
Image retrieval enables an efficient search through vast amounts of satellite imagery and returns similar images to a query. Deep learning models can identify images across various semantic concepts without the need for annotations. This work proposes to use Geospatial Foundation Models, like Prithvi, for remote sensing image retrieval with multiple benefits: i) the models encode multi-spectral satellite data and ii) generalize without further fine-tuning. We introduce two datasets to the retrieval task and observe a strong performance: Prithvi processes six bands and achieves a mean Average Precision of 97.62% on BigEarthNet-43 and 44.51% on ForestNet-12, outperforming other RGB-based models. Further, we evaluate three compression methods with binarized embeddings balancing retrieval speed and accuracy. They match the retrieval speed of much shorter hash codes while maintaining the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImage Retrieval and Classification Techniques · Remote-Sensing Image Classification
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
