Towards Scalable and Generalizable Earth Observation Data Mining via Foundation Model Composition
Man Duc Chuc

TL;DR
This paper explores combining pretrained foundation models from remote sensing and general vision to enhance Earth Observation data mining, demonstrating that ensembling smaller models can match larger models' performance with less resource use.
Contribution
It investigates the effectiveness of reusing and combining existing pretrained models for Earth Observation tasks, highlighting feature ensembling and knowledge distillation as practical strategies.
Findings
Feature ensembling of smaller models matches larger models' performance.
Ensembling reduces training time and computational resources.
Knowledge distillation creates compact models retaining ensemble strengths.
Abstract
Foundation models are rapidly transforming Earth Observation data mining by enabling generalizable and scalable solutions for key tasks such as scene classification and semantic segmentation. While most efforts in the geospatial domain have focused on developing large models trained from scratch using massive Earth Observation datasets, an alternative strategy that remains underexplored is the reuse and combination of existing pretrained models. In this study, we investigate whether foundation models pretrained on remote sensing and general vision datasets can be effectively combined to improve performance across a diverse set of key Earth Observation tasks. Using the GEO-Bench benchmark, we evaluate several prominent models, including Prithvi, Hiera, and DOFA, on eleven datasets covering a range of spatial resolutions, sensor modalities, and task types. The results show that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Computational Techniques and Applications · Data Management and Algorithms · Geographic Information Systems Studies
MethodsKnowledge Distillation · Sparse Evolutionary Training
