TL;DR
Cryo-Bench introduces a comprehensive benchmark for evaluating foundation models on Cryosphere Earth observation tasks, highlighting their performance, limitations, and domain adaptation capabilities.
Contribution
This work provides the first dedicated benchmark for GFMs in Cryosphere applications, including diverse datasets and evaluation protocols, and offers practical insights on model fine-tuning strategies.
Findings
UNet with frozen encoder achieves 66.38 mIoU
GFMs outperform U-Net in few-shot settings
Hyperparameter tuning improves GFM performance by 12.77% on average
Abstract
Geo-Foundation Models (GFMs) have been evaluated across diverse Earth observation task including multiple domains and have demonstrated strong potential of producing reliable maps even with sparse labels. However, benchmarking GFMs for Cryosphere applications has remained limited, primarily due to the lack of suitable evaluation datasets. To address this gap, we introduce \textbf{Cryo-Bench}, a benchmark compiled to evaluate GFM performance across key Cryospheric components. Cryo-Bench includes debris-covered glaciers, glacial lakes, sea ice, and calving fronts, spanning multiple sensors and broad geographic regions. We evaluate 14 GFMs alongside UNet and ViT baselines to assess their advantages, limitations, and optimal usage strategies. With a frozen encoder, UNet achieves the highest average mIoU of \textbf{66.38}, followed by TerraMind at \textbf{64.02} across five evluation dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
