When Earth Foundation Models Meet Diffusion: An Application to Land Surface Temperature Super-Resolution
Yiheng Chen, Zihui Ma, Peishi Jiang, Yilong Dai, Qikai Hu, Xinyue Ye, Lingyao Li, Rita Sousa, Runlong Yu

TL;DR
This paper introduces EFDiff, a diffusion-based super-resolution framework guided by Earth foundation models, significantly improving land surface temperature resolution from highly degraded thermal data.
Contribution
The paper presents a novel Earth foundation model-guided diffusion framework for super-resolution, leveraging geospatial embeddings to enhance reconstruction quality in remote sensing.
Findings
EFDiff outperforms baseline super-resolution methods on a large global benchmark.
Cross-attention conditioning by Earth foundation models is more effective than channel concatenation.
The framework is broadly applicable to remote sensing problems with pretrained geospatial representations.
Abstract
Land surface temperature (LST) super-resolution is important for environmental monitoring. However, it remains challenging as coarse thermal observations severely underdetermine fine-scale structure. In this paper, we propose Earth Foundation Model-guided Diffusion (EFDiff), a novel framework for super-resolution under extreme spatial degradation. EFDiff uses the Prithvi-EO-2.0 Earth foundation model to encode high-resolution multispectral reflectance into geospatial embeddings, which are injected into the denoising network via cross-attention to guide fine-scale reconstruction from highly degraded observations. We study two variants, EFDiff- and EFDiff-, which offer complementary trade-offs between perceptual realism and pixel-level fidelity. We evaluate EFDiff under an extreme scale gap using a globally diverse benchmark comprising 242,416 co-registered…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
