CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

Ziyang Gong; Zhixiang Wei; Di Wang; Xiaoxing Hu; Xianzheng Ma; Hongruixuan Chen; Yuru Jia; Yupeng Deng; Zhenming Ji; Xiangwei Zhu; Xue Yang; Naoto Yokoya; Jing Zhang; Bo Du; Junchi Yan; Liangpei Zhang

arXiv:2410.22629·cs.CV·September 24, 2025·2 cites

CrossEarth: Geospatial Vision Foundation Model for Domain Generalizable Remote Sensing Semantic Segmentation

Ziyang Gong, Zhixiang Wei, Di Wang, Xiaoxing Hu, Xianzheng Ma, Hongruixuan Chen, Yuru Jia, Yupeng Deng, Zhenming Ji, Xiangwei Zhu, Xue Yang, Naoto Yokoya, Jing Zhang, Bo Du, Junchi Yan, Liangpei Zhang

PDF

Open Access 1 Repo

TL;DR

CrossEarth is the first geospatial vision foundation model designed for remote sensing domain generalization in semantic segmentation, achieving superior cross-domain performance through innovative training pipelines and a comprehensive benchmark.

Contribution

The paper introduces CrossEarth, a novel foundation model for RSDG semantic segmentation, with new data and model pipelines for improved cross-domain generalization.

Findings

01

CrossEarth outperforms existing methods on a new RSDG benchmark.

02

The Earth-Style Injection pipeline enhances domain robustness.

03

Multi-Task Training improves segmentation accuracy across diverse scenarios.

Abstract

The field of Remote Sensing Domain Generalization (RSDG) has emerged as a critical and valuable research frontier, focusing on developing models that generalize effectively across diverse scenarios. Despite the substantial domain gaps in RS images that are characterized by variabilities such as location, wavelength, and sensor type, research in this area remains underexplored: (1) Current cross-domain methods primarily focus on Domain Adaptation (DA), which adapts models to predefined domains rather than to unseen ones; (2) Few studies targeting the RSDG issue, especially for semantic segmentation tasks, where existing models are developed for specific unknown domains, struggling with issues of underfitting on other unknown scenarios; (3) Existing RS foundation models tend to prioritize in-domain performance over cross-domain generalization. To this end, we introduce the first vision…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cuzyoung/crossearth
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeographic Information Systems Studies

MethodsFocus