RS-WorldModel: a Unified Model for Remote Sensing Understanding and Future Sense Forecasting
Linrui Xu, Zhongan Wang, Fei Shen, Gang Xu, Huiping Zhuang, Ming Li, and Haifeng Li

TL;DR
RS-WorldModel is a unified remote sensing model that jointly understands changes and forecasts future scenes using a large annotated dataset, outperforming existing models in accuracy and efficiency.
Contribution
The paper introduces RS-WorldModel, a novel unified model for remote sensing that combines change understanding and future scene forecasting, along with a large dataset RSWBench-1.1M.
Findings
Outperforms larger open-source models in spatiotemporal question-answering.
Achieves an FID of 43.13 on text-guided scene forecasting.
Uses only 2B parameters, demonstrating efficiency.
Abstract
Remote sensing world models aim to both explain observed changes and forecast plausible futures, two tasks that share spatiotemporal priors. Existing methods, however, typically address them separately, limiting cross-task transfer. We present RS-WorldModel, a unified world model for remote sensing that jointly handles spatiotemporal change understanding and text-guided future scene forecasting, and we build RSWBench-1.1M, a 1.1 million sample dataset with rich language annotations covering both tasks. RS-WorldModel is trained in three stages: (1) Geo-Aware Generative Pre-training (GAGP) conditions forecasting on geographic and acquisition metadata; (2) synergistic instruction tuning (SIT) jointly trains understanding and forecasting; (3) verifiable reinforcement optimization (VRO) refines outputs with verifiable, task-specific rewards. With only 2B parameters, RS-WorldModel surpasses…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Geographic Information Systems Studies · Domain Adaptation and Few-Shot Learning
