UniRS: Unifying Multi-temporal Remote Sensing Tasks through Vision Language Models
Yujie Li, Wenjia Xu, Guangzuo Li, Zijian Yu, Zhiwei Wei, Jiuniu Wang,, Mugen Peng

TL;DR
UniRS is a unified vision-language model that effectively handles various multi-temporal remote sensing tasks, such as image analysis, change detection, and video classification, by integrating diverse visual inputs within a single framework.
Contribution
This work introduces UniRS, the first model to unify multi-temporal remote sensing tasks across different visual input types using a shared visual representation and specialized modules.
Findings
Achieves state-of-the-art results on multiple remote sensing tasks
Supports diverse visual inputs including images, image pairs, and videos
Demonstrates high versatility and effectiveness in multi-temporal analysis
Abstract
The domain gap between remote sensing imagery and natural images has recently received widespread attention and Vision-Language Models (VLMs) have demonstrated excellent generalization performance in remote sensing multimodal tasks. However, current research is still limited in exploring how remote sensing VLMs handle different types of visual inputs. To bridge this gap, we introduce \textbf{UniRS}, the first vision-language model \textbf{uni}fying multi-temporal \textbf{r}emote \textbf{s}ensing tasks across various types of visual input. UniRS supports single images, dual-time image pairs, and videos as input, enabling comprehensive remote sensing temporal analysis within a unified framework. We adopt a unified visual representation approach, enabling the model to accept various visual inputs. For dual-time image pair tasks, we customize a change extraction module to further enhance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Semantic Web and Ontologies
MethodsSoftmax · Attention Is All You Need · ADaptive gradient method with the OPTimal convergence rate
