SITS-DECO: A Generative Decoder Is All You Need For Multitask Satellite Image Time Series Modelling
Samuel J. Barrett, Docko Sow

TL;DR
SITS-DECO introduces a simple, generative, decoder-only model inspired by language models to perform multi-task satellite image time series analysis, outperforming larger models without task-specific adaptation.
Contribution
It demonstrates that a unified, generative decoder-only architecture can effectively handle diverse EO tasks using symbolic prompting, emphasizing data diversity over architectural complexity.
Findings
Outperforms larger models on crop-type classification
Can perform multiple tasks with symbolic prompts
Shows the importance of dense temporal sequence modeling
Abstract
Earth Observation (EO) Foundation Modelling (FM) holds great promise for simplifying and improving the use of EO data for diverse real-world tasks. However, most existing models require additional adaptation before they can be used and are structured rigidly around particular data sources or training approaches. To address this, we take inspiration from large language models, where diverse tasks, both pre-training and downstream, are implicitly captured through next-token prediction over unified token sequences, leveraging the structure and diversity of the training data. We introduce SITS-DECO (Satellite Image Time Series-DECoder Only), a proof-of-concept generative model that applies this unified-sequence framing to EO data. Using a simple GPT-style decoder-only architecture, and demonstrate its ability to perform useful EO tasks (pixel-wise, multi-temporal, multi-modal crop-type…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
