BigEarthNet.txt: A Large-Scale Multi-Sensor Image-Text Dataset and Benchmark for Earth Observation

Johann-Ludwig Herzog; Mathis J\"urgen Adler; Leonard Hackel; Yan Shu; Angelos Zavras; Ioannis Papoutsis; Paolo Rota; and Beg\"um Demir

arXiv:2603.29630·cs.CV·April 2, 2026

BigEarthNet.txt: A Large-Scale Multi-Sensor Image-Text Dataset and Benchmark for Earth Observation

Johann-Ludwig Herzog, Mathis J\"urgen Adler, Leonard Hackel, Yan Shu, Angelos Zavras, Ioannis Papoutsis, Paolo Rota, and Beg\"um Demir

PDF

1 Datasets

TL;DR

BigEarthNet.txt is a comprehensive large-scale multi-sensor Earth observation dataset with diverse annotations, designed to improve vision-language models' performance on remote sensing tasks.

Contribution

It introduces a novel, richly annotated multi-sensor dataset for Earth observation, enabling instruction-driven learning and benchmarking for remote sensing applications.

Findings

01

BigEarthNet.txt surpasses existing datasets in textual richness and annotation diversity.

02

Fine-tuning models on BigEarthNet.txt improves performance across multiple tasks.

03

Benchmark results reveal current models' limitations on complex land-use/land-cover classes.

Abstract

Vision-langugage models (VLMs) have shown strong performance in computer vision (CV), yet their performance on remote sensing (RS) data remains limited due to the lack of large-scale, multi-sensor RS image-text datasets with diverse textual annotations. Existing datasets predominantly include aerial Red-Green-Blue imagery, with short or weakly grounded captions, and provide limited diversity in annotation types. To address this limitation, we introduce BigEarthNet $.$ txt, a large-scale, multi-sensor image-text dataset designed to advance instruction-driven image-text learning in Earth observation across multiple tasks. BigEarthNet $.$ txt contains 464044 co-registered Sentinel-1 synthetic aperture radar and Sentinel-2 multispectral images with 9.6M text annotations, including: i) geographically anchored captions describing land-use/land-cover (LULC) classes, their spatial relations, and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

BIFOLD-BigEarthNetv2-0/BigEarthNet.txt
dataset· 432 dl
432 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.