EarthView: A Large Scale Remote Sensing Dataset for Self-Supervision
Diego Velazquez, Pau Rodriguez L\'opez, Sergio Alonso, Josep M., Gonfaus, Jordi Gonzalez, Gerardo Richarte, Javier Marin, Yoshua Bengio,, Alexandre Lacoste

TL;DR
EarthView introduces a large-scale, diverse remote sensing dataset and a specialized Masked Autoencoder, EarthMAE, to advance self-supervised learning for Earth monitoring tasks, demonstrating improved downstream performance.
Contribution
The paper provides a comprehensive remote sensing dataset and a tailored self-supervised model, addressing heterogeneity in data modalities and resolutions for Earth observation applications.
Findings
Pre-training on Satellogic data enhances downstream task performance
EarthMAE effectively handles diverse remote sensing data modalities
The dataset enables scalable self-supervised learning in Earth monitoring
Abstract
This paper presents EarthView, a comprehensive dataset specifically designed for self-supervision on remote sensing data, intended to enhance deep learning applications on Earth monitoring tasks. The dataset spans 15 tera pixels of global remote-sensing data, combining imagery from a diverse range of sources, including NEON, Sentinel, and a novel release of 1m spatial resolution data from Satellogic. Our dataset provides a wide spectrum of image data with varying resolutions, harnessed from different sensors and organized coherently into an accessible HuggingFace dataset in parquet format. This data spans five years, from 2017 to 2022. Accompanying the dataset, we introduce EarthMAE, a tailored Masked Autoencoder, developed to tackle the distinct challenges of remote sensing data. Trained in a self-supervised fashion, EarthMAE effectively processes different data modalities such as…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed and Parallel Computing Systems
MethodsMasked autoencoder
