DeepSatData: Building large scale datasets of satellite images for training machine learning models
Michail Tarasiou, Stefanos Zafeiriou

TL;DR
This paper discusses the creation of large-scale satellite image datasets using Sentinel-2 data for training deep learning models, focusing on dense classification tasks like semantic segmentation.
Contribution
It introduces a methodology for generating extensive satellite image datasets with freely available data and addresses challenges in data quality and scalability.
Findings
Large-scale satellite datasets can be generated using Sentinel-2 data.
The approach supports training deep neural networks for dense classification.
Code implementation is publicly available for reproducibility.
Abstract
This report presents design considerations for automatically generating satellite imagery datasets for training machine learning models with emphasis placed on dense classification tasks, e.g. semantic segmentation. The implementation presented makes use of freely available Sentinel-2 data which allows generation of large scale datasets required for training deep neural networks. We discuss issues faced from the point of view of deep neural network training and evaluation such as checking the quality of ground truth data and comment on the scalability of the approach. Accompanying code is provided in https://github.com/michaeltrs/DeepSatData.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReservoir Engineering and Simulation Methods
