Major TOM: Expandable Datasets for Earth Observation
Alistair Francis, Mikolaj Czerkawski

TL;DR
Major TOM introduces an extensible framework and a comprehensive open-access Earth Observation dataset to facilitate dataset integration and reduce duplication in deep learning model training.
Contribution
It proposes a shared, scalable framework for combining diverse EO datasets and provides the large MajorTOM-Core dataset as a practical resource and template.
Findings
MajorTOM-Core covers most of Earth's land surface.
Framework enables merging datasets with different formats.
Supports scalable, collaborative EO data collection.
Abstract
Deep learning models are increasingly data-hungry, requiring significant resources to collect and compile the datasets needed to train them, with Earth Observation (EO) models being no exception. However, the landscape of datasets in EO is relatively atomised, with interoperability made difficult by diverse formats and data structures. If ever larger datasets are to be built, and duplication of effort minimised, then a shared framework that allows users to combine and access multiple datasets is needed. Here, Major TOM (Terrestrial Observation Metaset) is proposed as this extensible framework. Primarily, it consists of a geographical indexing system based on a set of grid points and a metadata structure that allows multiple datasets with different sources to be merged. Besides the specification of Major TOM as a framework, this work also presents a large, open-access dataset,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeochemistry and Geologic Mapping · Geological Modeling and Analysis
MethodsSparse Evolutionary Training
