Better, Not Just More: Data-Centric Machine Learning for Earth Observation
Ribana Roscher, Marc Ru{\ss}wurm, Caroline Gevaert, Michael, Kampffmeyer, Jefersson A. dos Santos, Maria Vakalopoulou, Ronny, H\"ansch, Stine Hansen, Keiller Nogueira, Jonathan Prexl, Devis, Tuia

TL;DR
This paper advocates shifting from a model-centric to a data-centric approach in geospatial machine learning to improve accuracy, generalization, and real-world applicability, emphasizing the entire ML cycle.
Contribution
It provides a clear definition, categorization, and overview of automated data-centric learning methods specifically for geospatial data, highlighting their role alongside traditional model-centric approaches.
Findings
Data-centric approaches enhance model robustness in geospatial tasks.
Experiments demonstrate practical implementation steps for data-centric ML.
Review categorizes existing geospatial ML methods into distinct groups.
Abstract
Recent developments and research in modern machine learning have led to substantial improvements in the geospatial field. Although numerous deep learning architectures and models have been proposed, the majority of them have been solely developed on benchmark datasets that lack strong real-world relevance. Furthermore, the performance of many methods has already saturated on these datasets. We argue that a shift from a model-centric view to a complementary data-centric perspective is necessary for further improvements in accuracy, generalization ability, and real impact on end-user applications. Furthermore, considering the entire machine learning cycle-from problem definition to model deployment with feedback-is crucial for enhancing machine learning models that can be reliable in unforeseen situations. This work presents a definition as well as a precise categorization and overview of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Data-Driven Disease Surveillance · Data Management and Algorithms
MethodsSparse Evolutionary Training · Focus
