The CAST package for training and assessment of spatial prediction models in R
Hanna Meyer, Marvin Ludwig, Carles Mil\`a, Jan Linnenbrink, Fabian, Schumacher

TL;DR
The paper introduces the CAST R package, which provides tools for training, evaluating, and applying machine learning models for spatial prediction tasks in environmental science, addressing challenges like spatial autocorrelation.
Contribution
The paper presents a new R package, CAST, that integrates advanced spatial cross-validation, feature selection, and model assessment methods for improved spatial prediction modeling.
Findings
CAST facilitates more reliable spatial predictions.
The package supports integration into existing workflows.
Case study demonstrates improved model performance.
Abstract
One key task in environmental science is to map environmental variables continuously in space or even in space and time. Machine learning algorithms are frequently used to learn from local field observations to make spatial predictions by estimating the value of the variable of interest in places where it has not been measured. However, the application of machine learning strategies for spatial mapping involves additional challenges compared to "non-spatial" prediction tasks that often originate from spatial autocorrelation and from training data that are not independent and identically distributed. In the past few years, we developed a number of methods to support the application of machine learning for spatial data which involves the development of suitable cross-validation strategies for performance assessment and model selection, spatial feature selection, and methods to assess…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicsdemographic modeling and climate adaptation · Spatial and Panel Data Analysis
