The Second Competition on Spatial Statistics for Large Datasets
Sameh Abdulah, Faten Alamri, Pratik Nag, Ying Sun, Hatem Ltaief, David, E. Keyes, Marc G. Genton

TL;DR
This paper details the second competition on spatial statistics for large datasets, evaluating various approximation methods for complex spatial and spatio-temporal predictions, and provides publicly available datasets for benchmarking.
Contribution
It introduces a comprehensive assessment framework for approximation methods in large-scale spatial statistics and shares valuable datasets for future research.
Findings
Multiple methods evaluated with varying accuracy and efficiency
Complex spatial processes pose significant computational challenges
Public datasets facilitate benchmarking and method development
Abstract
In the last few decades, the size of spatial and spatio-temporal datasets in many research areas has rapidly increased with the development of data collection technologies. As a result, classical statistical methods in spatial statistics are facing computational challenges. For example, the kriging predictor in geostatistics becomes prohibitive on traditional hardware architectures for large datasets as it requires high computing power and memory footprint when dealing with large dense matrix operations. Over the years, various approximation methods have been proposed to address such computational issues, however, the community lacks a holistic process to assess their approximation efficiency. To provide a fair assessment, in 2021, we organized the first competition on spatial statistics for large datasets, generated by our {\em ExaGeoStat} software, and asked participants to report the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Data-Driven Disease Surveillance · Spatial and Panel Data Analysis
