REX-SUB: A Scalable Subsampling Strategy for Modeling Large Spatial Datasets
Nicholas Rios, Ben Seiyon Lee

TL;DR
REX-SUB introduces a scalable, randomized subsampling method combined with Vecchia approximation to efficiently model large spatial datasets with Gaussian processes, reducing prediction errors.
Contribution
It proposes a novel randomized exchange algorithm for subsampling and integrates a scalable Vecchia approximation for efficient large-scale spatial modeling.
Findings
REX-SUB achieves lower mean squared prediction errors.
REX-SUB outperforms competing subsampling strategies.
The method is effective on real large-scale spatial data.
Abstract
Recent advances in data collection technologies have led to the emergence of massive spatial datasets, with measurements obtained at millions of spatial locations. Geostatistical models typically employ Gaussian processes (GPs) to capture spatial dependence, but standard GP fitting becomes prohibitive at such scales. A promising solution is optimal subsampling, where a subset of locations is selected that optimizes a criterion. In this study, we propose a randomized exchange algorithm for subsampling (REX-SUB) which efficiently selects small subsamples that minimize prediction errors in the fitted spatial GP models. To further improve computational efficiency, we embed a scalable Vecchia approximation to the GP's joint likelihood, which takes advantage of sparsity in the precision matrix to enable fast inference on the selected subsamples. Through a simulation study and an application…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
