Spatial Multivariate Trees for Big Data Bayesian Regression
Michele Peruzzi, David B. Dunson

TL;DR
This paper introduces SpamTrees, a scalable Bayesian multivariate regression model for high-resolution geospatial data that captures complex relationships across multiple outcomes using spatial multivariate trees, demonstrated on climate data.
Contribution
The paper develops SpamTrees, a novel Bayesian multivariate regression approach utilizing spatial multivariate trees for scalable analysis of large, complex geospatial datasets.
Findings
SpamTrees effectively model multivariate spatial data at high resolutions.
The method scales well to large datasets with complex outcome relationships.
Application to climate data demonstrates practical utility and efficiency.
Abstract
High resolution geospatial data are challenging because standard geostatistical models based on Gaussian processes are known to not scale to large data sizes. While progress has been made towards methods that can be computed more efficiently, considerably less attention has been devoted to big data methods that allow the description of complex relationships between several outcomes recorded at high resolutions by different sensors. Our Bayesian multivariate regression models based on spatial multivariate trees (SpamTrees) achieve scalability via conditional independence assumptions on latent random effects following a treed directed acyclic graph. Information-theoretic arguments and considerations on computational efficiency guide the construction of the tree and the related efficient sampling algorithms in imbalanced multivariate settings. In addition to simulated data examples, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoil Geostatistics and Mapping · Remote Sensing in Agriculture · Data-Driven Disease Surveillance
