A spatial random forest algorithm for population-level epidemiological risk assessment
Duncan Lee, Vinny Davies

TL;DR
This paper introduces the SPAR-Forest-ERF algorithm, combining random forests with Bayesian spatial models to improve non-linear confounder-response effect estimation in spatial epidemiology, with full uncertainty quantification.
Contribution
It presents the first fusion of random forests with Bayesian spatial models for flexible, interpretable exposure response functions in population health risk assessment.
Findings
Applied to Scottish air pollution and health data
Demonstrated improved modeling of complex confounder effects
Provided full uncertainty quantification for risk estimates
Abstract
Spatial epidemiology identifies the drivers of elevated population-level disease risks, using disease counts, exposures and known confounders at the areal unit level. Poisson regression models are typically used for inference, which incorporate a linear/additive regression component and allow for unmeasured confounding via a set of spatially autocorrelated random effects. This approach requires the confounder interactions and their functional relationships with disease risk to be specified in advance, rather than being learned from the data. Therefore, this paper proposes the SPAR-Forest-ERF algorithm, which is the first fusion of random forests for capturing non-linear and interacting confounder-response effects with Bayesian spatial autocorrelation models that can estimate interpretable exposure response functions (ERF) with full uncertainty quantification. Methodologically, we extend…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Bayesian Inference · Spatial and Panel Data Analysis · Data-Driven Disease Surveillance
