Flexible domain prediction using mixed effects random forests
Patrick Krennmair, Timo Schmid

TL;DR
This paper introduces mixed effects random forests as a flexible, non-parametric approach for small area estimation, effectively capturing hierarchical data structures and improving estimation accuracy over traditional linear models.
Contribution
It develops a novel framework combining mixed effects random forests with a bootstrap method for uncertainty quantification in small area estimation.
Findings
Outperforms traditional regression models in simulations
Provides more accurate estimates with hierarchical data
Demonstrates effectiveness on Mexican income data
Abstract
This paper promotes the use of random forests as versatile tools for estimating spatially disaggregated indicators in the presence of small area-specific sample sizes. Small area estimators are predominantly conceptualized within the regression-setting and rely on linear mixed models to account for the hierarchical structure of the survey data. In contrast, machine learning methods offer non-linear and non-parametric alternatives, combining excellent predictive performance and a reduced risk of model-misspecification. Mixed effects random forests combine advantages of regression forests with the ability to model hierarchical dependencies. This paper provides a coherent framework based on mixed effects random forests for estimating small area averages and proposes a non-parametric bootstrap estimator for assessing the uncertainty of the estimates. We illustrate advantages of our proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
