Predictive Volatility of Machine Learning in Micro-Samples: A Regularised Assessment of Regional Poverty
A. H. Jamaluddin, A. T. R. Dani, N. I. Mahat, V. Ratnasari, S. S. M. Fauzi

TL;DR
This study compares various statistical and machine learning models to identify the most reliable methods for analyzing poverty drivers in small, collinear regional datasets, emphasizing the effectiveness of regularised linear models.
Contribution
It demonstrates that regularised linear models outperform complex machine learning ensembles in small, collinear regional datasets for poverty analysis.
Findings
Simple linear shrinkage models outperform complex ensembles in predictive accuracy.
ICT skills are consistently identified as a key factor in reducing poverty.
Overfitting is prevalent in complex machine learning models like BART on small regional data.
Abstract
Identifying the structural drivers of poverty in regional datasets is frequently hindered by small sample sizes and high multidimensional collinearity, which can result in unstable and misleading policy advice. This paper evaluates the provincial causes of poverty in Indonesia by addressing these specific statistical hazards. We employ a rigorous model-comparison framework designed for small samples () with high collinearity, comparing standard linear models with frequentist penalisation, Bayesian shrinkage priors, an adjusted spatial intrinsic conditionally autoregressive (ICAR) model, and complex machine learning ensembles. To ensure a robust evaluation, we measure predictive performance using strict Leave-One-Out Cross-Validation (LOOCV). The results demonstrate that algorithmic complexity is inherently risky in regional datasets: simple linear shrinkage models (Ridge, Elastic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
