Using Machine Learning to Predict Poverty Status in Costa Rican Households
Ji Yoon Kim

TL;DR
This paper develops machine learning models to predict poverty status in Costa Rican households, demonstrating the effectiveness of Random Forest and Gradient Boosted Trees, with education identified as a key predictor.
Contribution
It introduces two supervised multiclassification models for poverty prediction using Costa Rican household data, highlighting the importance of education in the prediction process.
Findings
Random Forest achieved 64.9% F1 score.
Gradient Boosted Trees achieved 68.4% F1 score.
Education has the greatest impact on poverty prediction.
Abstract
This study presents two supervised multiclassification machine learning models to predict the poverty status of Costa Rican households as a way to support government and business sectors make decisions in a rapidly changing social and economic environment. Using the Costa Rican household dataset collected via the proxy means test conducted by the Inter-American Development Bank, Random Forest and Gradient Boosted Trees achieved F1 scores of 64.9% and 68.4%, respectively. This study also reveals that education has the greatest impact on predicting poverty status.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
