Using Machine Learning to Predict Poverty Status in Costa Rican   Households

Ji Yoon Kim

arXiv:2111.13319·stat.AP·November 29, 2021

Using Machine Learning to Predict Poverty Status in Costa Rican Households

Ji Yoon Kim

PDF

TL;DR

This paper develops machine learning models to predict poverty status in Costa Rican households, demonstrating the effectiveness of Random Forest and Gradient Boosted Trees, with education identified as a key predictor.

Contribution

It introduces two supervised multiclassification models for poverty prediction using Costa Rican household data, highlighting the importance of education in the prediction process.

Findings

01

Random Forest achieved 64.9% F1 score.

02

Gradient Boosted Trees achieved 68.4% F1 score.

03

Education has the greatest impact on poverty prediction.

Abstract

This study presents two supervised multiclassification machine learning models to predict the poverty status of Costa Rican households as a way to support government and business sectors make decisions in a rapidly changing social and economic environment. Using the Costa Rican household dataset collected via the proxy means test conducted by the Inter-American Development Bank, Random Forest and Gradient Boosted Trees achieved F1 scores of 64.9% and 68.4%, respectively. This study also reveals that education has the greatest impact on predicting poverty status.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.