Mapping Dengue Vulnerability in Recife, Brazil: Socioeconomic Insights from PCA and Robust Regression
Marc\'ilio Ferreira dos Santos

TL;DR
This study integrates socioeconomic data, PCA, and various regression models to map dengue vulnerability in Recife, Brazil, demonstrating that census data can effectively predict and rank neighborhood-level dengue risk for targeted interventions.
Contribution
It introduces a combined approach using PCA and robust regression models to predict dengue risk, enhancing spatial analysis and public health planning.
Findings
PCA reduced dimensionality of socioeconomic variables.
Linear models explained over 60% of variance in dengue cases.
Risk ranking based on PCA scores matched actual distribution with 83.5% accuracy.
Abstract
Based on approximately 90,000 confirmed dengue cases reported in Recife - a major city in northeastern Brazil - between 2015 and 2024, we conducted a neighborhood-level spatial analysis. Socioeconomic and demographic indicators from the 2022 Brazilian Census were integrated to explore factors associated with the spatial distribution of dengue incidence. To address multicollinearity and reduce dimensionality, we applied Principal Component Analysis (PCA) to the explanatory variables. Using the resulting components, we built predictive models via Ordinary Least Squares (OLS), robust regression, and Random Forest algorithms. The OLS model explained 60.4% of the variance in case density (cases per square kilometer), while the robust model - more resilient to outliers - accounted for 43.2%. The Random Forest model, capturing nonlinear patterns, achieved 37.3%. Despite some localized gains…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMosquito-borne diseases and control
