# Analyzing the influence of environment, demographic and socio-economic factors on Aedes albopictus (Diptera: Culicidae) mosquito density at the micro-level using XGBoost and SHAP

**Authors:** Junyi Yao, Zijun Zhou, Hongxia Liu, Shenjun Yao, Jianping Wu

PMC · DOI: 10.1186/s13071-025-07220-0 · Parasites & Vectors · 2026-01-09

## TL;DR

This study uses machine learning to understand how environmental and socioeconomic factors affect Aedes albopictus mosquito density in urban Shanghai, offering insights for targeted mosquito control.

## Contribution

The study introduces an interpretable XGBoost-Poisson model with SHAP analysis to capture nonlinear relationships in urban mosquito density.

## Key findings

- The XGBoost-Poisson model achieved the best accuracy (R2 = 0.73) in predicting Aedes albopictus density.
- 14-day temperature lag was the dominant predictor, while population density and education completion had negative effects.
- Mosquito density was higher near schools and office areas and lower in residential and park environments.

## Abstract

Effective mosquito control in urban areas requires understanding of how climatic, ecological and socioeconomic factors shape vector abundance. However, most studies use linear or opaque models that overlook nonlinear relationships between environmental conditions and Aedes albopictus density. These complex associations remain insufficiently characterized in highly urbanized settings, where interacting environmental and human factors jointly influence mosquito habitats.

We trained a random forest model, an XGBoost model with a default squared-error objective and an XGBoost model with a Poisson count objective using adult Aedes albopictus monitoring data collected across Shanghai from April to November 2023. Model performance was evaluated with RMSE, MAE, R2 and Poisson deviance, and temporally blocked cross-validation was applied to assess temporal generalizability. SHAP analysis was used to interpret variable importance and contribution patterns. To examine operational relevance, we additionally evaluated hotspot localization accuracy using July 2024 data.

On the independent test set, the XGBoost-Poisson model achieved the best overall accuracy (R2 = 0.73, Poisson deviance = 4.52). SHAP analysis identified the 14-day temperature lag as the dominant predictor, followed by a slight negative population density and compulsory completion of education. Precipitation and NDVI showed smaller positive contributions. Age structure variables exhibited nonlinear trends—with an inverted-U shape for children, a declining pattern for older adults and a shallow U shape for building height. By site type, mosquito density tended to be higher near schools, livestock sheds and office areas and lower in residential, farmhouse, park and hospital environments. Under temporally blocked cross-validation, the model retained moderate temporal generalization. In out-of-time hotspot validation, the top 10% of sites captured 41–50% of hotspots, rising to 60–68% at 25% coverage, suggesting moderate spatial localization.

The framework identified key environmental and socioeconomic drivers of Aedes albopictus density in Shanghai. Despite moderate temporal generalization, it provides interpretable, fine-scale insights to guide targeted vector control and inform urban mosquito management in dense metropolitan settings. Future research should validate the framework across additional seasons and diverse urban contexts, incorporate finer environmental and infrastructural data and enhance uncertainty quantification for improved interpretive robustness.

## Linked entities

- **Species:** Aedes albopictus (taxon 7160)

## Full-text entities

- **Species:** Aedes albopictus (Asian tiger mosquito, species) [taxon 7160], Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12882247/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12882247/full.md

---
Source: https://tomesphere.com/paper/PMC12882247