# Hybrid models combining trend and seasonality components with machine learning algorithms provide accurate forecasting of malaria incidence

**Authors:** Syed Shah Areeb Hussain, Sanchit Bedi, Chander Prakash Yadav, Ajeet Kumar Mohanty, Kalpana Mahatme, Suchi Tyagi, N. M. Anoop Krishnan, Sri Harsha Kota, Amit Sharma

PMC · DOI: 10.1371/journal.pgph.0004500 · PLOS Global Public Health · 2025-10-17

## TL;DR

This paper shows that combining machine learning with time-series models improves malaria forecasting accuracy and precision in Goa, India.

## Contribution

Hybrid models integrating machine learning and time-series components significantly enhance malaria incidence forecasting.

## Key findings

- Climatic extremes have a stronger influence on malaria transmission than average values in Goa.
- Hybrid models (e.g., RF-ARMA, SVM-ARMA, XGB-ARMA) improved accuracy (RMSE: 0.5-15) while retaining precision.
- Time-series components in machine learning models offer better forecasting for malaria elimination planning.

## Abstract

Forecasting malaria incidence is vital for effective resource allocation during malaria elimination. In this study, we highlight robust models for forecasting incidence using climatic and malaria data from Goa, India. Multi-collinearity and Shapley Additive Explanations (SHAP) were used to identify most important predictors of malaria transmission among 15 climatic variables. Three machine-learning models (Support vector machines, Random Forest, Extreme gradient boosting), three time-series models (ARIMA, SARIMA, SARIMAX), and three hybrid models (RF-ARMA, SVM-ARMA, XGB-ARMA) were then trained and tested on data spanning from 2010 to 2019. Climatic extremes have stronger influence on malaria transmission than average values in Goa. Machine learning models exhibit lower accuracy (Root Mean Square Error (RMSE):13–37) but high precision (lower confidence intervals). Conversely, time series models, yielded more accurate results (RMSE: 5–41) albeit with less precision (wider confidence interval). To address this, we augmented machine learning models by incorporating time series variables which significantly bolstered their accuracy while retaining their inherent precision (RMSE: 0·5-15). Integrating time-series components into machine learning models harnesses the strengths of both approaches resulting in a substantial enhancement in accuracy and precision of forecasts. This technique has potential for wider use in planning malaria elimination, and routine epidemiological data analysis.

## Linked entities

- **Diseases:** malaria (MONDO:0005136)

## Full-text entities

- **Diseases:** malaria (MESH:D008288)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12533842/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12533842/full.md

## References

40 references — full list in the complete paper: https://tomesphere.com/paper/PMC12533842/full.md

---
Source: https://tomesphere.com/paper/PMC12533842