# Predicting food prices in Kenya using machine learning: a hybrid model approach with XGBoost and gradient boosting

**Authors:** Benard O. Ogol, Evans Omondi, John Olukuru, Betsy Muriithi, Kennedy Senagi

PMC · DOI: 10.3389/frai.2025.1661989 · Frontiers in Artificial Intelligence · 2025-10-24

## TL;DR

This paper presents a hybrid machine learning model combining XGBoost and gradient boosting to predict food prices in Kenya, aiming to help policymakers address food insecurity.

## Contribution

The novel contribution is a hybrid model using XGBoost and gradient boosting with a linear regression meta-model for improved food price prediction in Kenya.

## Key findings

- The hybrid model achieved an R-squared value of 0.9940, outperforming standalone models.
- Key features influencing food prices include unit quantity, price type, commodity, and currency.
- The model was saved as pickle files for potential deployment on a web application.

## Abstract

Food price volatility continues to be a significant concern in Kenya's economic development, posing challenges to the country's economic stability.

This study examines the application of machine learning methods, employing a hybrid approach that combines XGBoost and gradient boosting, to predict food prices in Kenya. The food prices data from the World Food Programme, covering the period from January 2006 to September 2024, as well as currency exchange rates data from the Central Bank of Kenya in US dollars (USD) and inflation rates data, were collated and preprocessed to be ready for analytics and machine learning. The augmented data were preprocessed and transformed, then used to train XGBoost, gradient boosting, LightGBM, decision tree, random forest, and linear regression. A hybrid model was then developed by stacking XGBoost and gradient boosting as the base models, with linear regression serving as the meta-model used to combine their predictions.

This model was then tuned using the hyperparameter random search method, achieving a mean absolute error of 0.1050, a mean squared error of 0.0261, a root mean square error of 0.1615, and an R-squared value of 0.9940, thereby surpassing the performance of all standalone models. We then applied cross-validation using 5-fold cross-validation and Diebold-Mariano tests to check for model overfitting and to perform model superiority analysis. Feature importance analysis using SHapley Additive exPlanations (SHAP) revealed that intuitive features influencing food prices are unit quantity, price type, commodity, and currency, while geographical factors such as county have a lesser impact. Finally, the model and its important features were saved as pickle files to facilitate the deployment of the model on a web application for food price predictions.

This data-driven decision support system can help policymakers and agricultural stakeholders (such as the Kenyan government) plan for future trends in food prices, potentially helping to prevent food insecurity in Kenya.

## Full-text entities

- **Diseases:** food insecurity (MESH:D005517)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12592163/full.md

## Figures

13 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12592163/full.md

## References

84 references — full list in the complete paper: https://tomesphere.com/paper/PMC12592163/full.md

---
Source: https://tomesphere.com/paper/PMC12592163