# Soil classification in the Sudan Savanna using sentinel products and topographic information with machine learning models

**Authors:** Win Sithu Maung, Ikazaki Kenta, Sakai Toru, Simporé Saïdou, Zerbo Lamine, Kone Nicolas

PMC · DOI: 10.1038/s41598-026-46259-6 · Scientific Reports · 2026-03-27

## TL;DR

This study uses machine learning and remote sensing data to classify soil types in the Sudan Savanna, showing that combining topographic and satellite data improves accuracy.

## Contribution

The study introduces a novel integration of Sentinel satellite data, topographic features, and machine learning for soil classification in a data-scarce region.

## Key findings

- XGBoost achieved the highest soil classification accuracy (78.9%) when combined with selected remote sensing and topographic features.
- Topographic parameters were found to be the most important features for accurate soil classification.
- The integration of optical, radar, and topographic data proved effective for soil mapping in the Sudan Savanna.

## Abstract

Accurate soil information is crucial for sustainable agricultural planning and land management, particularly in data-scarce regions, such as the Sudan Savanna, the largest sorghum-producing area in Africa. A recent study reported that soils in this region corresponded well with the topography, having formed primarily through erosion–deposition processes, resulting in systematic variation in soil types along the landscape. Therefore, this study compared the performances of three machine learning models, i.e., Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Support Vector Machine (SVM), for soil classification based on multisource remote sensing and topographic data. Ground-truth data with four different soil types, Lixisols, Petric Plinthosols, Pisoplinthic Petric Plinthosols, and Gleysols, were used to train and validate the models using 19 remote sensing-derived covariates including Sentinel-1 SAR, Sentinel-2 bands, spectral indices, and Topographic Wetness Index. Machine learning classification was analyzed under different scenarios of remote sensing feature combination. Results showed that the XGBoost with the selected feature combination achieved the highest performance with an overall accuracy of 78.9%, followed by RF (72.3%) and SVM (65.2%). Among the selected features, topographic parameters appeared the most important and provided complementary information for accurate soil classification. This study demonstrates the effectiveness of integrating optical, radar, and topographic information for soil mapping and provides a valuable management tool to support agricultural and environmental strategies in the Sudan Savanna.

## Full-text entities

- **Diseases:** PT (MESH:D006526)
- **Chemicals:** water (MESH:D014867), Sentinel (MESH:C093628), LX (-)
- **Species:** Sorghum bicolor (broomcorn, species) [taxon 4558], Panicum miliaceum (broomcorn millet, species) [taxon 4540], Arachis hypogaea (goober, species) [taxon 3818], Homo sapiens (human, species) [taxon 9606], Vigna unguiculata (cowpea, species) [taxon 3917], Glycine max (soybean, species) [taxon 3847]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC13039294/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13039294/full.md

## References

2 references — full list in the complete paper: https://tomesphere.com/paper/PMC13039294/full.md

---
Source: https://tomesphere.com/paper/PMC13039294