# Mapping and modelling land degradation vulnerability in a semi-arid region: a case study from Battalgazi District, Turkiye

**Authors:** Miraç Kılıç

PMC · DOI: 10.7717/peerj.20606 · PeerJ · 2026-01-27

## TL;DR

This study creates a new framework to map land degradation risk in a semi-arid region using satellite data and machine learning, showing where soils are most vulnerable.

## Contribution

The HyStoRSM framework combines land survey data, remote sensing, and machine learning for high-resolution land degradation vulnerability mapping.

## Key findings

- HyStoRSM achieved high prediction accuracy (R2 = 0.74) for land degradation vulnerability.
- Hydro-topographic variables had a greater impact on LDV than spectral indices according to SHAP analysis.
- 21.7% and 20.3% of the study area showed high and very high land degradation vulnerability.

## Abstract

Land degradation threatens and the provision of ecosystem services worldwide. Land degradation vulnerability (LDV) assessments still lack the necessary spatial detail and predictive accuracy, and the integration of multiple spectral indices with machine learning remains underexplored. This study addresses the critical importance of spatially mapping vulnerability to land degradation and develops a novel framework that combines advanced machine learning and uncertainty measurement with the STORIE Index Rating (SIR), a semi-quantitative method for assessing potential soil productivity. This framework aims to spatially predict the vulnerability of soils in the study area to land degradation with high accuracy.

This study addresses this gap by introducing HyStoRSM, a novel framework that integrates land-survey-derived data, remote sensing, and machine learning. This study presents a case study of the HyStoRSM framework in the Battalgazi district (940.5 km2) of Malatya province, which is representative of continental semi-arid conditions in the upper reaches of the Euphrates Basin in Eastern Anatolia. The framework integrates land survey data (major soil groups, land use capability, slope-depth combination, and erosion severity), spectral indices derived from Landsat 8 OLI/TIRS imagery, and topographic indices calculated from SRTM (Shuttle Radar Topography Mission) data. Landsat 8 and SRTM data from 2023 were processed on the Google Earth Engine platform. Local LDV scores were generated using the geometric mean form of the SIR. An extreme gradient boosting (XGBoost) regression model, optimized using Optuna, estimated continuous LDV scores, while SHapley Additive exPlanations (SHAP) provided insights into feature importance.

The optimized XGBoost regression model, with hyperparameters tuned using 5-fold cross-validation with Optuna-based hyperparameter optimization and validated on an independent 30% test dataset, achieved high prediction accuracy (R2 = 0.74, RMSE = 0.1285, MAE = 0.1002, and Huber Loss = 0.0083). SHAP analysis revealed that the length-slope factor was the most influential variable, followed by the stream power index and the Normalized Difference Vegetation Index (NDVI). These results demonstrated that hydro-topographic variables had a greater impact on LDV than spectral indices. Accordingly, an LDV map at 30 m spatial resolution was produced. Spatial analysis indicated that 21.7% and 20.3% of the study area exhibited high and very high LDV, primarily concentrated in the southern and southeastern regions. Conversely, low and very low vulnerabilities covered 16.9% and 12.4% of the area.

The HyStoRSM framework integrates multisource satellite data, land survey data, and advanced machine learning into a single, interpretable framework. This enables proactive, precise land degradation risk management, especially in semiarid regions where terrain and hydrologic controls drive erosion vulnerability.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12857559/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12857559/full.md

## References

114 references — full list in the complete paper: https://tomesphere.com/paper/PMC12857559/full.md

---
Source: https://tomesphere.com/paper/PMC12857559