# Development and validation of an interpretable machine learning model for non-invasive screening of precancerous gastric lesions using symptom and lifestyle data: a multicentre cohort study

**Authors:** Lan Wang, Kaiqiang Tang, Peng Zhang, Jiasheng Liu, Bowen Wu, Jun Chen, Yan Li, Shiyu Du, Yan Wang, Shao Li

PMC · DOI: 10.1016/j.eclinm.2026.103756 · 2026-01-17

## TL;DR

Researchers developed a non-invasive machine learning model to screen for precancerous stomach lesions using symptoms and lifestyle data, outperforming current guidelines in accuracy and cost-effectiveness.

## Contribution

An interpretable machine learning model for non-invasive precancerous gastric lesion screening that outperforms existing guidelines in multiple validation settings.

## Key findings

- The model achieved AUCs of 0.82 in internal testing and 0.80 in external validation for detecting precancerous gastric lesions.
- It outperformed existing guidelines by 0.18–0.35 in AUC across all datasets and reduced the average cost per detected case by 37.1%.
- Key predictors identified include Helicobacter pylori infection, age, and melaena.

## Abstract

Precancerous gastric lesions (PLGC) are a critical stage in gastric cancer progression, where timely intervention can substantially reduce mortality. However, current screening strategies are predominantly endoscopic, which are invasive, costly, and often inaccessible in resource-limited settings. We aimed to develop and validate an interpretable machine learning model for non-invasive PLGC screening using symptom and lifestyle data.

In this multicentre study, we enrolled eligible adult participants undergoing or scheduled to undergo upper gastrointestinal endoscopy with no prior diagnosis of malignancy. The development cohort comprised 1034 participants recruited at two hospitals between Nov 16, 2022, and Apr 7, 2023. Symptom and lifestyle data from this cohort were used to construct the development dataset, which was randomly split into a training set (n = 620), an internal validation set (n = 207), and a hold-out test set (n = 207). External performance was assessed in a retrospective hospital-based cohort from four additional hospitals (n = 630; May 21, 2018 to Jul 30, 2023) and a prospective community-based cohort from 32 screening sites (n = 847; June 21, 2023, to Nov 7, 2023). We developed a stacking ensemble model to predict the primary outcome (presence of PLGC) by integrating seven base learners (Gaussian Naïve Bayes, Logistic Regression, K-Nearest Neighbours, Gradient Boosting Classifier, eXtreme Gradient Boosting, Random Forest, Adaptive Boosting) and applied Shapley Additive Explanations (SHAP) for clinical interpretability. Model performance was compared with guideline-based screening strategies from the Chinese Guidelines for Gastric Cancer Screening and Early Diagnosis and Treatment and the British Society of Gastroenterology gastric cancer risk guidance, using the area under the receiver operating characteristic curve (AUC; 95% CI), sensitivity, specificity, positive predictive value, and negative predictive value.

In total, 2511 participants (male: n = 871, 34.7%; female: n = 1640, 65.3%) were included. The primary outcome, PLGC, was present in 509 of 1034 participants (49.2%) in the development cohort, in 331 of 630 participants (52.5%) in the retrospective validation cohort, and in 312 of 847 participants (36.8%) in the prospective validation cohort. The model showed robust performance for non-invasive PLGC screening, with AUCs of 0.82 (95% CI: 0.77–0.87) in the internal hold-out test set, 0.80 (95% CI: 0.78–0.82) in the external retrospective validation set, and 0.79 (95% CI: 0.77–0.81) in the prospective validation set. With AUC improvements of 0.18–0.35, our model exceeded both guideline-based strategies across all datasets (internal hold-out test: 0.82 (95% CI: 0.77–0.87) vs. 0.47 (95% CI: 0.42–0.53)/0.48 (95% CI: 0.42–0.53); external retrospective validation: 0.80 (95% CI: 0.78–0.82) vs. 0.62 (95% CI: 0.60–0.64)/0.58 (95% CI: 0.55–0.60); prospective validation: 0.79 (95% CI: 0.77–0.81) vs. 0.57 (95% CI: 0.54–0.59)/0.52 (95% CI: 0.50–0.55); all p < 0.001). In a cost-effectiveness analysis, this translated into a 37.1% reduction in the average cost per detected PLGC case versus guideline-based tools. SHAP analysis further identified 15 key predictors, including Helicobacter pylori infection, age, and melaena.

An interpretable machine learning model integrating symptom and lifestyle information, some of which were implicated by traditional medicine, achieved superior performance to guideline-based screening strategies for PLGC non-invasive screening in both hospital-based and community-based populations. However, the generalisability may be limited by the cohorts’ age and regional distribution; further studies should incorporate more non-invasive metrics to optimise the screening model and pursue broader external validation and real-world implementation.

10.13039/501100001809National Natural Science Foundation of China. Innovation Team and Talents Cultivation Program of the National Administration of Traditional Chinese Medicine. Fundamental and Interdisciplinary Disciplines Breakthrough Plan of the Ministry of Education of China.

## Linked entities

- **Diseases:** gastric cancer (MONDO:0001056)

## Full-text entities

- **Diseases:** malignancy (MESH:D009369), Gastric Cancer (MESH:D013274), Precancerous gastric lesions (MESH:D011230), H elicobacter pylori infection (MESH:D016481)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12856190/full.md

---
Source: https://tomesphere.com/paper/PMC12856190