# An interpretable machine learning model for detecting vision-threatening diabetic retinopathy among patients with diabetic retinopathy: a web-based cross-sectional study

**Authors:** Mingyang Song, Yimeng Shi

PMC · DOI: 10.3389/fendo.2026.1776188 · Frontiers in Endocrinology · 2026-03-04

## TL;DR

This study developed a machine learning model to detect severe diabetic retinopathy using clinical data, aiming to help prevent blindness through early identification.

## Contribution

An interpretable machine learning model for detecting vision-threatening diabetic retinopathy using routine clinical data is proposed and validated.

## Key findings

- The SVM model achieved an AUC of 0.879 and accuracy of 0.837 in detecting vision-threatening diabetic retinopathy.
- Key factors associated with VTDR included diabetes duration, glycated hemoglobin levels, albuminuria, and anemia.
- A simplified calculator based on SHAP rankings maintained strong diagnostic performance.

## Abstract

Vision-threatening diabetic retinopathy (VTDR) is a severe complication of type 2 diabetes mellitus (T2DM), particularly prevalent in patients with prolonged disease duration, poor glycemic control, and systemic comorbidities. This condition frequently progresses asymptomatically toward irreversible blindness without timely intervention. The early identification of VTDR is challenging due to the lack of validated biomarkers and a reliance on subjective clinical assessments. This study aimed to develop and validate an interpretable machine learning (ML) model to detect VTDR among patients with diabetic retinopathy (DR).

Retrospective clinical data from T2DM patients with DR were extracted from the electronic medical records at our hospital and categorized into VTDR and non-VTDR (defined as mild-to-moderate non-proliferative diabetic retinopathy) groups. The dataset was partitioned into training and testing sets (7:3 ratio). Eight ML models were trained and evaluated using metrics such as Area Under the Curve (AUC), accuracy, and recall. Model performance was evaluated using a comprehensive scoring system (total score = 64). Shapley Additive Explanations (SHAP) were used to interpret the best-performing model. A web-based application was developed to demonstrate potential clinical utility.

Among 1,124 enrolled patients, the prevalence of VTDR was 36.9%. Key associated factors included diabetic treatment, T2DM duration, glycated hemoglobin levels, albuminuria, and anemia. The Support Vector Machine (SVM) model demonstrated superior performance, with an AUC of 0.879, accuracy of 0.837, precision of 0.833, Brier score of 0.129, and an F1 score of 0.756, outperforming the other ML models. The SVM model achieved the highest total score (57/64) in the testing cohort. Furthermore, decision curve analysis and calibration curves confirmed the robustness and reliability of the models. A simplified calculator derived from the SHAP feature importance rankings maintained strong diagnostic capacity.

The interpretable SVM model effectively detected VTDR among patients with DR using routine clinical data. While requiring external validation, this study serves as a proof-of-concept for a cost-effective screening tool that could assist clinicians in prioritizing high-risk patients and facilitating early intervention to prevent irreversible vision impairment.

## Linked entities

- **Diseases:** type 2 diabetes mellitus (MONDO:0005148), diabetic retinopathy (MONDO:0005266), anemia (MONDO:0002280)

## Full-text entities

- **Diseases:** anemia (MESH:D000740), diabetic (MESH:D003920), DR (MESH:D003930), T2DM (MESH:D003924), albuminuria (MESH:D000419), blindness (MESH:D001766), vision impairment (MESH:D014786)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12997098/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12997098/full.md

## References

57 references — full list in the complete paper: https://tomesphere.com/paper/PMC12997098/full.md

---
Source: https://tomesphere.com/paper/PMC12997098