# Machine learning prediction models for visual impairment in Chinese adults aged ≥ 45 years with cardiovascular metabolic diseases: a population-based study using CHARLS

**Authors:** Yuhao Liu, Riyan Zhang, Duoduo Xie, Min Liu, Guanshun Yu, Zhong Lin, Jia Qu, Ronghan Wu

PMC · DOI: 10.1186/s12886-025-04596-6 · 2025-12-30

## TL;DR

This study uses machine learning to predict vision impairment in older Chinese adults with cardiovascular metabolic diseases, identifying key risk factors and developing a model for early detection.

## Contribution

A novel logistic regression-based prediction model for vision impairment in CMD patients with interpretable results and stable performance.

## Key findings

- Eleven significant predictors of vision impairment were identified, including hearing impairment, depressive symptoms, and glaucoma history.
- Logistic regression showed the most stable performance across training and validation sets with AUCs between 0.693 and 0.705.
- A nomogram was developed for individualized risk estimation, aiding clinical decision-making in resource-limited settings.

## Abstract

There has been a growing prevalence of cardiovascular metabolic diseases (CMD) in adults aged ≥ 45 years, and vision impairment (VI) is highly prevalent in this population. The objective of this study was to explore the critical determinants of VI in individuals affected by CMD and to develop risk prediction models.

We analyzed data collected in 2011 (n = 1,926) and 2015 (n = 3,033) within the China Health and Retirement Longitudinal Study (CHARLS). Risk factors were selected using the least absolute shrinkage and selection operator (LASSO) regression followed by multivariable logistic regression analysis. Eight machine learning (ML) algorithms were applied: LR, GBM, XGBoost, LightGBM, CatBoost, AdaBoost, NN, and SVM. The evaluation of model performance incorporated ROC curves, calibration assessments, and decision curve analysis.

Eleven predictors demonstrated significant links to VI in CMD patients: hearing impairment, depressive symptoms, pain, lower uric acid levels, poorer self-rated health, functional limitations, multimorbidity, reduced cognitive function, poorer sleep quality, and histories of glaucoma and cataract surgery. Among the eight ML algorithms, LR achieved the most stable performance, with AUCs of 0.705 (2015 training set), 0.693 (2015 internal validation set), and 0.695 (2011 temporal validation set). Shapley Additive exPlanations (SHAP) analysis ranked the relative contribution of predictors, and a nomogram was developed for individualized risk estimation.

We established an LR-based prediction model for VI in patients with CMD aged ≥ 45 years, exhibiting stable accuracy and favorable interpretability in clinical settings. This tool may support timely recognition and intervention of eye health risks in CMD patients aged ≥ 45 years, particularly in settings with limited ophthalmic resources.

The online version contains supplementary material available at 10.1186/s12886-025-04596-6.

## Linked entities

- **Diseases:** glaucoma (MONDO:0005041), cataract (MONDO:0005129)
- **Species:** Homo sapiens (taxon 9606)

## Full-text entities

- **Diseases:** cardiovascular metabolic diseases (MESH:D002318), visual impairment (MESH:D014786)

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12860091/full.md

---
Source: https://tomesphere.com/paper/PMC12860091