# Integrative machine learning approach to risk prediction for dementia and Alzheimer’s disease

**Authors:** Amos Stern, Michal Linial

PMC · DOI: 10.1007/s11357-025-01828-x · GeroScience · 2025-08-27

## TL;DR

This study uses machine learning to predict dementia and Alzheimer's risk by combining health, genetic, and lifestyle data from UK Biobank.

## Contribution

The study introduces an integrative ML approach that combines genetic and lifestyle factors for dementia risk prediction, emphasizing sex-specific and modifiable risk factors.

## Key findings

- CatBoost achieved the best performance (ROC-AUC = 0.773) in predicting Alzheimer’s disease risk.
- ApoE-ε4 was the most predictive genetic marker, while comorbidities and lifestyle factors were key non-genetic predictors.
- Vascular dementia models outperformed Alzheimer’s-specific models despite fewer cases.

## Abstract

Dementia, particularly Alzheimer’s disease (AD), presents a growing global health challenge characterized by cognitive decline, behavioral changes, and loss of independence. With increasing life expectancy, early diagnosis and improved clinical strategies are urgently needed. This study developed and evaluated machine learning (ML) models to predict AD risk using UK Biobank data, integrating health, genetic, and lifestyle factors. The cohort included 2878 AD cases and 72,366 controls. Among several algorithms, CatBoost performed best (ROC-AUC = 0.773), especially in females. Inputs included ICD-10 codes from 5 years pre-diagnosis, ApoE-ε4 genotype, and large collection of modifiable risk factors. Despite fewer cases, the risk predictive models for vascular dementia (VaD) outperformed the unique AD models. ApoE-ε4 was the most predictive genetic marker, while other common variants had limited utility. Key non-genetic predictors included comorbidities (e.g., diabetes, hypertension), education, physical activity, and diet. These findings highlight the value of integrating diverse data sources for dementia risk prediction and emphasize the role of sex-specific modeling and modifiable factors in early, personalized intervention strategies.

The online version contains supplementary material available at 10.1007/s11357-025-01828-x.

## Linked entities

- **Diseases:** dementia (MONDO:0001627), Alzheimer’s disease (MONDO:0004975), vascular dementia (MONDO:0004648), diabetes (MONDO:0005015)

## Full-text entities

- **Genes:** APOE (apolipoprotein E) [NCBI Gene 348] {aka AD2, APO-E, ApoE4, LDLCQ5, LPG}
- **Diseases:** AD (MESH:D000544), cognitive decline (MESH:D003072), loss of independence (MESH:D064129), Dementia (MESH:D003704), diabetes (MESH:D003920), hypertension (MESH:D006973), VaD (MESH:D015140)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12972432/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12972432/full.md

## References

1 references — full list in the complete paper: https://tomesphere.com/paper/PMC12972432/full.md

---
Source: https://tomesphere.com/paper/PMC12972432