# Utility of Clustering in Mortality Risk Stratification in Pulmonary Hypertension

**Authors:** Pasquale Tondo, Lucia Tricarico, Giuseppe Galgano, Maria Pia C. Varlese, Daphne Aruanno, Crescenzio Gallo, Giulia Scioscia, Natale D. Brunetti, Michele Correale, Donato Lacedonia

PMC · DOI: 10.3390/bioengineering12040408 · Bioengineering · 2025-04-11

## TL;DR

This study uses machine learning to identify distinct groups of pulmonary hypertension patients and predict five-year mortality based on clinical data.

## Contribution

The novel use of clustering and machine learning to identify PH phenotypes and predict mortality risk factors.

## Key findings

- Three distinct PH clusters were identified with varying clinical characteristics and mortality rates.
- Logistic regression achieved the best predictive performance with an AUC of 0.835 and accuracy of 0.744.
- Age, NYHA class, and number of medications were identified as significant mortality risk factors.

## Abstract

Background: Pulmonary hypertension (PH) is a condition characterized by increased pressure in the pulmonary arteries with poor prognosis and, therefore, an optimal management is necessary. The study’s aim was to search for PH phenotypes and develop a predictive model of five-year mortality using machine learning (ML) algorithms. Methods: This multicenter study was conducted on 122 PH patients. Clinical and demographic data were collected and then used to identify phenotypes through clustering. Subsequently, a predictive model was performed by different ML algorithms. Results: Three PH clusters were identified: Cluster 1 (mean age 68.57 ± 10.54) includes 57% females, 69% from non-respiratory PH groups, and better cardiac (NYHA class 2.61 ± 0.84) and respiratory function (FEV1% 78.78 ± 21.54); Cluster 2 includes 50% females, mean age of 71.36 ± 8.32 years, 44% from PH group 3, worse respiratory function (FEV 1% 68.12 ± 10.20); intermediate cardiac function (NYHA class 3.18 ± 0.49) and significantly higher mortality (75%); Cluster 3 represents the youngest cluster (mean age 61.11 ± 13.50) with 65% males, 81% from non-respiratory PH groups, intermediate respiratory function (FEV1% 70.51 ± 17.91) and worse cardiac performance (NYHA class 3.22 ± 0.58). After testing ML models, logistic regression showed the best predictive performance (AUC = 0.835 and accuracy = 0.744) and identified three mortality-risk factors: age, NYHA class, and number of medications taken. Conclusions: The results suggest that the integration of ML into clinical practice can improve risk stratification to optimize treatment strategies and improve outcomes for PH patients.

## Linked entities

- **Diseases:** pulmonary hypertension (MONDO:0005149)

## Full-text entities

- **Diseases:** PH (MESH:D006976)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12024815/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12024815/full.md

## References

43 references — full list in the complete paper: https://tomesphere.com/paper/PMC12024815/full.md

---
Source: https://tomesphere.com/paper/PMC12024815