# Clustering in dilated cardiomyopathy at initial evaluation: An effective tool for clinical stratification

**Authors:** Ilaria Gandin, Maria Perotto, Alessia Paldino, Giovanni Baj, Denise Zaffalon, Andrea Pezzato, Cinzia Crescenzi, Fabiana Romeo, Annamaria Martino, Francesca Fanisio, Federica Toto, Maddalena Rossi, Marta Gigli, Matteo Dal Ferro, Leonardo Calò, Gianfranco Sinagra, Marco Merlo

PMC · DOI: 10.1002/ejhf.3780 · 2025-08-15

## TL;DR

This study uses machine learning to identify two subgroups of dilated cardiomyopathy patients based on ECG features, which differ in genetic causes and risk of serious outcomes.

## Contribution

A novel machine learning-based clustering approach for clinical stratification of DCM patients using baseline ECG data.

## Key findings

- Two DCM subgroups were identified based on ECG features and genetic profiles.
- CL2 subgroup had lower genetic variant yield and reduced risk of sudden cardiac death.
- Simplified clustering using three ECG variables was validated in an external cohort.

## Abstract

Dilated cardiomyopathy (DCM) has a highly variable presentation and disease course. Current stratification strategies are complex and require multimodality evaluation. Using machine learning (ML) on a large dataset obtained at first cardiological evaluation, this study aims to identify specific DCM subgroups.

In a retrospective cohort of DCM patients, baseline clinical, genetic, and outcome data were collected. Unsupervised clustering was performed and then simplified to identify patient subgroups. The subgroups were characterized based on outcomes, including all‐cause mortality/heart transplantation (HT)/left ventricular assist device implantation (LVAD), sudden cardiac death/major ventricular arrhythmias (SCD/MVA) and heart failure‐related death/HT/LVAD. These findings were then validated in an external population. In the derivation cohort of 409 patients (mean age 46 ± 14 years, 71% male), two cluster‐subgroups were identified: CL1 (82%) and CL2 (18%), mainly differentiated by electrocardiogram (ECG) characteristics. A lower yield of pathogenic/likely pathogenic variants was found in CL2 versus CL1 (15% vs. 47%, p < 0.001). A simplified clustering using only three variables (QRS duration, presence of left bundle branch block, intrinsicoid deflection >50 ms) was equally effective and validated in the external cohort of 160 patients (mean age 54 ± 13 years, 68% male). A lower risk for SCD/MVA events was observed for CL2 in the primary (hazard ratio 0.29, 95% confidence interval 0.13–0.67) and validation cohort (p = 0.017).

Using ML, baseline ECG variables were found to effectively identify two DCM subgroups differing in disease progression and genetic background. This approach could serve as a valuable tool for improving risk stratification of DCM patients upon their initial evaluation.

## Linked entities

- **Diseases:** dilated cardiomyopathy (MONDO:0005021), heart failure (MONDO:0005252), sudden cardiac death (MONDO:0007264)

## Full-text entities

- **Genes:** ERVW-5 (endogenous retrovirus group W member 5) [NCBI Gene 100862695] {aka CL2}, ADGRL1 (adhesion G protein-coupled receptor L1) [NCBI Gene 22859] {aka CIRL1, CL1, DEDBANP, LEC2, LPHN1}
- **Diseases:** sudden cardiac death (MESH:D016757), left bundle branch block (MESH:D002037), MVA (MESH:C536987), major ventricular arrhythmias (MESH:D001145), death (MESH:D003643), DCM (MESH:D002311), SCD (MESH:C536778), heart failure (MESH:D006333)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12803628/full.md

---
Source: https://tomesphere.com/paper/PMC12803628