# An Open-Source Analysis of Cardiomyopathy Using Machine Learning and Electrocardiograms

**Authors:** Arda Altintepe, Asu Rustemli, Amir Reza Vazifeh, Jason W. Fleischer

PMC · DOI: 10.3390/diagnostics16050719 · Diagnostics · 2026-02-28

## TL;DR

This study uses open-source ECG data and machine learning to distinguish types of cardiomyopathy, offering a potential diagnostic tool for heart conditions.

## Contribution

The study introduces an open-source ECG-based machine learning pipeline to differentiate cardiomyopathy subtypes, including obstructive and non-obstructive HCM.

## Key findings

- Logistic regression achieved high discrimination of HCM from ischemic DCM (AUC-ROC = 0.92) and non-ischemic DCM (AUC-ROC = 0.90).
- Differentiating obstructive HCM from non-obstructive HCM was more challenging (XGBoost AUC-ROC = 0.81; LR = 0.75).
- ECG features showed measurable differences between cardiomyopathy subtypes, such as QRS amplitudes and T-loop complexity.

## Abstract

Background/Objectives: Dilated cardiomyopathy (DCM) and hypertrophic cardiomyopathy (HCM) are common cardiomyopathies associated with heart failure. Electrocardiogram (ECG) screening before an echocardiogram could help streamline diagnosis, particularly in rural areas. Prior ECG–machine learning (ML) studies do not use open-source data when studying cardiomyopathy, and very few proprietary studies directly compare HCM and DCM or address ECG differences within obstructive (HOCM) and non-obstructive HCM (HNCM). Methods: Standard and vectorcardiogram-derived (VCG) ECG features were extracted from the MIMIC-IV-ECG database. The final cohort comprised 599 patients (HCM = 208 [HOCM = 99, HNCM = 53, unknown = 56]; DCM = 391 [ischemic cardiomyopathy with left ventricular dilation = 250, non-ischemic = 141]). Logistic regression (LR) and extreme gradient boosting (XGBoost) with five-fold cross-validation separated HCM from ischemic cardiomyopathy with left ventricular dilation (DCM-I) and non-ischemic DCM (DCM-NI), and HOCM from HNCM. Results: Using the area under the receiver-operating-characteristic curve (AUC-ROC) as the performance metric, LR achieved high discrimination of HCM from DCM-I (0.92) and DCM-NI (0.90). However, differentiating HOCM from HNCM proved more difficult (XGBoost = 0.81; LR = 0.75). Both DCM subtypes (especially ischemic) showed lower QRS amplitudes and right-posterior ventricular gradient orientation; HCM displayed higher amplitudes and larger, more complex T-loops. Within HCM, HOCM had stronger leftward electrical activity and more dipolar to non-dipolar QRS energy after singular value decomposition. Conclusions: Using only open-access data, we demonstrate an interpretable ECG-based pipeline that discriminates cardiomyopathy and highlights distinct features. While detecting obstruction remains difficult, ECG features provide measurable separation, supporting possible diagnostic screening and offering a reproducible framework for future studies.

## Linked entities

- **Diseases:** Dilated cardiomyopathy (MONDO:0005021), hypertrophic cardiomyopathy (MONDO:0005045), heart failure (MONDO:0005252)

## Full-text entities

- **Diseases:** DCM (MESH:D002311), Cardiomyopathy (MESH:D009202), heart failure (MESH:D006333), left ventricular dilation (MESH:C565277), ischemic (MESH:D002545), HCM (MESH:D002312)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12984491/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12984491/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/PMC12984491/full.md

---
Source: https://tomesphere.com/paper/PMC12984491