# Limited Performance of Machine Learning Models Developed Based on Demographic and Laboratory Data Obtained Before Primary Treatment to Predict Coronary Aneurysms

**Authors:** Mi-Jin Kim, Gi-Beom Kim, Dongha Yang, Yeon-Jin Jang, Jeong-Jin Yu

PMC · DOI: 10.3390/biomedicines13051073 · Biomedicines · 2025-04-29

## TL;DR

A machine learning model to predict heart artery issues in children with Kawasaki disease performs poorly using pre-treatment data.

## Contribution

This study evaluates the performance of machine learning models for predicting coronary artery aneurysms in Kawasaki disease patients.

## Key findings

- The best machine learning model achieved an AUC of 0.661, but with limited sensitivity and specificity.
- Unsupervised learning found no distinct patterns between patients with and without coronary artery aneurysms.
- The Harada score had a low AUC of 0.558, indicating poor predictive power.

## Abstract

Background/objectives: Kawasaki disease is the leading cause of acquired heart disease in children within developed countries. Although treatment with intravenous immunoglobulin (IVIG) significantly reduces the incidence of coronary artery aneurysm (CAA), the risk of it persists, affecting long-term patient outcomes. While intensified primary treatment is recommended for patients at high risk of IVIG resistance or CAA development, a universally accepted predictive model for such resistance remains unestablished. This study aims to develop a machine learning model to predict the occurrence of CAAs prior to initiating IVIG therapy. Methods: Data from two nationwide epidemiological surveys conducted between 2012 and 2017 were analyzed, encompassing 17,189 patients with calculable coronary artery z-scores and Harada scores. Various supervised machine learning algorithms were applied to develop a model for predicting CAA. Afterward, unsupervised learning techniques were employed to explore the data’s inherent structure. Results: The Harada score’s receiver operating characteristic (ROC) analysis yielded an area under the curve (AUC) of 0.558. The highest AUC among the machine learning models was 0.661, achieved by the Light Gradient Boosting Machine. However, this model’s sensitivity was 0.615, and specificity was 0.647, indicating limited clinical applicability. Unsupervised learning revealed no distinct distribution patterns between patients with/without CAAs. Conclusions: Despite utilizing a large dataset to develop a machine learning-based prediction model for CAAs, the performance was unsatisfactory. Future studies should focus on enhancing predictive models by incorporating additional clinical data, such as acute-phase coronary artery diameter measurements, to improve accuracy and clinical utility.

## Linked entities

- **Diseases:** Kawasaki disease (MONDO:0012727), coronary artery aneurysm (MONDO:0006714), heart disease (MONDO:0005267)

## Full-text entities

- **Diseases:** heart disease (MESH:D006331), CAA (MESH:D003323), Kawasaki disease (MESH:D009080)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12108861/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12108861/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/PMC12108861/full.md

---
Source: https://tomesphere.com/paper/PMC12108861