A Multi-Stage Framework for Kawasaki Disease Prediction Using Clustering-Based Undersampling and Synthetic Data Augmentation: Cross-Institutional Validation with Dual-Center Clinical Data in Taiwan
Heng-Chih Huang, Chuan-Sheng Hung, Chun-Hung Richard Lin, Yi-Zhen Shie, Cheng-Han Yu, Ting-Hsin Huang

TL;DR
This paper introduces a multi-stage AI framework to predict Kawasaki disease using undersampling and data augmentation, validated across two hospitals in Taiwan.
Contribution
A novel multi-stage AI framework that addresses class imbalance in Kawasaki disease prediction using clustering-based undersampling and synthetic data augmentation.
Findings
The model achieved 97.5% specificity and 53.6% F1-score at 95% recall on the CGMH test set.
It maintained 74.7% specificity with 23.4% F1-score on the KMUH validation set.
The framework demonstrates cross-institutional generalizability and practical utility for KD screening.
Abstract
Kawasaki disease (KD) is a rare yet potentially life-threatening pediatric vasculitis that, if left undiagnosed or untreated, can result in serious cardiovascular complications. Its heterogeneous clinical presentation poses diagnostic challenges, often failing to meet classical criteria and increasing the risk of oversight. Leveraging routine laboratory tests with AI offers a promising strategy for enhancing early detection. However, due to the extremely low prevalence of KD, conventional models often struggle with severe class imbalance, limiting their ability to achieve both high sensitivity and specificity in practice. To address this issue, we propose a multi-stage AI-based predictive framework that incorporates clustering-based undersampling, data augmentation, and stacking ensemble learning. The model was trained and internally tested on clinical blood and urine test data from…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsKawasaki Disease and Coronary Complications · Pneumonia and Respiratory Infections · Sepsis Diagnosis and Treatment
