Empirical investigation of multi-source cross-validation in clinical ECG   classification

Tuija Leinonen; David Wong; Antti Vasankari; Ali Wahab; Ramesh; Nadarajah; Matti Kaisti; Antti Airola

arXiv:2403.15012·cs.LG·October 24, 2024·1 cites

Empirical investigation of multi-source cross-validation in clinical ECG classification

Tuija Leinonen, David Wong, Antti Vasankari, Ali Wahab, Ramesh, Nadarajah, Matti Kaisti, Antti Airola

PDF

Open Access 1 Repo

TL;DR

This study empirically compares cross-validation methods in multi-source ECG classification, revealing that leave-source-out cross-validation offers more realistic performance estimates than standard K-fold methods, which tend to be overly optimistic.

Contribution

It systematically evaluates cross-validation strategies in multi-source medical data, demonstrating the advantages of leave-source-out validation for realistic performance assessment.

Findings

01

K-fold cross-validation overestimates accuracy for new sources.

02

Leave-source-out validation provides less biased estimates.

03

Multi-source data improves evaluation reliability.

Abstract

Traditionally, machine learning-based clinical prediction models have been trained and evaluated on patient data from a single source, such as a hospital. Cross-validation methods can be used to estimate the accuracy of such models on new patients originating from the same source, by repeated random splitting of the data. However, such estimates tend to be highly overoptimistic when compared to accuracy obtained from deploying models to sources not represented in the dataset, such as a new hospital. The increasing availability of multi-source medical datasets provides new opportunities for obtaining more comprehensive and realistic evaluations of expected accuracy through source-level cross-validation designs. In this study, we present a systematic empirical evaluation of standard K-fold cross-validation and leave-source-out cross-validation methods in a multi-source setting. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

utu-health-research/dl-ecg-classifier
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRadiomics and Machine Learning in Medical Imaging