A random shuffle method to expand a narrow dataset and overcome the   associated challenges in a clinical study: a heart failure cohort example

Lorenzo Fassina; Alessandro Faragli; Francesco Paolo Lo Muzio,; Sebastian Kelle; Carlo Campana; Burkert Pieske; Frank Edelmann; Alessio; Alogna

arXiv:2012.06784·q-bio.QM·December 15, 2020

A random shuffle method to expand a narrow dataset and overcome the associated challenges in a clinical study: a heart failure cohort example

Lorenzo Fassina, Alessandro Faragli, Francesco Paolo Lo Muzio,, Sebastian Kelle, Carlo Campana, Burkert Pieske, Frank Edelmann, Alessio, Alogna

PDF

1 Repo

TL;DR

This paper introduces a novel random shuffle method to artificially expand small clinical datasets, improving predictive modeling in heart failure studies without relying on complex hypotheses or models.

Contribution

The study presents a new random shuffle technique that significantly increases dataset size while maintaining statistical validity, outperforming traditional repeated-measures methods.

Findings

01

Dataset cardinality increased by approximately 10 times with the shuffle method.

02

Further increase to 21 times when combined with repeated-measures approach.

03

Enhanced datasets improved the accuracy of machine learning and regression models.

Abstract

Heart failure (HF) affects at least 26 million people worldwide, so predicting adverse events in HF patients represents a major target of clinical data science. However, achieving large sample sizes sometimes represents a challenge due to difficulties in patient recruiting and long follow-up times, increasing the problem of missing data. To overcome the issue of a narrow dataset cardinality (in a clinical dataset, the cardinality is the number of patients in that dataset), population-enhancing algorithms are therefore crucial. The aim of this study was to design a random shuffle method to enhance the cardinality of an HF dataset while it is statistically legitimate, without the need of specific hypotheses and regression models. The cardinality enhancement was validated against an established random repeated-measures method with regard to the correctness in predicting clinical conditions…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

lorfas74/random-shuffle
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.