Multilabel 12-Lead Electrocardiogram Classification Using Gradient Boosting Tree Ensemble
Alexander William Wong, Weijie Sun, Sunil Vasu Kalmady, Padma Kaul,, Abram Hindle

TL;DR
This paper presents a gradient boosting tree ensemble approach for multilabel 12-lead ECG classification, utilizing morphology and signal features to detect cardiac abnormalities, achieving competitive results in a challenge setting.
Contribution
The study introduces a novel ensemble method combining morphology and signal features for ECG diagnosis, with a two-phase feature selection process for improved accuracy.
Findings
Achieved an official validation score of 0.476
Placed 36th out of 41 in the challenge rankings
Utilized a large dataset of 43,101 records for training and evaluation
Abstract
The 12-lead electrocardiogram (ECG) is a commonly used tool for detecting cardiac abnormalities such as atrial fibrillation, blocks, and irregular complexes. For the PhysioNet/CinC 2020 Challenge, we built an algorithm using gradient boosted tree ensembles fitted on morphology and signal processing features to classify ECG diagnosis. For each lead, we derive features from heart rate variability, PQRST template shape, and the full signal waveform. We join the features of all 12 leads to fit an ensemble of gradient boosting decision trees to predict probabilities of ECG instances belonging to each class. We train a phase one set of feature importance determining models to isolate the top 1,000 most important features to use in our phase two diagnosis prediction models. We use repeated random sub-sampling by splitting our dataset of 43,101 records into 100 independent runs of 85:15…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
