Comparison of static and dynamic random forests models for EHR data in   the presence of competing risks: predicting central line-associated   bloodstream infection

Elena Albu; Shan Gao; Pieter Stijnen; Frank Rademakers; Christel; Janssens; Veerle Cossey; Yves Debaveye; Laure Wynants; Ben Van Calster

arXiv:2404.16127·cs.LG·May 27, 2024·1 cites

Comparison of static and dynamic random forests models for EHR data in the presence of competing risks: predicting central line-associated bloodstream infection

Elena Albu, Shan Gao, Pieter Stijnen, Frank Rademakers, Christel, Janssens, Veerle Cossey, Yves Debaveye, Laure Wynants, Ben Van Calster

PDF

Open Access 1 Repo

TL;DR

This study compares static and dynamic random forest models for predicting bloodstream infections in hospital data, highlighting that complex models do not significantly outperform simpler ones in non-censored settings.

Contribution

It demonstrates that in non-censored hospital data, simple binary models perform comparably to complex competing risks models for infection prediction.

Findings

01

Binary, multinomial, and competing risks models have similar AUROC scores.

02

Survival models tend to overestimate infection risk.

03

Complex models do not significantly improve prediction over simple binary models.

Abstract

Prognostic outcomes related to hospital admissions typically do not suffer from censoring, and can be modeled either categorically or as time-to-event. Competing events are common but often ignored. We compared the performance of random forest (RF) models to predict the risk of central line-associated bloodstream infections (CLABSI) using different outcome operationalizations. We included data from 27478 admissions to the University Hospitals Leuven, covering 30862 catheter episodes (970 CLABSI, 1466 deaths and 28426 discharges) to build static and dynamic RF models for binary (CLABSI vs no CLABSI), multinomial (CLABSI, discharge, death or no event), survival (time to CLABSI) and competing risks (time to CLABSI, discharge or death) outcomes to predict the 7-day CLABSI risk. We evaluated model performance across 100 train/test splits. Performance of binary, multinomial and competing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sibipx/clabsi_compare_rfsrc_models
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Machine Learning in Healthcare · Imbalanced Data Classification Techniques