On the Importance of Diversity in Re-Sampling for Imbalanced Data and Rare Events in Mortality Risk Models
Yuxuan (Diana) Yang, Hadi Akbarzadeh Khorshidi, Uwe Aickelin, Aditi, Nevgi, Elif Ekinci

TL;DR
This paper improves mortality risk prediction models by introducing a diversity-based re-sampling technique that enhances minority class detection and outperforms existing methods across multiple datasets.
Contribution
It proposes a novel diversity-based re-sampling approach using Solow-Polasky measure and greedy algorithms to address class imbalance in mortality risk models.
Findings
Enhanced classifier performance over ten external datasets.
Diversity-based re-sampling improves detection of minority (mortality) events.
Performance of UK SORT improved by 1.4%. after applying the method.
Abstract
Surgical risk increases significantly when patients present with comorbid conditions. This has resulted in the creation of numerous risk stratification tools with the objective of formulating associated surgical risk to assist both surgeons and patients in decision-making. The Surgical Outcome Risk Tool (SORT) is one of the tools developed to predict mortality risk throughout the entire perioperative period for major elective in-patient surgeries in the UK. In this study, we enhance the original SORT prediction model (UK SORT) by addressing the class imbalance within the dataset. Our proposed method investigates the application of diversity-based selection on top of common re-sampling techniques to enhance the classifier's capability in detecting minority (mortality) events. Diversity amongst training datasets is an essential factor in ensuring re-sampled data keeps an accurate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Medical Coding and Health Information · Machine Learning in Healthcare
