A novel interpretable machine learning system to generate clinical risk   scores: An application for predicting early mortality or unplanned   readmission in a retrospective cohort study

Yilin Ning; Siqi Li; Marcus Eng Hock Ong; Feng Xie; Bibhas; Chakraborty; Daniel Shu Wei Ting; Nan Liu

arXiv:2201.03291·cs.LG·December 31, 2024

A novel interpretable machine learning system to generate clinical risk scores: An application for predicting early mortality or unplanned readmission in a retrospective cohort study

Yilin Ning, Siqi Li, Marcus Eng Hock Ong, Feng Xie, Bibhas, Chakraborty, Daniel Shu Wei Ting, Nan Liu

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new interpretable machine learning system using ShapleyVIC for transparent variable selection, applied to predict early mortality or readmission, achieving comparable performance with fewer variables.

Contribution

The study presents a robust, interpretable variable selection method that enhances transparency and simplifies risk score generation in clinical prediction models.

Findings

01

ShapleyVIC selected 6 key variables from 41 candidates.

02

The resulting model performed similarly to a 16-variable machine learning model.

03

The approach improves interpretability without sacrificing accuracy.

Abstract

Risk scores are widely used for clinical decision making and commonly generated from logistic regression models. Machine-learning-based methods may work well for identifying important predictors, but such 'black box' variable selection limits interpretability, and variable importance evaluated from a single model can be biased. We propose a robust and interpretable variable selection approach using the recently developed Shapley variable importance cloud (ShapleyVIC) that accounts for variability across models. Our approach evaluates and visualizes overall variable contributions for in-depth inference and transparent variable selection, and filters out non-significant contributors to simplify model building steps. We derive an ensemble variable ranking from variable contributions, which is easily integrated with an automated and modularized risk score generator, AutoScore, for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nliulab/shapleyvic
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)

MethodsLogistic Regression