Towards Quantification of Bias in Machine Learning for Healthcare: A   Case Study of Renal Failure Prediction

Josie Williams; Narges Razavian

arXiv:1911.07679·cs.LG·November 19, 2019·1 cites

Towards Quantification of Bias in Machine Learning for Healthcare: A Case Study of Renal Failure Prediction

Josie Williams, Narges Razavian

PDF

Open Access

TL;DR

This paper compares traditional risk scores and machine learning models for renal failure prediction to quantify biases and assess generalization, highlighting the potential of ML to improve healthcare decision-making.

Contribution

It introduces a case study analyzing bias in ML models versus traditional risk scores in renal failure prediction using large-scale EHR data.

Findings

01

ML model trained on 1.6 million patients outperforms traditional risk score.

02

Comparison reveals biases in current clinical practice.

03

ML models show better generalization across diverse patient data.

Abstract

As machine learning (ML) models, trained on real-world datasets, become common practice, it is critical to measure and quantify their potential biases. In this paper, we focus on renal failure and compare a commonly used traditional risk score, Tangri, with a more powerful machine learning model, which has access to a larger variable set and trained on 1.6 million patients' EHR data. We will compare and discuss the generalization and applicability of these two models, in an attempt to quantify biases of status quo clinical practice, compared to ML-driven models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare · Chronic Kidney Disease and Diabetes