Monitoring machine learning (ML)-based risk prediction algorithms in the   presence of confounding medical interventions

Jean Feng; Alexej Gossmann; Gene Pennello; Nicholas Petrick; Berkman; Sahiner; Romain Pirracchio

arXiv:2211.09781·stat.ML·April 17, 2023

Monitoring machine learning (ML)-based risk prediction algorithms in the presence of confounding medical interventions

Jean Feng, Alexej Gossmann, Gene Pennello, Nicholas Petrick, Berkman, Sahiner, Romain Pirracchio

PDF

Open Access 1 Repo

TL;DR

This paper addresses the challenge of monitoring ML risk prediction models in healthcare when confounding medical interventions occur, proposing a new method that accounts for these confounders to ensure valid performance assessment.

Contribution

It introduces a novel score-based CUSUM monitoring procedure that accounts for confounding interventions and demonstrates its effectiveness through simulations and real-world data.

Findings

01

Monitoring is valid under certain assumptions despite confounding interventions.

02

Combining model updating with monitoring improves detection of performance issues.

03

The method successfully detects calibration decay in a postoperative nausea risk model.

Abstract

Performance monitoring of machine learning (ML)-based risk prediction models in healthcare is complicated by the issue of confounding medical interventions (CMI): when an algorithm predicts a patient to be at high risk for an adverse event, clinicians are more likely to administer prophylactic treatment and alter the very target that the algorithm aims to predict. A simple approach is to ignore CMI and monitor only the untreated patients, whose outcomes remain unaltered. In general, ignoring CMI may inflate Type I error because (i) untreated patients disproportionally represent those with low predicted risk and (ii) evolution in both the model and clinician trust in the model can induce complex dependencies that violate standard assumptions. Nevertheless, we show that valid inference is still possible if one monitors conditional performance and if either conditional exchangeability or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jjfeng/monitoring_ml_cmi
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHealth Systems, Economic Evaluations, Quality of Life · Cardiac, Anesthesia and Surgical Outcomes · Advanced Causal Inference Techniques