An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness

Ioannis Bilionis; Ricardo C. Berrios; Luis Fernandez-Luque; Carlos Castillo

arXiv:2604.23954·cs.AI·April 28, 2026

An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness

Ioannis Bilionis, Ricardo C. Berrios, Luis Fernandez-Luque, Carlos Castillo

PDF

TL;DR

This study empirically evaluates the risks associated with updating AI models in clinical settings, focusing on stability, arbitrariness, and fairness, using diabetes data to inform trustworthy decision support systems.

Contribution

It introduces a monitoring framework to detect risks from model updates, emphasizing the importance of continuous oversight for clinical AI reliability.

Findings

01

Model updates can cause prediction flips and instability.

02

Updates may increase arbitrariness and reduce fairness.

03

Continuous monitoring is crucial for trustworthy clinical AI.

Abstract

Artificial Intelligence and Machine Learning (AI/ML) models used in clinical settings are increasingly deployed to support clinical decision-making. However, when training data become stale due to changes in demographics, environment, or patient behaviors, model performance can degrade substantially. While updating models with new training data is necessary, such updates may also introduce new risks. We evaluated the proposed monitoring framework on four publicly available U.S.-based Type 1 Diabetes datasets containing high-resolution continuous glucose monitoring (CGM) data, comprising approximately 11,300 weekly observations from 496 participants under 20 years of age. All datasets included structured sociodemographic information. Using the prediction of severe hyperglycemia events in children with type 1 diabetes as a case study, we examine how different model update strategies can…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.