Predictive divergence in machine learning models for clinical mortality risk: A multicohort study of covid-19 patients

Júlia Chaves Neuenschwander Magalhães; Alexandre Dias Porto Chiavegatto Filho

PMC · DOI:10.1371/journal.pone.0344354·March 6, 2026

Predictive divergence in machine learning models for clinical mortality risk: A multicohort study of covid-19 patients

Júlia Chaves Neuenschwander Magalhães, Alexandre Dias Porto Chiavegatto Filho

PDF

Open Access

TL;DR

This study shows that different machine learning models can give very different mortality risk predictions for the same patients, even when overall performance is similar.

Contribution

The study reveals that ML models with similar global performance can diverge in predictions across patient subgroups, emphasizing the need for subgroup-aware model evaluation.

Findings

01

ML models showed high overall performance (mean AUC of 0.855) but significant prediction divergence at the individual level (R² from 0.56 to 0.80).

02

Five patient subgroups had mortality rates ranging from 22% to 80%, with model performance varying significantly (F = 73.18, p < 0.001).

03

TabPFN and LightGBM performed best in the 'Anemia' subgroup, while TabPFN underperformed in the 'Immunodeficient' subgroup.

Abstract

Machine learning (ML) algorithms are increasingly used in healthcare to support clinical decision-making. While models with similar overall performance are often considered interchangeable for deployment, they may produce divergent predictions, a phenomenon known as algorithmic multiplicity. In such cases, the choice of algorithm may introduce bias. This study investigates the impacts of algorithmic multiplicity in mortality prediction and assesses the influence of patient characteristics on model decisions. A cohort of 4,337 adult patients (≥18 years) with RT-PCR–confirmed covid-19 from five tertiary care hospitals in Brazil was followed from March to August 2020. Five popular ML models for structured data were trained on demographic and laboratory data collected at early hospital admission to predict in-hospital mortality. Model performance, feature importance, and algorithmic…

Linked entities

Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.

Genes1

CRP

Proteins1

Species1

Homo sapiens(human · species)

Chemicals1

TabPFN

Diseases9

covid-19 acute inflammation deficiency in red blood cells ML respiratory compromise infection died Anemia sepsis

Figures15

Click any figure to enlarge with its caption.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Artificial Intelligence in Healthcare and Education · COVID-19 diagnosis using AI