LLMs for clinical risk prediction

Mohamed Rezk; Patricia Cabanillas Silva; and Fried-Michael Dahlweid

arXiv:2409.10191·cs.CL·September 17, 2024

LLMs for clinical risk prediction

Mohamed Rezk, Patricia Cabanillas Silva, and Fried-Michael Dahlweid

PDF

Open Access

TL;DR

This paper evaluates GPT-4 and clinalytix Medical AI for clinical risk prediction of delirium, revealing GPT-4's deficiencies and highlighting the current limitations of LLMs in healthcare decision-making.

Contribution

It provides a comparative analysis of LLMs' performance in clinical risk prediction, emphasizing the need for human oversight and identifying current limitations.

Findings

01

GPT-4 had significant deficiencies in identifying positive delirium cases.

02

Clinalytix Medical AI demonstrated superior accuracy in predictions.

03

LLMs face challenges in diagnosing conditions and interpreting complex clinical data.

Abstract

This study compares the efficacy of GPT-4 and clinalytix Medical AI in predicting the clinical risk of delirium development. Findings indicate that GPT-4 exhibited significant deficiencies in identifying positive cases and struggled to provide reliable probability estimates for delirium risk, while clinalytix Medical AI demonstrated superior accuracy. A thorough analysis of the large language model's (LLM) outputs elucidated potential causes for these discrepancies, consistent with limitations reported in extant literature. These results underscore the challenges LLMs face in accurately diagnosing conditions and interpreting complex clinical data. While LLMs hold substantial potential in healthcare, they are currently unsuitable for independent clinical decision-making. Instead, they should be employed in assistive roles, complementing clinical expertise. Continued human oversight…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Dropout