Large Language Model Capabilities in Perioperative Risk Prediction and   Prognostication

Philip Chung; Christine T Fong; Andrew M Walters; Nima Aghaeepour,; Meliha Yetisgen; Vikas N O'Reilly-Shah

arXiv:2401.01620·cs.AI·January 4, 2024·1 cites

Large Language Model Capabilities in Perioperative Risk Prediction and Prognostication

Philip Chung, Christine T Fong, Andrew M Walters, Nima Aghaeepour,, Meliha Yetisgen, Vikas N O'Reilly-Shah

PDF

Open Access 1 Repo

TL;DR

This study evaluates GPT-4 Turbo's ability to predict perioperative risks and outcomes from clinical notes, showing moderate success in classification tasks but poor performance in duration predictions, indicating potential clinical utility.

Contribution

It demonstrates that large language models can assist in perioperative risk stratification and outcome prediction using clinical notes, highlighting their strengths and limitations.

Findings

01

Achieved F1 scores of 0.50 for ASA classification

02

Achieved F1 scores of 0.81 for ICU admission

03

Achieved F1 scores of 0.86 for hospital mortality

Abstract

We investigate whether general-domain large language models such as GPT-4 Turbo can perform risk stratification and predict post-operative outcome measures using a description of the procedure and a patient's clinical notes derived from the electronic health record. We examine predictive performance on 8 different tasks: prediction of ASA Physical Status Classification, hospital admission, ICU admission, unplanned admission, hospital mortality, PACU Phase 1 duration, hospital duration, and ICU duration. Few-shot and chain-of-thought prompting improves predictive performance for several of the tasks. We achieve F1 scores of 0.50 for ASA Physical Status Classification, 0.81 for ICU admission, and 0.86 for hospital mortality. Performance on duration prediction tasks were universally poor across all prompt strategies. Current generation large language models can assist clinicians in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

philipchung/llm-periop-prediction
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Cardiac, Anesthesia and Surgical Outcomes · Artificial Intelligence in Healthcare and Education

MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Softmax · Label Smoothing · Multi-Head Attention · Adam · Dropout · Absolute Position Encodings · Layer Normalization