Trustworthy and Practical AI for Healthcare: A Guided Deferral System   with Large Language Models

Joshua Strong; Qianhui Men; Alison Noble

arXiv:2406.07212·cs.CL·February 27, 2025·1 cites

Trustworthy and Practical AI for Healthcare: A Guided Deferral System with Large Language Models

Joshua Strong, Qianhui Men, Alison Noble

PDF

Open Access

TL;DR

This paper introduces a Human-AI collaboration system using large language models for healthcare, focusing on trustworthiness by deferring uncertain predictions to humans and addressing calibration issues in imbalanced data.

Contribution

It presents a novel guided deferral system for healthcare LLMs, combining medical report parsing with uncertainty-based human deferral, and proposes the Imbalanced Expected Calibration Error metric.

Findings

01

Effective in classifying medical reports and deferring uncertain cases

02

Open-source LLMs tailored for healthcare deployment

03

Highlights calibration challenges in imbalanced healthcare data

Abstract

Large language models (LLMs) offer a valuable technology for various applications in healthcare. However, their tendency to hallucinate and the existing reliance on proprietary systems pose challenges in environments concerning critical decision-making and strict data privacy regulations, such as healthcare, where the trust in such systems is paramount. Through combining the strengths and discounting the weaknesses of humans and AI, the field of Human-AI Collaboration (HAIC) presents one front for tackling these challenges and hence improving trust. This paper presents a novel HAIC guided deferral system that can simultaneously parse medical reports for disorder classification, and defer uncertain predictions with intelligent guidance to humans. We develop methodology which builds efficient, effective and open-source LLMs for this purpose, for the real-world deployment in healthcare. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Data Quality and Management