Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration

Jeremy Qin; Bang Liu; Quoc Dinh Nguyen

arXiv:2409.03225·cs.CL·September 6, 2024

Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration

Jeremy Qin, Bang Liu, Quoc Dinh Nguyen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Atypical Presentations Recalibration, a novel method to improve the confidence calibration of healthcare LLMs by leveraging atypical cases, significantly reducing calibration errors and outperforming existing techniques.

Contribution

The paper presents a new recalibration technique using atypical presentations to enhance healthcare LLM confidence estimates, addressing overconfidence issues in high-stakes medical applications.

Findings

01

Calibration errors reduced by approximately 60% on medical datasets

02

Outperforms existing calibration methods like vanilla and CoT confidence

03

Provides analysis of atypicality's role in model calibration

Abstract

Black-box large language models (LLMs) are increasingly deployed in various environments, making it essential for these models to effectively convey their confidence and uncertainty, especially in high-stakes settings. However, these models often exhibit overconfidence, leading to potential risks and misjudgments. Existing techniques for eliciting and calibrating LLM confidence have primarily focused on general reasoning datasets, yielding only modest improvements. Accurate calibration is crucial for informed decision-making and preventing adverse outcomes but remains challenging due to the complexity and variability of tasks these models perform. In this work, we investigate the miscalibration behavior of black-box LLMs within the healthcare setting. We propose a novel method, \textit{Atypical Presentations Recalibration}, which leverages atypical presentations to adjust the model's…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jeremy-qin/medical_confidence_elicitation
pytorchOfficial

Videos

Enhancing Healthcare LLM Trust with Atypical Presentations Recalibration· underline

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies · Electronic Health Records Systems