Can LLMs Correct Physicians, Yet? Investigating Effective Interaction   Methods in the Medical Domain

Burcu Sayin; Pasquale Minervini; Jacopo Staiano; Andrea Passerini

arXiv:2403.20288·cs.CL·May 7, 2024·1 cites

Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain

Burcu Sayin, Pasquale Minervini, Jacopo Staiano, Andrea Passerini

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how Large Language Models can assist physicians in medical decision-making by analyzing their interaction effectiveness, prompt sensitivity, and potential to improve diagnostic accuracy in clinical scenarios.

Contribution

It introduces a comprehensive evaluation of multiple LLMs in medical interactions, highlighting prompt design's impact and the models' ability to enhance diagnostic accuracy.

Findings

01

Prompt design significantly affects LLM accuracy.

02

Mistral improves physician diagnosis accuracy up to 74%.

03

Llama2 and Meditron are highly sensitive to prompts.

Abstract

We explore the potential of Large Language Models (LLMs) to assist and potentially correct physicians in medical decision-making tasks. We evaluate several LLMs, including Meditron, Llama2, and Mistral, to analyze the ability of these models to interact effectively with physicians across different scenarios. We consider questions from PubMedQA and several tasks, ranging from binary (yes/no) responses to long answer generation, where the answer of the model is produced after an interaction with a physician. Our findings suggest that prompt design significantly influences the downstream accuracy of LLMs and that LLMs can provide valuable feedback to physicians, challenging incorrect diagnoses and contributing to more accurate decision-making. For example, when the physician is accurate 38% of the time, Mistral can produce the correct answer, improving accuracy up to 74% depending on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

unitn-sml/physician-medLLM-interaction
noneOfficial

Videos

Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain· underline

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Healthcare Systems and Technology · Health Sciences Research and Education