Beyond Accuracy: Risk-Sensitive Evaluation of Hallucinated Medical Advice
Savan Doshi

TL;DR
This paper introduces a risk-sensitive evaluation method for medical language models that assesses potential harm from hallucinated content, emphasizing the importance of impact over factual correctness.
Contribution
It proposes a novel framework that quantifies hallucinations based on risk-related language, moving beyond traditional correctness metrics to better evaluate clinical safety.
Findings
Models differ significantly in risk profiles despite similar accuracy.
Standard metrics do not capture high-risk hallucinations.
Risk-sensitive evaluation reveals safety concerns overlooked by traditional methods.
Abstract
Large language models are increasingly being used in patient-facing medical question answering, where hallucinated outputs can vary widely in potential harm. However, existing hallucination standards and evaluation metrics focus primarily on factual correctness, treating all errors as equally severe. This obscures clinically relevant failure modes, particularly when models generate unsupported but actionable medical language. We propose a risk-sensitive evaluation framework that quantifies hallucinations through the presence of risk-bearing language, including treatment directives, contraindications, urgency cues, and mentions of high-risk medications. Rather than assessing clinical correctness, our approach evaluates the potential impact of hallucinated content if acted upon. We further combine risk scoring with a relevance measure to identify high-risk, low-grounding failures. We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Neurobiology of Language and Bilingualism · Healthcare Decision-Making and Restraints
