Evaluation of Large Language Models for Summarization Tasks in the   Medical Domain: A Narrative Review

Emma Croxford; Yanjun Gao; Nicholas Pellegrino; Karen K. Wong; Graham; Wills; Elliot First; Frank J. Liao; Cherodeep Goswami; Brian Patterson; Majid; Afshar

arXiv:2409.18170·cs.CL·September 30, 2024

Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review

Emma Croxford, Yanjun Gao, Nicholas Pellegrino, Karen K. Wong, Graham, Wills, Elliot First, Frank J. Liao, Cherodeep Goswami, Brian Patterson, Majid, Afshar

PDF

Open Access

TL;DR

This paper reviews how large language models are evaluated for medical text summarization, highlighting current challenges and proposing future directions to improve assessment methods in high-stakes clinical applications.

Contribution

It provides a comprehensive overview of evaluation methods for clinical summarization by large language models and suggests future research directions to overcome resource limitations.

Findings

01

Current evaluation methods are resource-intensive and limited in scope.

02

There is a need for standardized, scalable evaluation frameworks.

03

Future directions include developing automated and semi-automated evaluation techniques.

Abstract

Large Language Models have advanced clinical Natural Language Generation, creating opportunities to manage the volume of medical text. However, the high-stakes nature of medicine requires reliable evaluation, which remains a challenge. In this narrative review, we assess the current evaluation state for clinical summarization tasks and propose future directions to address the resource constraints of expert human evaluation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies