Towards objectively evaluating the quality of generated medical summaries
Francesco Moramarco, Damir Juric, Aleksandar Savkov, Ehud Reiter

TL;DR
This paper introduces an objective evaluation method for generated medical summaries by counting facts and calculating standard metrics, aiming to improve reproducibility and accuracy in assessment.
Contribution
It presents a novel fact-counting evaluation approach for medical summaries, enhancing objectivity and reproducibility over traditional subjective methods.
Findings
The method provides consistent and objective quality assessments.
It is particularly effective for medical report summarization.
The approach simplifies evaluation processes in medical NLP tasks.
Abstract
We propose a method for evaluating the quality of generated text by asking evaluators to count facts, and computing precision, recall, f-score, and accuracy from the raw counts. We believe this approach leads to a more objective and easier to reproduce evaluation. We apply this to the task of medical report summarisation, where measuring objective quality and accuracy is of paramount importance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Semantic Web and Ontologies · Topic Modeling
