DocLens: Multi-aspect Fine-grained Evaluation for Medical Text Generation
Yiqing Xie, Sheng Zhang, Hao Cheng, Pengfei Liu, Zelalem Gero, Cliff, Wong, Tristan Naumann, Hoifung Poon, Carolyn Rose

TL;DR
This paper introduces DocLens, a multi-aspect evaluation framework for medical text generation that assesses completeness, conciseness, and attribution, showing higher agreement with medical experts than existing metrics.
Contribution
It proposes a novel set of fine-grained evaluation metrics for medical text generation, utilizing various evaluators, and demonstrates superior alignment with expert judgments.
Findings
DocLens outperforms existing metrics in agreement with medical experts.
The framework is effective across multiple medical text generation tasks.
Open-source evaluators require further improvement, as highlighted by the study.
Abstract
Medical text generation aims to assist with administrative work and highlight salient information to support decision-making. To reflect the specific requirements of medical text, in this paper, we propose a set of metrics to evaluate the completeness, conciseness, and attribution of the generated text at a fine-grained level. The metrics can be computed by various types of evaluators including instruction-following (both proprietary and open-source) and supervised entailment models. We demonstrate the effectiveness of the resulting framework, DocLens, with three evaluators on three tasks: clinical note generation, radiology report summarization, and patient question summarization. A comprehensive human study shows that DocLens exhibits substantially higher agreement with the judgments of medical experts than existing metrics. The results also highlight the need to improve open-source…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Biomedical Text Mining and Ontologies
MethodsSparse Evolutionary Training · Focus
