TL;DR
This paper introduces a new dataset for question-driven summarization of consumer health answers, enabling evaluation of machine-generated summaries to improve medical information accessibility.
Contribution
It presents the MEDIQA Answer Summarization dataset, the first collection of question-driven health answer summaries, and benchmarks it with baseline and state-of-the-art models.
Findings
The dataset effectively evaluates machine-generated health summaries.
Baseline models provide a reference point for future improvements.
State-of-the-art models show potential but still have room for enhancement.
Abstract
Automatic summarization of natural language is a widely studied area in computer science, one that is broadly applicable to anyone who routinely needs to understand large quantities of information. For example, in the medical domain, recent developments in deep learning approaches to automatic summarization have the potential to make health information more easily accessible to patients and consumers. However, to evaluate the quality of automatically generated summaries of health information, gold-standard, human generated summaries are required. Using answers provided by the National Library of Medicine's consumer health question answering system, we present the MEDIQA Answer Summarization dataset, the first summarization collection containing question-driven summaries of answers to consumer health questions. This dataset can be used to evaluate single or multi-document summaries…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
