A Dataset for Plain Language Adaptation of Biomedical Abstracts
Kush Attal, Brian Ondov, Dina Demner-Fushman

TL;DR
This paper introduces a novel, manually aligned dataset of biomedical abstracts adapted into plain language, enabling better evaluation of deep learning models for health communication.
Contribution
The creation of the first document- and sentence-aligned dataset for biomedical abstract adaptation into plain language, facilitating improved model training and evaluation.
Findings
Benchmarking with state-of-the-art models establishes baseline performance.
The dataset contains 750 abstracts with 7643 sentence pairs.
Provides a resource for future research in biomedical plain language adaptation.
Abstract
Though exponentially growing health-related literature has been made available to a broad audience online, the language of scientific articles can be difficult for the general public to understand. Therefore, adapting this expert-level language into plain language versions is necessary for the public to reliably comprehend the vast health-related literature. Deep Learning algorithms for automatic adaptation are a possible solution; however, gold standard datasets are needed for proper evaluation. Proposed datasets thus far consist of either pairs of comparable professional- and general public-facing documents or pairs of semantically similar sentences mined from such documents. This leads to a trade-off between imperfect alignments and small test sets. To address this issue, we created the Plain Language Adaptation of Biomedical Abstracts dataset. This dataset is the first manually…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsTest
