TL;DR
This paper explores using pretrained neural language models to assist in medical text simplification, creating a semi-automated tool that improves the speed and quality of simplifying complex medical texts for broader accessibility.
Contribution
It introduces a new aligned medical dataset and demonstrates how combining multiple pretrained models enhances text simplification accuracy.
Findings
Ensemble model outperforms individual models by 2.1%.
Sentence context improves model performance by 6.17%.
Achieved 64.52% word prediction accuracy.
Abstract
The goal of text simplification (TS) is to transform difficult text into a version that is easier to understand and more broadly accessible to a wide variety of readers. In some domains, such as healthcare, fully automated approaches cannot be used since information must be accurately preserved. Instead, semi-automated approaches can be used that assist a human writer in simplifying text faster and at a higher quality. In this paper, we examine the application of autocomplete to text simplification in the medical domain. We introduce a new parallel medical data set consisting of aligned English Wikipedia with Simple English Wikipedia sentences and examine the application of pretrained neural language models (PNLMs) on this dataset. We compare four PNLMs(BERT, RoBERTa, XLNet, and GPT-2), and show how the additional context of the sentence to be simplified can be incorporated to achieve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsLinear Layer · WordPiece · Attention Dropout · Weight Decay · Attention Is All You Need · BERT · Byte Pair Encoding · Softmax · Adam · Layer Normalization
