AutoMeTS: The Autocomplete for Medical Text Simplification

Hoang Van; David Kauchak; Gondy Leroy

arXiv:2010.10573·cs.CL·October 22, 2020

AutoMeTS: The Autocomplete for Medical Text Simplification

Hoang Van, David Kauchak, Gondy Leroy

PDF

1 Repo

TL;DR

This paper explores using pretrained neural language models to assist in medical text simplification, creating a semi-automated tool that improves the speed and quality of simplifying complex medical texts for broader accessibility.

Contribution

It introduces a new aligned medical dataset and demonstrates how combining multiple pretrained models enhances text simplification accuracy.

Findings

01

Ensemble model outperforms individual models by 2.1%.

02

Sentence context improves model performance by 6.17%.

03

Achieved 64.52% word prediction accuracy.

Abstract

The goal of text simplification (TS) is to transform difficult text into a version that is easier to understand and more broadly accessible to a wide variety of readers. In some domains, such as healthcare, fully automated approaches cannot be used since information must be accurately preserved. Instead, semi-automated approaches can be used that assist a human writer in simplifying text faster and at a higher quality. In this paper, we examine the application of autocomplete to text simplification in the medical domain. We introduce a new parallel medical data set consisting of aligned English Wikipedia with Simple English Wikipedia sentences and examine the application of pretrained neural language models (PNLMs) on this dataset. We compare four PNLMs(BERT, RoBERTa, XLNet, and GPT-2), and show how the additional context of the sentence to be simplified can be incorporated to achieve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

vanh17/MedTextSimplifier
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLinear Layer · WordPiece · Attention Dropout · Weight Decay · Attention Is All You Need · BERT · Byte Pair Encoding · Softmax · Adam · Layer Normalization