Multilingual Simplification of Medical Texts

Sebastian Joseph; Kathryn Kazanas; Keziah Reina; Vishnesh J.; Ramanathan; Wei Xu; Byron C. Wallace; and Junyi Jessy Li

arXiv:2305.12532·cs.CL·October 19, 2023·1 cites

Multilingual Simplification of Medical Texts

Sebastian Joseph, Kathryn Kazanas, Keziah Reina, Vishnesh J., Ramanathan, Wei Xu, Byron C. Wallace, and Junyi Jessy Li

PDF

Open Access 1 Repo

TL;DR

This paper introduces a multilingual dataset for simplifying complex medical texts into four languages and evaluates models on this task, aiming to improve health literacy across language barriers.

Contribution

It presents the first sentence-aligned multilingual medical text simplification dataset and evaluates models in multilingual and zero-shot settings.

Findings

01

Models can generate viable simplified texts

02

Multilingual dataset enables cross-language simplification research

03

Challenges remain in improving model quality and consistency

Abstract

Automated text simplification aims to produce simple versions of complex texts. This task is especially useful in the medical domain, where the latest medical findings are typically communicated via complex and technical articles. This creates barriers for laypeople seeking access to up-to-date medical findings, consequently impeding progress on health literacy. Most existing work on medical text simplification has focused on monolingual settings, with the result that such evidence would be available only in just one language (most often, English). This work addresses this limitation via multilingual simplification, i.e., directly simplifying complex texts into simplified texts in multiple languages. We introduce MultiCochrane, the first sentence-aligned multilingual text simplification dataset for the medical domain in four languages: English, Spanish, French, and Farsi. We evaluate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sebajoe/multicochrane
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsText Readability and Simplification · Topic Modeling · Natural Language Processing Techniques