MedReadMe: A Systematic Study for Fine-grained Sentence Readability in   Medical Domain

Chao Jiang; Wei Xu

arXiv:2405.02144·cs.CL·October 29, 2024

MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain

Chao Jiang, Wei Xu

PDF

Open Access 3 Models 1 Video

TL;DR

This paper introduces MedReadMe, a comprehensive dataset and analysis framework for assessing and improving sentence-level readability in medical texts, leveraging fine-grained annotations and large language models.

Contribution

It provides a new dataset with detailed annotations, benchmarks existing readability metrics, and demonstrates how adding jargon span features enhances their accuracy.

Findings

01

Adding jargon span counts improves readability metric correlation with human judgments.

02

Fine-grained span annotations enable better understanding of medical text complexity.

03

Benchmarking shows LLM-based methods outperform traditional metrics in medical readability.

Abstract

Medical texts are notoriously challenging to read. Properly measuring their readability is the first step towards making them more accessible. In this paper, we present a systematic study on fine-grained readability measurements in the medical domain at both sentence-level and span-level. We introduce a new dataset MedReadMe, which consists of manually annotated readability ratings and fine-grained complex span annotation for 4,520 sentences, featuring two novel "Google-Easy" and "Google-Hard" categories. It supports our quantitative analysis, which covers 650 linguistic features and automatic complex word and jargon identification. Enabled by our high-quality annotation, we benchmark and improve several state-of-the-art sentence-level readability metrics for the medical domain specifically, which include unsupervised, supervised, and prompting-based methods using recently developed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain· underline

Taxonomy

TopicsText Readability and Simplification · Topic Modeling