Efficient Standardization of Clinical Notes using Large Language Models

Daniel B. Hier; Michael D. Carrithers; Thanh Son Do; Tayo; Obafemi-Ajayi

arXiv:2501.00644·cs.CL·January 3, 2025

Efficient Standardization of Clinical Notes using Large Language Models

Daniel B. Hier, Michael D. Carrithers, Thanh Son Do, Tayo, Obafemi-Ajayi

PDF

Open Access

TL;DR

This paper introduces a large language model-based method to standardize clinical notes, improving their consistency, readability, and readiness for data extraction and interoperability in healthcare systems.

Contribution

The study presents a novel LLM approach for comprehensive clinical note standardization, addressing grammatical, spelling, terminology, abbreviations, and formatting inconsistencies.

Findings

01

Corrected an average of 4.9 grammatical errors per note

02

Expanded 15.8 abbreviations and acronyms per note

03

No significant data loss observed after standardization

Abstract

Clinician notes are a rich source of patient information but often contain inconsistencies due to varied writing styles, colloquialisms, abbreviations, medical jargon, grammatical errors, and non-standard formatting. These inconsistencies hinder the extraction of meaningful data from electronic health records (EHRs), posing challenges for quality improvement, population health, precision medicine, decision support, and research. We present a large language model approach to standardizing a corpus of 1,618 clinical notes. Standardization corrected an average of $4.9 + / - 1.8$ grammatical errors, $3.3 + / - 5.2$ spelling errors, converted $3.1 + / - 3.0$ non-standard terms to standard terminology, and expanded $15.8 + / - 9.1$ abbreviations and acronyms per note. Additionally, notes were re-organized into canonical sections with standardized headings. This process prepared notes for key…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies