# A Natural Language Processing Approach to Identify Negative Patient Descriptors in Electronic Health Records for Maternal Care

**Authors:** Azade Tabaie, Angela D. Thomas, Emily K. Mutondo, Allan Fong

PMC · DOI: 10.1055/a-2703-7227 · Applied Clinical Informatics · 2025-10-28

## TL;DR

This study uses NLP to find negative language in medical notes about Black mothers, showing biased documentation may contribute to maternal harm.

## Contribution

A novel NLP method identifies negative patient descriptors in EHRs and links them to demographic disparities in maternal care.

## Key findings

- 719 notes with negative descriptors were found, disproportionately linked to Black patients.
- Negative descriptors were more common in younger patients and those with public insurance.
- Black patients had higher odds of being labeled with negative descriptors despite fewer overall cases.

## Abstract

Maternal harm, especially for Black women, is a significant health care issue. Unstructured clinical notes in electronic health records (EHRs) may reveal unsafe maternal care. Prior studies using natural language processing (NLP) have shown that tone and sentiment in notes contribute to preventable safety events.

This study aimed to examine whether negative patient descriptors in EHR clinical notes are associated with adverse maternal outcomes and how their use varies by patient demographics.

We conducted a retrospective cohort study of women who delivered at two large birthing hospitals in Washington, DC between January 1, 2016 and March 31, 2020. Using a predefined list of negative keywords (e.g., combative) and NLP, we identified sentences from clinical notes for manual review. Two subject matter experts labeled keywords as “negative descriptors” if they negatively described patients. A logistic regression model with elastic net regularization was trained on the labeled sentences to classify the remaining corpus. We evaluated the prevalence of negative descriptors by race, age, insurance type, and pregnancy outcomes, and calculated adjusted odds ratios.

Among 190,026 clinical notes from 9,302 patients, 719 notes associated with 444 patients contained at least one negative descriptor. Of these, 313 (70.5%) were Black, 45 (10.1%) were White, and 86 (19.4%) were from Other racial groups (
p
 < 0.001). Negative descriptors were more common among younger patients (18–29 years: 49.3%) and those with Medicare/Medicaid insurance (65.3%). Although case patients—defined as those with postpartum readmission or severe maternal morbidity—had slightly fewer descriptors overall, they had higher adjusted odds of having them. Black patients were associated with higher odds, and commercial insurance with lower odds, of having negative descriptors.

Negative descriptors appear disproportionately in the notes of Black patients and those with public insurance, suggesting implicit bias in documentation. Addressing biased language is essential for improving equity in maternal care.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12566921/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12566921/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12566921/full.md

---
Source: https://tomesphere.com/paper/PMC12566921