# Application of a natural language processing algorithm to early asthma ascertainment for adults in the era of electronic health records

**Authors:** Chung-Il Wi, Thanai Pongdee, Hee Yun Seol, Sunghwan Sohn, Elham Sagheb, Bhavani Singh Agnikula Kshatriya, Shauna M. Overgaard, Deepak K. Sharma, Sungrim Moon, Elizabeth A. Krusemark, Dave Watson, Sergio E. Chiarella, Miguel A. Park, Jason D. Greenwood, Randy M. Foss, Zhandong Liu, Meera Gupta, Carla M. Davis, Wade Schulz, Hongfang Liu, Young J. Juhn

PMC · DOI: 10.1016/j.jacig.2025.100618 · The Journal of Allergy and Clinical Immunology: Global · 2025-11-26

## TL;DR

A natural language processing algorithm was validated to automatically identify adult asthma cases from electronic health records, showing high accuracy and potential for large-scale clinical use.

## Contribution

The study validates an NLP algorithm for adult asthma identification in EHRs, demonstrating its feasibility and accuracy.

## Key findings

- NLP-PAC identified 98 subjects with asthma, with 89 overlapping with manual chart review.
- The algorithm showed high sensitivity (92%) and specificity (99%) before and after a new EHR system implementation.
- Risk factors for asthma identified by NLP-PAC and manual chart review were similar.

## Abstract

The natural language processing (NLP) algorithm for predetermined asthma criteria (NLP-PAC) was successfully developed and validated for automatically ascertaining pediatric asthma from electronic health record (EHRs) systems. A scalable, efficient, and automated tool for ascertaining adult asthma status from EHRs remains nonexistent.

We validated NLP-PAC enabling ascertainment and early identification of adult asthma status in their EHRs.

We applied the validated NLP-PAC to EHRs of a convenient sample (adult cohorts who participated in our previous population-based studies) in which a reference standard (ie, asthma status defined by manual chart review) is available. The performance of NLP-PAC was assessed by determining criterion validity against manual chart review and construct validity before and after the new EHR (Epic) system was implemented in 2018.

The cohort consisted of 1,898 subjects, with 43% male and a median age at time of last follow-up of 65 years (interquartile range, 55-76). Manual chart review and NLP-PAC identified 97 (5.1%) and 98 (5.1%) subjects with asthma, respectively, with 89 subjects commonly identified by both methods. The sensitivity, specificity, positive predictive value, and negative predictive value of NLP-PAC were 92%, 99%, 91%, and 99%, respectively, before the new EHR system was implement, which remained similar after introducing the system (95%, 88%, 96%, and 85%, respectively). The risk factors for asthma identified either by NLP-PAC or manual chart review were similar.

Automatic asthma ascertainment for adults based on EHR data is feasible with our NLP algorithm, offering immense scientific and clinical value for large-scale clinical research and population management for adult asthma care.

## Linked entities

- **Diseases:** asthma (MONDO:0004979)

## Full-text entities

- **Diseases:** asthma (MESH:D001249)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12769801/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12769801/full.md

## References

56 references — full list in the complete paper: https://tomesphere.com/paper/PMC12769801/full.md

---
Source: https://tomesphere.com/paper/PMC12769801