Extracting Post-Acute Sequelae of SARS-CoV-2 Infection Symptoms from Clinical Notes via Hybrid Natural Language Processing

Zilong Bai; Zihan Xu; Cong Sun; Chengxi Zang; H. Timothy Bunnell; Catherine Sinfield; Jacqueline Rutter; Aaron Thomas Martinez; L. Charles Bailey; Mark Weiner; Thomas R. Campion; Thomas Carton; Christopher B. Forrest; Rainu Kaushal; Fei Wang; Yifan Peng

arXiv:2508.12405·cs.CL·August 19, 2025

Extracting Post-Acute Sequelae of SARS-CoV-2 Infection Symptoms from Clinical Notes via Hybrid Natural Language Processing

Zilong Bai, Zihan Xu, Cong Sun, Chengxi Zang, H. Timothy Bunnell, Catherine Sinfield, Jacqueline Rutter, Aaron Thomas Martinez, L. Charles Bailey, Mark Weiner, Thomas R. Campion, Thomas Carton, Christopher B. Forrest, Rainu Kaushal, Fei Wang, Yifan Peng

PDF

Open Access

TL;DR

This paper presents a hybrid NLP pipeline combining rule-based and BERT models to extract and detect PASC symptoms from clinical notes, improving diagnosis accuracy and efficiency across multiple health systems.

Contribution

We developed a novel hybrid NLP approach with a comprehensive PASC lexicon, validated across multiple sites, demonstrating high accuracy and processing speed for symptom extraction from clinical notes.

Findings

01

Achieved an average F1 score of 0.82 internally and 0.76 externally.

02

Processed notes in approximately 2.45 seconds each.

03

Showed strong correlation ($\rho > 0.83$) between model mentions and clinical diagnoses.

Abstract

Accurately and efficiently diagnosing Post-Acute Sequelae of COVID-19 (PASC) remains challenging due to its myriad symptoms that evolve over long- and variable-time intervals. To address this issue, we developed a hybrid natural language processing pipeline that integrates rule-based named entity recognition with BERT-based assertion detection modules for PASC-symptom extraction and assertion detection from clinical notes. We developed a comprehensive PASC lexicon with clinical specialists. From 11 health systems of the RECOVER initiative network across the U.S., we curated 160 intake progress notes for model development and evaluation, and collected 47,654 progress notes for a population-level prevalence study. We achieved an average F1 score of 0.82 in one-site internal validation and 0.76 in 10-site external validation for assertion detection. Our pipeline processed each note at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Sentiment Analysis and Opinion Mining · Mental Health via Writing