Automating clinical phenotyping using natural language processing
Linea Schmidt, Susanne Ibing, Florian Borchert, Julian Hugo, Allison A. Marshall, Jellyana Peraza, Judy H. Cho, Erwin P. Böttinger, Bernhard Y. Renard, Ryan C. Ungaro

TL;DR
This study compares rule-based NLP and GPT-4 for extracting Crohn’s disease features from clinical notes, showing high accuracy and potential to automate chart reviews.
Contribution
The first study to explore LLM-based phenotyping for Crohn’s sub-phenotypes using sentence-level datasets and direct comparison with rule-based methods.
Findings
GPT-4 achieved F1 scores of at least 0.90 for disease behavior and 0.82 for age at diagnosis at the note level.
Combining rule-based and LLM approaches improved precision and enabled prioritization of chart reviews.
Performance was comparable to human experts with no statistically significant difference.
Abstract
Real-world studies based on electronic health records often require manual chart review to derive patients’ clinical phenotypes, a labor-intensive task with limited scalability. Here, we developed and compared computable phenotyping based on rules using the spaCy framework and a Large Language Model (LLM), GPT-4, for sub-phenotyping of patients with Crohn’s disease, considering age at diagnosis and disease behavior. For our rule-based approach, we leveraged the spaCy framework and for the LLM-based approach, we used the GPT-4 model. The underlying data included 49,572 clinical notes and 2204 radiology reports from 584 Crohn’s disease patients. A test set of 280 clinical texts was labeled at sentence-level, in addition to patient-level ground truth data. The algorithms were evaluated based on their recall, precision, specificity values, and F1 scores. Overall, we observe similar or…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Electronic Health Records Systems · Artificial Intelligence in Healthcare and Education
