A Large Language Model Outperforms Other Computational Approaches to the   High-Throughput Phenotyping of Physician Notes

Syed I. Munzir; Daniel B. Hier; Chelsea Oommen; Michael D. Carrithers

arXiv:2406.14757·cs.AI·June 24, 2024·1 cites

A Large Language Model Outperforms Other Computational Approaches to the High-Throughput Phenotyping of Physician Notes

Syed I. Munzir, Daniel B. Hier, Chelsea Oommen, Michael D. Carrithers

PDF

Open Access

TL;DR

This paper demonstrates that a Large Language Model, specifically GPT-4, outperforms traditional NLP and hybrid methods in automating the extraction of phenotypic information from physician notes, advancing precision medicine.

Contribution

The study introduces and evaluates a Large Language Model approach for high-throughput phenotyping, showing its superiority over existing computational methods.

Findings

01

GPT-4 achieved higher accuracy in phenotyping tasks.

02

Large Language Models outperform traditional NLP methods.

03

The approach enhances automation in processing electronic health records.

Abstract

High-throughput phenotyping, the automated mapping of patient signs and symptoms to standardized ontology concepts, is essential to gaining value from electronic health records (EHR) in the support of precision medicine. Despite technological advances, high-throughput phenotyping remains a challenge. This study compares three computational approaches to high-throughput phenotyping: a Large Language Model (LLM) incorporating generative AI, a Natural Language Processing (NLP) approach utilizing deep learning for span categorization, and a hybrid approach combining word vectors with machine learning. The approach that implemented GPT-4 (a Large Language Model) demonstrated superior performance, suggesting that Large Language Models are poised to be the preferred method for high-throughput phenotyping of physician notes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · AI in cancer detection · Electronic Health Records Systems

MethodsAttention Is All You Need · Softmax · Layer Normalization · Absolute Position Encodings · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer