A Large Language Model Outperforms Other Computational Approaches to the High-Throughput Phenotyping of Physician Notes
Syed I. Munzir, Daniel B. Hier, Chelsea Oommen, Michael D. Carrithers

TL;DR
This paper demonstrates that a Large Language Model, specifically GPT-4, outperforms traditional NLP and hybrid methods in automating the extraction of phenotypic information from physician notes, advancing precision medicine.
Contribution
The study introduces and evaluates a Large Language Model approach for high-throughput phenotyping, showing its superiority over existing computational methods.
Findings
GPT-4 achieved higher accuracy in phenotyping tasks.
Large Language Models outperform traditional NLP methods.
The approach enhances automation in processing electronic health records.
Abstract
High-throughput phenotyping, the automated mapping of patient signs and symptoms to standardized ontology concepts, is essential to gaining value from electronic health records (EHR) in the support of precision medicine. Despite technological advances, high-throughput phenotyping remains a challenge. This study compares three computational approaches to high-throughput phenotyping: a Large Language Model (LLM) incorporating generative AI, a Natural Language Processing (NLP) approach utilizing deep learning for span categorization, and a hybrid approach combining word vectors with machine learning. The approach that implemented GPT-4 (a Large Language Model) demonstrated superior performance, suggesting that Large Language Models are poised to be the preferred method for high-throughput phenotyping of physician notes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · AI in cancer detection · Electronic Health Records Systems
MethodsAttention Is All You Need · Softmax · Layer Normalization · Absolute Position Encodings · Byte Pair Encoding · Label Smoothing · Position-Wise Feed-Forward Layer · Dropout · Adam · Linear Layer
