Automated Extraction of Fluoropyrimidine Treatment and Treatment-Related Toxicities from Clinical Notes Using Natural Language Processing
Xizhi Wu, Madeline S. Kreider, Philip E. Empey, Chenyu Li, Yanshan Wang

TL;DR
This study compares various NLP methods, including large language models, for extracting treatment and toxicity information related to fluoropyrimidines from clinical notes, demonstrating LLMs' superior performance.
Contribution
It introduces a comprehensive evaluation of NLP approaches, especially LLMs, for extracting oncology treatment and toxicity data from unstructured clinical notes.
Findings
Error-analysis prompting achieved perfect F1 scores (1.000) for treatment and toxicities.
Zero-shot prompting reached F1=1.000 for treatment and 0.876 for toxicities.
LMM-based approaches outperformed traditional machine learning and deep learning methods.
Abstract
Objective: Fluoropyrimidines are widely prescribed for colorectal and breast cancers, but are associated with toxicities such as hand-foot syndrome and cardiotoxicity. Since toxicity documentation is often embedded in clinical notes, we aimed to develop and evaluate natural language processing (NLP) methods to extract treatment and toxicity information. Materials and Methods: We constructed a gold-standard dataset of 236 clinical notes from 204,165 adult oncology patients. Domain experts annotated categories related to treatment regimens and toxicities. We developed rule-based, machine learning-based (Random Forest, Support Vector Machine [SVM], Logistic Regression [LR]), deep learning-based (BERT, ClinicalBERT), and large language models (LLM)-based NLP approaches (zero-shot and error-analysis prompting). Models used an 80:20 train-test split. Results: Sufficient data existed to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPharmacovigilance and Adverse Drug Reactions · Colorectal Cancer Treatments and Studies · Computational Drug Discovery Methods
