A Large Language Model Pipeline for Breast Cancer Oncology

Tristen Pool; Dennis Trujillo

arXiv:2406.06455·cs.AI·June 17, 2024

A Large Language Model Pipeline for Breast Cancer Oncology

Tristen Pool, Dennis Trujillo

PDF

Open Access

TL;DR

This paper develops and fine-tunes large language models for breast cancer treatment decision support, achieving high accuracy and proposing a framework for evaluating their potential to outperform human oncologists in clinical settings.

Contribution

It introduces a novel Langchain prompt engineering pipeline for fine-tuning LLMs on oncology data and guidelines, with a focus on breast cancer treatment classification.

Findings

01

Achieved over 85% accuracy in classifying treatment options.

02

Estimated the model's potential to outperform oncologists in 8.2% to 13.3% of cases.

03

Proposed a framework for assessing LLMs' clinical decision-making performance.

Abstract

Large language models (LLMs) have demonstrated potential in the innovation of many disciplines. However, how they can best be developed for oncology remains underdeveloped. State-of-the-art OpenAI models were fine-tuned on a clinical dataset and clinical guidelines text corpus for two important cancer treatment factors, adjuvant radiation therapy and chemotherapy, using a novel Langchain prompt engineering pipeline. A high accuracy (0.85+) was achieved in the classification of adjuvant radiation therapy and chemotherapy for breast cancer patients. Furthermore, a confidence interval was formed from observational data on the quality of treatment from human oncologists to estimate the proportion of scenarios in which the model must outperform the original oncologist in its treatment prediction to be a better solution overall as 8.2% to 13.3%. Due to indeterminacy in the outcomes of cancer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies