Exploring Large Language Models for Specialist-level Oncology Care
Anil Palepu, Vikram Dhillon, Polly Niravath, Wei-Hung Weng, Preethi, Prasad, Khaled Saab, Ryutaro Tanno, Yong Cheng, Hanh Mai, Ethan Burns, Zainub, Ajmal, Kavita Kulkarni, Philip Mansfield, Dale Webster, Joelle Barral, Juraj, Gottweis, Mike Schaekermann, S. Sara Mahdavi

TL;DR
This study evaluates AMIE, a large language model with web search capabilities, for specialist breast oncology care, showing it outperforms trainees but still lags behind expert oncologists, highlighting its potential and current limitations.
Contribution
The paper demonstrates the application of an enhanced LLM in a complex oncology domain without domain-specific fine-tuning, using a novel evaluation framework with synthetic cases.
Findings
AMIE outperforms trainees and fellows in management planning
Web retrieval and self-critique improve LLM responses
AMIE's performance is below that of expert oncologists
Abstract
Large language models (LLMs) have shown remarkable progress in encoding clinical knowledge and responding to complex medical queries with appropriate clinical reasoning. However, their applicability in subspecialist or complex medical settings remains underexplored. In this work, we probe the performance of AMIE, a research conversational diagnostic AI system, in the subspecialist domain of breast oncology care without specific fine-tuning to this challenging domain. To perform this evaluation, we curated a set of 50 synthetic breast cancer vignettes representing a range of treatment-naive and treatment-refractory cases and mirroring the key information available to a multidisciplinary tumor board for decision-making (openly released with this work). We developed a detailed clinical rubric for evaluating management plans, including axes such as the quality of case summarization, safety…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRadiomics and Machine Learning in Medical Imaging · Artificial Intelligence in Healthcare and Education
MethodsSparse Evolutionary Training
