Scaling Clinical Trial Matching Using Large Language Models: A Case Study in Oncology
Cliff Wong, Sheng Zhang, Yu Gu, Christine Moung, Jacob Abel, Naoto, Usuyama, Roshanthi Weerasinghe, Brian Piening, Tristan Naumann, Carlo, Bifulco, Hoifung Poon

TL;DR
This study explores using large language models like GPT-4 to improve clinical trial matching in oncology by structuring eligibility criteria and extracting matching logic, showing promising results and identifying key challenges.
Contribution
It demonstrates the potential of LLMs to enhance clinical trial matching processes and highlights areas for further improvement in accuracy and context handling.
Findings
LLMs can structure complex eligibility criteria effectively.
LLMs outperform previous baselines in matching tasks.
Challenges remain in context understanding and data accuracy.
Abstract
Clinical trial matching is a key process in health delivery and discovery. In practice, it is plagued by overwhelming unstructured data and unscalable manual processing. In this paper, we conduct a systematic study on scaling clinical trial matching using large language models (LLMs), with oncology as the focus area. Our study is grounded in a clinical trial matching system currently in test deployment at a large U.S. health network. Initial findings are promising: out of box, cutting-edge LLMs, such as GPT-4, can already structure elaborate eligibility criteria of clinical trials and extract complex matching logic (e.g., nested AND/OR/NOT). While still far from perfect, LLMs substantially outperform prior strong baselines and may serve as a preliminary solution to help triage patient-trial candidates with humans in the loop. Our study also reveals a few significant growth areas for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Machine Learning in Healthcare · Biomedical Text Mining and Ontologies
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Adam · Dense Connections · Label Smoothing · Residual Connection · Dropout · Absolute Position Encodings · Byte Pair Encoding
