Retrieval-Augmented LLMs for Evidence Localization in Clinical Trial Recruitment from Longitudinal EHR Narratives
Ziyi Chen, Mengxian Lyu, Cheng Peng, Yonghui Wu

TL;DR
This study evaluates various large language models and strategies for improving clinical trial patient screening from long electronic health record narratives, demonstrating that generative LLMs with retrieval-augmented strategies outperform other methods.
Contribution
It systematically compares encoder- and decoder-based LLMs and introduces strategies to handle long documents, achieving state-of-the-art results on a clinical trial screening benchmark.
Findings
MedGemma with RAG achieved 89.05% micro-F1 score.
Generative LLMs excel in long-term reasoning across lengthy documents.
Specific criteria are needed to select optimal LLM strategies for real-world adoption.
Abstract
Screening patients for enrollment is a well-known, labor-intensive bottleneck that leads to under-enrollment and, ultimately, trial failures. Recent breakthroughs in large language models (LLMs) offer a promising opportunity to use artificial intelligence to improve screening. This study systematically explored both encoder- and decoder-based generative LLMs for screening clinical narratives to facilitate clinical trial recruitment. We examined both general-purpose LLMs and medical-adapted LLMs and explored three strategies to alleviate the "Lost in the Middle" issue when handling long documents, including 1) Original long-context: using the default context windows of LLMs, 2) NER-based extractive summarization: converting the long document into summarizations using named entity recognition, 3) RAG: dynamic evidence retrieval based on eligibility criteria. The 2018 N2C2 Track 1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
