A foundation model for human-AI collaboration in medical literature mining
Zifeng Wang, Lang Cao, Qiao Jin, Joey Chan, Nicholas Wan, Behdad, Afzali, Hyun-Jin Cho, Chang-In Choi, Mehdi Emamverdi, Manjot K. Gill,, Sun-Hyung Kim, Yijia Li, Yi Liu, Hanley Ong, Justin Rousseau, Irfan Sheikh,, Jenny J. Wei, Ziyang Xu, Christopher M. Zallek, Kyungsang Kim

TL;DR
LEADS, a specialized AI foundation model, significantly improves medical literature mining by enhancing accuracy and efficiency in study selection and data extraction tasks, outperforming generic models and streamlining expert workflows.
Contribution
We introduce LEADS, a novel foundation model trained on extensive medical literature data, tailored for study search, screening, and data extraction, demonstrating superior performance over generic models.
Findings
LEADS improves recall in study selection to 0.81 from 0.77.
LEADS reduces data extraction time by 26.9%.
Experts using LEADS achieve higher accuracy in data extraction.
Abstract
Systematic literature review is essential for evidence-based medicine, requiring comprehensive analysis of clinical trial publications. However, the application of artificial intelligence (AI) models for medical literature mining has been limited by insufficient training and evaluation across broad therapeutic areas and diverse tasks. Here, we present LEADS, an AI foundation model for study search, screening, and data extraction from medical literature. The model is trained on 633,759 instruction data points in LEADSInstruct, curated from 21,335 systematic reviews, 453,625 clinical trial publications, and 27,015 clinical trial registries. We showed that LEADS demonstrates consistent improvements over four cutting-edge generic large language models (LLMs) on six tasks. Furthermore, LEADS enhances expert workflows by providing supportive references following expert requests, streamlining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
