AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

Ramtin Babaeipour; Fran\c{c}ois Charest; Madison Wright

arXiv:2602.00052·cs.IR·April 20, 2026

AI-assisted Protocol Information Extraction For Improved Accuracy and Efficiency in Clinical Trial Workflows

Ramtin Babaeipour, Fran\c{c}ois Charest, Madison Wright

PDF

TL;DR

This paper evaluates an AI system using generative LLMs with Retrieval-Augmented Generation for automated clinical trial protocol information extraction, demonstrating improved accuracy and operational efficiency over standalone LLMs.

Contribution

It introduces a RAG-based AI approach that significantly enhances extraction accuracy and workflow speed in clinical trial protocol management.

Findings

01

RAG process achieves 89.0% extraction accuracy, outperforming 62.6% of standalone LLMs.

02

AI-assisted workflows are completed 40% faster and rated as less cognitively demanding.

03

Users strongly prefer AI-assisted tasks, supporting integration into clinical workflows.

Abstract

Increasing clinical trial protocol complexity, amendments, and challenges around knowledge management create significant burden for trial teams. Structuring protocol content into standard formats has the potential to improve efficiency, support documentation quality, and strengthen compliance. We evaluate an Artificial Intelligence (AI) system using generative LLMs with Retrieval-Augmented Generation (RAG) for automated clinical trial protocol information extraction. We compare the extraction accuracy of our clinical-trial-specific RAG process against that of publicly available (standalone) LLMs. We also assess the operational impact of AI-assistance on simulated extraction Clinical Research Coordinator (CRC) workflows. Our RAG process shows higher extraction accuracy (89.0%) than standalone LLMs with fine-tuned prompts (62.6%) against expert-supported reference annotations. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.