# Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development

**Authors:** Min Zou, Leszek Popko, Michelle Gaudio

PMC · DOI: 10.1007/s43441-025-00785-z · Therapeutic Innovation & Regulatory Science · 2025-05-13

## TL;DR

This paper introduces a new method using large language models to automatically classify protocol deviations in clinical trials, improving efficiency and accuracy.

## Contribution

The novel contribution is a generalizable framework using LLMs with tailored prompts for classifying protocol deviations in clinical development.

## Key findings

- The LLM-based approach flagged over 80% of protocol deviations affecting disease progression assessment.
- The automated method provided actionable insights in minutes, compared to months of manual analysis.
- The solution identified gaps in first-line controls, aiding process improvement in clinical trials.

## Abstract

As described in ICH E3 Q&A R1 (International Council for Harmonisation. E3: Structure and content of clinical study reports—questions and answers (R1). 6 July 2012. Available from: https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf): “A protocol deviation (PD) is any change, divergence, or departure from the study design or procedures defined in the protocol”. A problematic area in human subject protection is the wide divergence among institutions, sponsors, investigators and IRBs regarding the definition of and the procedures for reviewing PDs. Despite industry initiatives like TransCelerate’s holistic approach [Galuchie et al. in Ther Innov Regul Sci 55:733–742, 2021], systematic trending and identification of impactful PDs remains limited. Traditional Natural Language Processing (NLP) methods are often cumbersome to implement, requiring extensive feature engineering and model tuning. However, the rise of Large Language Models (LLMs) has revolutionised text classification, enabling more accurate, nuanced, and context-aware solutions [Nguyen P. Test classification in the age of LLMs. 2024. Available from: https://blog.redsift.com/author/phong/]. An automated classification solution that enables efficient, flexible, and targeted PD classification is currently lacking.

We developed a novel approach using a large language model (LLM), Meta Llama2 [Meta. Llama 2: Open source, free for research and commercial use. 2023. Available from: https://www.llama.com/llama2/] with a tailored prompt to classify free-text PDs from Roches’ PD management system. The model outputs were analysed to identify trends and assess risks across clinical programs, supporting human decision-making. This method offers a generalisable framework for developing prompts and integrating data to address similar challenges in clinical development.

This approach flagged over 80% of PDs potentially affecting disease progression assessment, enabling expert review. Compared to months of manual analysis, this automated method produced actionable insights in minutes. The solution also highlighted gaps in first-line controls, supporting process improvement and better accuracy in disease progression handling during trials.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606], Lama glama (llama, species) [taxon 9844]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12181094/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12181094/full.md

## References

24 references — full list in the complete paper: https://tomesphere.com/paper/PMC12181094/full.md

---
Source: https://tomesphere.com/paper/PMC12181094