# Building a Safe and Transparent Workflow for Large Language Model (LLM)-Assisted Clinical Trials and Prediction Models: A Technical Report

**Authors:** João Frutuoso

PMC · DOI: 10.7759/cureus.92571 · Cureus · 2025-09-17

## TL;DR

This paper introduces a seven-step workflow to safely and transparently use large language models in clinical trials and prediction models, ensuring accountability and scientific standards.

## Contribution

The novel contribution is a structured, auditable workflow with checklists aligned to international guidelines for integrating LLMs into clinical research.

## Key findings

- The workflow includes steps like governance, literature review, model evaluation, and privacy safeguards to ensure transparency.
- Reusable checklists map study types to reporting guidelines like CONSORT-AI and TRIPOD+AI.
- The framework mitigates risks like biased outputs and fabricated citations while maintaining human oversight.

## Abstract

The use of large language models (LLMs) in clinical trials and prediction models is expanding rapidly, offering opportunities for efficiency but also raising concerns about privacy, fairness, accuracy, and accountability. This technical report proposes a structured workflow to support research teams in adopting LLMs while preserving scientific standards and public trust. The workflow is organized into seven sequential steps: (i) scope definition and governance, (ii) retrieval-augmented literature review, (iii) model evaluation and benchmarking, (iv) documentation and audit trail, (v) expert quality gates, (vi) manuscript disclosure, and (vii) privacy and security safeguards.

To facilitate adoption, we provide reusable checklists that map study types to relevant international reporting guidelines, including Consolidated Standards of Reporting Trials - Artificial Intelligence (CONSORT-AI), Standard Protocol Items: Recommendations for Interventional Trials - Artificial Intelligence (SPIRIT-AI), Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis - Artificial Intelligence (TRIPOD+AI), Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA), and Developmental and Exploratory Clinical Investigations of Decision - Support Systems Driven by Artificial Intelligence (DECIDE-AI). The framework is designed to mitigate risks such as fabricated citations, biased outputs from skewed datasets, and over-reliance on automated text. Rather than replacing human reasoning, it aims to augment it, offering greater speed while maintaining accountability, reproducibility, and transparency.

By combining governance rules, technical safeguards, and human oversight, this workflow provides a practical and auditable path for integrating LLMs into clinical trials and prediction models without eroding confidence in scientific work.

## Full-text entities

- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12534133/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12534133/full.md

## References

20 references — full list in the complete paper: https://tomesphere.com/paper/PMC12534133/full.md

---
Source: https://tomesphere.com/paper/PMC12534133