# ‘Plugging the gap’: development of a plain language glossary for statistical methodology research

**Authors:** Sarah Booth, Molly Wells, Clareece Nevill, Lucy Teece, Barbara Czyznikowska, Gurpreet Grewal-Santini, Mary Mancini, Farheen Yameen, Suzanne C Freeman

PMC · DOI: 10.1186/s40900-025-00782-4 · 2025-10-14

## TL;DR

Researchers created a plain language glossary to help the public understand statistical research terms, filling gaps in existing resources.

## Contribution

A new plain language glossary of 64 statistical methodology terms, developed with public feedback to improve clarity and accessibility.

## Key findings

- Identified 64 statistical terms missing from existing public glossaries.
- Public feedback improved definitions by simplifying language and adding examples.
- The glossary is freely available online for public and research use.

## Abstract

Plain language glossaries of research-related terms are a useful resource for Patient and Public Involvement and Engagement (PPIE) activities. They can provide public contributors with a deeper understanding of aspects of the project such as the study design, the methods being used for data analysis, and the interpretation of results. However, whilst plain language glossaries of research-related terms exist, they do not always include definitions for concepts that are commonly used in statistical methodology research. The aims of this work were to (1 develop a plain language glossary of the statistical methodology research terms that were missing from the existing glossaries and 2) outline the process used to develop the glossary to aid researchers in producing a glossary relevant to their own research projects.

By conducting online searches and consulting members of the Biostatistics Research Group at the University of Leicester, we conducted a scoping exercise in August 2023 to identify glossaries aimed at members of the public that included definitions of statistical terms. We then reviewed the glossaries to develop a list of terms that are commonly used in the statistical methodology research conducted by the Biostatistics Research Group at the University of Leicester, which had not already been defined. Initial definitions of these terms were generated using ChatGPT (GPT-3.5). These were then refined and discussed with public contributors. Three cycles of PPIE feedback were used to further develop and update the definitions for use in the glossary.

We reviewed gaps in five existing glossaries and identified a list of 64 statistical terms to develop definitions for. These covered a range of concepts including different types of statistical models, Bayesian analysis, meta-analysis, and prognostic modelling. The feedback we received from public contributors focused on the level of language used, shortening the length of the definitions, and including examples to give context to the definitions.

We developed a plain language glossary of terms that are commonly used in statistical methodology research as a resource for public contributors taking part in PPIE. The glossary has been made available on the NIHR Leicester Biomedical Research Centre website (https://leicesterbrc.nihr.ac.uk/ppismart/ppismart-definitions/).

The online version contains supplementary material available at 10.1186/s40900-025-00782-4.

Statistical methodology research focuses on finding the best tools to analyse data and making sure they work well. It is important for members of the public to be involved in statistical methodology research. To help and encourage people to get involved, researchers at the University of Leicester have been making resources to help explain what statistical methodology research is.

Glossaries of research terms are a simple way for members of the public to understand what is going on and join in. We found five glossaries that included some words we often use but there were many words missing. The aim of this project was to create a new glossary that included the words that are often used by researchers working in statistical methodology research but were not already included in existing glossaries.

This paper explains the process of developing the glossary. First definitions were created using ChatGPT (GPT-3.5). Feedback from members of the public was then used to update the definitions. We repeated the feedback and updating cycle three times. This made sure that the definitions (i) were using the right level of language, (ii) were made up of short sentences, and (iii) included examples.

The final glossary includes simple definitions for 64 terms that are commonly used by statistical methodology researchers. The glossary is freely available on the NIHR website for anyone to use (https://leicesterbrc.nihr.ac.uk/ppismart/ppismart-definitions/). We hope that other researchers find the glossary a useful resource to help keep members of the public engaged with their research projects.

The online version contains supplementary material available at 10.1186/s40900-025-00782-4.

## Full-text entities

- **Genes:** PPIE (peptidylprolyl isomerase E) [NCBI Gene 10450] {aka CYP-33, CYP33, CypE}
- **Diseases:** learning (MESH:D007859), death (MESH:D003643)
- **Species:** Homo sapiens (human, species) [taxon 9606]
- **Cell lines:** NIHR — Homo sapiens (Human), Neuroblastoma, Cancer cell line (CVCL_1306)

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12523162/full.md

---
Source: https://tomesphere.com/paper/PMC12523162