# Training and Experience in Study Selection (TESS): study protocol for a pilot randomised trial within a systematic review

**Authors:** Elayne Ahern, Temitayo Adedeji, Aoife Whiston, Sarah Dillon, Fiona Lynn, Declan Devane, Elayne Ahern

PMC · DOI: 10.12688/hrbopenres.14129.1 · 2025-04-14

## TL;DR

This study tests how training and experience affect the reliability of novice researchers in selecting studies for systematic reviews.

## Contribution

It introduces a pilot randomized trial to evaluate the impact of training and screener pairing on novice performance in systematic reviews.

## Key findings

- Reliability of screening decisions will be measured using Cohen’s kappa and percentage of agreement against expert standards.
- Secondary outcomes include validity metrics like false positives/negatives and feasibility factors like task completion time.

## Abstract

Systematic reviews can be resource-intensive and require timely completion, yet limited availability of experienced reviewers often necessitates incorporating novice members into review teams. The purpose of this Study Within A Review (SWAR) will be to determine whether training and level of experience within the screening pair affects the reliability of decisions made by novice screeners during study selection for a systematic review

A 2(training: task-specific, minimal guidance) x 2(experience level of screening partner, ‘Reviewer 1’: moderate experience, minimal experience) pilot randomised trial will be conducted within a host systematic review in the topic area of depression and psychosocial functioning. Participants (
N = 12), consisting of higher education students with no prior experience in evidence synthesis, will be randomised to one of the four conditions to complete a standardised study selection task at title/abstract level (
k = 219 records) on Covidence systematic review screening software, blindly and independently. Total participation time is estimated at 5 hours. Screening decisions made by participants will be assessed for reliability against the consensus-based decisions by two reviewers with content and methodological expertise (expert standard), through calculation of chance-corrected Cohen’s kappa and percentage of agreement, then compared across the conditions. Secondary outcomes will include reliability within the screening pair (participant and allocated screening partner), validity of screening decisions (false positives, false negatives, sensitivity, specificity), feasibility measures, including time taken to complete the study selection task and success of blinding, as well as acceptability.

Findings will be used to inform the design of subsequent trial work to determine the efficacy of training and screener pairing for study selection. Ultimately, these insights will help to build capacity among novice screeners to engage with evidence synthesis and work alongside experienced review teams.

Northern Ireland Hub for Trials Methodology Research SWAR Registry:
SWAR 38.

## Linked entities

- **Diseases:** depression (MONDO:0002050)

## Full-text entities

- **Diseases:** depression (MESH:D003866)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12982981/full.md

---
Source: https://tomesphere.com/paper/PMC12982981