# Improving item pool utilization for health professions examinations under variable-length computerized adaptive testing designs: a shadow-test approach

**Authors:** Hwanggyu Lim, Kyung (Chris) Tyek Han

PMC · DOI: 10.3352/jeehp.2025.22.35 · 2025-11-03

## TL;DR

This paper introduces new algorithms to improve the efficiency and sustainability of adaptive testing in health professions exams while maintaining content validity.

## Contribution

The study proposes and validates new algorithms that significantly improve item pool utilization in shadow-test adaptive testing frameworks.

## Key findings

- Modification 2 reduced unused items from 35.6% to 5.0% in variable-length shadow CAT.
- The proposed methods improved item exposure rates and maintained measurement precision.
- The new framework offers a secure and sustainable solution for high-stakes health profession assessments.

## Abstract

The shadow-test approach to computerized adaptive testing (CAT) ensures content validity in health professions examinations but may suffer from poor item pool utilization in variable-length designs, increasing operational costs and security risks. This study aimed to address this challenge by developing algorithms that enhance the sustainability of shadow CAT in variable-length design.

A simulation study was conducted to evaluate 3 proposed modifications of the α-stratification method designed to improve item pool utilization. These methods, which integrated randomesque selection and multiple-form strategies, were compared with 2 baseline algorithms within a variable-length shadow CAT framework. Performance was assessed in terms of measurement precision, pool utilization, and test efficiency.

The proposed modifications significantly outperformed the baseline methods across all measures of item pool utilization and exposure control. The most effective method (Modification 2) reduced the proportion of unused items from 35.6% to 5.0% and produced more uniform item exposure rates. These substantial gains in operational sustainability were achieved while maintaining measurement precision comparable to the baseline methods.

The proposed algorithms effectively mitigate poor item pool utilization in shadow CAT under variable-length design. This enhanced framework provides a robust, secure, and sustainable solution for high-stakes adaptive assessments in the health professions that remain content-valid, precise, and operationally efficient.

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12976627/full.md

---
Source: https://tomesphere.com/paper/PMC12976627