# AI-enhanced adaptive testing with cognitive diagnostic feedback and its association with performance in undergraduate surgical education: a pilot study

**Authors:** Nuno Silva Gonçalves, Carlos Collares, José Miguel Pêgo

PMC · DOI: 10.3389/fnbeh.2025.1735237 · Frontiers in Behavioral Neuroscience · 2026-01-06

## TL;DR

This pilot study explores how AI-enhanced adaptive testing with cognitive feedback can improve performance in undergraduate surgical education.

## Contribution

The study introduces AI-assisted adaptive testing with cognitive diagnostic feedback in surgical education and shows its association with improved summative performance.

## Key findings

- Students who completed the AI-enhanced adaptive test scored higher on the Progress Test.
- Memory skills were the strongest predictor of summative outcomes.
- Feedback and assessment of decision-making need refinement for better impact.

## Abstract

Effective feedback in the cognitive domain is essential for surgical education but often limited by resource constraints and traditional assessment formats. Artificial Intelligence (AI) has emerged as a catalyst for innovation, enabling automated feedback, real-time cognitive diagnostics, and scalable item generation, thereby transforming how future surgeons learn and are assessed.

An item bank of 150 multiple-choice questions was developed using AI-assisted item generation and difficulty estimation. A formative Computerized Adaptive Testing (CAT), balanced across three cognitive domains (memory, analysis, and decision) and surgical topics, was delivered via QuizOne® 3–5 days before the summative Progress Test. A total of 147 students participated, of whom 116 completed the formative CAT. Performance correlations, group comparisons, analysis of covariance (ANCOVA), and regression analyses were conducted.

Students who voluntarily completed CAT showed higher Progress Test scores, though causality cannot be established due to self-selection bias (p = 0.021), with the effect persisting after adjusting for prior academic performance (ANCOVA p = 0.041). Memory skills were the strongest predictors of summative outcomes (R2 = 0.180, β = 0.425), followed by analysis (R2 = 0.080, β = 0.283); decision was not significant (R2 = 0.029, β = 0.170).

AI-enhanced CAT–Cognitive Diagnostic Modeling (CDM) represents a promising formative approach in undergraduate surgical education, being associated with higher summative performance and providing individualized diagnostic feedback. Refining feedback presentation and enhancing decision-making assessment could further optimize its educational impact.

## Full-text entities

- **Genes:** CAT (catalase) [NCBI Gene 847]
- **Diseases:** CDM (MESH:D013736), fatigue (MESH:D005221), Trauma (MESH:D014947)
- **Chemicals:** CDM (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12816294/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12816294/full.md

## References

36 references — full list in the complete paper: https://tomesphere.com/paper/PMC12816294/full.md

---
Source: https://tomesphere.com/paper/PMC12816294