# Incorporating a non-cognitive selection method in the residency program: the complexities of effective situational judgment test item writing

**Authors:** Diantha Soemantri, Syntia Nusanti, Natalia Widiasih Raharjanti, Fitri Octaviana, Aulia Rizka, Prasetyanugraheni Kreshanti, Rita Mustika

PMC · DOI: 10.3389/fmed.2026.1747830 · 2026-03-19

## TL;DR

This paper explores the challenges of creating effective situational judgment tests for medical residency selection in Indonesia, highlighting issues like bias and content focus.

## Contribution

The study provides insights into the SJT item writing process in Indonesia, emphasizing the need for structured training to improve test validity.

## Key findings

- Commitment to professionalism was the dominant domain in the SJT scenarios.
- Scenarios varied in length and were designed to be neutral in language, gender, and sensitivity.
- Issues like construct irrelevance and lack of contextual details were identified as major pitfalls.

## Abstract

Medical residency programs have been pivoting toward incorporating non-cognitive selection methods. Since Indonesia is accelerating its residency training program, more inclusive methods such as Situational Judgment Tests (SJTs) are preferable. As relatively early adopters of SJT, we argue for the importance on reflecting on the item writing process. This study aimed to examine the SJT item writing process for residency selection in Indonesia, specifically by mapping its content and associated features or characteristics.

Twenty one subject matter experts from 11 residency programs served as item writers. After developing test specifications, the item writers used a provided template to create scenarios and corresponding response options. The analysis involved identifying the domain and content of the SJT scenarios and assessing potential biases related to gender, language and sensitivity.

A total of 106 SJT scenarios were developed, with commitment to professionalism as the dominant domain. The scenarios varied in length, from 21 to 145 words, and were designed to be neutral in terms of language, gender and sensitivity. Several drawbacks were identified, including excessive focus on clinical decision making and specific medical specialty contexts and limited information provided within some scenarios.

The findings revealed several pitfalls which include construct irrelevance and lacking of contextual details. A structured faculty development on SJT writing should take into account those nuances that can threaten the test’s validity. To sustain the development and further adoption of SJT as a selection method, continuous support and feedback for the item writers are crucial.

---
Source: https://tomesphere.com/paper/PMC13043418