"Crash Test Dummies" for AI-Enabled Clinical Assessment: Validating Virtual Patient Scenarios with Virtual Learners
Brian Gin, Ahreum Lim, Fl\'avia Silva e Oliveira, Kuan Xing, Xiaomei Song, Gayana Amiyangoda, Thilanka Seneviratne, Alison F. Doubleday, Ananya Gangopadhyaya, Bob Kiser, Lukas Shum-Tim, Dhruva Patel, Kosala Marambe, Lauren Maggio, Ara Tekian, Yoon Soo Park

TL;DR
This paper introduces an AI virtual patient platform and a Bayesian psychometric model to evaluate and validate AI-based clinical assessments, ensuring robustness and interpretability before deployment with human learners.
Contribution
It develops an open-source virtual assessment platform combined with a Bayesian model to reliably measure clinical competencies and validate AI assessment tools.
Findings
The model accurately recovered simulated learners' competencies.
Case difficulty was effectively estimated by competency.
Rater sensitivity and thresholds remained stable across AI raters.
Abstract
Background: In medical and health professions education (HPE), AI is increasingly used to assess clinical competencies, including via virtual standardized patients. However, most evaluations rely on AI-human interrater reliability and lack a measurement framework for how cases, learners, and raters jointly shape scores. This leaves robustness uncertain and can expose learners to misguidance from unvalidated systems. We address this by using AI "simulated learners" to stress-test and psychometrically characterize assessment pipelines before human use. Objective: Develop an open-source AI virtual patient platform and measurement model for robust competency evaluation across cases and rating conditions. Methods: We built a platform with virtual patients, virtual learners with tunable ACGME-aligned competency profiles, and multiple independent AI raters scoring encounters with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Simulation-Based Education in Healthcare · Clinical Reasoning and Diagnostic Skills
