Artificial Intelligence Meets Item Analysis (AI meets IA): A Study of Chatbot Training and Performance in detecting and correcting MCQ Flaws
Mashaal Sabqat, Rehan Ahmed Khan, Masood Jawaid, Madiha Sajjad

TL;DR
This study explores how ChatGPT can detect and correct flaws in multiple-choice questions, finding mixed results after training.
Contribution
The study evaluates ChatGPT's ability to detect and correct MCQ flaws before and after training, revealing specific strengths and limitations.
Findings
ChatGPT improved at detecting 'complicated stems' and 'absolute terms' after training.
It struggled with 'nonparallel options' and 'vague frequency terms' both before and after training.
Performance worsened during peak hours, and no significant overall improvement was observed.
Abstract
To explore the potential of AI-powered chatbots, specifically ChatGPT, in identifying and correcting flaws in MCQs. A three-phase-Interventional study was conducted from February to August 2023 at Riphah International University, Islamabad. In Phase-1, flawed MCQs were selected from the NBME guide and fed into ChatGPT. ChatGPT identified item flaws and suggested corrections. In Phase-2, ChatGPT was trained to detect flaws in MCQs with text data from the NBME item writing guide. In Phase-3, ChatGPT was again tested to detect flaws and correct MCQs. Data were analyzed using SPSS, Version 26 and presented using percentages and McNemar’s test with exact conditional method. ChatGPT could identify and correct flaws such as use of “None of the above,” “Grammatical cues,” “absolute terms,” and “inconsistently presented numerical data.” However, it struggled with flaws related to “complicated…
Click any figure to enlarge with its caption.
Figure 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education
