UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty   and Response Time for Multiple-Choice Questions

Ana-Cristina Rogoz; Radu Tudor Ionescu

arXiv:2404.13343·cs.CL·April 23, 2024·1 cites

UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty and Response Time for Multiple-Choice Questions

Ana-Cristina Rogoz, Radu Tudor Ionescu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents a novel data augmentation approach using Large Language Models to predict item difficulty and response time for medical exam questions, demonstrating potential improvements in automated assessment systems.

Contribution

Introduces a new LLM-based data augmentation method for predicting question difficulty and response time in medical exams, with analysis of feature combinations and model performance.

Findings

01

Predicting question difficulty remains challenging.

02

Including question text improves prediction accuracy.

03

LLM answer variability enhances model performance.

Abstract

This work explores a novel data augmentation method based on Large Language Models (LLMs) for predicting item difficulty and response time of retired USMLE Multiple-Choice Questions (MCQs) in the BEA 2024 Shared Task. Our approach is based on augmenting the dataset with answers from zero-shot LLMs (Falcon, Meditron, Mistral) and employing transformer-based models based on six alternative feature combinations. The results suggest that predicting the difficulty of questions is more challenging. Notably, our top performing methods consistently include the question text, and benefit from the variability of LLM answers, highlighting the potential of LLMs for improving automated assessment in medical licensing exams. We make our code available https://github.com/ana-rogoz/BEA-2024.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ana-rogoz/bea-2024
noneOfficial

Videos

UnibucLLM: Harnessing LLMs for Automated Prediction of Item Difficulty and Response Time for Multiple-Choice Questions· underline

Taxonomy

TopicsEducational Technology and Assessment