Exploring the Potential of Large Language Models for Estimating the Reading Comprehension Question Difficulty
Yoshee Jain, John Hollander, Amber He, Sunny Tang, Liang Zhang, and, John Sabatini

TL;DR
This study explores the use of large language models, GPT-4, to automate the estimation of reading comprehension question difficulty, aiming to enhance scalability and personalization in educational assessments.
Contribution
It demonstrates that GPT-4 can effectively estimate question difficulty levels, aligning with traditional psychometric measures, and highlights its potential for scalable, adaptive educational systems.
Findings
GPT-4's difficulty estimates correlate with IRT parameters
Models show sensitivity to extreme item characteristics
Potential for scalable, automated assessment in education
Abstract
Reading comprehension is a key for individual success, yet the assessment of question difficulty remains challenging due to the extensive human annotation and large-scale testing required by traditional methods such as linguistic analysis and Item Response Theory (IRT). While these robust approaches provide valuable insights, their scalability is limited. There is potential for Large Language Models (LLMs) to automate question difficulty estimation; however, this area remains underexplored. Our study investigates the effectiveness of LLMs, specifically OpenAI's GPT-4o and o1, in estimating the difficulty of reading comprehension questions using the Study Aid and Reading Assessment (SARA) dataset. We evaluated both the accuracy of the models in answering comprehension questions and their ability to classify difficulty levels as defined by IRT. The results indicate that, while the models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Intelligent Tutoring Systems and Adaptive Learning
MethodsALIGN
