Controlling Cloze-test Question Item Difficulty with PLM-based Surrogate Models for IRT Assessment
Jingshen Zhang, Jiajun Xie, Xinying Qiu

TL;DR
This paper introduces a novel method using pre-trained language models as surrogate models to generate and control the difficulty of multiple-choice cloze test questions for IRT assessment, reducing reliance on human subjects.
Contribution
It presents a new framework leveraging PLMs for automatic question difficulty control and evaluation in adaptive testing, addressing a gap in current question generation methods.
Findings
Effective control of question difficulty demonstrated
Framework reduces need for human test subjects
Improves evaluation of MC cloze test questions
Abstract
Item difficulty plays a crucial role in adaptive testing. However, few works have focused on generating questions of varying difficulty levels, especially for multiple-choice (MC) cloze tests. We propose training pre-trained language models (PLMs) as surrogate models to enable item response theory (IRT) assessment, avoiding the need for human test subjects. We also propose two strategies to control the difficulty levels of both the gaps and the distractors using ranking rules to reduce invalid distractors. Experimentation on a benchmark dataset demonstrates that our proposed framework and methods can effectively control and evaluate the difficulty levels of MC cloze tests.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEducational Technology and Assessment
