Can Model Uncertainty Function as a Proxy for Multiple-Choice Question   Item Difficulty?

Leonidas Zotos; Hedderik van Rijn; Malvina Nissim

arXiv:2407.05327·cs.CL·February 3, 2025

Can Model Uncertainty Function as a Proxy for Multiple-Choice Question Item Difficulty?

Leonidas Zotos, Hedderik van Rijn, Malvina Nissim

PDF

Open Access 1 Repo

TL;DR

This paper investigates whether the uncertainty of large generative models can serve as a proxy for estimating the difficulty of multiple-choice questions, revealing weak correlations and differences based on answer correctness and question types.

Contribution

It explores the correlation between model uncertainty and actual student response distributions, introducing a new dataset and analyzing how uncertainty varies with question types and answer correctness.

Findings

01

Weak correlation between model uncertainty and question difficulty

02

Model behavior differs for correct and wrong answers

03

Correlation varies across different question types

Abstract

Estimating the difficulty of multiple-choice questions would be great help for educators who must spend substantial time creating and piloting stimuli for their tests, and for learners who want to practice. Supervised approaches to difficulty estimation have yielded to date mixed results. In this contribution we leverage an aspect of generative large models which might be seen as a weakness when answering questions, namely their uncertainty, and exploit it towards exploring correlations between two different metrics of uncertainty, and the actual student response distribution. While we observe some present but weak correlations, we also discover that the models' behaviour is different in the case of correct vs wrong answers, and that correlations differ substantially according to the different question types which are included in our fine-grained, previously unused dataset of 451…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leonidaszotos/uncertainty-as-proxy-for-difficulty
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Criteria Decision Making