Response-free item difficulty modelling for multiple-choice items with fine-tuned transformers: Component-wise representation and multi-task learning

Jan Net\'ik; Patr\'icia Martinkov\'a

arXiv:2605.16991·cs.CL·May 19, 2026

Response-free item difficulty modelling for multiple-choice items with fine-tuned transformers: Component-wise representation and multi-task learning

Jan Net\'ik, Patr\'icia Martinkov\'a

PDF

TL;DR

This paper introduces a transformer-based approach for modeling item difficulty in multiple-choice reading comprehension tasks, reducing manual feature engineering and improving performance with multi-task learning especially in small data regimes.

Contribution

It proposes end-to-end transformer fine-tuning with component-wise and multi-task variants, demonstrating the effectiveness of joint encoding and auxiliary tasks in item difficulty modeling.

Findings

01

Joint encoding is a viable alternative to feature engineering.

02

Multi-task learning improves performance in small-sample settings.

03

Transformer fine-tuning captures wording-derived signals effectively.

Abstract

Response-free item difficulty modelling promises to reduce reliance on response-based calibration but is intrinsically difficult on reading-comprehension multiple-choice items, where difficulty depends on inferential demands across wording components. Whereas most existing approaches extract item-text features and pass them to a separate statistical or machine-learning model, we fine-tune transformer encoders end-to-end on the item wording, eliminating the manual feature engineering and preprocessing that discards information. Moreover, two extensions to this joint-encoding approach are proposed: a component-wise variant that encodes wording components separately through a shared encoder, and a multi-task variant that retains joint encoding and adds an auxiliary multiple-choice question answering objective on the shared encoder. Each method is evaluated under a Monte Carlo subsampling…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.