Which Questions Improve Learning the Most? Utility Estimation of Questions with LM-based Simulations
Dong-Ho Lee, Hyundong Cho, Jonathan May, Jay Pujara

TL;DR
This paper introduces QUEST, a framework using language models to simulate learners and directly measure the utility of questions based on their impact on exam performance, advancing question evaluation for learning.
Contribution
The paper presents QUEST, a novel simulation-based method for directly estimating question utility, and curates a benchmark dataset linking textbook content to exam questions across disciplines.
Findings
QUEST-trained questions improve simulated test scores by over 20%.
Utility estimation is weakly correlated with salience and similarity metrics.
QUEST enables outcome-driven question generation that enhances learning outcomes.
Abstract
Asking good questions is critical for comprehension and learning, yet evaluating and generating such questions remains a challenging problem. Prior work on inquisitive questions focuses on learner-generated, curiosity-driven queries and evaluates them using indirect metrics, such as salience or information gain, that do not directly capture a question's impact on actual learning outcomes. We introduce QUEST (Question Utility Estimation with Simulated Tests), a framework that uses language models to simulate learners and directly quantify the utility of a question - its contribution to exam performance. QUEST simulates a learner who asks questions and receives answers while studying a textbook chapter, then uses them to take an end-of-chapter exam. Through this simulation, the utility of each question is estimated by its direct effect on exam performance, rather than inferred indirectly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
