Quiz Design Task: Helping Teachers Create Quizzes with Automated Question Generation
Philippe Laban, Chien-Sheng Wu, Lidiya Murakhovs'ka, Wenhao, Liu, Caiming Xiong

TL;DR
This paper evaluates whether improvements in question generation models for quizzes translate into practical benefits for teachers, revealing that current models still have significant room for enhancement despite progress in standard metrics.
Contribution
The study connects NLG metric improvements to real-world quiz creation, highlighting the gap between model performance and practical acceptance by teachers.
Findings
Question acceptance rate increased with recent QGen models
Best model achieved 68.4% acceptance rate among teachers
Standard NLG metrics may have reached their upper bounds
Abstract
Question generation (QGen) models are often evaluated with standardized NLG metrics that are based on n-gram overlap. In this paper, we measure whether these metric improvements translate to gains in a practical setting, focusing on the use case of helping teachers automate the generation of reading comprehension quizzes. In our study, teachers building a quiz receive question suggestions, which they can either accept or refuse with a reason. Even though we find that recent progress in QGen leads to a significant increase in question acceptance rates, there is still large room for improvement, with the best model having only 68.4% of its questions accepted by the ten teachers who participated in our study. We then leverage the annotations we collected to analyze standard NLG metrics and find that model performance has reached projected upper-bounds, suggesting new automatic metrics are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
