DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions
Nigel Fernandez, Alexander Scarlatos, Wanyong Feng, Simon Woodhead,, Andrew Lan

TL;DR
DiVERT is a novel variational approach that generates high-quality, interpretable distractors for math MCQs by modeling the errors behind them, outperforming existing methods and matching human quality.
Contribution
Introduces DiVERT, a variational method that learns error representations for distractor generation in math MCQs, enhancing plausibility and interpretability.
Findings
DiVERT outperforms GPT-4-based approaches in distractor quality.
Human evaluators find DiVERT's error labels comparable to human-authored ones.
Effective with a 7B parameter open-source language model.
Abstract
High-quality distractors are crucial to both the assessment and pedagogical value of multiple-choice questions (MCQs), where manually crafting ones that anticipate knowledge deficiencies or misconceptions among real students is difficult. Meanwhile, automated distractor generation, even with the help of large language models (LLMs), remains challenging for subjects like math. It is crucial to not only identify plausible distractors but also understand the error behind them. In this paper, we introduce DiVERT (Distractor Generation with Variational Errors Represented as Text), a novel variational approach that learns an interpretable representation of errors behind distractors in math MCQs. Through experiments on a real-world math MCQ dataset with 1,434 questions used by hundreds of thousands of students, we show that DiVERT, despite using a base open-source LLM with 7B parameters,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Text Analysis Techniques · Topic Modeling · Natural Language Processing Techniques
MethodsBalanced Selection
