MATHWELL: Generating Educational Math Word Problems Using Teacher Annotations
Bryan R Christ, Jonathan Kropko, Thomas Hartvigsen

TL;DR
MATHWELL is a fine-tuned 70B language model designed to generate educational math word problems that are solvable, accurate, and appropriate for K-8 students, outperforming existing models in quality and safety.
Contribution
The paper introduces MATHWELL, the first K-8 targeted math word problem generator fine-tuned with teacher annotations for educational appropriateness.
Findings
MATHWELL produces more solvable and accurate problems than public models.
MATHWELL matches GPT-4 in problem quality but with better appropriateness for K-8.
MATHWELL avoids generating harmful questions and matches desired reading levels.
Abstract
Math word problems are critical K-8 educational tools, but writing them is time consuming and requires extensive expertise. To be educational, problems must be solvable, have accurate answers, and, most importantly, be educationally appropriate. We propose that language models have potential to support K-8 math education by automatically generating word problems. However, evaluating educational appropriateness is hard to quantify. We fill this gap by having teachers evaluate problems generated by LLMs, who find existing models and data often fail to be educationally appropriate. We then explore automatically generating educational word problems, ultimately using our expert annotations to finetune a 70B language model. Our model, MATHWELL, is the first K-8 word problem generator targeted at educational appropriateness. Further expert studies find MATHWELL generates problems far more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Educational Assessment and Pedagogy · Mathematics, Computing, and Information Processing
MethodsAttention Is All You Need · Dropout · Adam · Position-Wise Feed-Forward Layer · Linear Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings · Multi-Head Attention · Dense Connections
