Solving Arithmetic Word Problems Automatically Using Transformer and Unambiguous Representations
Kaden Griffith, Jugal Kalita

TL;DR
This paper presents a Transformer-based approach for automatically solving math word problems by translating them into arithmetic expressions, leveraging pre-training and diverse notations, achieving significant accuracy improvements.
Contribution
It introduces a Transformer model trained on both domain-specific and general text data for translating word problems into arithmetic expressions, outperforming prior methods.
Findings
Most neural configurations outperform previous approaches by over 20 percentage points.
The best models increase accuracy by nearly 10% over the previous state of the art.
Pre-training on general text improves performance on math word problem translation.
Abstract
Constructing accurate and automatic solvers of math word problems has proven to be quite challenging. Prior attempts using machine learning have been trained on corpora specific to math word problems to produce arithmetic expressions in infix notation before answer computation. We find that custom-built neural networks have struggled to generalize well. This paper outlines the use of Transformer networks trained to translate math word problems to equivalent arithmetic expressions in infix, prefix, and postfix notations. In addition to training directly on domain-specific corpora, we use an approach that pre-trains on a general text corpus to provide foundational language abilities to explore if it improves performance. We compare results produced by a large number of neural configurations and find that most configurations outperform previously reported approaches on three of four…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Software Engineering Research
MethodsLinear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Byte Pair Encoding · Dense Connections · Label Smoothing · *Communicated@Fast*How Do I Communicate to Expedia? · Adam · Softmax
