Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning
Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay, Rajpurohit, Peter Clark, Ashwin Kalyan

TL;DR
This paper introduces TabMWP, a new dataset for complex mathematical reasoning involving heterogeneous data, and proposes PromptPG, a policy gradient method to improve in-context example selection for GPT-3, enhancing accuracy and stability.
Contribution
The paper presents a novel dataset for semi-structured mathematical reasoning and a policy gradient-based approach to optimize in-context example selection for large language models.
Findings
PromptPG outperforms baseline by 5.31% in accuracy.
It significantly reduces prediction variance.
Demonstrates effectiveness in complex reasoning tasks.
Abstract
Mathematical reasoning, a core ability of human intelligence, presents unique challenges for machines in abstract thinking and logical reasoning. Recent large pre-trained language models such as GPT-3 have achieved remarkable progress on mathematical reasoning tasks written in text form, such as math word problems (MWP). However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data. To fill the gap, we present Tabular Math Word Problems (TabMWP), a new dataset containing 38,431 open-domain grade-level problems that require mathematical reasoning on both textual and tabular data. Each question in TabMWP is aligned with a tabular context, which is presented as an image, semi-structured text, and a structured table. There are two types of questions: free-text and multi-choice, and each problem is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
MethodsAttention Is All You Need · Test · Linear Layer · Dropout · Cosine Annealing · Byte Pair Encoding · Dense Connections · Layer Normalization · Attention Dropout · Multi-Head Attention
