Dynamic Prompt Learning via Policy Gradient for Semi-structured   Mathematical Reasoning

Pan Lu; Liang Qiu; Kai-Wei Chang; Ying Nian Wu; Song-Chun Zhu; Tanmay; Rajpurohit; Peter Clark; Ashwin Kalyan

arXiv:2209.14610·cs.LG·March 3, 2023·41 cites

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning

Pan Lu, Liang Qiu, Kai-Wei Chang, Ying Nian Wu, Song-Chun Zhu, Tanmay, Rajpurohit, Peter Clark, Ashwin Kalyan

PDF

Open Access 2 Repos 2 Datasets 1 Video

TL;DR

This paper introduces TabMWP, a new dataset for complex mathematical reasoning involving heterogeneous data, and proposes PromptPG, a policy gradient method to improve in-context example selection for GPT-3, enhancing accuracy and stability.

Contribution

The paper presents a novel dataset for semi-structured mathematical reasoning and a policy gradient-based approach to optimize in-context example selection for large language models.

Findings

01

PromptPG outperforms baseline by 5.31% in accuracy.

02

It significantly reduces prediction variance.

03

Demonstrates effectiveness in complex reasoning tasks.

Abstract

Mathematical reasoning, a core ability of human intelligence, presents unique challenges for machines in abstract thinking and logical reasoning. Recent large pre-trained language models such as GPT-3 have achieved remarkable progress on mathematical reasoning tasks written in text form, such as math word problems (MWP). However, it is unknown if the models can handle more complex problems that involve math reasoning over heterogeneous information, such as tabular data. To fill the gap, we present Tabular Math Word Problems (TabMWP), a new dataset containing 38,431 open-domain grade-level problems that require mathematical reasoning on both textual and tabular data. Each question in TabMWP is aligned with a tabular context, which is presented as an image, semi-structured text, and a structured table. There are two types of questions: free-text and multi-choice, and each problem is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

Dynamic Prompt Learning via Policy Gradient for Semi-structured Mathematical Reasoning· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques

MethodsAttention Is All You Need · Test · Linear Layer · Dropout · Cosine Annealing · Byte Pair Encoding · Dense Connections · Layer Normalization · Attention Dropout · Multi-Head Attention