Making Large Language Models Better Reasoners with Step-Aware Verifier

Yifei Li; Zeqi Lin; Shizhuo Zhang; Qiang Fu; Bei Chen; Jian-Guang Lou,; Weizhu Chen

arXiv:2206.02336·cs.CL·May 25, 2023·39 cites

Making Large Language Models Better Reasoners with Step-Aware Verifier

Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou,, Weizhu Chen

PDF

Open Access

TL;DR

This paper introduces DIVERSE, a step-aware verifier approach that improves reasoning accuracy of large language models by generating diverse reasoning paths and verifying each step individually, achieving state-of-the-art results.

Contribution

DIVERSE is a novel method that enhances reasoning in language models through diverse prompt generation and step-by-step verification, surpassing previous techniques.

Findings

01

Achieves 83.2% on GSM8K, surpassing previous best of 74.4%.

02

Outperforms on six out of eight reasoning benchmarks.

03

Improves problem-solving accuracy significantly across multiple datasets.

Abstract

Few-shot learning is a challenging task that requires language models to generalize from limited examples. Large language models like GPT-3 and PaLM have made impressive progress in this area, but they still face difficulties in reasoning tasks such as GSM8K, a benchmark for arithmetic problems. To improve their reasoning skills, previous work has proposed to guide the language model with prompts that elicit a series of reasoning steps before giving the final answer, achieving a significant improvement on GSM8K from 17.9% to 58.1% in problem-solving rate. In this paper, we present DIVERSE (Diverse Verifier on Reasoning Step), a novel approach that further enhances the reasoning capability of language models. DIVERSE has three main components: first, it generates diverse prompts to explore different reasoning paths for the same question; second, it uses a verifier to filter out incorrect…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Pathways Language Model · Linear Layer · Cosine Annealing · Residual Connection · Dropout · Dense Connections · Weight Decay