Making Large Language Models Better Reasoners with Step-Aware Verifier
Yifei Li, Zeqi Lin, Shizhuo Zhang, Qiang Fu, Bei Chen, Jian-Guang Lou,, Weizhu Chen

TL;DR
This paper introduces DIVERSE, a step-aware verifier approach that improves reasoning accuracy of large language models by generating diverse reasoning paths and verifying each step individually, achieving state-of-the-art results.
Contribution
DIVERSE is a novel method that enhances reasoning in language models through diverse prompt generation and step-by-step verification, surpassing previous techniques.
Findings
Achieves 83.2% on GSM8K, surpassing previous best of 74.4%.
Outperforms on six out of eight reasoning benchmarks.
Improves problem-solving accuracy significantly across multiple datasets.
Abstract
Few-shot learning is a challenging task that requires language models to generalize from limited examples. Large language models like GPT-3 and PaLM have made impressive progress in this area, but they still face difficulties in reasoning tasks such as GSM8K, a benchmark for arithmetic problems. To improve their reasoning skills, previous work has proposed to guide the language model with prompts that elicit a series of reasoning steps before giving the final answer, achieving a significant improvement on GSM8K from 17.9% to 58.1% in problem-solving rate. In this paper, we present DIVERSE (Diverse Verifier on Reasoning Step), a novel approach that further enhances the reasoning capability of language models. DIVERSE has three main components: first, it generates diverse prompts to explore different reasoning paths for the same question; second, it uses a verifier to filter out incorrect…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Text Readability and Simplification
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Pathways Language Model · Linear Layer · Cosine Annealing · Residual Connection · Dropout · Dense Connections · Weight Decay
