Language Models Are Greedy Reasoners: A Systematic Formal Analysis of   Chain-of-Thought

Abulhair Saparov; He He

arXiv:2210.01240·cs.CL·March 3, 2023·38 cites

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought

Abulhair Saparov, He He

PDF

Open Access 2 Repos 2 Datasets 1 Video

TL;DR

This paper systematically analyzes the reasoning process of large language models using a synthetic dataset, revealing their strengths in deduction but limitations in proof planning and exploration.

Contribution

It introduces PrOntoQA, a synthetic dataset enabling formal analysis of LLM reasoning, and provides insights into their reasoning capabilities and limitations.

Findings

01

LLMs can perform correct deduction steps

02

They are capable of reasoning in fictional contexts

03

They struggle with proof planning and exploring multiple deduction paths

Abstract

Large language models (LLMs) have shown remarkable reasoning capabilities given chain-of-thought prompts (examples with intermediate reasoning steps). Existing benchmarks measure reasoning ability indirectly, by evaluating accuracy on downstream tasks such as mathematical reasoning. However, it is unclear how these models obtain the answers and whether they rely on simple heuristics rather than the generated chain-of-thought. To enable systematic exploration of the reasoning ability of LLMs, we present a new synthetic question-answering dataset called PrOntoQA, where each example is generated from a synthetic world model represented in first-order logic. This allows us to parse the generated chain-of-thought into symbolic proofs for formal analysis. Our analysis on InstructGPT and GPT-3 shows that LLMs are quite capable of making correct individual deduction steps, and so are generally…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Datasets

Videos

Language Models Are Greedy Reasoners: A Systematic Formal Analysis of Chain-of-Thought· slideslive

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Explainable Artificial Intelligence (XAI)

Methods15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · Linear Layer · Multi-Head Attention · Adam · Dense Connections · Dropout · Layer Normalization · {Dispute@FaQ-s}How to file a dispute with Expedia? · Refunds@Expedia|||How do I get a full refund from Expedia?