Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving
Yuliang Ji, Fuchen Shen, Jian Wu, Qiujie Xie, Yue Zhang

TL;DR
This paper introduces a new FOL dataset focusing on case-based reasoning, revealing significant performance gaps in LLMs between linear and proof by cases, supported by theoretical analysis.
Contribution
The paper presents a novel FOL dataset with case-based reasoning problems and provides a theoretical explanation for LLMs' difficulties with non-linear reasoning.
Findings
LLMs perform significantly worse on case-based reasoning tasks
A new dataset (PC-FOL) for evaluating proof by cases in FOL
Theoretical analysis explains the disparity in reasoning performance
Abstract
To comprehensively evaluate the mathematical reasoning capabilities of Large Language Models (LLMs), researchers have introduced abundant mathematical reasoning datasets. However, most existing datasets primarily focus on linear reasoning, neglecting other parts such as proof by contradiction and proof by cases, which are crucial for investigating LLMs' reasoning abilities. To address this limitation, we first introduce a novel first-order logic (FOL) dataset named PC-FOL, annotated by professional mathematicians, focusing on case-based reasoning problems. All instances in this dataset are equipped with a manually written natural language proof, clearly distinguishing it from conventional linear reasoning datasets. Our experimental results over leading LLMs demonstrate a substantial performance gap between linear reasoning and case-based reasoning problems. To further investigate this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematics, Computing, and Information Processing · Constraint Satisfaction and Optimization · Topic Modeling
